4 Bootstrap

Theoretical (e.g. $\chi^2$ ) and numerical (e.g. Monte Carlo) $\bar{\beta}_{\mathrm{min}}$ error estimates for nonlinear models are discussed, e.g. by Press et al. (1988). We estimate the $\bar{\beta}_{\mathrm{min}}$ errors with the bootstrap for the regression coefficients (Efron & Tibshirani 1986). This bootstrap also allows estimates of "other'' model parameters (and their errors) that would be difficult to solve analytically for $K\!\geq\!2$ , e.g. the total amplitude (A) and the epochs of the minima ( t_min,1, t_min,2,... , t_min,K). The six stages of our bootstrap are:

1.

Minimizing $\chi^2$ (Eq. 8) for $\bar{y}$ with $\bar{w}$ gives

$\begin{displaymath}{\epsilon}_i = y(t_i)-g(t_i,\bar{\beta}_{\mathrm{min}}) =y_i-g_i, \end{displaymath}$

(11)

i.e. the "empirical distribution of residuals''.

2.

A random sample $\bar{\epsilon}^*$ is selected from $\bar{\epsilon}$ . The number of the same $\epsilon_i$ entering into $\bar{\epsilon}^*$ may vary between 0 and nin this random sample with replacement. The $\bar{\epsilon}^*$ determine a unique random sample $\bar{w}^*$ , where the connection of w_i being the weight of $\epsilon_i$ is preserved.

3.

A random sample $\bar{y}^{*} \! = \! \bar{g} \! + \! \bar{\epsilon}^{*}$ is obtained.

4.

Minimizing $\chi^2$ (Eq. 8) for $\bar{y}^*$ with $\bar{w}^*$ gives one estimate for $\bar{\beta'}_{\mathrm{min}}$ , as well as for the "other'' parameters of higher order models ( $K\!\geq\!2$ ). For example, measuring the difference between the minimum and maximum of $\bar{g}(\bar{\beta'}_{\mathrm{min}})$ gives a numerical A estimate.

5.

The bootstrap returns to the 2nd stage, until S estimates of $\bar{\beta'}_{\mathrm{min}}$ and "other'' parameters have been obtained.

6.

The expectation value and variance for any $\bar{\beta}_{\mathrm{min}}$ component are the mean and variance of its S estimates in $\bar{\beta'}_{\mathrm{min}}$ . The same applies to the S estimates of "other'' model parameters.

Note that the $\bar{y}$ , $\bar{w}$ , $\bar{\epsilon}$ and $\bar{g}$ remain unchanged, while $\bar{y}^*$ , $\bar{w}^*$ and $\bar{\epsilon}^*$ are changing, and $\bar{\epsilon}^*$ determines both $\bar{y}^*$ and $\bar{w}^*$ .

We introduce a few notations useful in testing the "Gaussian hypothesis''

: H_G: An arbitrary $\bar{x}$ with S components represents a random sample drawn from a Gaussian distribution.

These $\bar{x}$ (e.g. $\bar{\epsilon}$ , $\bar{M}$ , $\bar{A}$ , $\bar{P}$ or $\bar{t}_{\mathrm{min,1}}$ ) are first arranged into an ascending (i.e. rank) order $x_1 \! \leq \! x_2 \!\leq ... \leq x_{\mathrm{S}}$ and then transformed to $u_i\!=\!(x_i-m_x)/s_x$ , where m_x and s_x are the mean and standard deviation of $\bar{x}$ . The cumulative distribution function is

(12)

and the cumulative Gaussian distribution function is

$\begin{displaymath}F(u)=(2\pi)^{-1/2}\int_{-\infty}^{\mathrm{u}} {\mathrm{e}}^{\mathrm{-z^2/2}} {\rm d}z. \end{displaymath}$

(13)

The preassigned significance level $\alpha \!=\!0.01$ for rejecting H_G determines an upper limit $c(\alpha\!=\!0.01,S)$ for the Kolmogorov-Smirnov test statistic $a = {\mathrm{max}} [~\mid F_{\mathrm{S}}(u)- F(u) \mid~]$ . H_G is rejected if, and only if,

$\begin{displaymath}a \geq c(\alpha\!=\!0.01,S). \end{displaymath}$

(14)

One might (correctly) argue that preserving the connection between $\epsilon_i$ and w_i during the 2nd stage of the above bootstrap distorts the statistics in case of inhomogeneous data quality. In that case some low quality data with large $\epsilon_i$ and small w_i will be randomly distributed among high quality data in each bootstrap sample. But this problem is eliminated by checking if the $\bar{\epsilon}$ distribution is Gaussian. If H_G is rejected for $\bar{\epsilon}$ with Eq. (14), i.e. the data quality is inhomogeneous, then the modelling is rejected (see R_I in Sect. 6.3). If, on the other hand, H_G is not rejected for $\bar{\epsilon}$ , preserving the connection between $\epsilon_i$ and w_iin our bootstrap is justified.

Up: Three stage period analysis