  Up: Three stage period analysis

# 4 Bootstrap

Theoretical (e.g. ) and numerical (e.g. Monte Carlo) error estimates for nonlinear models are discussed, e.g. by Press et al. (1988). We estimate the errors with the bootstrap for the regression coefficients (Efron & Tibshirani 1986). This bootstrap also allows estimates of "other'' model parameters (and their errors) that would be difficult to solve analytically for , e.g. the total amplitude (A) and the epochs of the minima ( tmin,1, tmin,2,... , tmin,K). The six stages of our bootstrap are:

1.
Minimizing (Eq. 8) for with gives (11)

i.e. the "empirical distribution of residuals''.

2.
A random sample is selected from . The number of the same entering into may vary between 0 and nin this random sample with replacement. The determine a unique random sample , where the connection of wi being the weight of is preserved.

3.
A random sample is obtained.

4.
Minimizing (Eq. 8) for with gives one estimate for , as well as for the "other'' parameters of higher order models ( ). For example, measuring the difference between the minimum and maximum of gives a numerical A estimate.

5.
The bootstrap returns to the 2nd stage, until S estimates of and "other'' parameters have been obtained.

6.
The expectation value and variance for any component are the mean and variance of its S estimates in . The same applies to the S estimates of "other'' model parameters.

Note that the , , and remain unchanged, while , and are changing, and determines both and .

We introduce a few notations useful in testing the "Gaussian hypothesis''

HG: An arbitrary with S components represents a random sample drawn from a Gaussian distribution.
These (e.g. , , , or ) are first arranged into an ascending (i.e. rank) order and then transformed to , where mx and sx are the mean and standard deviation of . The cumulative distribution function is (12)

and the cumulative Gaussian distribution function is (13)

The preassigned significance level for rejecting HG determines an upper limit for the Kolmogorov-Smirnov test statistic . HG is rejected if, and only if, (14)

One might (correctly) argue that preserving the connection between and wi during the 2nd stage of the above bootstrap distorts the statistics in case of inhomogeneous data quality. In that case some low quality data with large and small wi will be randomly distributed among high quality data in each bootstrap sample. But this problem is eliminated by checking if the distribution is Gaussian. If HG is rejected for with Eq. (14), i.e. the data quality is inhomogeneous, then the modelling is rejected (see RI in Sect. 6.3). If, on the other hand, HG is not rejected for , preserving the connection between and wiin our bootstrap is justified.  Up: Three stage period analysis

Copyright The European Southern Observatory (ESO)