next previous
Up: Method of running

2. Basic equations

Let xk be values of the signal obtained at times tk, k=1...N. In the "local" (or "running") approximations it is usually suggested that the data tex2html_wrap_inline1942   tex2html_wrap_inline1944 are fitted by a function tex2html_wrap_inline1946 which depends not only on the moment t, but on the limits of the interval of fitting. Examples of such function fitting of a test signal by various methods are shown in Fig. 1 (click here). However, the resulting function is expected to be dependent only on one argument. Thus it is generally chosen so that the smoothing ("computed") value tex2html_wrap_inline1950 at the moment t0 is equal to a value of the smoothing function tex2html_wrap_inline1954 at t=t0:
equation237
(cf. Whittaker & Robinson 1928). In this case tex2html_wrap_inline1958 remains a free parameter which determines statistical and spectral properties of the function tex2html_wrap_inline1960 for a fixed set of data  tex2html_wrap_inline1942. Here we assumed that the data are renumerated according to the trial argument interval from tex2html_wrap_inline1964 to tex2html_wrap_inline1966. Obviously, such numbering is dependent both on the "mean argument" t0 and on the "filter half-width"tex2html_wrap_inline1958.

  figure242
Figure 1: Approximations tex2html_wrap_inline1954 of the model discrete "signal" xk=tk3+tk2 defined at times tk=k, tex2html_wrap_inline1978 by using 4 tested fits for t0=10 and tex2html_wrap_inline1982 ("wm", "wp") or tex2html_wrap_inline1984 ("um", "up"). Such difference in tex2html_wrap_inline1958 leads to equal number n=19 of observations with non-zero weights. A smoothing value of tex2html_wrap_inline1990 at t=t0 corresponds to an adopted value tex2html_wrap_inline1994

In the most often case of the linear fits, the function tex2html_wrap_inline1990 may be expressed as
equation247
where the coefficients tex2html_wrap_inline1998 may be determined e.g. by minimizing a weighted sum of the residuals
equation254
The weights wk are generally characteristic of the accuracy tex2html_wrap_inline2002 of the measurements xk and are equal to tex2html_wrap_inline2006, where tex2html_wrap_inline2008 is an "unit weight error", if pk=1 for the data used for the fit (cf. Whittaker & Robinson 1928). The parameter tex2html_wrap_inline2008 is a scale coefficient which may be set to arbitrary positive value. It does not affect the smoothing function and its statistical characteristics. The "additional" weights tex2html_wrap_inline2014 were used in Paper I to make the smoothing function and its first derivative continuous.

The following concrete functions were used:
equation259
("unweighted" fits), and
equation266
("weighted" fits). As the base functions, we have used the polynomials:
equation273
The minimum of the function tex2html_wrap_inline2024 for the fixed data corresponds to a system of "normal" equations:
equation277
where
equation286
Introducing a vector of values
equation293
one may write
equation300
In our designations, tex2html_wrap_inline2026 is a matrix, inverse of tex2html_wrap_inline2028, and
equation308
This means that coefficients tex2html_wrap_inline1998 and basic functions tex2html_wrap_inline2032 have "interchanged their places": tex2html_wrap_inline1998 are now functions of t0 and tex2html_wrap_inline1958, whereas values of the basic functions are constant.

Introducing a vector tex2html_wrap_inline2040 similar to (9), one may write
equation318
This vector is also a function of t0, tex2html_wrap_inline1958. Each k-th component of this vector may be interpreted as a dependence of the calculated value tex2html_wrap_inline1994 smoothing the unit value xk=1, whereas all other signal values are equal to zero. For N evenly distributed observations tex2html_wrap_inline2054 (i=1...N) with a step tex2html_wrap_inline2058, the function h of 3 variables tex2html_wrap_inline2062 becomes dependent on 2 variables tex2html_wrap_inline2064 only. In this case, one may write a convolution - type expression
equation327
Here k' is a number corresponding to tk'=t0 in each interval of the local approximation. This equation is valid for i=k'...N-n+k'. For the "borders" (i=1...k'-1,N-n+k'+1...N) one has to redetermine the vector tex2html_wrap_inline2040. In Paper I we determined values of h for the illustrative 9-point "wp" fits.

In this paper we prefer to express all fit functions X (coefficients, derivatives etc.) in terms of the "projective" vectors h[X,k], because this allows one to estimate the accuracy and possible correlations between the parameters. If tex2html_wrap_inline2082 and tex2html_wrap_inline2084 are deviations of the functions X and Y caused by deviations of the observations tex2html_wrap_inline2090, then
 eqnarray336
Mathematical expectation of the left side of this equation may be calculated, if the correlation matrix tex2html_wrap_inline2092 or its mathematical expectation tex2html_wrap_inline2094 is known.

For uncorrelated deviations tex2html_wrap_inline2096, tex2html_wrap_inline2098, where tex2html_wrap_inline2100 is a Kronecker symbol, and Eq. (14) may be rewritten as
equation349
In the particular case Y=X one obtains variance of X: tex2html_wrap_inline2106 For the coefficients tex2html_wrap_inline1998 one may obtain relation
 equation354
where
equation360
and
equation368
One may note that for pk=p= const, the matrices tex2html_wrap_inline2112 tex2html_wrap_inline2114 For the unweighted fits one usually suggests p=1, thus tex2html_wrap_inline2118 tex2html_wrap_inline2120 This last result is usual for least squares approximations (cf. Whittaker & Robinson 1928; Anderson 1958).

Following Paper I, one may formally separate "true" (index "t") values of the signal tex2html_wrap_inline2122 and the deviations of the real observations from them tex2html_wrap_inline2124. The values tex2html_wrap_inline2126 are often believed to be uncorrelated with each other and the "true" values, have a zero mathematical expectation and a variance tex2html_wrap_inline2128. Usually "true" values are unknown (except models with known "signal" and "noise"), but may have systematic deviations from the corresponding fit which may be characterized by a parameter tex2html_wrap_inline2130 The weighted sum (3) of the squares of the residuals tex2html_wrap_inline2132 is
equation390
where
equation400
The second summand in the right part of Eq. (19) may be called tex2html_wrap_inline2134 as corresponds to the deviations. Equation (19) allows to estimate of tex2html_wrap_inline2136 which is needed for accuracy determinations. One may note that, for constant weights (4), the expression in brackets in Eq. (19) is equal to (n-m-1) for all non-degenerate systems of basic functions tex2html_wrap_inline2140. Usually the fits are chosen so that tex2html_wrap_inline2142 is negligible as compared with tex2html_wrap_inline2144 i.e. it is suggested that the systematic deviation of the fit from the true signal is much less than its statistical error. The values of tex2html_wrap_inline2144 tex2html_wrap_inline2148 tex2html_wrap_inline2150 tex2html_wrap_inline2152 and the estimate tex2html_wrap_inline2154 of tex2html_wrap_inline2136 are dependent on t0 and tex2html_wrap_inline2160

The variance of the smoothed value at argument t0 is
eqnarray412
Here tex2html_wrap_inline2164 is the deviation of the smoothed value tex2html_wrap_inline1994 from the true one tex2html_wrap_inline2168.

If the argument t0 coincides with tk of the k-th observation, then one may transform Eq. (23) of Paper I into
eqnarray430
For polynomial fits tex2html_wrap_inline2176 and for the weights (4) and (5) pk=1, thus tex2html_wrap_inline2180. For "constant" weights (4), taking into account that in this case tex2html_wrap_inline2182, one may obtain even more simple expression tex2html_wrap_inline2184. Making summation of the left and right parts of Eqs. (22, 19) for all observations (or only part of them), one may also estimate tex2html_wrap_inline2136 neglecting systematic deviations of the fit from the true signal as compared with the statistical error of the signal value. These values we will mark as tex2html_wrap_inline2188 and tex2html_wrap_inline2190 Another characteristic value of the variance may be defined as
equation452
For normally distributed uncorrelated signal the values tex2html_wrap_inline2192 and tex2html_wrap_inline2194 must be very close as they characterize the same quantity - the unbiased estimate of the unit weight variance. The parameter tex2html_wrap_inline2196 is the rms deviation of the signal from the fit, its mathematical expectation depends on tex2html_wrap_inline1958 and is biased.

One may note that general expressions for the smoothed value and its accuracy may cause problems, if the number of points n1 in the subinterval tex2html_wrap_inline2202 is not sufficient. If n1=m+1 and all the arguments tk are different then one may obtain the fit interpolating all the values. If number of different arguments is smaller than m+1, the system of normal equations is degenerate and no fit of order m is available. In this case one may decrease m (what changes statistical and spectral properties of the fit) or not to use the fit at this data point. We prefer the second way when computing tex2html_wrap_inline2214 tex2html_wrap_inline2194 and tex2html_wrap_inline2218

Computation of the smoothed values at the moments of observations is carried out most often to compute estimates of tex2html_wrap_inline2136 and to provide time series analysis of the residuals tex2html_wrap_inline2222 of the signal from the fit. Another application is to compute the fit at arbitrary argument t0. For this case we propose to use the following restrictions: a) the number of the data points inside the interval must exceed m+1 (as was mentioned above); b) the numbers of the data points with tk<t0 (j1) and tk>t0 (j2) must both be nonzero; c) the number of the data points with tex2html_wrap_inline2236 (j3) must exceed some limiting value (practically tex2html_wrap_inline2240 and tex2html_wrap_inline2242 d) the accuracy estimate of the smoothed value tex2html_wrap_inline2244 must not exceed some limiting value, e.g. tex2html_wrap_inline2192 or manually inserted one; e) the value tex2html_wrap_inline1994 must lie within the interval tex2html_wrap_inline2250 where one may recommend to use tex2html_wrap_inline2252 from 0 to 0.1. These restrictions (some of them may be not used) allow to obtain the fit only at arguments t0 where it has sense, because in other case one may formally obtain values extrapolating the data at the edge(s) of the subinterval and apparent waves which are not statistically significant.

This may be more simple, if the observations are evenly sampled, and the coefficients h[x,k] are the same for all intervals (except edges), as well as the matrices tex2html_wrap_inline2258, tex2html_wrap_inline2026, tex2html_wrap_inline2262.

Generally one may introduce 2 scale factors tex2html_wrap_inline2264 tex2html_wrap_inline2266 Then tex2html_wrap_inline2268 tex2html_wrap_inline2270 tex2html_wrap_inline2272 tex2html_wrap_inline2274 tex2html_wrap_inline2276 but the parameters tex2html_wrap_inline2278 tex2html_wrap_inline2280 tex2html_wrap_inline1946 tex2html_wrap_inline2284 tex2html_wrap_inline2286 do not depend on tex2html_wrap_inline2288 and tex2html_wrap_inline2290 thus they may be set to any nonzero value. Practically one may choose tex2html_wrap_inline2292 and tex2html_wrap_inline2294 (i.e. tex2html_wrap_inline2296 for unequal weights and wk=1 for equal weights. It is important to note that generally the parameter tex2html_wrap_inline2136 itself does not correspond to the characteristics accuracy of the observations. The accuracy of the fit is defined in a more complex way by Eq. (21).

Foster (1996c) proposed to introduce the parameter
equation483
This quantity is scale-invariant, as does not depend on parameters tex2html_wrap_inline2288 and tex2html_wrap_inline2304 It has physical sense of the variance of the parameter
equation488
which coincides with a weighted mean. Imposing the normalization
equation493
one may define the "local" ensemble variance
equation499
where n*=n2-n1+1 is the number of the data points (from n1 to n2) in the trial interval tex2html_wrap_inline2312 In previous expressions the sums from 1 to n and from n1 to n2 were equal as they contained the additional weight pk which is equal to zero for k outside the interval [n1,n2]. The use of points in a local interval is correct from the statistical point of view but is not suitable to use the ensemble variance which vary with t0. Thus one may introduce the "global" ensemble variance which is defined for the whole data set and does not depend on the interval tex2html_wrap_inline2202:
equation512

Foster (1996c) proposes to use tex2html_wrap_inline2330 instead of tex2html_wrap_inline2136 as it has clear physical meaning. In other notation, this corresponds to tex2html_wrap_inline2334 From Eqs. (23) and (25) one may define the effective number of data points
equation521
With a normalization tex2html_wrap_inline2336 one will obtain tex2html_wrap_inline2338 in a form by Foster (1996c). One may note that tex2html_wrap_inline2340 for equal weights pk=p, and tex2html_wrap_inline2344 for unequal weights.

In our notation, these expressions are meaningful for "um" and "wm", as xm coincides with the smoothing value at t0. For parabolic and other non-linear fits one may redefine the effective number of data points using tex2html_wrap_inline2350 instead of pk in Eqs. (23, 24, 27):
equation534


next previous
Up: Method of running

Copyright by the European Southern Observatory (ESO)
This email address is being protected from spambots. You need JavaScript enabled to view it.