Let xk be values of the signal obtained at times tk, k=1...N. In
the "local" (or "running") approximations it is usually suggested that
the data
are fitted by a function
which depends not only on the moment t, but on the
limits of the interval of fitting. Examples of such function fitting of
a test signal by various methods are shown in Fig. 1 (click here). However, the
resulting function is expected to be dependent only on one argument.
Thus it is generally chosen so that the smoothing ("computed") value
at the moment t0 is equal to a value of the smoothing function
at t=t0:
![]()
(cf. Whittaker & Robinson 1928). In this case
remains a free
parameter which determines statistical and spectral properties of the
function
for a fixed set of data
. Here we assumed
that the data are renumerated according to the trial argument interval
from
to
. Obviously, such numbering is dependent
both on the "mean argument" t0 and on the "filter half-width"
.
![]()
Figure 1: Approximations
of the model discrete "signal"
xk=tk3+tk2 defined at times tk=k,
by using 4
tested fits for t0=10 and
("wm", "wp") or
("um", "up").
Such difference in
leads to equal number n=19 of observations
with non-zero weights.
A smoothing value of
at
t=t0 corresponds to an adopted value ![]()
In the most often case of the linear fits, the function
may be
expressed as
![]()
where the coefficients
may be determined e.g. by minimizing a
weighted sum of the residuals
![]()
The weights wk are generally characteristic of the accuracy
of
the measurements xk and are equal to
, where
is an "unit weight error", if pk=1 for the data used for the
fit (cf. Whittaker & Robinson 1928). The parameter
is a scale
coefficient which may be set to arbitrary positive value. It does not
affect the smoothing function and its statistical characteristics.
The "additional" weights
were used in Paper I to
make the smoothing function and its first derivative continuous.
The following concrete functions were used:
![]()
("unweighted" fits), and
![]()
("weighted" fits). As the base functions, we have used the polynomials:
![]()
The minimum of the function
for the fixed data
corresponds to a system of "normal" equations:
![]()
where
![]()
Introducing a vector of values
![]()
one may write
![]()
In our designations,
is a matrix, inverse of
, and
![]()
This means that coefficients
and basic functions
have "interchanged their places":
are now functions of t0
and
, whereas values of the basic functions are constant.
Introducing a vector
similar to (9), one may write
![]()
This vector is also a function of t0,
. Each k-th component
of this vector may be interpreted as a dependence of the calculated
value
smoothing the unit value xk=1, whereas all
other signal values are equal to zero. For N evenly distributed
observations
(i=1...N) with a step
,
the function h of 3 variables
becomes dependent on
2 variables
only. In this case, one may write a
convolution - type expression
![]()
Here k' is a number corresponding to tk'=t0 in each interval of
the local approximation. This equation is valid for i=k'...N-n+k'.
For the "borders" (i=1...k'-1,N-n+k'+1...N) one has to redetermine the
vector
. In Paper I we determined values of h for the
illustrative 9-point "wp" fits.
In this paper we prefer to express all fit functions X (coefficients,
derivatives etc.) in terms of the "projective" vectors h[X,k], because
this allows one to estimate the accuracy and possible correlations
between the parameters. If
and
are deviations of the
functions X and Y caused by deviations of the observations
,
then

Mathematical expectation of the left side of this equation may be
calculated, if the correlation matrix
or its
mathematical expectation
is known.
For uncorrelated deviations
,
, where
is a
Kronecker symbol, and Eq. (14) may be rewritten as
![]()
In the particular case Y=X one obtains variance of X:
For the coefficients
one may obtain relation
![]()
where
![]()
and
![]()
One may note that for pk=p= const, the matrices
For the unweighted fits one
usually suggests p=1, thus
This last result is usual for least squares approximations (cf.
Whittaker & Robinson 1928; Anderson 1958).
Following Paper I, one may formally separate "true" (index "t") values
of the signal
and the deviations of the real observations from
them
. The values
are
often believed to be
uncorrelated with each other and the "true" values, have a zero
mathematical expectation and a variance
. Usually
"true" values are unknown (except models with known "signal" and
"noise"), but may have systematic deviations from the corresponding fit
which may be characterized by a parameter
The weighted sum (3)
of the squares of the residuals
is

where
![]()
The second summand in the right part of Eq. (19) may be called
as corresponds to the deviations.
Equation (19) allows to estimate of
which is needed for
accuracy determinations. One may note that, for constant weights (4),
the expression in brackets in Eq. (19) is equal to (n-m-1) for all
non-degenerate systems of basic functions
. Usually the fits
are chosen so that
is negligible as compared with
i.e. it is suggested that the systematic deviation of the fit from
the true signal is much less than its statistical error. The values of
and the estimate
of
are dependent on t0 and
The variance of the smoothed value at argument t0 is

Here
is the deviation of the smoothed value
from
the true one
.
If the argument t0 coincides with tk of the k-th observation,
then one may transform Eq. (23) of Paper I into

For polynomial fits
and for the weights (4) and
(5) pk=1, thus
.
For "constant" weights (4), taking into account that in this case
, one may obtain even more simple expression
.
Making summation of the left and right parts of Eqs. (22, 19) for all
observations (or only part of them), one may also estimate
neglecting systematic deviations of the fit from the true signal as
compared with the statistical error of the signal value. These values
we will mark as
and
Another
characteristic value of the variance may be defined as

For normally distributed uncorrelated signal the values
and
must be very close as they characterize the same quantity -
the unbiased estimate of the unit weight variance. The parameter
is the rms deviation of the signal from the fit, its mathematical
expectation depends on
and is biased.
One may note that general expressions for the smoothed value and its
accuracy may cause problems, if the number of points n1 in the
subinterval
is not sufficient.
If n1=m+1 and all the arguments tk are different then one may
obtain the fit interpolating all the values. If number of different
arguments is smaller than m+1, the system of normal equations is
degenerate and no fit of order m is available. In this case one
may decrease m (what changes statistical and spectral properties of
the fit) or not to use the fit at this data point. We prefer the
second way when computing
and
Computation of the smoothed values at the moments of observations is
carried out most often to compute estimates of
and to provide
time series analysis of the residuals
of the signal from the fit.
Another application is to compute the fit at arbitrary argument t0.
For this case we propose to use the following restrictions: a)
the number of the data points inside the interval must exceed m+1
(as was mentioned above); b) the numbers of the data points with
tk<t0 (j1) and tk>t0 (j2) must both be nonzero;
c) the number of the data points with
(j3) must
exceed some limiting value (practically
and
d) the accuracy estimate of the smoothed value
must not
exceed some limiting value, e.g.
or manually inserted one;
e) the value
must lie within the interval
where one may
recommend to use
from 0 to 0.1.
These restrictions (some of them may be not used) allow to obtain the
fit only at arguments t0 where it has sense, because in other case
one may formally obtain values extrapolating the data at the edge(s)
of the subinterval and apparent waves which are not statistically
significant.
This may be more simple, if the observations are evenly sampled, and
the coefficients h[x,k] are the same for all intervals (except
edges), as well as the matrices
,
,
.
Generally one may introduce 2 scale factors
Then
but the parameters
do not depend on
and
thus
they may be set to any nonzero value. Practically one may choose
and
(i.e.
for unequal weights and wk=1 for
equal weights. It is important to note that generally the parameter
itself does not correspond to the characteristics accuracy of
the observations. The accuracy of the fit is defined in a more complex
way by Eq. (21).
Foster (1996c) proposed to introduce the parameter
![]()
This quantity is scale-invariant, as does not depend on parameters
and
It has physical sense of the variance of the
parameter
![]()
which coincides with a weighted mean. Imposing the normalization
![]()
one may define the "local" ensemble variance
![]()
where n*=n2-n1+1 is the number of the data points (from n1 to
n2) in the trial interval
In previous expressions the sums from
1 to n and from n1 to n2 were equal as they contained the
additional weight pk which is equal to zero for k outside the
interval [n1,n2].
The use of points in a local interval is correct from the statistical
point of view but is not suitable to use the ensemble variance
which vary with t0.
Thus one may introduce the "global" ensemble variance which is defined
for the whole data set and does not depend on the interval
:
![]()
Foster (1996c) proposes to use
instead of
as it has clear physical meaning. In other notation, this corresponds
to
From Eqs. (23) and (25) one may define the effective number of data
points
![]()
With a normalization
one will obtain
in a form by
Foster (1996c). One may note that
for equal weights
pk=p, and
for unequal weights.
In our notation, these expressions are meaningful for "um" and "wm",
as xm coincides with the smoothing value at t0. For parabolic and
other non-linear fits one may redefine the effective number of data
points using
instead of pk in Eqs. (23, 24, 27):
![]()