Next: 4.
Proposed algorithms and Up: 3.
Problems with the Previous: 3.2
Nonlinearity
According to the classical probabilistic interpretation of the least squares algorithm [Gauss, 1809], the matrices are the covariance matrices of a Gaussian distribution in the space. This is based upon the assumption that linearity applies and that the observation errors have independent, unity variance normal distributions. As discussed in Paper I, Section 2.2, some unit has to be chosen to express the residuals as dimensionless numbers; the units should be chosen in such a way that the expected errors in the observational procedure are of the order of unity. The variance does not, however, need to be exactly 1; its value can be determined after the least squares fit by a classical formula, and the matrices can be accordingly rescaled.
The problem is, the observation errors result mostly from a combination of systematic effects rather than from a random noise, which could follow a Gaussian distribution. The errors from the same observatory are typically highly correlated, their size depends upon the observing technology, and thus changes with the observatory and over the years; anyway the estimation of the rescaling factor is subject to great uncertainties. In the limit case, when there are only 3 observations, and no a posteriori information is available on the observation errors. For orbits determined from a small number of observations, and with small residuals, some reasonable a priori weighting (e.g. with residuals measured in arc seconds) is more realistic than a Gaussian rescaling based upon an unrealistic error model.
These real difficulties in the use of the simplest form of the Gaussian theory lead some authors to give up whatsoever quantitative assessment of the uncertainty. This does not necessarily follow from the difficulty of using one specific, and oversimplified, error model. The information contained in the normal and covariance matrices can be used, provided we adopt a formulation independent from the Gaussian error distribution hypothesis, such as the optimization approach of Section 2. The normal matrix has an intrinsic meaning, being proportional to the Hessian matrix of the second partial derivatives of the target function, which in turn measures the overall size of the residuals. If we are prepared to decide whether a given proposed identification is either accepted or rejected on the basis of the RMS of the post-fit residuals of the observations belonging to both arcs, then we are entitled to use a prediction of the value of the post-fit target function as a selection criterion.
Indeed the final acceptance of an identification requires a finer examination of the post-fit residuals, possibly including outlier rejection, which is of course incompatible with the hypothesis that all the residuals follow a normal distribution. Whether it is possible to perform outlier rejection, both in single arc solutions and in proposed identification solutions, in an entirely automatic way, without intervention of an experienced human, is an open question, which we plan to address in a forthcoming paper. For the purpose of proposing identifications, however, the removal of outliers is a refinement which is meaningful only after good candidate couples are found.
Thus the algorithm for proposing identifications needs to rely only on a rough estimate of the acceptable range of values for the observation residuals. As it shown in the histograms of Section 4, a factor 2 in the controls does not matter too much, the nonlinear effects being a potential source of much larger discrepancy between the linearly proposed identification and the actual identified orbit.