next up previous
Next: 2.2 Linear orbit identification Up: 2. Identification penalties Previous: 2. Identification penalties

2.1 Differential corrections as an optimization algorithm

The principle of least squares assumes that a target function $Q$ has to be minimized to find the nominal solution. The target function $Q=\xi\cdot\xi/m$ is formed with the sum of squares of the residuals $\xi$, with $\xi \in \Re ^m$. The residuals are normalized, as discussed in Section 3.3, thus $Q$ is dimensionless. In our case, $m=2\cdot N_{obs}$ for $N_{obs}$astrometric observations of two angular coordinates. The residuals are functions $\xi(X)$ of the estimated parameters $X\in \Re^N$. In the simplest problems of orbit determination $N=6$ and $X$ is some vector representing the orbital elements at some initial epoch $t_0$: in this paper $X=(a,h,k,p,q,\lambda)$ are the equinoctial elements as defined in Paper I, Sect. 4.1. Some of the coordinates of the vector $X$, e.g. in our case the mean longitude $\lambda$, are not real numbers, but are sometimes angles (defined $mod(2\pi)$), and this introduces some complications which will be noted later.

Thus the target function also depends upon $X$, and the minimum of $Q(X)$ is obtained by solving the nonlinear equations:

\begin{displaymath}{\displaystyle \partial Q \over \displaystyle \partial X}={2 ......splaystyle \partial \xi \over \displaystyle \partial X}(X)\ .\end{displaymath}

Now let the map between $X$ and $\xi$ be linearized in a neighborhood of some point $X^*$:

\begin{displaymath}\xi=B\; \Delta X \, ,\qquad\Delta X = X- X^*\end{displaymath}

where the target function is approximated by a quadratic form

\begin{displaymath}Q(X)= {\displaystyle 1 \over \displaystyle m}\; \Delta X^T \; B^T B\; \Delta X \ .\end{displaymath}

The equations to be solved for the minimum are the normal equations

\begin{displaymath}B^T B\; \Delta X=-B^T\; \xi\end{displaymath}

with normal matrix $C=B^T B$ and solution

\begin{displaymath}\Delta X= -\Gamma\;B^T\;\xi\end{displaymath}

computed with the covariance matrix $\Gamma=C^{-1}$, which exists whenever $C$ is positive-definite, which is generically the case for $m\geq N$. We shall of course assume that the linearization is performed around the solution $X^*$ such that $B^T\xi=\underline0$; in the standard differential corrections procedure $X^*$ is obtained by iterating the solution of the normal equation until convergence (pseudo-Newton method). For a standard reference on differential correction, see [Cappellari et al., 1976].

Please note that to apply a single iteration of differential correction, and even any fixed number of iterations, is not enough to guarantee convergence; an iterative scheme with a tight convergence control needs to be used. As an example, in our programs the convergence is controlled by requiring that the correction norm

 \begin{displaymath}\vert\vert\Delta X\vert\vert= \sqrt{\Delta X \cdot C\, \Delta X /N}\; < \epsilon\end{displaymath} (1)


is below a small control value $\epsilon$ to stop the iterative procedure; $\epsilon=10^{-5}$ is used in most cases.


next up previous
Next: 2.2 Linear orbit identification Up: 2. Identification penalties Previous: 2. Identification penalties

Maria Eugenia Sansaturio
1999-05-20