Use of accelerationWrtState in Field version of Holmes Featherstone model's acceleration

Hello,

I think I found two problems with the use of the “optimized” accelerationWrtState method in the Field version of HolmesFeatherstoneAttractionModel#acceleration.

  1. There is a bug in the check that is performed to decide whether to use the faster method or not.
    1. isGradientStateDerivative does not check the partials of the date of the state, which however do impact the partials of the acceleration through the rotation matrix between state frame and ITRF
    2. the gradient of mu is not checked. However, accelerationWrtState uses only its real part.
  2. The check performed on the components of the position of the state is too restrictive. Indeed, it would be possible to call accelerationWrtState with a position vector of the form [x+dx, y+dy, z+dz] as is it the case now, and then compose the output with the actual derivatives of the position.

To demonstrate (1), tests #2 and #3 fail:


  @Test
  void testHolmesFeatherstoneAttractionModelPartials() {
    testHolmesFeatherstoneAttractionModelPartials(false, false);
  }

  @Test
  void testHolmesFeatherstoneAttractionModelPartialsWrtEpoch() {
    testHolmesFeatherstoneAttractionModelPartials(true, false);
  }

  @Test
  void testHolmesFeatherstoneAttractionModelPartialsWrtMu() {
    testHolmesFeatherstoneAttractionModelPartials(false, true);
  }

  void testHolmesFeatherstoneAttractionModelPartials(final boolean wrtDate, final boolean wrtMu) {

    final File orekitData = new File(System.getProperty("user.home") + "/Resources/orekit-data");
    final DataProvider dirCrawler = new DirectoryCrawler(orekitData);
    DataContext.getDefault().getDataProvidersManager().addProvider(dirCrawler);

    final HolmesFeatherstoneAttractionModel model = new HolmesFeatherstoneAttractionModel(
      FramesFactory.getITRF(IERSConventions.IERS_2010, false),
      GravityFieldFactory.getNormalizedProvider(8, 8)
    );

    final int prs = 8;
    final DSFactory factory = new DSFactory(prs, 1);

    final double mu = Constants.EGM96_EARTH_MU;
    final Gradient gMu = wrtMu ? Gradient.variable(prs, 7, 0.0).multiply(1e6).add(mu) : Gradient.constant(prs, mu);
    final DerivativeStructure dsMu = wrtMu ? factory.variable(7, 0.0).multiply(1e6).add(mu) : factory.constant(mu);

    final double dt = 0.0;
    final Gradient gDt = wrtDate ? Gradient.variable(prs, 6, dt) : Gradient.constant(prs, dt);
    final DerivativeStructure dsDt = wrtDate ? factory.variable(6, dt) : factory.constant(dt);

    final Frame frame = FramesFactory.getGCRF();
    final AbsoluteDate date = AbsoluteDate.ARBITRARY_EPOCH;
    final FieldAbsoluteDate<Gradient> gDate = new FieldAbsoluteDate<>(GradientField.getField(prs), date).shiftedBy(gDt);
    final FieldAbsoluteDate<DerivativeStructure> dsDate = new FieldAbsoluteDate<>(factory.getDerivativeField(), date).shiftedBy(dsDt);

    final Orbit orbit = new KeplerianOrbit(7e6, 0.2, 0.7, -1.0, 0.5, 0.7, PositionAngleType.TRUE, frame, date, mu);
    final PVCoordinates pv = orbit.getPVCoordinates();

    final FieldPVCoordinates<Gradient> gPv = new FieldPVCoordinates<>(
      new FieldVector3D<>(
        Gradient.variable(prs, 0, pv.getPosition().getX()),
        Gradient.variable(prs, 1, pv.getPosition().getY()),
        Gradient.variable(prs, 2, pv.getPosition().getZ())
      ),
      new FieldVector3D<>(
        Gradient.variable(prs, 3, pv.getVelocity().getX()),
        Gradient.variable(prs, 4, pv.getVelocity().getY()),
        Gradient.variable(prs, 5, pv.getVelocity().getZ())
      )
    );

    final FieldSpacecraftState<Gradient> gState = new FieldSpacecraftState<>(new FieldCartesianOrbit<>(gPv, frame, gDate, gMu));
    final FieldVector3D<Gradient> gAcc = model.acceleration(gState, new Gradient[]{gMu});

    final FieldPVCoordinates<DerivativeStructure> dsPv = new FieldPVCoordinates<>(
      new FieldVector3D<>(
        factory.variable(0, pv.getPosition().getX()),
        factory.variable(1, pv.getPosition().getY()),
        factory.variable(2, pv.getPosition().getZ())
      ),
      new FieldVector3D<>(
        factory.variable(3, pv.getVelocity().getX()),
        factory.variable(4, pv.getVelocity().getY()),
        factory.variable(5, pv.getVelocity().getZ())
      )
    );

    final FieldSpacecraftState<DerivativeStructure> dsState = new FieldSpacecraftState<>(new FieldCartesianOrbit<>(dsPv, frame, dsDate, dsMu));
    final FieldVector3D<DerivativeStructure> dsAcc = model.acceleration(dsState, new DerivativeStructure[]{dsMu});

    Assertions.assertArrayEquals(dsAcc.toVector3D().toArray(), gAcc.toVector3D().toArray(), Precision.EPSILON);
    Assertions.assertArrayEquals(Arrays.copyOfRange(dsAcc.getX().getAllDerivatives(), 1, prs + 1), gAcc.getX().getGradient(), Precision.EPSILON);
    Assertions.assertArrayEquals(Arrays.copyOfRange(dsAcc.getY().getAllDerivatives(), 1, prs + 1), gAcc.getY().getGradient(), Precision.EPSILON);
    Assertions.assertArrayEquals(Arrays.copyOfRange(dsAcc.getZ().getAllDerivatives(), 1, prs + 1), gAcc.getZ().getGradient(), Precision.EPSILON);
  }

while for (2), something like this works:

    public <T extends CalculusFieldElement<T>> FieldVector3D<T> acceleration2(final FieldSpacecraftState<T> s, final T[] parameters) {

        final T mu = parameters[0];

        // check for faster computation dedicated to derivatives with respect to state
        if (isGradientStateDerivative2(s) && ((Gradient) mu).getAddendum().isZero()) {
            @SuppressWarnings("unchecked")
            final FieldVector3D<Gradient> p = (FieldVector3D<Gradient>) s.getPosition();
            final FieldVector3D<Gradient> redPos = new FieldVector3D<>(
                    Gradient.variable(3, 0, p.getX().getValue()),
                    Gradient.variable(3, 0, p.getY().getValue()),
                    Gradient.variable(3, 0, p.getZ().getValue())
            );
            final FieldVector3D<Gradient> redAcc = accelerationWrtState(s.getDate().toAbsoluteDate(), s.getFrame(), redPos, Gradient.constant(3, mu.getReal()));
            final RealMatrix jacAcc = new Array2DRowRealMatrix(new double[][]{
                    redAcc.getX().getGradient(),
                    redAcc.getY().getGradient(),
                    redAcc.getZ().getGradient()
            });
            final RealMatrix jacPos = new Array2DRowRealMatrix(new double[][]{
                    p.getX().getGradient(),
                    p.getY().getGradient(),
                    p.getZ().getGradient()
            });
            final RealMatrix jac = jacAcc.multiply(jacPos);
            @SuppressWarnings("unchecked")
            final FieldVector3D<T> a = new FieldVector3D<>(
                    (T) new Gradient(redAcc.getX().getValue(), jac.getRow(0)),
                    (T) new Gradient(redAcc.getY().getValue(), jac.getRow(1)),
                    (T) new Gradient(redAcc.getZ().getValue(), jac.getRow(2))
            );
            return a;
        }

        // get the position in body frame
        final FieldAbsoluteDate<T> date             = s.getDate();
        final FieldStaticTransform<T> fromBodyFrame = bodyFrame.getStaticTransformTo(s.getFrame(), date);
        final FieldStaticTransform<T> toBodyFrame   = fromBodyFrame.getInverse();
        final FieldVector3D<T> position             = toBodyFrame.transformPosition(s.getPosition());

        // gradient of the non-central part of the gravity field
        return fromBodyFrame.transformVector(new FieldVector3D<>(gradient(date, position, mu)));

    }

    private <T extends CalculusFieldElement<T>> boolean isGradientStateDerivative2(final FieldSpacecraftState<T> state) {
        return (state.getPosition().getX() instanceof Gradient) && state.getDate().getDate().hasZeroField();
    }

Cheers,
Alberto

Hi Alberto,

so first of all I think it’s worth mentionning that the “optimized” (for performance) piece of code is implicitly for Orekit STM computation system, done in this particular case with a Gradient whose first three independent variables are the variations in position from the state. I actually recently removed on develop some lines that were still using DerivativeStructure at order 1, which has been obsolete for a while.

Anyway, you’re definitely right about the date in (1). There should be a check to see if it is actually a constant (it is already done with StaticTransform for example).

About MU, I’m a bit surprised that only getReal/getValue is used, because I believe that the model parameters derivatives are evaluated via the same call to acceleration… So this might be a bug in itself (needs confirmation). Edit: Ok so upon verification the check should make sure that MU is exactly an independent variable as well.

As for (2), I’m not sure I see what you mean. Could you maybe rephrase?

Cheers,
Romain.

Hi Romain,

What I mean by (2) is the following: consider a FieldSpacecraftState<Gradient> for which the components of the position vector are Gradient with all derivatives potentially different from zero (and the i-th derivative also from one).

When calling acceleration with such a state, the “optimized” implementation is not used because, for example, getX().getGradient() is different from {1.0, 0.0, 0.0, …}.

However, it could still be used if the implementation is modified along the lines of the piece of code I shared above. What it does is similar to what (I assume) the mechanics for the STM computation with a numerical propagator do:

  1. compute the partials of the gradient w.r.t. the current position by initializing a new Gradient for each component of the position vector to be an independent variable
  2. compute the partials of the gradient w.r.t. the position vector retrieved from the spacecraft state by multiplying the two Jacobian matrices

This is what you would do to propagate the STM using the variational equations dotPhi = A*Phi

The use case would be to call this optimized implementation when propagating a FieldSpacecraftState<Gradient> using a FieldNumericalPropagator (assuming the date and mu are “constants”).

Regarding the check on mu, I am not sure that being an independent variable is sufficient. gradientHessian, which is called inside accelerationWrtState, takes mu as double (i.e. its constant part).

Cheers,
Alberto

Yes, but I think that gradientHessian assumes that mu has gradient [0, 0, …, 0, 1], so partial are computed analytically.

Ok so (2) is about generalizing the hack that is already done, that’s what I thought but I wasn’t sure, thanks for the clarification. To be honest, I’m not super keen as the code is already a bit ugly. If you insist, at the very least, if a modification is made to cover more cases, we need to make sure that the performance for the internal case with the STM does not suffer from it.

Cheers,
Romain.

I think you are right, it must be an independent variable at index 3.

I am not familiar with the way variational equations are set up for the computation of the STM, so I might be totally wrong, but wouldn’t be better to “reverse” the hack? What I mean is:

The end goal is to build the matrix A such that the time derivatives of the STM are computed as dotPhi = A*Phi.

Right now, this is done by mapping the identity matrix into a FieldVector3D for the spacecraft state position, evaluating the acceleration as FieldVector3D, and then mapping the Jacobian of the acceleration back into A. This makes sense if the Jacobian is actually computed using AD, but not much if there is an analytical implementation available, as the latter needs to map the partials back into a matrix (as it happens in accelerationWrtState.

So, wouldn’t be more efficient to have methods in ForceModel that return the Jacobian as a matrix that can be directly added to A, and implement a “reverse hack” that uses Gradient for those forces that need actual AD? The Gradient version of the state could still be instantiated only once per evaluation of the ODEs.

Last comment, I started this whole investigation because I found out that when propagating the STM using a matrix harvester the gradient of the gravity potential is evaluated twice at each step: one for the “main” equations, and one for the variational (“secondary”) equations as part of accelerationWrtState.

Now, the latter has basically the same runtime as the regular gradient method without partials. So, I would think that by just taking the value of the gradient for the main ODEs from that computed from the secondary ODEs, it would be possible to almost halve the overall runtime required to account for this force during numerical propagation.

Alberto

From what I’ve been able to gather, historically before AD was available, the variational equations used to be computed analytically in Orekit (and are still for comparison in the tests). I think the rationale was ease of maintenance (despite the non-friendly syntax) compared to very long equations. I’m assuming that the hack in the geopotential was introduced to keep similar performance to the old system. As fas as I am aware, only HolmesFeather[…] and DragForce use tricks like this, all the other forces use pure AD for the variational equations.

An important remark on the main/primary and secondary equations in adaptive-step integrators: the latter are only evaluated when a step has been validated according to the error control. So in general, the secondary equations are not called as many times as the main ones. But generally they are much slower to evaluate because they involve Fields. Edit: upon reflexion I’m not sure, maybe they are evaluated as many times, but still don’t weight in for the step size

Anyway, currently, depending on the forces, STM’s can also have a bottleneck where there is a lot of model parameters, due to manipulating dictionnaries and querying TimeSpanMap. So yeah there is room for improvement I think, thanks for being proactive on this topic!

In the short run, I suggest you open an issue for the wrong checks of (1), which could be fixed with the next patch. And if you wish, create a separate one to somewhat extend the hack in Holmes[…].

Cheers,
Romain.

I think they are evaluated as many times. I tried using a fixed step size integrator with only non-spherical gravity potential 48 by 48, and Moon/Sun as third bodies. No estimated parameters whatsoever.
The runtime spent for the evaluation of the main ODEs is roughly the same as that spent for the evaluation of the secondary ODEs. And both are dominated by the evaluation of the gravity gradient (as expected).

If by “still don’t weight in for the step size” you mean that the error in the integration of the secondary eqs. is not accounted for in the step size estimation, yes you are correct. This is akin to the field integrator, for which the adaptive step is computed based on the error on the real part only.

I will do this. I’ll think a bit more before tackling point (2).

FYI I started a discussion about options to speed up the STM a while ago. The slower forces to include are definitely drag and high-order geopotential. I just implemented the dedicated atmospheric model for drag partials. For the geopotential, another idea (beyond the different order for the derivatives) that I’ve seen in the literature would be an adaptive order based on the position.

First issue opened Checks on date and mu before calling accelerationWrtState in HolmesFeatherstoneAttractionModel (#1866) · Issues · Orekit / Orekit · GitLab