Orekit Caching Feature Behaviour

Leonardo · April 23, 2020, 3:29pm

Dear Orekit Team,

I would like to ask you few questions related to the caching/interpolation feature used internally in Orekit during a frame transformation. In particular I am referring to the chain that is triggered when querying a transform at a given date connecting the GCRF to the ITRF based on the IERS conventions.

Here you can find a small example to support the explanation:

public class TestTransformBehaviour {

public static void main(String[] args) throws OrekitException {

    String orekitDataPath = args[0];
    File orekitData = new File(orekitDataPath);
    DataProvidersManager manager = DataProvidersManager.getInstance();
    manager.addProvider(new DirectoryCrawler(orekitData));

    AbsoluteDate date1 = new AbsoluteDate(2015, 01, 01, 00, 00, 00, TimeScalesFactory.getTAI());
    AbsoluteDate date2 = new AbsoluteDate(2015, 01, 01, 00, 01, 00, TimeScalesFactory.getTAI());

    Frame gcrf = FramesFactory.getGCRF();
    Frame itrf = FramesFactory.getITRF(IERSConventions.IERS_2010, false);

    KeplerianOrbit myOrbit = new KeplerianOrbit(54354, 0, 0, 0, 0, 0, PositionAngle.TRUE, gcrf, date1, CelestialBodyFactory.getEarth().getGM());

    System.out.println("Z component at date1: " + myOrbit.getPVCoordinates(date1, itrf).getPosition().getZ());

    System.out.println("Z component at date2: " + myOrbit.getPVCoordinates(date2, itrf).getPosition().getZ());

}

}

The Orekit version used is the v10.0 together with the latest orekit-data that I could find on your website.

What has been done was to output in two different runs the Z position vector component [m] in ITRF by following 2 queries order:

First Date1 and then Date2 in the first run
First Date2 and then Date1 in the second run

So basically by exchanging the order of the sysouts in the 2 runs.

The result is the following:
Run 1: z at date1: 79.64028519559314;— z at date2: 75.72821368799973
Run 2: z at date1: 79.6402851954891; ---- z at date2: 75.72821368794723

As you can see they are slightly different. By inspecting the code we have seen that the Transforms that are provided in the 2 runs are not the same, they have a different orientation.
We inspected the cache of the ShiftingTransformProvider after the 2 runs; you can find the content in the attached file. ShiftingTransformProviderCaches.xlsx (11.1 KB)

We understood that the reason for this difference is that the algorithm sets as reference date of the ShiftingTransformProvider cache the one that is queried as first when creating the first slot. Afterwards depending on the next queried dates the slot may be extended or not or a new one is created.
This means that order matters. So by querying first a date in the future or first a date in the past you obtain different results.
I am really interested in this topic since for the application that we are developing we execute multiple processes in parallel that may have queries at different times not always sorted in a chronological order. So depending on the load of the machine one process may be run before or after another one resulting in different results due to the cache being populated differently.

Coming to the questions:

Is this the supposed correct behavior of the algorithm? So that depending on the population of the cache you can obtain different results?
Could it be that due to the caching Orekit is intended to be used sequentially rather than in parallel to avoid a different population of the cache between 2 runs?
Maybe this has been already thought and accounted via the maximum errors that you check in the unit tests of the ShiftingTransformProvider and InterpolatingProvider? If that is the case can we consider those as the maximum errors due to the caching/interpolation feature? If not how can we compute it?
Does Orekit have an internal way to avoid the caching? Or this would require a forking?
I have understood that the caching is of course in place to optimize performances, is it possible to find some benchmarks about it? I have seen that here you mention an increase in efficiency but only for the tidal effects, is this valid also for the others cached entries? And on which scale is it? Seconds or minutes?

Hopefully I was clear enough and I am not asking something that has already been discussed.

Thank you really much for your support and your time

Kind Regards,
Leonardo

–

PS: While debugging the Orekit code I have noticed that in the method of the GenericTimeStampedCache:

private int entryIndex(final AbsoluteDate date, final long dateQuantum)

the date is given as an input but then is never used locally, could it be a left over from a refactoring?

evan.ward · April 23, 2020, 7:01pm

Hi Leonardo,

Welcome to Orekit! Those are some good questions, and well thought through.

Yes. The best accuracy you can obtain with the GCRF to ITRF transformation is a few nrad (a couple mm at Earth’s surface).[1] Anything beyond that is extra precision without accuracy. You results seem to be within that tolerance. Depending digits that have no meaning is probably not correct, but if you have a use case for it I would be interested to know it. It would be possible to fix the time grid on which the interpolation samples are computed, which would provide more repeatable results, though not any more accurate.

Orekit 10.1 introduced the concept of a DataContext that provides a more powerful way to manage frames and time scales. It could be used to create a separate context, and hence cache, for each thread or operation.

[1] https://datacenter.iers.org/data/latestVersion/207_BULLETIN_B207.txt

No. Since Orekit 5ish it has been able to use multiple threads. All frame transformation should still be within the accuracy limits whether they are computed sequentially or in parallel.

Yes, those are probably good limits. Luc also produced a plot of differences over a few years when he implemented the feature, as well as measured the performance. Not sure where that ended up in the transition to this forum. Found it at [2]. Still can’t find that plot though…

To compute your own comparison is a bit trickier. You should be able to do it by creating your own frames using CirfProvider, TirfProvider, and ItrfProvider. Only the first one appears to be cached though so perhaps you can get the answer you want by just comparing the CirfProvider. Maybe you could make a new plot for us?

[2] [Orekit Developers] huge improvement in ITRF frame performances

Yes, but it is not as easy. You would have to create your own frames using the transform providers mentioned above. Please create a feature request on GitLab if you would like this feature. Contributions welcome!

It is an order of magnitude faster, or more. All professional astrodynamics packages I’ve used cache and/or interpolate nutation for this reason. IIRC Luc collected some performance data when he implemented the caching, but I don’t know of any performance data for this produced with the current version of Orekit. Perhaps you could make another nice plot for us?

I have been thinking for a while about creating a suite of JMH benchmarks for Orekit that get run on our CI server so we can track performance across releases. Perhaps you would like to help? But that is a topic for another thread.

luc · April 24, 2020, 12:55pm

There is indeed a way to remove all interpolation which is used in tests. Look at the static method FramesFactory.getNonInterpolatingTransform. It is used for example in the test ITRFEquinoxProviderTest.testNROvsEquinoxNoEOP2010 to check that ITRF frames
computed using equinox paradigm and Non Rotating Origin paradigm are within 1.7 micro arcseconds to each other. It was used (or similar code was used) when tuning the interpolation parameters. The idea was that interpolation should introduce error that are orders of magnitude smallers than the accuracy of the models by themselves. In other words, interpolation does not degrade accuracy, it just slightly shifts the frames within the models own accuracy.

Leonardo · April 27, 2020, 10:18am

Hi guys,

many thanks for your answers and thanks for the welcome! I have to say that I am not new to Orekit as I am using it since 2 years now and I also participated to the last Orekit day (btw thanks for organizing it)

If I am correct this would mean to implement our own FramesFactory in which the getCIRF method would not rely on the caching/interpolation feature, rather on the non interpolating provider method mentioned by @luc

Thanks for the info about the DataContext I was not aware of this feature.

Thanks for the info! I had a look at the getNonInterpolatingProvider method, if my understanding is correct, at the end:

The real source for the accuracy error is at the level of the EOP when the corrections are cached, but it would be present anyway also without the caching/interpolation for the Transform connecting the GCRF to the CIRF.
Via the caching/interpolation used at the level of the CIRF we sacrifice the repetitiveness of the results for performance reasons
The additional errors due to the interpolation and shifting performed in the Interpolating and Shifting Providers are within the accuracy of the model, so they don’t degrade the results
The only way to completely remove all the caching is to use the NonInterpolatingTransformProvider method, that will also avoid to cache the EOP corrections.
Via the DataContext we can achieve repetitiveness, but still we have the accuracy error introduced by caching the corrections in the EOP.

Is it correct? Or am I still missing something else?

I was just surprised to not be able to obtain the same results between two different runs. But if this is the intended behavior and the difference between two runs is within the accuracy error we can either decide to accept it or to make use of the DataContext as you have suggested.

It seems to be an interesting topic I would like to know more about it

Thanks again

Kind Regards,
Leonardo Andreasi

evan.ward · April 28, 2020, 8:18pm

I created a new topic for this: Performance Measurement with JMH

Leonardo · May 5, 2020, 9:03am

Thank you for the info!