clearFactories() method from test class Utils

yannick · September 12, 2018, 12:11pm

Hello,

I am searching for a way to reload model data into Orekit without having to restart my application. At the moment I am mostly concerned with UTC-TAI history, but other files might come up later.

As I mentioned on the old mailing lists about a year ago, this is not as easy as it sounds because the model data is cached inside the factories, so simply registering a new data file is not enough. I recently discovered the method clearFactories() from the test class Utils, which appears to do exactly what I want. However it is part of Orekit’s unit tests, which means it is not accessible for users of the library.

I have a few questions and concerns about using this method in “production” code, maybe someone here could give me a few pointers ?

if I have already requested a timescale from the factory, and affected it to a local variable : will the timescale be updated with the new data even if I do not specifically go through the factory again ?
what would happen if this method were to be called in a multi-thread program ? I think the case might have already come up if Orekit unit tests are performed in parallel. Are all threads using the new data, possibly leading to inconsistent computations (because of a unexpected data change in the middle of a computation) ? Or are the threads completing their computations with the previous data, the new data applying only for new computations ?

If I decide to follow this route, I will perform more tests to ensure the behavior corresponds to my needs. But this is not so easy to test, so I thought it could be wise to ask first, just to see if this has a chance to work.

Thanks in advance for your help
Yannick

luc · September 12, 2018, 1:22pm

For now, Orekit is clearly designed in a way it does not support this use case. However, it did occur in some discussions (perhaps the one you mentioned), with an idea of context.

The clearFactories method is a hack (an an ugly one) intended for test purposes only. It relies on the Java language reflection API to overwrite private final data. This is not something to be used lightly, it will break code.

if I have already requested a timescale from the factory, and affected it to a local variable : will the timescale be updated with the new data even if I do not specifically go through the factory again ?

It will reuse old data. If you look at the private UTCScale or UT1Scale classes, you will see their private constructors are called with the timescale data (leap seconds for UTC, EOP history for UT1) and then they store this data. In most cases, Orekit tries to use immutable classes, so this rationale is true almost everywhere.

what would happen if this method were to be called in a multi-thread program ?

As long as immutable classes have already been built in each thread, everything will be fine. However,
another problem is that some data is loaded using lazy evaluation, that is it is loaded only when needed. So if you configure data loading in thread 1, then launch thread 2 and reconfigure the factories (by using the
ugly hack, or using a reset function if we decide to add one after this discussion) and after this reconfiguration thread 1 needs to lazily load, it will use the current factory configuration, which was set
up for the other thread! So using he same factory that is reset to hold some kind of context would probably not work. One think that may work would be to have factories use thread-local caches, then resetting a factory in one thread would not override the configuration that me be needed again.

evan.ward · September 12, 2018, 1:59pm

Hi Yannick,

An smaller change to Orekit may be to make the constructor for UTCScale public so the user can provide the specific data they want to use, and then the user can control the specific instance of the UTCScale they want to use (e.g. UTC as it was in 2010 or UTC as it is currently).

For the long term goal I prefer the architecture outlined in [1].

[1] https://www.orekit.org/mailing-list-archives/orekit-developers/msg00085.html

gbonnefille · September 13, 2018, 7:46am

Introducing something to let the user manage finely the data context really worth the efforts. Nevertheless, I think a simplest API, based on global singletons should be kept.

This is clearly a really use case for Orekit: letting user to quickly write a piece of code to solve one problem. In such context, dealing with complex initialization and huge context object to bring from call to call is not the expected deal.

Ideally, a rewrite must create two API : a simple one and a powerfull one. libcurl has such dual API called easy and multi.
Eventually the begining of the modularization of Orekit?

evan.ward · September 13, 2018, 1:06pm

I think both can be supported at the same time with only a small difference in API by making the static factories instantiable (TimeScaleFactory, etc.). Existing behavior would be preserved by keeping an instance as a global singleton. The difference in API would be only the constructor which would be used to supply different data sets.

Or course existing classes in Orekit that use static factories may have to be modified so that they can accept their dependencies via a constructor/method parameter.

yannick · September 13, 2018, 3:26pm

Thank you very much for the answers.

For my short-term needs, I think I’ll take my chances with the clearFactories() method (by an ugly copy-paste into my own project to make it accessible). I need it only in a single location, in a piece of code that will not be called often, and always at a time where I can ensure no other threads are running. With a good amount of tests around it, I think it is a relatively safe bet.

Now for long-term refactoring, I agree with @gbonnefille : it is important to keep simple things simple.

I would not mind a dual API. But there is maybe a way to avoid it : maybe the context could be handled in a “state machine” way ? What I have in mind would look like:

contexts could be activated by calling a dedicated static method activateContext(“mycontext”), somewhere in an appropriate class
thread-specific contexts (with thread-specific data caching), so a thread calling activateContext() would not mess up the computations of another thread
a default context, so the user does not need to bother with the whole mechanics if he does not need it.

I have not thought about this much, but at first glance it would seem to work and remain simple to use.