Data Context Proposal

I think it would be good to add it.

That would be fine , as currently it breaks the automatic build (see https://ci.orekit.org/job/orekit/job/annotation/). Note that @gbonnefille is currently working on a revamp of the building process to switch from jenkins to gitlab-ci with also SonarQube (https://sonar.orekit.org/) and Nexus (Nexus Repository Manager) to have automatic build of everything (general website, maven site, nightly packages…).

No, I think it would be wiser to delay the release to mid January.

OK, options for integrating into the build are:

  1. Include plugin in main orekit jar. Would require orekit is built in two steps: annotation+plugin first, then main code.
  2. Create a separate jar for the plugin that uses a different classifier. Similar to above except the compiler plugin is not in the runtime jar.
  3. Separate maven projects for the annotation and plugin, as they are now, move them to a new repository. Adds two new build time dependencies, but zero new run time dependencies. Adds another git repository.
  4. Make Orekit a multi-module maven build. Same as above except all in the same git repository. Would need to move all existing code to a new directory in the repository.

Some of my thoughts on this topic:

  • What is the impact of annotation and compiler plugin to the Orekit’s users? Are these things limited to Orekit’s development? Or a user of Orekit can be interested (or need) to use them?
  • Splitting the Orekit’s code base in many modules could be interesting for some other use cases: as a user, I only want to use Time related computation and/or referentials computations.
  • Does the changes on the building system will introduce incompatibilities and so a bump on the major part of the version number?

I am always afraid with multi-modules projects, but agree that once they are properly set up, they are more powerful and user-friendly.

Multi-modules for new features (like annotation and plugin) could be done on 10.1, but splitting the main library into orekit-core, orekit-od and such would require a major version. In any case, I guess it will happen some times in the future, as the library becomes more bloated with each version.

1 Like

If they don’t want to use it then there should be no impact. The annotation is dropped at run time so it should not change any run time behavior. Note that attempting to reference the plugin at run time will throw an exception because it depends on classes that are part of javac. But if they don’t want to use the plugin why would the reference it?

If a user doesn’t want to use the default data context then the user will want to use the plugin when compiling their own sources. The plugin will print warnings every time they use the default data context. They would add an extra argument when calling javac -Xplugin:dataContextPlugin ....

I didn’t have a big split in mind. Just put the annotation and plugin in separate modules and leave Orekit as it is. A bigger split is interesting, but that is probably a separate larger discussion.

To use the plugin an OpenJDK based javac must be used. The plugin needs access to the internal representation used in the compiler, similar to how @Deprecated works, so it needs a specific compiler. I’ve tested it with OpenJDK 8 and 11, the two LTS versions. Using an annotation processor would use a standard API and so would be compatible with more compilers, but it seems to be unable to figure where the annotated elements are used. I needed the more powerful API of a compiler plugin.

How does this impact the user? If the user wants to continue using Orekit as they have there will be no impacts. If the user wants to check where they use the default data context in their own program then they will have to compile it with OpenJDK and add the compiler option I mentioned above. If the user wants to build Orekit they would need to use OpenJDK. (We could create a profile to enable/disable the compiler plugin which would enable using a different compiler to build Orekit without the ability to do any compile time data context checks.)

So I believe including the plugin will be entirely backward compatible.

Yeah, it is useful when I’m using Hipparchus but I still don’t fully understand how the build works.

I’ve implemented option 1 now, but it will be relatively easy to switch to any of the other options. I’ll push it up to the annotation branch.

I would personally be interested in using the plug-in for my application, since I intend to avoid using the default data-context anywhere. However right now we are using the Oracle JDK. Moving to OpenJDK has been considered, but we have not yet made the step. We will certainly consider it again now that it’s a requirement for this plug-in.

Another thing that I had in mind was to create a set of Checkstyle rules to allow developers to spot undesired uses of the default data-context. However unless I missed something, it would require a rule for each Orekit method with a data-context in its list of arguments, and would certainly be a pain to maintain.

I understand that it is also possible to create a default data-context that will crash at runtime when it is used ? Assuming the application test coverage is good enough, this should also be able to spot undesired uses of the default data-context (but as a last resort since it only works at runtime).

I have begun a prototype integration of this new feature into my application. So far, so good, but I have not been able to do that much yet unfortunately. I have a feeling that the hard part will be ensuring that the default data-context remains unused.

As spotted in a previous thread I have in mind another library facing a similar issue: libcurl. In order to both serve newbies and experimented users, the library offers two API: one exposing all possible settings and the other much more simpler with many presets. As a counterside, the simpler one cannot be used in software with concurrent usages.

In order to offer the ability to forbid usage of the Default Data-Context, I feel that the better solution is to extract this part in a dedicated module (let’s call it orekit-helper or orekit-easy). With such change, consumers can enforce the prohibition of this module.
But as Luc said, this is a major change.

Is it not possible to add a check looking for the annotation @DefaultDataContext?
But you will certainly miss any indirect calls.

What about using AOP (Aspect Oriented Programming)?
The @DefaultDataContext should be declared with a RUNTIME retention. Then, a user will be able to add AOP (Spring AOP?) to react on a call using such annotation: emit a warning or drop a RuntimeException.

For Orekit the change seems quite small.

For the user, the required process to introduce such logic in its framework.

Depends on the requirements.

The compiler plugin will probably work on Oracle JDK. It is my understanding that Oracle JDK is mostly OpenJDK with some rebranding and extra bits. Could you try it out? Let me know if it doesn’t and we can probably get it to work fairly quickly. Is the paid version of OracleJDK different from the free version on their website? If so I can download it for testing.

I didn’t realize that was possible. Can Checkstyle also spot undesirable class, field, and constructor access? With the @DefaultDataContext annotation it would fairly easy to generate a list of annotated fields, methods, constructors, and classes. That is about all a standard conforming annotation processor can do.

Yes, but because Orekit uses the default data context to initialize static fields in common classes it will always crash even if you don’t use the default data context in your code. For example the static fields in AbsoluteDate prevent the class form initializing without a functional default data context. Looking to the future (11.0) I think it makes sense to remove static fields that use the default data context, but that is worth discussing as it would make code that used those static fields more verbose. Guilhem makes a good suggestion of moving the static fields to another class/module.

I’ve tried it out with a couple of my applications and it was not as bad as converting Orekit’s code. The hardest part I’ve found is converting enums like IERSConventions that used the default data context.

I’ve played a bit with the annotation and the compiler plugin. Everything seems to work fine with my Oracle 9 JDK. Those are really great features, I think it is important to expose them to the users if possible. I would really like to be able to audit my code with them.

If I understand the annotation properly, there is an assumption that all orekit low-level code that makes use of the default context must be manually annotated properly ? Then the compiler plugin helps to ensure that the annotation has been propagated properly up the call tree ? If that is correct, it will be very important to maintain proper annotation of orekit low-level code when the library evolves, otherwise the compiler plugin could miss something.

We have used checkstyle in my team to spot “anti-patterns” using custom regex rules. It is very useful to guide newcomers. But I fear that even if the annotation can generate a list of “forbidden” items, converting this list to checkstyle rules and keeping them up-to-date will be a bit of a chore. The compiler plugin, if accessible to the library users, seems to provide the same functionnality in a much more elegant way.

Does this mean that the compiler plugin will issue warnings even if my code does not use the default data context (except indirectly when those fields are initialized) ? I’m still working on a prototype (not as much as I’d like to, unfortunately), but it’s not yet ready to check that. On a side note, I’m struggling a bit trying to reuse the parser from TAIUTCDatFilesLoader with a file that does not match the expected naming convention. I’ll probably figure it out in the end, but I think it shows that an example of non-standard DataContext in the documentation (or tutorial) would probably help users to make use of this new feature. It adds a layer of complexity on the existing DataLoader/DataProvider system that was already non-trivial.

This is a great idea, and seems fairly easy to do ! However, it should catch exactly the same code as the compiler plugin, except at runtime. The more I think about this, the more I believe that the compiler plugin is the best solution (at least for me) and that runtime checks would be redundant. But it would probably not hurt to change the annotation to RUNTIME retention so users of the library can make use of it in this way ?

You are fully right: the compiler warning is a really good solution.

But one lesson I learned during many years to help development teams is: « real people are lazy! »

Concerning compiler warnings, they don’t read or they (don’t want to) understand the meaning of messages. And they have a good reason to do so: they have to deliver something for a given date.

Here, we are talking about a possible bug really difficult to fix: the soft will not break, but some results can be probably affected by a misconfiguration in some conditions. And such type of bug is really pervert as it can resist of test and occur only on production. Why? Because during tests, we will certainly use ContextData not really different from the default one (because when creating a test, we are not trying to cover Orekit features, but just testing our new features). As we use similar ContextData, we are unable to detect the mix of injected DataContext and default. So, the really first day we will test our feature against ContextData different from default is on production.

It’s the reason why I believe having a Runtime Checker could be useful for users.

The AOP solution is one of them. But like the compiler plugin, it requires the development team is aware of such possible error and want to protect against it. But remember « real people are lazy »

The ideal solution is, IMHO, something where it became impossible to mix injected DataContext and default. For example, splitting code so I’m not able to use both, or at least I’m really aware of the usage of the default. Another example is offering control of the default values: I’m able to inject DataContext for default or default is automatically redirected to current DataContext. Or anything else avoiding a mix of these.

Please, make sure I know all of this requires lot of job. I don’t want to minimize the interest of the job already done: it’s a really great job. And the compiler plugin is also a really good solution. Many thanks for thinking of this aspect.
I just want to share my experience. I hope this message can be help you to find, in future, a solution for the runtime aspect.

Great! I’m glad it works. As orekit is building now on the annotation branch the plugin and annotation are included in the Orekit jar. We could also split them out into separate maven artifacts by switching Orekit to a multi-module maven build if that is desirable.

Yes, it works just like @Deprecated. When any code that is not annotated with @DefaultDataContext uses any code that is annotated with @DefaultDataContext then a warning is printed. It also has the same limitations as @Deprecated, namely it can only use information that is available at compile time so it is possible to write code that uses the default data context without generating a warning. This mainly means it has difficulty with inheritance where some methods use the default data context and others do not. For example:

AbsoluteDate date = ...;
Object o = date;
o.toString();

The method AbsoluteDate.toString() uses the default data context but the above snippet won’t trigger a warning because when toString() is called the compiler doesn’t know the runtime type of o (and can’t know for more complex cases). This is probably actually fairly common with code like System.out.print(date) or "" + date where toString() is called using polymorphism so the compiler can’t know which implementation of toString() will actually be called. Another example is DataProvider.feed(Pattern, DataLoader, DataProvidersManager) where the default implementation uses the default data context to maintain backward compatibility but all the implementations of DataLoader in Orekit override the default implementation with one that does not use the default data context.

In summary the compiler plugin is a useful tool that will catch most uses of the default data context, but it can’t catch all because of the power of the Java language. To ensure the default data context is not used a runtime checker is also needed but can’t be implemented until Orekit 11.0.

Yes, running mvn compile (on the annotation branch) on Orekit runs the plugin on Orekit as well. It currently only prints a warning for the DataProvider.feed() method noted above. In other words Orekit is annotated as well as it can be.

No. For example, AbsoluteDate.J2000_EPOCH is annotated with @DefaultDataContext. As soon as your program references AbsoluteDate for the first time J2000_EPOCH will be initialized using the default data context. The compiler plugin will only emit a warning if your code references J2000_EPOCH.

More and better documentation is always helpful. :slight_smile: @luc made many updates to the docs already. contexts.md provides some good pointers. Perhaps adding a tutorial or an example in the JavaDoc?

The easiest transition would probably be to use DataContext myContext = new LazyLoadedDataContext() and then configure the DataLoaderss and DataProviders as you did before. Another way now possible is:

String path = "/path/to/leap/second/file";
try (InputStream in = Files.newInputStream(Paths.get(path))){
    List<OffsetModel> leapSeconds = new TAIUTCDatFilesLoader.Parser().parse(in, path);
}

Then leapSeconds can be used as the first argument to TimeScales.of(...).

I’m not familiar with AOP but theoretically it should be able to catch some cases that the compiler cannot, as described above. It would introduce some overhead though which would impact performance but it could still be useful as part of a test suite.

Agreed. I appreciate you sharing your valuable lessons learned. My preferred solution is creating an ExceptionalDataContext that always throws a runtime exception and then the user can make it the default using DataContext.setDefault(...). Implementing that in the 10.0 series would break backward compatibility, so a full implementation will have to wait to 11.0 where we can remove (or move) the static fields that use the default data context.

A partial solution may be available now if the user references AbsoluteDate and Propagator, and then sets the ExceptionalDataContext as the default. The default data context could still be used through e.g. AbsoluteDate.J2000_EPOCH but any new attempts to use the default data context would fail. Some example code to implement this idea, still needs to be tested:

public void setExceptionalDataContext() {
    setExceptionalDataContext(null, null);
}

// reference AbsoluteDate and Propagator so their static fields are initialized.
private void setExceptionalDataContext(AbsoluteDate d, Propagator p) {
    DataContext.setDefault(new ExceptionalDataContext());
}

I have set up a tutorial for multiple data context usage.
Could someone review it? I recall that the tutorials are now in a separate git repository: https://gitlab.orekit.org/orekit/orekit-tutorials

Do you think we could merge the data-context and annotation branches back to develop and
prepare 10.1 release?

Hi @luc,

I’ve read your tutorial, I find it very good. It remains simple and yet provides interesting insight into DataContext details.

I have spotted a couple typos: I have a patch ready, but I do not have commit rights to the Tutorials repo. Could I get it please ? Otherwise, I can provide the patch file.

In the process of reviewing this tutorial, I’ve spotted a minor issue with the Tutorials repo. I have opened the issue in gitlab. Once a consensus on the branching model is reached, the fix will be trivial whatever we choose.

It seems someone already gave you write access to orekit-tutorials before me. Could you
check if it works?

My bad, it seems like this was a git configuration issue on my side. I’ve now been able to push the new develop branch to the tutorials repos, so my commit rights work. I’ll push the commit for the typos later today. Sorry to have bothered you for nothing.

I can merge them all into the develop branch.

Nice tutorial! The code, good for a tutorial, would create performance issues if used in production. Every time context.getFrames() is called it creates a new set of frames. Would be a memory hog and could hurt performance since the equality shortcut in frame.getTransform(...) would not be used. I created CompositeDataContext to have a convenient way to address this issue. Again, not an issue for a tutorial, but I would be nervous if someone pasted that into production code.

Also the loop output should be indented show it is treated as preformatted.

Thanks for writing the tutorial!

I fixed this on the new develop branch of the tutorials repo, along with a couple other typos.

I did not touch the tutorial code however, so your other points are still applicable.