I would stop short of calling this a feature request, but I do think it is worth identifying to active contributors a use case for OREKIT: concurrent computation of completely unrelated tasks. We use OREKIT in an operational system, its an excellent library, but the main problem we face is the compute time.
Has anyone assessed the overhead of the thread safety measures that exist in the frame-caching function? I ask because if they add up to being significant due to the high hit rate, I wonder whether alternative models might be useful and injectable according to need. We have not tried this due to time pressure, but others may have assessed it an identified it as impactful or negligible.
We do currently use OREKIT in a multi-threaded environment, but only because that is the easiest way to deliver computation requests and responses over the network. In other words, our threading model is not because we want to safely share OREKIT internal information between threads, it is simply to deliver workload. We could therefore instead spawn standalone processes for each computation request so that it may run in a non-thread-safe mode or develop thread-local based solutions to achieve thread safety or even remove mutable singletons all together.
I am not aware of accurate benchmarks about the overhead due to our attempts to preserve thread-safety. There has been a very recent remark (yesterday) here about the time scales.
From my experience, intuition is often insufficient about bottlenecks, so we really need to set up full-blown instrumentation and monitoring tools before we can draw any conclusions, and this of course will depend on the use case. For performances analyses, we used to rely on YourKit, Maxime recently pointed me to VisualVM too as I don’t have access to a YourKit licence anymore. I don’t know if these tools provide information about multi-thread bottlenecks or if they are accurate enough to pinpoint to the code we added for thread safety.
Starting a separate process will likely be more overhead than starting a new thread. Threads in Orekit share loaded auxiliary data. A separate processes would have load it all first.
You could try using the Preloaded instead of the LazyLoaded variants of the
DataContext. Synchronization is much simpler when all the data is loaded into memory, which may improve run time at the cost of more memory and longer start up times. See e.g.
There is some performance monitoring or Orekit frame transforms at Sign in · GitLab E.g. Sign in · GitLab
The DataContext changes do indeed open up improvements that we already make use of.
Starting the process or thread is not the concern for me in this particular use case. For all of my use cases, I am able to make the heart of the problem “embarrassingly parallel”, meaning if any synchronisation is needed I can manage it outside the high revisit-rate loops. The focus of my question was really in the use of thread safety measures deep inside tight loop algorithms and whether objective measures have already been obtained within the large and growing Orekit community.
I agree with @luc that intuition alone is not enough. We have seen (in VisualVM) that protected operations in the frame caching functionality have a high hit rate, so it is at least an area of interest. Even if we choose a single threaded approach to eliminate contention, the synchronisation and memory barrier operations are there, and unless the JVM is somehow clever enough to optimise them away they are slow compared to simple maths operations and so could be significant inside the loop of an algorithm.
Of course that is just a hunch and so it comes down to gathering evidence comparing the runtime with the thread-safety hacked out.
Thanks for your time, and thanks for the good work you all put in on Orekit!