On the use of (Field)DataDictionary and DoubleArrayDictionary

Serrof · September 19, 2025, 10:21am

Hi folks,

Recently I discovered performance issues when numerically propagating the STM and model parameters Jacobian with “a lot” of forces (even 10 was bad).

The problem was mostly due to creating/modifying/querying DataDictionary and DoubleArrayDictionary. It was the bottleneck, taking the majority of CPU.

So in patch 13.1.1 I did my best to fix this, without breaking APIs. Often, I simply had to use Java HashMap which is basically more efficient than these Orekit native classes.

So my question is: why do we reimplement everything there? Why not just use HashMap?

Cheers,
Romain.

luc · September 19, 2025, 12:16pm

When they were created, they had smaller overhead than map (see commit f53d163f).

Before that, we did use standard maps (see commit 4f9656a).

Did something change that increased the overhead, or is it simply that as stated in their javadoc, they are intended for a small number of keys?

Serrof · September 19, 2025, 4:45pm

They might be faster than HashMap for a few entries, I don’t know, I doubt it would be as noticeable.

What I know for sure is that when you have repeating propagations with 10-20+ model parameters in your partial derivatives, getting/writing/copying/modifying these dictionaries was the bottleneck and it’s already not anymore after switching HashMap wherever possible without breaking APIs. I’m talking a factor 3 in total running time more or less.
Btw you might say that such a use case is far fetched but it’s not, I’m simply trying to use Orekit for a direct shooting method.

So I’m thinking why maintain in-house dictionnaries, with the same features than HashMap, when we could just delegate everywhere?
In practice we could still have them deprecated to soften the upgrading pain of users.

Cheers,
Romain.

luc · September 19, 2025, 5:32pm

I would really like to see a benchmark with the use case of one additional state only (STM), which is a fairly frequent one.

Could we have an interface for this, and as SpacecraftState is indeed immutable we could decide at build time to use either the internal implementation for few keys or an hash map implementation for many keys?

Serrof · September 19, 2025, 6:19pm

Sure, let’s dig a bit more before acting.

My feeling tho is that when you only have a few additional states/derivatives, you don’t call them very often, so I don’t know if you would see a difference at high level. On the other hand, the more states you have, the more you’re going to spend time trying to find which one is which and it can even become your bottleneck.

Cheers,
Romain