Multiprocessing and Orekit python wrapper

Hello everyone !

I am trying to combine multiprocessing (using the python library multiprocessing) and Orekit.
I am using the function starmap to apply the function myfunction to each element of a list of arguments sets args_list :

pool = multiprocessing.Pool(nb_processes)
pool.starmap(myfunction, args_list)

To test this, I began with a simple case : with only one process (nb_processes = 1). This should be equivalent to the situation when I had not implemented multiprocessing (at that time, my code worked perfectly).
When a new process is created, all is going well until the line
earth_fixed = FramesFactory.getITRF(IERSConventions.IERS_2010, True)
where the execution hangs (no error but the execution just hangs indefinitely).

My first guess was that the process has trouble accessing to orekit-data.
I tried adding

orekit.initVM()
setup_orekit_curdir("data/orekit-data-master")

at the beginning of myfunction for orekit to be properly initialized in that process. It does not solve the problem.
I also tried adding another line that uses orekit-data before the getITRF line :

utc = TimeScalesFactory.getUTC()
earth_fixed = FramesFactory.getITRF(IERSConventions.IERS_2010, True)

Surpisingly, the utc line is executed without any problem, and then the ITRF line hangs indefinitely.

I am beginning to be out of ideas to find a solution to this problem so I would be glad to get help on this.

Thank you in advance for the time spent on my problem.
Regards,
Thomas

Hi @ThomasJ,

Welcome to the forum!

I’m surprised you have a problem with the Frames. They should be thread-safe, however, I wouldn’t know if it’s equivalent to “process-safe”.

I suggest you instantiate a different DataContext for each of your processes so you don’t have concurrent access to the data between the processes.
Here is a link to this in the technical documentation.
And a small tutorial (in Java…).

I hope this will help. Feel free to come back to us with a runnable piece of code if you still have some issues.

Regards,
Maxime

Hi @MaximeJ,

Thank you very much for your answer.

If I understand well, your idea would be to create for each process a DataContext object and use it to collect the frame.
I tried but it seems that it does not solve the problem. May be I am missing something.
I wrote a very simple code. It prints 1 and 2 … but not 3!
To run it, you just need to have the orekit-data-master folder in the same folder as the script.

import multiprocessing as mp
import orekit
from orekit.pyhelpers import setup_orekit_curdir
from org.orekit.data import DataContext
from org.orekit.frames import FramesFactory
from org.orekit.utils import IERSConventions


def main():

    # Starting the Java virtual machine to run Orekit
    orekit.initVM()
    setup_orekit_curdir("orekit-data-master")

    # Trying to get ITRF in the main process
    print(FramesFactory.getITRF(IERSConventions.IERS_2010, True))

    # Multi processing
    pool = mp.Pool(1)
    args_list = [(0,)]
    pool.starmap(my_function, args_list)


def my_function(i):
    data_context = DataContext.getDefault()
    print(1)
    utc = data_context.getTimeScales().getUTC()
    print(2)
    itrf = data_context.getFrames().getITRF(IERSConventions.IERS_2010, True)
    print(3) # Not printed !!


if __name__ == "__main__":
    main()

Regards,
Thomas

Hi again @ThomasJ,

I’m not sure what’s happening.

What I meant in my first message was to change the “default” DataContext with a specific one for each process.
So, replace:

data_context = DataContext.getDefault()

With something like

from org.orekit.data import DirectoryCrawler, LazyLoadedDataContext
from java.io import File
datafile = File(str(orekit_dir))
data_context = LazyLoadedDataContext()
data_context.getDataProvidersManager().addProvider(DirectoryCrawler(datafile))

If you’re using a directory for the orekit-data (and not a zip file. I suggest using a directory, the zip file will be slower).

But now that I’ve seen your code I guess it’s not the problem. I don’t know why it can access the time scales but not the frames.
Are you sure you’re having the “Earth Orientation Parameters” in your zip file ?
Can you replace the ITRF with:

gcrf = data_context.getFrames().getGCRF()

and see if it still freezes ?

Regards,
Maxime

Hi @MaximeJ,

I modified my_function like that :

def my_function(i):
    data_path = File("./orekit-data-master")
    data_context = LazyLoadedDataContext()
    data_context.getDataProvidersManager().addProvider(DirectoryCrawler(data_path))
    print(1)
    utc = data_context.getTimeScales().getUTC()
    print(2)
    gcrf = data_context.getFrames().getGCRF()
    print(3)
    itrf = data_context.getFrames().getITRF(IERSConventions.IERS_2010, True)
    print(4) # Not printed !!

It print 1, 2 and 3 … but still not 4 ! It freezes.
Seems like getGCRF() is working.

Besides, you can notice that I included a getITRF() in my main() function which has a good behaviour. So my orekit-data-master folder contains all data necessary for ITRF.

Again, thank you for your help.
Regards,
Thomas

Hi @ThomasJ,

Yes sorry I missed that line.

:worried: Sorry to read that. Seems like the problem is quite complex.
Is there a way to properly debug with starmap and see where the code stalls?
(I’m a bit out of idea…)

Regards,
Maxime

Hello again @MaximeJ,

I forgot to tell you that I am working on an Ubuntu 22.04.2 OS (Linux 5.15.0).

I tried on a different OS (Windows 10) and I have no problem. The code runs correctly !

Have you got any explanation for that ?
May be a guess from @petrus.hyvonen ? :grin:

Thanks for your time,
Regards,
Thomas

Hi there,

Are your two Python environments equivalent in terms of library versions (check with pip freeze for example)? If not try aligning the non working one on the other with a requirement file

Cheers,
Romain.

Hello again,

Oh, this is “good” news, at least it shows it must be a configuration / OS issue and not an issue with Orekit data management.

Hi,

Very hard to see where the problem could be. I have mostly used threads, but notice there you need to do an AttachCurrentThread() to get the JVM access. Not completely sure why you don’t seem to need that here (try!). See for example: Python wrapper and orekit.initVM()

Not sure if different java versions could have an influence, I typically use openjdk 8 which is what is installed with the default conda orekit.

Hi !

Thanks to all of you for replying to this topic !

Are your two Python environments equivalent in terms of library versions (check with pip freeze for example)? If not try aligning the non working one on the other with a requirement file

@Serrof Yes they are. I use python 3.10 and a pip freeze gives the following output in both cases:

orekit==11.3.2
orekit-stubs==0.1

Very hard to see where the problem could be. I have mostly used threads, but notice there you need to do an AttachCurrentThread() to get the JVM access. Not completely sure why you don’t seem to need that here (try!). See for example: Python wrapper and orekit.initVM()

Not sure if different java versions could have an influence, I typically use openjdk 8 which is what is installed with the default conda orekit.

@petrus.hyvonen I can’t pass the output of initVM() to the parallelized function because this object is not pickable. So I can’t use the function AttachCurrentThread(). I am not very familiar with multi threading and multiprocessing so I don’t know if there is a way to get over this problem.
I also use openjdk 8.

I will consider switching to multithreading instead of multiprocessing or write my whole code in Java.

Regards,
Thomas

Hi,

You don’t need to transfer the initVM() value, this is handled by the library. use:

import orekit
orekit.getVMEnv().attachCurrentThread()

not sure that it will help your issue though, but worth a try.

Hi @petrus.hyvonen ,

It does not solve the issue but thanks for your help !

Regards,
Thomas

Hi there,

I don’t know if this is what you want to do but I know @abulte used the library called rich to parallelize runs of Orekit in Python.

Romain.

I’ve also been banging my head against multi CPU use over the last couple of days. What I’ve learned:

  • Multi Threading only benefits I/O bound tasks, like propagating many objects and writing the state to a file or database in each time step. Threads share the same memory space and you can “attach” the vm as mentioned above.

  • Multi Processing helps with cpu-bound tasks, but each process has an isolated memory space and requires that all data transferred in/out can be serialized (pickled). So you need to spool up a dedicated vm for each sub process.on init

@ThomasJ This worked for me:

import os

from multiprocessing import Pool, current_process
from pprint import pprint


def task(*args, **kwargs) -> str:
    process = current_process()
    print(f"Start: {process.name} > {args}")

    from org.orekit.utils import IERSConventions
    from org.orekit.frames import FramesFactory

    fname = getattr(IERSConventions, args[0])
    frame = FramesFactory.getITRF(fname, True)

    print(f"Done: {process.name} > {frame}")
    return "[ok]"


def pool_init():
    import orekit
    from orekit.pyhelpers import setup_orekit_curdir

    vm = orekit.initVM()
    fdir = os.path.dirname(__file__)
    setup_orekit_curdir(os.path.join(fdir, "data", "orekit-data.zip"))


if __name__ == "__main__":

    workers = 4
    work_init = pool_init  # starts orekit vm
    work_task = task  # task with reg in/out
    work_units = ["IERS_2010", "IERS_2003", "IERS_1996"]  # list of things to do

    with Pool(workers, initializer=pool_init) as pool:
        results = pool.map(work_task, work_units)

    pprint(results)

The result should give you:

Start: SpawnPoolWorker-2 > ('IERS_2010',)
Start: SpawnPoolWorker-4 > ('IERS_2003',)
Start: SpawnPoolWorker-1 > ('IERS_1996',)
Done: SpawnPoolWorker-2 > CIO/2010-based ITRF simple EOP
Done: SpawnPoolWorker-4 > CIO/2003-based ITRF simple EOP
Done: SpawnPoolWorker-1 > CIO/1996-based ITRF simple EOP
['[ok]', '[ok]', '[ok]']

Hello Sydra,

Thanks for your message.

Finally, I managed to solve my issue by using the library mpi4py, which is quite simple to use, instead of multiprocessing.

Regards,
Thomas

Hi,ThomasJ!
Thanks for your post here,I’ve been stuck on this problem for a while. I just successfully debugged this method using multiprocessing. What’s the difference in implementation with mpi4py? Is there any difference in efficiency between the two methods?

mpi4py looks super interesting - thank you for sharing! From what I can tell it does not require separate java vm for each process which should be a lot more efficient.

Hello !

Sorry for replying so late.

@AstroL
I am far from being an expert on the topic, but from my little experience, here is the main difference I see :

multiprocessing is a python library that allowes you to divide a task on several processors and that provides some tools for communication between the processes.

MPI allowes you to do the same thing but it is not specific to python. When you run your python script with MPI, you must specify the number of processes you want to use (mpiexec -n 4 python myscript.py for instance). The script will be read by each process independently. mpi4py is a python library that allowes you to control MPI from your python script (communication) and to have a different behaviour depending on the process number (rank) using if rank == i: ... for instance.

@Sydra
I am glad to know that it is interesting to you!

From what I can tell it does not require separate java vm for each process which should be a lot more efficient.

Are you sure of that ? I think that you need to have separate java vm for each process. Can you provide more details on what makes you think that ?

The documentation for mpi4py is very good. You can check the tutorials that will help you understand how it works.

I’ve admittedly not have time to try this yet. For context, I’m using orekit as the engine of a space communications network simulation. As the number of spacecraft / locations goes up, it becomes computationally challenging.

Multi-threading didn’t help, and using multiprocess on an 8 core machine was actually slower than running single process due to the overhead of serializing the state data to pass it between processes.

I haven’t tried GPU yet (via CUDA) and MPI is still on my list too. I’ll report back when I had a chance to play with it …