Endless loop in a multi-handler case

larahue · September 18, 2020, 4:04pm

Good afternoon @ the Orekit Community,

I have some troubles using detectors and handlers in a propagation.
More precisely, I would like to use several handler (Handler1, Handler2, Handler3) to perform distinct tasks. All handlers shall perform their tasks at the same time, and they are therefore associated with the same type of detector (here: an ApsideDetector).
From my understanding, it is not possible to associate one single detector with several handlers, am I correct ?
The solution I have chosen is to create 3 ApsideDetector : one for each handler.
My handlers can return either Action.RESET_STATE, Action.RESET_EVENT or Action.CONTINUE.
It seems that after a certain time of propagation, we loose control of the propagation (like an endless loop ? I lose the thread executor). Could it be possible that two handlers - active at the same propagation step - reset each other’s calls ? Did this kind of problem already happen to one of you ?

We are still digging to find the exact root cause in the deep code of propagation … but we have no more clue than these several handlers interferring while resetting states.

Just for information, after reading this post (AbstractIntegrator.acceptStep endless looping in eventLoop) - by the way : great feature “your topic is similar to” ! - I downloaded the latest 10.2 orekit version. However, it had no effect on my problem.

Thanks in advance,

Lara

evan.ward · September 18, 2020, 5:18pm

Hi Lara,

Good questions.

In a narrow technical sense, correct. In a broader sense that single detector can take many actions including calling other detectors. E.g. a MultiplexerHandler could be an effective way to handle your case. The MultiplexerHandler would then decide in which order the other handlers are called and which Action is returned. This gives you full control of the ordering and the Action.

In most cases this is probably not the solution you want as it introduces non-deterministic behavior to your application. The ordering of events is only defined up to the tolerance specified as described in [1]. So with simultaneous events Orekit/Hipparchus will chose an arbitrary order for the events. Think of it as similar to a race condition in concurrent programming. It is easy to envision a scenario where only one of the handlers would be called at an event because it resets the state, or where all three will be called, or many other combinations. Given the actions you’re using it would be easy for the RESET_STATE and RESET_EVENTS to trigger each other repeatedly unless you’re careful to avoid it. Perhaps there is a bug here with earliestTimeConsidered not being considered in evaluateStep(...) i.e. for RESET_* actions and step boundaries.

Could you provide code to reproduce the issue, preferably as a JUnit test case? These kinds of event detection issues are hard to figure out so I need to be able to run the code with a debugger attached to see exactly what is happening. Also we’ll need a test case to include in our test suite so we can make sure the bug stays fixed.

Regards,
Evan

[1] org.hipparchus.ode.events (Hipparchus 1.4 API)

Mikael · September 21, 2020, 7:21am

Hi,
Using several detectors that detect the same event is usually a bad idea from our experience. Depending on the event handling RESET_STATE/RESET_EVENT, the integrator can fall in specific cases. hard to debug. I don’t know about the MultiplexerHandler but it looks like a good candidate to handle this. The good way is one handler that can call other classes (or handlers). Fewer detectors make the propagation faster, and lower the risk of bad interactions.

larahue · September 21, 2020, 3:36pm

Hi all,
Thank you for your answers

Yes, I will probably create a new class that will implement EventHandler<ApsideDetector> and will work like a MultiplexerHandler, it seems much more advantageous (computation time, control of the handlers’ orders)

In the meantime, I tried to reproduce the bug. During debug, I noticed that if I change two of my handlers (any 2 handlers out of the 3 developed) by the ContinueOnEvent action, the bug still remains (however, it seems to disappear with 3 detectors working with the action ContinueOnEvent)

Therefore, I have reproduced the bug with the following case :

2 apside detectors are associated with the action ContinueOnEvent
1 apside detector is associated with the handler ApsideCounterOrekit

You can provoke the bug by running BugOrekitTest.
The bug appears at orbit #21 : it seems as if we enter an endless loop (maybe in AbstractIntegrator line 331 ?)

Thanks,
Lara

ApsideCounterOrekit.txt (1.8 KB) BugOrekitTest.txt (1.8 KB)

evan.ward · September 22, 2020, 6:22pm

Thanks for the test case. I’ll try it out, though I might not have a chance to get to it for a while.

Regards,
Evan

larahue · September 23, 2020, 3:47pm

Hi all,
Another question, as I am looking to prioritize the actions returned by the handlers associated with the ApsideDetector :
Will an ACTION.RESET_STATE implies an ACTION.RESET_EVENT ? or is it 2 handlers with distinct behaviours ?
The idea behind this question is to have a ‘generic’ function, that would return the most conservative action, among a list of actions (that are the actions associated with all the handler that triggered at time t)

Regards
Lara

evan.ward · September 24, 2020, 12:09pm

Hi Lara,

The RESET_ actions do form a hierarchy. RESET_EVENTS causes all event detectors to be checked. RESET_DERIVATIVES causes the propagator to discard all computed derivatives after the specified time, which forces an integrator step boundary at that time and also necessarily rechecks all event detectors. RESET_STATE does all the above plus sets the integrator’s state to a new value.

Regards,
Evan

larahue · September 25, 2020, 8:49am

Hi Evan,

Thank you very much for your support ! All of these reflexions lead me to one question :
I am wondering if in orekit it would be possible to associate one detector with a list of handlers instead of one single handler ? that would avoid the situation with N identical detectors being associated with N different handlers. Indeed, the different actions return by the list of handler could be prioritized by : STOP > RESET STATE > RESET DERIVATIVE > RESET EVENT > CONTINUE (if my understanding is correct). This priority order could also be overwritten by the developper.

Regards
Lara

evan.ward · September 25, 2020, 12:08pm

That sounds like a useful idea, but instead of modifying AbstractDetector I think it would be better to create an EventHandler that calls the other handlers, similar to the MultiplexingHandler idea we discussed earlier. Contributions welcome.

larahue · October 7, 2020, 11:15am

Hi Evan,
I would like to contribute to Orekit about this topic (me or @yannick who works with me).
Here is what we propose, in the main lines:

The creation of one interface AbstractMultiHandlersDetector, very similarly to AbstractDetector.
AbstractMultiHandlersDetector implements EventDetector:

public abstract class AbstractMultiHandlersDetector
    <T extends AbstractMultiHandlersDetector<T>> 
    implements EventDetector

The attributes of AbstractMultiHandlersDetector are identical to the attributes of AbstractDetector, except for:

    /** Default handler for event overrides. */
    private final EventHandler<? super T> handler;

which becomes:

/** Default list of handlers for event overrides. */
private final List<EventHandler<? super T>> handlers;

/** List of handlers whose Action returned is RESET_STATE. */
private List<EventHandler<? super T>> resetStateHandlers;

All methods of the AbstractMultiHandlersDetector are identical to the methods of AbstractDetector, except for:

withHandler -> withHandlers

public T withHandler(final EventHandler<? super T> newHandler) {
            return reate(getMaxCheckInterval(), getThreshold(), getMaxIterationCount(), newHandler);        
}

which becomes:

public T withHandlers(List<EventHandler<? super T>> newHandlers) {
            return reate(getMaxCheckInterval(), getThreshold(), getMaxIterationCount(), newHandlers);        
}

eventOccured -> takes into account priorization :
STOP > RESET_STATE > RESET_DERIVATIVE > RESET_EVENT > CONTINUE

public Action eventOccurred(final SpacecraftState s, final boolean increasing) {

    Map<EventHandler, Action> actions = getHandlers().stream()
        .collect(Collectors.toMap(
            Function.identity(), 
            handler -> handler.eventOccurred(s, (T) this, increasing)));

    if (actions.containsValue(Action.STOP)) {
        return Action.STOP;
    }

    if (actions.containsValue(Action.RESET_STATE)) {
        resetStateHandlers = actions.entrySet().stream()
            .filter(entry -> Action.RESET_STATE.equals(entry.getValue()))
            .map(Map.Entry::getKey).collect(Collectors.toList());
        return Action.RESET_STATE;
    }

    if (actions.containsValue(Action.RESET_DERIVATIVES)) {
        return Action.RESET_DERIVATIVES;
    }

    if (actions.containsValue(Action.RESET_EVENTS)) {
        return Action.RESET_EVENTS;
    }
    
    return Action.CONTINUE;
}

resetState -> take into account all handlers that returned the action RESET_STATE:

public SpacecraftState resetState(final SpacecraftState oldState) {

    SpacecraftState newState = oldState;

    for (EventHandler handler : resetStateHandlers) {
        newState = handler.resetState((T) this, newState);
    }

    return newState;
}

We are interested about Orekit’s feedback on such modification (if you would have done it differently ?)
Best,

Lara

evan.ward · October 7, 2020, 12:36pm

Hi Lara,

Excellent!

Good implementation of the logic for what you’re trying to do. A couple questions for you:

Why create a new type of EventDetector instead of EventHandler? The EventHandler interface is much simpler to implement than EventDetector. The methods that contain the new logic eventOccured(...) and resetState(...) are both present in the EventHandler interface.

How is the order determined in resetState(...)? As it is it appears the order handler.resetState(...) is called is the order in which the handlers were added. Seems like as good an order as any. Just have to make sure it is documented in the actual implementation because order will matter.

To some extent using this class will break the existing rules of event detection. Current rules state there is a total order on all events occurring, but that is not true for this class. Consider to EventHandlers A and B where A returns RESET_STATE and resets the state such that B does not occur if A occurs before B. If A and B are detected by separate EventDetectors then A would occur and B would not. With this class A would start occurring when its eventOccurred(...) method is called, then B would occur, then A would finish occurring when its resetState(...) method is called. B never sees the reset state from A and there is no longer a total order as B occurs neither before nor after A. It sounds like this behavior is what you want, so just need to document how using this class is different from using EventHandlers in separate detectors.

larahue · October 7, 2020, 3:52pm

Thank you Evan for your answer,

Yes,you are right : the EventHandler interface seems much simpler to implement than EventDetector. Moreover, I would presume that if we want all the detectors that implements EventDetector to be able to use such MultipleEventHandler, it is better to implement EventHandler than EventDetector (since double heritage is not possible)

The class MultipleEventHandler would have two attributes only:

/** Default list of handlers for event overrides. */
private final List<EventHandler<? super T>> handlers = new ArrayList<>();

/** List of handlers whose Action returned is RESET_STATE. */
    private List<EventHandler<? super T>> resetStateHandlers = new ArrayList<>();

And it would override the three methods : init(...), eventOcurred(...) and resetState(...) as detailed before.

However, I personally think that is it dangerous to let the user use several detectors that detect the same event without warning him.We loss quite a lot of time debugging this problem. Maybe it could be good to add documentation in the EventDetector &/or AbstractDetector class to warn to user from this usage, and to advice him to use the MultipleHandler. What do you think?

Regarding the order choice for resetState, it is purely arbitrary. Yes, such information is very important for the user, so I will document it.

evan.ward:

To some extent using this class will break the existing rules of event detection. Current rules state there is a total order on all events occurring, but that is not true for this class. Consider to EventHandlers A and B where A returns RESET_STATE and resets the state such that B does not occur if A occurs before B. If A and B are detected by separate EventDetectors then A would occur and B would not. With this class A would start occurring when its eventOccurred(…) method is called, then B would occur, then A would finish occurring when its resetState(…) method is called. B never sees the reset state from A and there is no longer a total order as B occurs neither before nor after A. It sounds like this behavior is what you want, so just need to document how using this class is different from using EventHandlers in separate detectors.

And finally, yes, your analysis is correct and it is the behaviour we want (A occurs, B occurs then the action resetState(...) of A occurs). If this is okay for you, we can start the development !

One last though : the endless loop in our problem is the while loop in AbstractIntegrator, line 331.
It is Hipparcus, not Orekit, so I cannot add this to my contribution, this while loop could have a maximumIterationNumber criteria, in order to avoid such endless loop ?

evan.ward · October 7, 2020, 4:25pm

Great idea.

Plan looks good to me. Thanks for you interest in making Orekit better!

Seems like a good idea. It will take some thought as to what the right limit is. The backing up feature is needed to ensure a total order on the events and that they alternate between increasing and decreasing events. There likely exist some strange g functions where the algorithm would need to back up many times to correctly handle them. There is also likely a limit to how many times the algorithm will need to back up to process practical g functions. I’m a bit hesitant to set a limit since it might arbitrarily break some code. My approach was to make that code correct, but it seems that has not worked especially well given this bug and another recent similar one. So it seems worthwhile to add another layer of defense against bugs causing infinite loops.

yannick · October 27, 2020, 4:48pm

Hi all,

I have taken the liberty to push the work of my colleague @larahue on the branch multi_handler-727 and to create the corresponding issue in Gitlab, I hope it’s okay.

The contribution seems fine to me, but it would be great if an Orekit developer with more experience could review it, maybe @evan.ward ? I have prepared a merge request for this review.

Kind regards
Yannick