Unscented filter with non-zero process noise

@markrutten
Thank you very much for the clarifications. (I am still learning about UKF)

No worries @niluj.

Those are the equations that are implemented in Hipparchus. But I’m confident they’re wrong!

Hi!

Thank you @markrutten for the clarifications, your comment is right.
@niluj comment is also right. Hipparchus doesn’t apply Eq 5.87, but we saw both implementations in the literature.

When we started implementing the UKF in Hipparchus, we saw both algorithms: (1) performing the UT only at the beginning of the step (current implementation) and (2) performing the UT at the beginning of the step and just before the update using the predicted state. We decided to follow (1) because it was easy to implement for a first version and for an internship :slight_smile: .

@markrutten your remark is important because we also wanted to implement the version (2), as you propose. However, as we implemented the filter during an internship, time issues didn’t also us to implement the version (2). We would be very happy to have your contribution about the version (2). I just have a question, do you think that having to choice between (1) and (2) is possible? Using a boolean or something else. The default value of this boolean could be in consistency with (2).

Best regards,
Bryan

This is all very confusing but surely the textbook would be correct?

I agree @Not_A_Baysean_Fan! But I think that the Sarkka book is correct and the book that @niluj refers to is wrong.

I’m not sure how to make a convincing argument, but I can start with some maths. I’m going to try and show the innovations covariance is wrong with @bcazabonne’s option (1). This relies on us agreeing that the KF and the UKF give the same results for a linear problem (which they must). Using the notation from above

Now if we simplify to a linear problem

Now compare that to innovations covariance calculated using a standard Kalman filter

Comparing (16) and (19) we can see that the process noise is missing from (16). The two will agree if there’s zero process noise, but this way of implementing the UKF is wrong if there’s non-zero process noise.

1 Like

@markrutten

I think we disagree. Not about which algorithm is right or wrong, but about how to say things.

My concern is not about which algorithm is right or wrong. On the contrary, I find Algorithm (2) much more interesting than (1). My concern is to have in Hipparchus as many possibilities as can be found in the literature.

For instance, [1] and [2] present both implementations. None of these references accuse an algorithm of being wrong. They just present both. Therefore, I find important, for a library that has to offer as many possibilities as possible to its users, to have both.

Again, I prefer Algorithm (2) too! I’m just asking about two possibilities instead of just one in the code :slight_smile:

[1] James Woodburn, Vincent Coppola, “ANALYSIS OF RELATIVE MERITS OF UNSCENTED AND EXTENDED KALMAN FILTERS IN ORBIT DETERMINATION”, AAS Paper 19-781, 2019 (1.9 MB)

[2] Van Der Merwe, Rudolph, and Eric A. Wan. “The square-root unscented Kalman filter for state and parameter-estimation.” 2001 IEEE international conference on acoustics, speech, and signal processing. Proceedings (Cat. No. 01CH37221). Vol. 6. IEEE, 2001|attachment (368.1 KB)

Regards,
Bryan

I am forgetting my manners. Welcome to the Orekit forum @Not_A_Baysean_Fan

1 Like

Thanks for your patience @bcazabonne.

As you say, that first reference states “In the general UKF formulation, the transformed state samples may then be used directly in measurement processing or formed into a sample mean and covariance, where the new covariance is resampled prior to measurement processing. Inclusion of process noise as described in Equation 24 dictates that our UKF implementation use the latter strategy of re-sampling an updated state-error covariance.”

Your second reference is one of the most cited UKF papers, but I’m confident that there’s a typo in their formulation (both the standard and the square-root) as I showed above. That’s really frustrating!

I’m reluctant to implement something that I know is wrong?

I have changed the code (to option 2) ready for a pull request, but need to do some more work to understand the tests (the radar test isn’t matching perfectly). I’ll give option 1/2 some more thought …

1 Like

Apologies for all of the posts …

The reference below is a more comprehensive paper by Wan and van der Merwe. In the additive form of the UKF (Table 7.3, page 233) they are more careful about distinguishing between the sigma points used for state prediction and those used to calculate the measurement sigma points (starred vs non-starred symbols). They have a non-standard way of augmenting the sigma point set, but in the algorithm notes say “alternatively, redraw a new set of sigma points that incorporate the additive process noise”. They don’t explicitly say that the algorithms in the square-root paper have a typo, but these more thorough descriptions are clearly different in that respect.

Wan, Eric A., and Rudolph van der Merwe. “The Unscented Kalman Filter.” In Kalman Filtering and Neural Networks, edited by Simon Haykin. John Wiley and Sons, 2001. (2.3 MB)

1 Like

The “radar” test references the filterpy python package. The latest release of filterpy (which was the end of 2018) contains the same problem with the UKF! They’ve fixed it in the source code on github in this change, but again, that’s really frustrating!

1 Like

Hi!

Thanks for the analysis.

The fact that the second paper by Wan and van der Merwe also presents both versions of the algorithm confirms my wish that both versions be available in Hipparchus.

Best regards,
Bryan

Haha. Sorry @bcazabonne I’m not doing a very good job of explaining this.

There is only one UKF algorithm described in the Wan and van der Merwe chapter that I attached. It was published the same year as their square-root paper (that you attached), but the book chapter contains a more thorough description of the UKF. Their description of the algorithm in the book chapter is more comprehensive and, importantly, it is different to the square-root paper (i.e. doesn’t contain the mistakes).

I was trying to use that as an argument for the fact that there is only one UKF, but unfortunately sometimes even the experts make mistakes when describing it.

1 Like

Hi

Sorry for the delay… I tried to contact Dr. Wan to have additional information about the two algorithms. Unfortunately, I didn’t had any answer.

The paper presents two algorithms: one without process noise addition, which performs only one unscented transform, and another one with process noise addition, which performs two unscented transform.

Currently, Hipparchus’ UKF is a mix between both (i.e., process noise addition but only one unscented transform). Having a mixed version is not implementation. But having both is an interesting one.

Regards,
Bryan

Hi,

I just went through this forum discussion and through the references you shared. From what I read, I understand the following:

There are two versions/options of the UKF for inclusion of process noise. Woodburn [1] writes:
“Literature on the UKF offers two options for the inclusion of process noise: a purely additive approach where a process noise matrix is added to the covariance determined from definitive propagation, or augmentation of the state vector to include process noise states. The latter option, while preferred by some authors since it avoids a re-sampling operation to create new sigma points after the covariance is updated, appears to only be applicable to cases where process noise can be applied in a purely additive sense at the end of a time update without concern for the method of uncertainty evolution over the time update interval. The state vector augmentation approach does not alter the trajectories of the sigma points, it merely utilizes the mechanics of the unscented transform to affect the state error covariance to avoid an additive operation that necessitates re-sampling of the sigma points.”

I believe these two version correspond to the two algorithms presented in Table 7.2 and Tabel 7.3 in the paper by Wan and van der Merwe [2] shared by @markrutten

The algorithm in Table 7.2 requires augmentation of the state: “the state is redefined as the concatenation of the original state and noise variables” (p.230 of [2]):
image
where x is the original state, v is the process noise and n is the observation noise.

The algorithm in Table 7.3 does not require this state augmentation, but instead requires augmenting or updating the sigma points after the predict step before doing the measurement update. This can be done by either adding sigma points (Eq 7.56 of [2]) or redrawing a complete new set of sigma points (footnote 6 on p.233 of [2]).

[1] James Woodburn, Vincent Coppola, “ANALYSIS OF RELATIVE MERITS OF UNSCENTED AND EXTENDED KALMAN FILTERS IN ORBIT DETERMINATION”, AAS Paper 19-781, 2019 (1.9 MB)
[2] Wan, Eric A., and Rudolph van der Merwe. “The Unscented Kalman Filter.” In Kalman Filtering and Neural Networks, edited by Simon Haykin. John Wiley and Sons, 2001.

I have in the past implemented the UKF wrongly myself. I neither augmented the state, nor augmented/redrawn the sigma points. I realised this when I saw that the innovation covariance did not include the process noise, which is exactly @markrutten’s point if I understand correctly.

Best,
David

1 Like

Thank you @dgondelach for your reply.

I’m still startled by the ambiguity that the presence of these two algorithms can cause.

I agree at 100% that the Table 7.3 algorithm is more interesting. The fact of finding both algorithms in the literature encouraged me to integrate both possibilities in Hipparchus in order to be as generic as possible

Since your two experiences with @markrutten shown that we should only have the Table 7.3 algorithm, then I’m fine with that.

Regards,
Bryan

Thanks @bcazabonne and @dgondelach for your time exploring this. It is confusing … as David explained both algorithms (table 7.2 and 7.3) introduce process noise, but in different ways.

Hi everyone,

I was looking at the pull requests in Hipparchus and found this one, which references this discussion. The request was however opened whicl this discussion was still ongoing.

Is there any consensus or conclusion on this? Should we merge the PR as is or should it be updated?

I think that the changes in my pull request (or something similar) are necessary. “Consensus” might be an exaggeration, but I interpreted the last post by @bcazabonne as reluctant agreement?

I just would like to see the impact in Orekit and on the unit tests

The Orekit unit tests are fine. Apologies for not making a pull request with the changes … I was waiting for the change in Hipparchus to happen first.

I’m a bit surprised that I didn’t need to change too many acceptance thresholds. I guess that the process noise used is very small so the differences in implementation don’t influence the results?

There was one issue in “UnscentedKalmanModelTest”, where the state covariance was not positive-definite after the first measurement update. I changed the initial state covariance to be diagonal and the test ran with only a small change to some test tolerances.

The Orekit code changes aren’t huge. The way that I’ve implemented the “getPredictedMeasurements” method in “UnscentedKalmanModel” looks like this

    /** {@inheritDoc} */
    @Override
    public RealVector[] getPredictedMeasurements(RealVector[] predictedSigmaPoints, MeasurementDecorator measurement) {

        // Observed measurement from decorator
        final ObservedMeasurement<?> observedMeasurement = measurement.getObservedMeasurement();

        // Initialize arrays of predicted measurements
        final RealVector[] predictedMeasurements = new RealVector[predictedSigmaPoints.length];

        // Initialize the relevant states used for measurement estimation
        final SpacecraftState[][] statesForMeasurementEstimation = new SpacecraftState[predictedSigmaPoints.length][builders.size()];

        // Loop on builders
        for (int k = 0; k < builders.size(); k++ ) {

            // Convert sigma points to spacecraft states
            for (int i = 0; i < predictedSigmaPoints.length; ++i) {
                RealVector predictedSigmaPoint = predictedSigmaPoints[i]
                        .getSubVector(orbitsStartColumns[k], orbitsEndColumns[k] - orbitsStartColumns[k]);
                Orbit orbit = orbitTypes[k].mapArrayToOrbit(predictedSigmaPoint.toArray(), null, angleTypes[k],
                        observedMeasurement.getDate(), builders.get(k).getMu(), builders.get(k).getFrame());
                statesForMeasurementEstimation[i][k] = new SpacecraftState(orbit);
            }
        }

        // Loop on sigma points to predict measurements
        for (int i = 0; i < predictedSigmaPoints.length; i++) {
            final EstimatedMeasurement<?> estimated = observedMeasurement.estimate(currentMeasurementNumber, currentMeasurementNumber,
                    KalmanEstimatorUtil.filterRelevant(observedMeasurement,
                            statesForMeasurementEstimation[i]));
            predictedMeasurements[i] = new ArrayRealVector(estimated.getEstimatedValue());
        }

        return predictedMeasurements;
    }