Reduced Chi-Square metric computation possible bug

alcs · October 29, 2025, 12:27pm

Hello,

When performing some batch least-squares estimation and checking within withLeastSquaresOptimizer.Optimum of Orekit, we have noticed that the reduced chi-square metric returned when using the getReducedChiSquare(n) method (inherited from org.hipparchus.optim.nonlinear.vector.leastsquares.LeastSquaresProblem.Evaluation), was giving us different results as compared to our own analysis of the residuals. After some checks, it appears that the getReducedChiSquare(n) is using the total number of measurements for the computation of the RCS, whereas only the “used” measurements should be considered for such metric.

Could you confirm if this is the case?

Thank you very much for your time

Serrof · November 26, 2025, 12:30pm

Upping this so that hopefully somebody takes a look.

luc · November 26, 2025, 6:38pm

Do you mean AbstractEvaluation at line 99 should use a reduced number of observations?

What do you exactly mean by “used measurements”?

alcs · November 27, 2025, 7:45am

Hi Luc,

Yes, I mean that the evaluation at line 99 that you mention should have a different number of observations. What I mean by “used measurements” is that during the batch least squared process, observations are discarded during the iterations due to many possible reasons. Thus, the getReducedChiSquare method should return its value with the currently accepted observations, not considering the rejected observations.

According to our tests, the value that is being returned is considering all observations, both accepted and discarded, in the denominator. The reason could be that the variable observationSize is generated at the beginning of the batch least squares process (when no observation has been rejected), and that later it is not updated as the iterations go.

Thanks for taking a look a it.

Serrof · November 30, 2025, 2:44pm

Hi,

looking a bit closer, I think the fix should be in Orekit. This is where the notion of REJECTED measurements is introduced. So BatchLSEstimator could have its own implementation of LeastSquaresProblem.Evaluation that filters out the discarded data. Does that make sense?

Cheers,
Romain.

luc · November 30, 2025, 3:15pm

I think you are right, but am not sure the original post refers to Orekit or Hipparchus.

alcs · December 1, 2025, 8:30am

Hi to both,

First, thank you very much for taking the time to look at this post. Second, It’s true that I opened the post originally for Hipparchus because I saw that was where the low-level function computing the reduced-chi-squared could be found. But it’s also true that perhaps it makes more sense that fix is made on the upper computation level of Orekit rather than Hipparchus. We are using Orekit, which is in turn calling Hipparchus.

Serrof · December 3, 2025, 10:33pm

Hi,

I believe this would solve the issue:

What do you think?

Cheers,
Romain.

alcs · December 4, 2025, 8:11am

Hi Romain,

Thanks a lot for the help. To be honest, I’m not a Java expert (we are working with the Orekit Python wrapper), so I cannot fully confirm if you merge request solves the issue. I don’t really see directly where the new class is taking into account the observationSize, but probably it’s because you’re defining that variable with super(), making it available to the methods inside optimumIn (could that be?). In any case, the tests seem to prove the fix it’s working, so I agree with the MR.

Again, thanks a lot for your time

Regards,

Alejandro Cano

Serrof · December 4, 2025, 12:49pm

Yes I’m using an Hipparchus abstract class that already takes the total number of measurements as input, which here accounts for the rejected ones.
What version of the Python wrapper are you using? If it’s the jpype one, you could replace the .jar as explained here. I could maybe provide you with it if you can’t compile Java on your side.

Cheers,
Romain.

afvy · December 4, 2025, 3:48pm

Hi Romain,

The observationSize should be the number of “scalar” observations. Thus, each accepted measurement should count n times where n is its dimension.

For example, in the test you added, the observationSize should be (10 PV meas) * (6 scalar meas per PV) = 60. Cfr. the observationSize of the Hipparchus optimum, which is 66 and not 11.

Cheers,

Alberto

Serrof · December 4, 2025, 4:30pm

Hi Alberto,

Fair enough, I’ll update the count accordingly.

Btw, I think we’ve got a case for a patch release in the evry near future.

Cheers,
Romain.

alcs · December 5, 2025, 9:10am

Hi Romain, thanks for the clarifications. We are not using the jpype Python wrapper yet, but we are planning to move to it soon. In any case, no need to rush to have a hot-fix of the bug for us, we already have a workaround in place for now and we prefer to wait for the official fix and jpype.

Thank you very much,

Best regards,

Alejandro Cano

alcs · December 5, 2025, 9:11am

Hi Alberto,

You are completely right, the ObservationSize should count every “measurement”, not only every “observation" (which can be composed of multiple measurements, such as the PV coordinates). Thanks for noticing it in Romain’s bug fix.

Best regards,

Alejandro Cano