CDM parser normalizes strings

Hi Orekit community,
I’m using the CDMParser in order to parse CDMs files and I need to get the raw content of the entries
but in the token processing the strings are normalized by the getContentAsNormalizedString method of the ParseToken class that removes all occurrences
of ‘_’ by replacing with space, and collapses several spaces as one space only.
Is there a way to enable the parser using the getRawContent method without normalizing strings?
How can I implement what I need without intervening on the Orekit? Is it possible?
It would be better to be able to configure the parser to choose how to fetch the lines: what do you think?
Thanks in advance!
Best Regards,

Hi @spaceseven,

Welcome to the Orekit community !

Unfortunately, I don’t think so… but maybe @luc would know better about the ParseToken class.

You’re probably right, it would be better for users to have the choice. But I’m afraid it will require a lot of work to implement this.
It would be a very different implementation since the ParseToken naturally converts parsed String into higher level objects (enumerate, List, FrameFacade etc.).

Best regards,

I understand the need, but the CCSDS standards explicitly states this normalization is mandatory.
For example in the CDM message, it is paragraph of CCSDS 508.0-B-1, which reads: In value fields that are text, an underscore shall be equivalent to
a single blank. Individual blanks shall be retained (shall be significant),
but multiple contiguous blanks shall be equivalent to a single blank.

The same statement is present in all CCSDS KVN formats, it is a common rule.
There is currently no way in Orekit to bypass this rule, and it would nevertheless open a can of worms as the file would not be parsable by other readers that comply with the standard.

Hi @MaximeJ and @luc,
Thanks for provided clarifications, I will adapt my code according to instructions detailed in the highlighted section.
Thanks again for the useful support.
Best regards,