MSAFE files URL no longer valid

cmasson · October 20, 2023, 12:35pm

Hi everyone,

It seems the MSAFE files URL (https://www.nasa.gov/sites/default/files/atoms/files) in the orekit-data repository update script is no longer valid, I now get a 404 error.
The file for July 2023 in the orekit-data repository is empty, which seems to confirm that the problem is not just on my end.

I was able to find the file for september on this page, which links to https://www.nasa.gov/wp-content/uploads/2019/04/sep2023f10-prd.txt, but I couldn’t find the previous files on the server. The naming convention has also changed (the _prd became -prd).

Does anyone know of another URL which still has the archive?

Cheers,
Clément

luc · October 20, 2023, 7:21pm

I just asked NASA about this. We will see their answer.

luc · October 27, 2023, 9:33am

No answer from NASA yet, but I tracked down missing files, they seem to have switched back and forth between two URLS.
As you mentioned, the file names have changed. For compatibility with current Orekit version, I have renamed the september 2023 file to old version (with ‘_’), but changed the script so it will now download files with the new name. I have also made an update in Orekit for this: commit 0b675.
Another change was that the september file was published with old MAC line endings (i.e. CR instead of DOS CRLF or Unix LF); I have adapted the script so it converts to Unix line endings on the fly.
We will see what happens when the october file is published…

cmasson · October 27, 2023, 9:41am

Hi Luc,

Thanks for the update ! I have just had another look at the website and found the page for the archived forecasts. I don’t know how I missed it the other day, it’s in big bright letters.

However it also has links to future files (which return a 404), so I guess the script should be able to check for that ? We’ll see what MSFC says.

Cheers,
Clément

luc · October 27, 2023, 9:43am

Yes, the script checks the retrieved file. If we get a 404 page, it does not match the expected header, so we delete it and stop the retrieval loop.

adestefano · January 8, 2024, 11:36pm

Hi all,

I am one of the NASA contacts for the MSAFE files and webpage, so I thought I’d comment here for anyone who was wondering what has happened to our webpage over the past several months. Over this last summer or so we were forced into moving to a new webpage, as the NASA site domain was getting a make-over. There were a lot of moving parts into getting access to our new webpage, although we did get help with migrating most of our content over.

I was able to update both our main page and the archive page today, see:
https://www.nasa.gov/solar-cycle-progression-and-forecast/
https://www.nasa.gov/solar-cycle-progression-and-forecast/archived-forecast/

Some annoying inconsistencies you will find:

All/most of the “older” data has a prepended “~/wp-content/uploads/2019/04/”
There’s a few that are completely strange, like “~/sites/default/files/atoms/files/aug2023f10_prd.txt”
Currently when I upload the files, the format seems to be “~/wp-content/uploads/YYYY/MM/mmmYYYY*-*.txt”, the first set of year and month are set at the moment when I upload them, and the second set are the actual month and year of the solar forecast.
The new web server system automatically converts any underscore to hyphen. I don’t know why. I asked the NASA web devs if this can be changed or if we can change the URL once the file is uploaded (currently we can’t). TBD on this.

I thought of an alternative approach, that if you deem useful, I can provide. On the archive webpage, I could create a list of URLs, for each matrix of archived data. That way you could have the list(s) without having to know what magical format the system decided to give us.

Hope that helps, and I will be waiting for any feedback,
Anthony

luc · January 9, 2024, 11:12am

Hi @adestefano welcome!

I am updating the update.sh script so it handles everything automatically. It already manages to locate the file in either the /uploads/ or the /atoms/ urls, and makes several attempts to to year/month in the former case, starting with the same date as the file content, but also trying up to 4 months in the past and the future (properly wrapping around years boundaries). This would probably work for a while, even if you need to upload some files in advance or later, as you needed to do for the last ones.

I still had two problems: the link to dec2023f10-prd.txt in the archive forecast page points to a dec2023ssn-prd.txt file, not an f10 file. It also seems the files for upcoming months (february 2024) already exist in the proper format but correspond to last year files. This is not too much a problem as I can check the content and delete them, just as I delete error 404 page when I retrieve them. I gues you need to prepare files for a whole line when you start a new year. So don’t bother about that. Only the dec2023f10-prd.txt is important for me now.

I’ll push the updated script probably this evening, european time.

Thanks for the explanations about URL changes.

luc · January 10, 2024, 3:11pm

I succeeded in updating all files up to January 2024 which was just published. So now there is no gap in our convenience repository.
What is more important is that the update.sh script we also provide has been adapted a lot to take care of all URLs and file names changes. It is also able to retrieve files that have not been uploaded in the month they refer to (for example the file for September 2023 was uploaded on January 2024), and we also added another check on file content. Before this change, we only checked that line 1 contained “TABLE 3”, but now we also check that line 14 corresponds to current month (the rationale is that there is a 7 lines header and each file starts with data for 6 months prior to current month).

Thanks to @adestefano for providing all the data files!

adestefano · January 10, 2024, 5:32pm

Thanks, @luc !

On the hyperlinks to the new year and upcoming months, what I can do is have a URL that should point you to the 404 page so it doesn’t erroneously point you to the previous year’s data. If I convert those future hyperlinks to just text, I then have to convert them back later when that month occurs, and it makes the formatting not so easy. I am learning that html is very picky about how you specify spaces and the font that they have doesn’t have equal length characters.

arletty · February 20, 2024, 12:27pm

@luc Hello! Now not work 3 site:

Work: usno_ser7_url=https://maia.usno.navy.mil/ser7

Not work:
iers_rapid_url=https://datacenter.iers.org/data

Not work (not it path): msafe_url_uploads=https://www.nasa.gov/wp-content/uploads

Not work (not or path): msafe_url_atoms=https://www3.nasa.gov/sites/default/files/atoms/files

Work: cssi_url=ftp://ftp.agi.com/pub/DynamicEarthData

luc · February 20, 2024, 12:50pm

iers_rapid_url and msafe_url_uploads worked for me today.
Note that when retrieving MSAFE data, several different URL and naming patterns are tried and the download loop in update.sh fails when all URL fail to download one file (which typically happens if it has not been published yet), so it is normal the script outputs several error messages at the end.

arletty · February 22, 2024, 1:16pm

My link also works - but it displays on the “404” page:

luc · February 22, 2024, 6:07pm

Yes, but this is handled by the script.
It loops until is gets a 404 error.