Efficient way to load DEM tiles in Rugged Python

yzokras · August 1, 2019, 2:27pm

Dear all,

I am working on using DEMs for direct location in Rugged in Python.

I use the srtm python library to download and load SRTM tiles: https://anaconda.org/conda-forge/srtm.py

See below the source code of the PythonTileUpdater subclass I’ve written. When the updateTile method is called, it creates a tile of size 1 degree latitude and longitude centered on the latitude,longitude passed as argument (so 0.5 deg on each side of the center).

I decided for this tile size because of our satellite’s swath and because the SRTM tiles are 1 degree big too (although my code does not follow the SRTM tiles because the longitude/latitude center is not an integer, but the srtm library does not care and loads from several SRTM files).

However, I cannot use the full resolution of the SRTM data (1 arcsec, so 3600 * 3600 points per tile) because it takes way too long. Right now the SRTM data is resampled to 100 * 100 points per tile. It takes approximately 15 seconds to perform direct location on 600 pictures (with 4 sensor points each, the four corners of the picture) spread over one half-orbit. I tried with 1000 * 1000 points per tile, but after 10 minutes it was not finished yet.

What could I do to make the tile loading more efficient? It seems hard to avoid the double loop on row/columns… Are tiles reloaded at each direct location, or only when the border of the tile is reached?

Thank you
Cheers
Clément

import srtm
import numpy as np
from org.orekit.rugged.raster import PythonTileUpdater

class DEMTileUpdater(PythonTileUpdater):
    def __init__(self):
        super(DEMTileUpdater, self).__init__()
        self.n_lat = 101  # Number of rows in tile 
        self.n_lon = 101  # Number of columns in tile
        self.tile_size_lat = float(np.deg2rad(1.0))  # Tile size in radians
        self.tile_size_lon = float(np.deg2rad(1.0))  # Tile size in radians
        self.delta_lat = self.tile_size_lat / self.n_lat  # row sample size in radians
        self.delta_lon = self.tile_size_lon / self.n_lon  # column sample size in radians
        self.elevation_data = srtm.get_data()  # srtm object to get srtm data
        
    def updateTile(self, latitude, longitude, tile):
        min_latitude = latitude - self.tile_size_lat / 2.0
        min_longitude = longitude - self.tile_size_lon / 2.0
        
        tile.setGeometry(min_latitude,  # minLatitude
                         min_longitude,  # minLongitude
                         self.delta_lat,  # latitudeStep
                         self.delta_lon,  # longitudeStep
                         self.n_lat,  # latitudeRows 
                         self.n_lon)  # longitudeColumns
        
        for i in range(0,self.n_lat):
            for j in range(0,self.n_lon):
                """
                Get elevation from srtm reader
                """
                elevation = self.elevation_data.get_elevation(np.rad2deg(min_latitude + i*self.delta_lat),
                                                              np.rad2deg(min_longitude + j*self.delta_lon))
                if elevation is None:
                    elevation = 0.0
                tile.setElevation(i, j, float(elevation))

Which is then instantiated and given to the Rugged builder:

from org.orekit.rugged.api import RuggedBuilder, AlgorithmId, EllipsoidId, BodyRotatingFrameId, InertialFrameId
from org.orekit.rugged.errors import RuggedException
from org.orekit.utils import CartesianDerivativesFilter, AngularDerivativesFilter
from org.orekit.rugged.raster import TileUpdater

ruggedBuilder = RuggedBuilder()
ruggedBuilder.setAlgorithm(AlgorithmId.DUVENHAGE)
ruggedBuilder.setDigitalElevationModel(DEMTileUpdater(), 8)
ruggedBuilder.setEllipsoid(EllipsoidId.WGS84, 
                           BodyRotatingFrameId.ITRF)
ruggedBuilder.setTimeSpan(absolute_date_init, 
                          absolute_date_end, 
                          0.01, 
                          5.0 / lineSensorTop.getRate(0.0))
ruggedBuilder.setTrajectory(InertialFrameId.TOD,
    satellitePVList, 4, CartesianDerivativesFilter.USE_P,
    satelliteQList,  4,  AngularDerivativesFilter.USE_R)
ruggedBuilder.addLineSensor(lineSensorTop)
ruggedBuilder.addLineSensor(lineSensorBottom)
rugged = ruggedBuilder.build()

And then the direct location:

for date in dates_tm:
    absolute_date = datetime_to_absolutedate(date)
    
    losUl = lineSensorTop.getLOS(absolute_date, 0)
    ulPoint = rugged.directLocation(absolute_date, sensorPositionTop, losUl)

luc · August 1, 2019, 6:34pm

I don’t know about the Python wrapper, but here are some data from the underlying Rugged library.

The tiles are lazily loaded, i.e. when location computation needs the elevation at some location and no tiles cover this, then the appropriate tile is loaded. Several tiles are kept in memory up to a user-provided number. At least 4 tiles in memory are needed to deal with the swath passing over a corner, but a classical setting is to keep 8 tiles in memory. When the maximum number is exhausted and a new tile must be loaded, the Least Recently Used (LRU) tile is evicted. Look at the setDigitalElevationModel method in RuggedBuilder, and in particular the newMaxCachedTiles parameter which we suggest to set at 8 to avoid reloading tiles when approaching corners.

On a very old laptop (about 5 years old and not a very powerful one), we achieved about 90000 direct locations per seconds in a classical mono-thread application. Rugged is used operationally for the Sentinel 2 Image Processing Facility which produces 1.7 tera bytes of data (per satellite I think) every day. So for 600 images with only 4 points, it should take… 0.03 seconds, not 15 seconds. The rate you achieve seems really low to me.

Tiles loading should not be the bottleneck for classical applications. As an example for Sentinel 2 what have very large pixels (10m or 60m depending on the wavelength), a 1° tile contains either 123 millions sensor pixels or 3 millions sensor pixels, which means on average a new tile is loaded once every several millions direct locations. The loading time is dwarfed by the direct location computation, and this computation is fast as I said before.

petrus.hyvonen · August 4, 2019, 3:25pm

Hi,

It would be interesting to run some benchmarks on comparing pure java with the python-java integration. I have not done so, but this is an example where I think the penatly could be quite high - when there are many calls between java and python for an action. It sounds a lot penalty, cannot say if it is reasonable but for pure java (as well as for pure python) you have some optimizations (JIT for java I think) that can really reduce the overhead of frequent calls. Is the lazy loader “hand-made” or a feature of java - in case the latter I could imagine that the gateway forces it to evaluate, don’t know.

Let us know if you come to some insight

luc · August 4, 2019, 8:25pm

The lazy holder is just simple custom code in the library, nothing fancy. It is located
in class TileCache which holds an array of Tiles and performs the LRU/eviction algorithm
and calls the user provided TileUpdater to load a new tile when needed.

We know for certain that at least one other closed-source interface between Python and Rugged exists and is used, without any adaptation on the Rugged side. It was developed using your Python/Orekit wrapper as the example, so is probably quite similar to what you set up

@yzokras, in your example code, how is the Get elevation from srtm reader part implemented? Doest it reopend the fle for each single DEM cell or not? We recommend having the underlying reader load the DEM tile in one temporary storage and then looping over the storage to call tile.setElevation to transfer this into Rugged and then de-allocate the temporary storage.