extent GDAL PAM metadata model to more general QGIS PAM model

Issue #898 resolved
Andreas Janz created an issue

Problem

Currently we restrict metadata handling to GDAL raster. Metadata IO is implemented using the GDAL API:

On the dataset level we use:

value = gdal.Dataset.GetMetadataItem(key, domain)
valueDict = gdal.Dataset.GetMetadata(domain)
valuesDictDict = gdal.Dataset.GetMetadata_Dict()

gdal.Dataset.SetMetadataItem(key, value, domain)
gdal.Dataset.SetMetadata(valueDict, domain)
gdal.Dataset.GetMetadata_Dict(valueDictDict)

On the band level we use:

value = gdal.Band.GetMetadataItem(key, domain)
valueDict = gdal.Band.GetMetadata(domain)
valuesDictDict = gdal.Band.GetMetadata_Dict()

gdal.Band.SetMetadataItem(key, value, domain)
gdal.Band.SetMetadata(valueDict, domain)
gdal.Band.GetMetadata_Dict(valueDictDict)

This is fully sufficient for data processing.

But can be limiting in GUI applications. Here we usually don’t use gdal.Dataset objects, but QgsRasterLayer objects.

Problem: QgsRasterLayer class can’t access GDAL metadata directly. We need to re-open the layer source via gdal.Open. For read-access, this is usually fine, but when writing new metadata items to the GDAL PAM aux.xml file, the approach fails, because QGIS will overwrite the aux.xml file again, when the QgsRasterLayer object is closed.

Proposed solution

We introduce another level of abstraction on top of GDAL PAM, using QGIS custom properties. Let’s call it QGIS PAM

On the dataset level we use:

value = layer.customProperty(QGISPAM/dataset/domain/key)

layer.setCustomProperty(QGISPAM/dataset/domain/key, value)

On the band level we use:

value = layer.customProperty(QGISPAM/band/bandNo/domain/key)

layer.setCustomProperty(QGISPAM/band/bandNo/domain/key, value)

An application that queries metadata from a raster source should use the following priorities:

  1. Check QGIS PAM first.

    If not found, check GDAL PAM afterwards.

  2. If still not found, use a suitable fallback value.

Or in short: QGIS PAM shadows GDAL PAM!

If you need to set metadata in a processing algorithm: set it to GDAL PAM, so that GDAL can read it later!

If you need to set metadata in a GUI application: set it to QGIS PAM, so that it can be stored to QML style file.

This process can be tedious and should be encapsulated in a Utils class or methode.
E.g. for enmap processing, I will implement this into the RasterReader class:

from enmapboxprocessing.rasterreader import RasterReader
reader = RasterReader(layer)
reader.metadataItem(key, domain, bandNo)

Usecase: handle wavelength information

Wavelength information handling is an important task in EnMAP-Box processing and GUI applications.

Currently you have no chance to consistently set missing wavelength information to an already existing QgsRasterLayer, because changes to the GDAL PAM may get overwritten again by QGIS.

With the proposed method we can solve this problem, by setting wavelength information via:

defaultDomain = ''  # as in GDAL PAM
layer.setCustomProperty(GDALPAM/band/1/defaultDomain/wavelength, 460)
layer.setCustomProperty(GDALPAM/band/1/defaultDomain/wavelength_units, 'Nanometers')
...
layer.setCustomProperty(GDALPAM/band/177/defaultDomain/wavelength, 2409)
layer.setCustomProperty(GDALPAM/band/177/defaultDomain/wavelength_units, 'Nanometers')

Consecutive queries will now find proper wavelength information at the highest level (i.e. in the QGIS custom properties) , e.g:

reader = RasterReader(layer)
wavelength = reader.wavelength(bandNo)

Performance advantages (caching complex to parse information at QgsRasterLayer level)

The proposed method can also be used to improve metadata handling performance by implementing intelligent caching for more complex metadata sources, like wavelength information.

Extracting correct wavelength information from the GDAL PAM is a more complicated process. The information can be stored at GDAL Dataset level as a list of values. Or as individual values at Band level. Also wavelength units can be specified at both levels. Running through all the possibilities may take some hundreds of milliseconds. For processing algorithms, this is not an issue, but for interactive GUI applications it may be.

If layer wavelength information is required again and again, a good approach would be to read the information once and cache the result by storing it (redundantly) inside the QGIS custom properties.

Consecutive queries can use the prepared wavelength information directly!

Metadata widget in the QGIS Layer Properties dialog

To make it really nice, we should have a general QGIS/GDAL PAM Metadata widget, comparable to:

The new metadata dialog should be fully editable and all metadata changes will be saved easily to QGIS PAM. This way we do not alter the original GDAL files, but only store changes in-memory at the QgsRasterLayer object. Metadata can be stored permanently, by saving it to QML sidecar file.

Comments (8)

  1. Andreas Janz reporter

    Some usage examples showing improved metadata handling using the RasterReader class. Note that the reader is now able to set metadata items. Those will always be stored as layer properties and do not alter the GDAL items, but effectively shadow them.

    class TestQgisPam(TestCase):
        # test QGIS PAM metadata handling (see #898)
    
        def test(self):
            layer = QgsRasterLayer(enmap)
            reader = RasterReader(layer)
    
            # make sure wavelength units exist in GDAL PAM
            key = 'wavelength_units'
            self.assertEqual('Micrometers', reader.gdalDataset.GetMetadataItem(key))
    
            # query wavelength units
            self.assertEqual('Micrometers', reader.metadataItem(key))
    
            # shadow GDAL PAM items
            reader.setMetadataItem(key, 'Nanometers')  # stores item in QGIS PAM
            self.assertEqual('Nanometers', reader.metadataItem(key))
    
            # shadow with None to effectively mask existing GDAL items
            reader.setMetadataItem(key, None)
            self.assertIsNone(None, reader.metadataItem(key))
    
            # ignoring QGIS PAM shadowing may be useful in some cases
            self.assertEqual('Micrometers', reader.metadataItem(key, ignoreQgisPam=True))
    
            # remove QGIS PAM item
            reader.removeMetadataItem(key)
            self.assertEqual('Micrometers', reader.metadataItem(key))  # ignoreQgisPam not required anymore
    

  2. Andreas Janz reporter

    This also applies to whole domains:

    def test_domain(self):
        layer = QgsRasterLayer(enmap)
        reader = RasterReader(layer)
    
        # make sure the ENVI domain exists
        domain = 'ENVI'
        domainItems = ['bands', 'band_names', 'byte_order', 'coordinate_system_string', 'data_ignore_value', 'data_type', 'default_bands', 'description', 'file_type', 'fwhm', 'header_offset', 'interleave', 'lines', 'samples', 'sensor_type', 'wavelength', 'wavelength_units', 'y_start', 'z_plot_titles']
        self.assertEqual(domainItems, list(reader.gdalDataset.GetMetadata(domain).keys()))
    
        # query domain
        self.assertEqual(
            domainItems,
            list(reader.metadataDomain(domain).keys())
        )
    
        # shadow GDAL PAM item in domain
        key = 'wavelength_units'
        reader.setMetadataItem(key, 'TEST', domain)
        self.assertEqual('TEST', reader.metadataDomain(domain)[key])
    
        # shadow with None to effectively mask existing GDAL items
        reader.setMetadataItem(key, None, domain)
        self.assertIsNone(reader.metadataDomain(domain)[key])
    
        # ignoring QGIS PAM shadowing may be useful in some cases
        self.assertEqual('Micrometers', reader.metadataDomain(domain, ignoreQgisPam=True)[key])
    
        # remove QGIS PAM domain
        reader.removeMetadataDomain(domain)
        self.assertEqual('Micrometers', reader.metadataItem(key))  # ignoreQgisPam not required anymore
    

  3. Andreas Janz reporter

    Usage example for efficient wavelength handling:

        def test_wavelength(self):
    
            layer = QgsRasterLayer(enmap)
            reader = RasterReader(layer)
    
            # make sure GDAL PAM wavelengths are available
            wavelengths = [float(reader.gdalBand(bandNo).GetMetadataItem('wavelength'))
                          for bandNo in range(1, layer.bandCount() + 1)]
            self.assertEqual(
                [0.46, 0.465, 0.47, 0.475, 0.479, 0.484, 0.489, 0.494, 0.499, 0.503, 0.508, 0.513, 0.518, 0.523, 0.528, 0.533, 0.538, 0.543, 0.549, 0.554, 0.559, 0.565, 0.57, 0.575, 0.581, 0.587, 0.592, 0.598, 0.604, 0.61, 0.616, 0.622, 0.628, 0.634, 0.64, 0.646, 0.653, 0.659, 0.665, 0.672, 0.679, 0.685, 0.692, 0.699, 0.706, 0.713, 0.72, 0.727, 0.734, 0.741, 0.749, 0.756, 0.763, 0.771, 0.778, 0.786, 0.793, 0.801, 0.809, 0.817, 0.824, 0.832, 0.84, 0.848, 0.856, 0.864, 0.872, 0.88, 0.888, 0.896, 0.915, 0.924, 0.934, 0.944, 0.955, 0.965, 0.975, 0.986, 0.997, 1.007, 1.018, 1.029, 1.04, 1.051, 1.063, 1.074, 1.086, 1.097, 1.109, 1.12, 1.132, 1.144, 1.155, 1.167, 1.179, 1.191, 1.203, 1.215, 1.227, 1.239, 1.251, 1.263, 1.275, 1.287, 1.299, 1.311, 1.323, 1.522, 1.534, 1.545, 1.557, 1.568, 1.579, 1.59, 1.601, 1.612, 1.624, 1.634, 1.645, 1.656, 1.667, 1.678, 1.689, 1.699, 1.71, 1.721, 1.731, 1.742, 1.752, 1.763, 1.773, 1.783, 2.044, 2.053, 2.062, 2.071, 2.08, 2.089, 2.098, 2.107, 2.115, 2.124, 2.133, 2.141, 2.15, 2.159, 2.167, 2.176, 2.184, 2.193, 2.201, 2.21, 2.218, 2.226, 2.234, 2.243, 2.251, 2.259, 2.267, 2.275, 2.283, 2.292, 2.3, 2.308, 2.315, 2.323, 2.331, 2.339, 2.347, 2.355, 2.363, 2.37, 2.378, 2.386, 2.393, 2.401, 2.409],
                wavelengths
            )
    
            # query wavelength via the RasterReader.wavelength()
            wavelength = reader.wavelength(bandNo=42)  # returns Nanometers by default
            self.assertEqual(685.0, wavelength)
    
            # check that we properly cached the wavelength
            defaultDomain = ''
            self.assertEqual(0.685, layer.customProperty(f'QGISPAM/band/42/{defaultDomain}/wavelength'))
            self.assertEqual('Micrometers', layer.customProperty(f'QGISPAM/band/42/{defaultDomain}/wavelength_units'))
    

  4. Log in to comment