hubflow fails to read ENVI style wavelength and wavelength_units

Issue #531 resolved
Benjamin Jakimow created an issue

How to replicate:

  1. Create a tif with wavelength and wavelength units set according to https://www.harrisgeospatial.com/docs/enviheaderfiles.html
  2. Read with hubflow Raster object and call metadataWavelength`

Example

unit test added to enmapboxtesting/test_hubflow.py : test_wavelength

import unittest
import os
from osgeo import gdal
import numpy as np
from enmapbox.testing import TestCase, TestObjects

class HUBFlowTests(TestCase):
    def test_wavelength(self):

        # define raster image with ENVI-like metadata
        root = self.createTestOutputDirectory() / 'hubflowtests'
        os.makedirs(root, exist_ok=True)
        pathImg = root / 'testimag.tif'
        # create a dataset with 5 bands, 10 lines and 15 samples
        ds = TestObjects.createRasterDataset(path=pathImg, ns=15, nl=10, nb=5, wlu='nm')
        self.assertIsInstance(ds, gdal.Dataset)

        from enmapbox.gui.utils import parseWavelength
        wl, wlu = parseWavelength(ds)

        enviWL = f"{{{','.join([str(v) for v in wl])}}}"

        print(f'Wavelength: {enviWL}')
        print(f'Wavelength units: {wlu}')

        ds.SetMetadataItem('wavelength', enviWL, 'ENVI')
        ds.SetMetadataItem('wavelength units', wlu, 'ENVI')
        ds.FlushCache()
        del ds

        # 2. read metadata with GDAL
        ds = gdal.Open(pathImg.as_posix())
        # retrieval of Wavelength and Wavelength Unit according to
        # ENVI Headerfile Definition: https://www.harrisgeospatial.com/docs/enviheaderfiles.html
        self.assertEqual(ds.GetMetadataItem('wavelength', 'ENVI'), enviWL)
        self.assertEqual(ds.GetMetadataItem('wavelength units', 'ENVI'), wlu)

        # 3. read metadata with hubflow
        from hubflow.core import Raster
        hubflowRaster = Raster(pathImg.as_posix())

        # HUB Flow Fail
        hubflowWL = hubflowRaster.metadataWavelength()

        for v1, v2 in zip(wl, hubflowWL):
            self.assertEqual(v1, v2)





if __name__ == '__main__':
    unittest.main()

Comments (8)

  1. Andreas Janz

    You should not use 'wavelength units' with a white space, but with a underscore: 'wavelength_units'

    In your testcase the whitespace is preserved, but in real life, GDAL will replace the whitespaces in metadata keys with underscores.

    For example, have a look at the aux.xml file from the testdata that was created by DGAL:

  2. Benjamin Jakimow reporter

    In real life, the unit test passes the line:

    self.assertEqual(ds.GetMetadataItem('wavelength units', 'ENVI'), wlu)

    and users don’t need to know how metadata is stored in specific raster formats.
    As GDAL works with the metadata name as used in the ENVI documentation, i.e. without underscore, I don’t see why the EnMAP-Box API should’nt as well?

  3. Andreas Janz

    As I showed, GDAL will alter your key and replaces the ' ' with a '_'. This is more a GDAL bug I guess?

  4. Benjamin Jakimow reporter

    GDAL allows to define and read metadata tag names that have spaces included. The ENVI metadata standard does so as well.
    So I don’t see any need why developers should avoid spaces just for the sake of the hubflow API, while the underlying GDAL API supports it pretty well?

    Why not simply match on “wavelength[_ ]units” (regex) instead “wavelength_units”?

  5. Andreas Janz

    I gave you an example, where GDAL messes up the metadata keys (screenshot above). Why isn’t that relevant for the discussion?

  6. Andreas Janz

    Why not simply match on “wavelength[_ ]units” (regex) instead “wavelength_units”?

    Ah ok, that makes sense!

  7. Log in to comment