xarray IO error for some specific VIIRS scenes

Issue #2 resolved
Bich Tran created an issue

You can see in the log.txt (in attached zip file), VIIRSL1 downloader works for all other scenes, except for VNP02IMG.A2022333.1124.002.2022333203424.nc, it returns this error (several tries):

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'scipy', 'rasterio']. Consider explicitly selecting one of the installed engines via the engine parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html)..)
2024-05-24 11:50:04,301 ERROR:
Traceback (most recent call last):
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/pywapor/collect/downloader.py", line 129, in collect_sources
x = dler(**args)
^^^^^^^^^^^^
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/pywapor/collect/product/VIIRSL1.py", line 531, in download
combine_unprojected_data(nc02_file, ncqa_file, lut_file, unproj_fn)
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/pywapor/collect/product/VIIRSL1.py", line 266, in combine_unprojected_data
ds_ = xr.open_dataset(nc02_file, mask_and_scale=False, group = "observation_data",engine='netcdf4')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/xarray/backends/api.py", line 553, in open_dataset
engine = plugins.guess_engine(filename_or_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/xarray/backends/plugins.py", line 197, in guess_engine
raise ValueError(error_msg)
ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'scipy', 'rasterio']. Consider explicitly selecting one of the installed engines via the engine parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html

Following the advice in the error message, I tried adding explicit engine parameter (netcdf4) in VIIRSL1.py, but it did not solve the issue.

I also updated xarray installation follow the guide in Installation (xarray.dev), it also didn’t work.

python -m pip install "xarray[io]"

conda install -c conda-forge xarray dask netCDF4 bottleneck pydap cftime iris

I found out that the file that the script is trying to open is this one, and it is empty (0 kB): ./VIIRSL1/VNP02IMG.A2022333.1124.002.2022333203424_data_lut.nc (in attached zip file)

It seems this scene is corrupted, and failing to process this scene interrupted the whole downloading step, which makes it impossible to the next step.

pywapor version 3.5.2

Comments (7)

  1. Bich Tran reporter

    Related question: In the log file, you can see the number of relevant VNP02IMG.v2 scenes was 260 for DOWNLOADER, and 253 for PRE_SE_ROOT. Why is that so?

    (Reported in #3)

  2. bert.coerver

    thanks for letting me know, from your report I’d indeed say that its basically a corrupted file on the NASA Opendap server, I’ll verify that and open a ticket on their forum if thats the case.

    And also add some kind of check for this in the pywapor VIIRS code so that it won’t interrupt the entire workflow, but just skips this scene.

  3. bert.coerver

    Not sure how, but for me it seems to work fine:

    2024-05-28 13:38:16,667      INFO: --> Found 504 VNP03IMG.v2 scenes.
    2024-05-28 13:38:20,909      INFO: --> Found 504 CLDMSK_L2_VIIRS_SNPP.v1 scenes.
    2024-05-28 13:38:25,743      INFO: --> Found 504 VNP02IMG.v2 scenes.
    2024-05-28 13:38:25,744      INFO:     --> Filtering VNP02IMG.v2 scenes with {'day_night_flag': 'DAY'}.
    2024-05-28 13:43:21,178      INFO:     --> Found 251 relevant VNP02IMG.v2 scenes.
    2024-05-28 13:43:21,198      INFO: --> Collecting 251 VIIRS scenes.
    2024-05-28 13:43:21,203      INFO: --> 0 scenes already downloaded, 251 remaining.
    2024-05-28 13:45:14,855      INFO:     --> (217/251) Processing 'VNP02IMG.A2022333.1124.002.2022333203424.nc'.
    2024-05-28 13:45:30,569      INFO:         --> Downloading VNP03IMG.A2022333.1124.002.2022333200714.nc.
    2024-05-28 13:50:41,517      INFO:         --> Determining indices of AOI inside scene.
    2024-05-28 13:51:01,501      INFO:             --> Found 1,280 pixels inside AOI.
    2024-05-28 13:51:11,191      INFO:         --> Saving latitudes.
    2024-05-28 13:51:13,271      INFO:             > peak-memory-usage: 33.8KB, execution-time: 0:00:02.078912.
    2024-05-28 13:51:13,272      INFO:             > chunksize|dimsize: [number_of_lines: 32|32, number_of_pixels: 40|40], crs: None
    2024-05-28 13:51:18,584      INFO:         --> Saving longitudes.
    2024-05-28 13:51:20,622      INFO:             > peak-memory-usage: 35.6KB, execution-time: 0:00:02.037268.
    2024-05-28 13:51:20,624      INFO:             > chunksize|dimsize: [number_of_lines: 32|32, number_of_pixels: 40|40], crs: None
    2024-05-28 13:51:27,128      INFO:         --> Downloading VNP02IMG.A2022333.1124.002.2022333203424.nc for AOI.
    2024-05-28 13:52:11,236      INFO:         --> Downloading CLDMSK_L2_VIIRS_SNPP.A2022333.1124.001.2022333234426.nc for AOI.
    2024-05-28 13:52:28,135      INFO:         --> Combining data.
    2024-05-28 13:52:30,173      INFO:             > peak-memory-usage: 78.2KB, execution-time: 0:00:02.037428.
    2024-05-28 13:52:30,175      INFO:             > chunksize|dimsize: [number_of_lines: 32|32, number_of_pixels: 40|40], crs: None
    

    When you were retrying to run it, did you delete the corrupt, 0-bytes VNP02IMG.A2022333.1124.002.2022333203424_data_lut.nc file? Because it will just reuse that file if it’s there.

  4. Bich Tran reporter

    I didn’t delete the file before. Just re-run again after deleting, and the file is now downloaded again.

  5. Log in to comment