xarray IO error for some specific VIIRS scenes
You can see in the log.txt (in attached zip file), VIIRSL1 downloader works for all other scenes, except for VNP02IMG.A2022333.1124.002.2022333203424.nc, it returns this error (several tries):
ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'scipy', 'rasterio']. Consider explicitly selecting one of the installed engines via the
engine
parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html)..)
2024-05-24 11:50:04,301 ERROR:
Traceback (most recent call last):
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/pywapor/collect/downloader.py", line 129, in collect_sources
x = dler(**args)
^^^^^^^^^^^^
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/pywapor/collect/product/VIIRSL1.py", line 531, in download
combine_unprojected_data(nc02_file, ncqa_file, lut_file, unproj_fn)
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/pywapor/collect/product/VIIRSL1.py", line 266, in combine_unprojected_data
ds_ = xr.open_dataset(nc02_file, mask_and_scale=False, group = "observation_data",engine='netcdf4')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/xarray/backends/api.py", line 553, in open_dataset
engine = plugins.guess_engine(filename_or_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/miniforge3/envs/pywapor/lib/python3.11/site-packages/xarray/backends/plugins.py", line 197, in guess_engine
raise ValueError(error_msg)
ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'scipy', 'rasterio']. Consider explicitly selecting one of the installed engines via theengine
parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html
Following the advice in the error message, I tried adding explicit engine
parameter (netcdf4) in VIIRSL1.py, but it did not solve the issue.
I also updated xarray installation follow the guide in Installation (xarray.dev), it also didn’t work.
python -m pip install "xarray[io]"
conda install -c conda-forge xarray dask netCDF4 bottleneck pydap cftime iris
I found out that the file that the script is trying to open is this one, and it is empty (0 kB): ./VIIRSL1/VNP02IMG.A2022333.1124.002.2022333203424_data_lut.nc (in attached zip file)
It seems this scene is corrupted, and failing to process this scene interrupted the whole downloading step, which makes it impossible to the next step.
pywapor version 3.5.2
Comments (7)
-
reporter -
thanks for letting me know, from your report I’d indeed say that its basically a corrupted file on the NASA Opendap server, I’ll verify that and open a ticket on their forum if thats the case.
And also add some kind of check for this in the pywapor VIIRS code so that it won’t interrupt the entire workflow, but just skips this scene.
-
- changed status to open
-
Not sure how, but for me it seems to work fine:
2024-05-28 13:38:16,667 INFO: --> Found 504 VNP03IMG.v2 scenes. 2024-05-28 13:38:20,909 INFO: --> Found 504 CLDMSK_L2_VIIRS_SNPP.v1 scenes. 2024-05-28 13:38:25,743 INFO: --> Found 504 VNP02IMG.v2 scenes. 2024-05-28 13:38:25,744 INFO: --> Filtering VNP02IMG.v2 scenes with {'day_night_flag': 'DAY'}. 2024-05-28 13:43:21,178 INFO: --> Found 251 relevant VNP02IMG.v2 scenes. 2024-05-28 13:43:21,198 INFO: --> Collecting 251 VIIRS scenes. 2024-05-28 13:43:21,203 INFO: --> 0 scenes already downloaded, 251 remaining. 2024-05-28 13:45:14,855 INFO: --> (217/251) Processing 'VNP02IMG.A2022333.1124.002.2022333203424.nc'. 2024-05-28 13:45:30,569 INFO: --> Downloading VNP03IMG.A2022333.1124.002.2022333200714.nc. 2024-05-28 13:50:41,517 INFO: --> Determining indices of AOI inside scene. 2024-05-28 13:51:01,501 INFO: --> Found 1,280 pixels inside AOI. 2024-05-28 13:51:11,191 INFO: --> Saving latitudes. 2024-05-28 13:51:13,271 INFO: > peak-memory-usage: 33.8KB, execution-time: 0:00:02.078912. 2024-05-28 13:51:13,272 INFO: > chunksize|dimsize: [number_of_lines: 32|32, number_of_pixels: 40|40], crs: None 2024-05-28 13:51:18,584 INFO: --> Saving longitudes. 2024-05-28 13:51:20,622 INFO: > peak-memory-usage: 35.6KB, execution-time: 0:00:02.037268. 2024-05-28 13:51:20,624 INFO: > chunksize|dimsize: [number_of_lines: 32|32, number_of_pixels: 40|40], crs: None 2024-05-28 13:51:27,128 INFO: --> Downloading VNP02IMG.A2022333.1124.002.2022333203424.nc for AOI. 2024-05-28 13:52:11,236 INFO: --> Downloading CLDMSK_L2_VIIRS_SNPP.A2022333.1124.001.2022333234426.nc for AOI. 2024-05-28 13:52:28,135 INFO: --> Combining data. 2024-05-28 13:52:30,173 INFO: > peak-memory-usage: 78.2KB, execution-time: 0:00:02.037428. 2024-05-28 13:52:30,175 INFO: > chunksize|dimsize: [number_of_lines: 32|32, number_of_pixels: 40|40], crs: None
When you were retrying to run it, did you delete the corrupt, 0-bytes
VNP02IMG.A2022333.1124.002.2022333203424_data_lut.nc
file? Because it will just reuse that file if it’s there. -
Added extra log information for when a file is being re-used instead of re-downloaded: https://bitbucket.org/cioapps/pywapor/commits/546fe7ae52bcf6f217145294debea6138535dc1d
-
reporter I didn’t delete the file before. Just re-run again after deleting, and the file is now downloaded again.
-
reporter - changed status to resolved
delete data files that xarray failed to open before re-runing again.
- Log in to comment
Related question: In the log file, you can see the number of relevant
VNP02IMG.v2
scenes was 260 for DOWNLOADER, and 253 for PRE_SE_ROOT. Why is that so?(Reported in #3)