Wiki

Clone wiki

wapor-et-look / Understanding_the_WaPOR_Pipeline

Understanding the WaPOR Pipeline

WaPOR database and levels

Each dataset (also called ‘level’) is defined by a unique region of interest and a specific spatial resolution. Table 1 specifies the resolution and area covered by the different levels.

The main differences (besides the changes in inputs and methodology) between version 2 and version 3 are:
- Level 1 data expanded to global coverage
- Level 2 data expanded to the original L1 extent (Africa and the Near East)
- Soil moisture was added as data component

Table 1: Spatial resolution and Regions of Interest of the different datasets (levels)

Level Dataset Resolution Region of Interest
Level 1 RET 0.3125° longitude by 0.25° latitude (approx 28km) Global
Level 1 PCP 0.05° (approx 5km) Global (from 50 degrees South to 50 degrees North)
Level 1 E,T,I,AETI, NPP,RSM, TBP, WP 300m Global (from 50 degrees South to 80 degrees North)
Level 2 E,T,I,AETI, NPP,RSM, TBP, WP 100m Africa and Near East, Pakistan, Colombia and Sri Lanka
Level 3 E,T,I,AETI, NPP,RSM, TBP, WP 20m Selected irrigation schemes and rainfed areas
Current L3 areas are located in Algeria, Egypt, Ethiopia (2 areas), Iraq (2 areas), Jordan (2 areas), Libya (3 areas), Lebanon, Kenya (3 areas), Mali, Mozambique (2 areas), Morocco, Pakistan (2 areas), Palestine, Sudan, Rwanda (3 areas), Senegal, Tunesia (2 areas) and Sri Lanka

All data is provided in Universal Transverse Mercator (UTM) projection to ensure optimal spatial resolution.

WaPOR data components

Table 2 lists the data components available in the WaPOR database. Most data components (Water Productivity, Evaporation, Transpiration, Interception, Net Primary Productivity, Total biomass production, Soil Moisture, and the quality layers are produced at both 20m (L3), 100m (L2) and 300m (L1). At L1 the Reference Evapotranspiration and Precipitation have a lower spatial resolution than the other data components and are produced daily. Additional quality data layers can be applied by the user to add value to the WaPOR data components, or to inform the user about the quality of input data.

Table 2: Overview of the WaPOR data components available in version 3, per Level, with temporal (24=daily, D=decadal, M=monthly, s=seasonal, Y=annual) and spatial resolutions specified

Data components Level1 1 (300mm) Level 2 (100m) Level 3 (20m)
Gross Biomass Water Productivity (GBWP) Y Y Y
Evaporation (E) D - Y D - Y D - Y
Transpiration (T) D - Y D – Y D – Y
Interception (I) D - Y D – Y D – Y
Actual Evapotranspiration and Interception (AETI) D - M -Y D - M - Y D - M - Y
Soil Moisture (RSM) D D D
Net Primary Production (NPP) D - M D – M D – M
Total Biomass Production (TBP) Y Y Y
Reference Evapotranspiration (RET) (~0.25°) 24 – D – M – Y
Precipitation (0.05°) 24 – D – M - Y
Quality of NDVI D D D
Quality of LST D

Technical approach

In order to produce the data components, various input data, such as satellite input data and meteorological input, are combined and applied in different algorithms among which the soil moisture and ETLook models (Bastiaanssen et al., 2012).

In most cases external input data is used to first calculate intermediate data components. The intermediate data components are used to standardise the processing chain, converting external data sources into the standardised input data required for the production of data components. An example is the NDVI, which is used as input to produce the Evaporation, Transpiration, Interception, Land Cover Classification, Net Primary Production and Phenology data components. Figure 1 shows a flow diagram of the relationships between the required data components and the intermediate data components. Input data sources are not shown in the flow diagram, as the processing chain should remain independent of specific input data sources. This ensures that data sources can be changed easily in the event of delayed availability or if they are unexpectedly discontinued and therefore causes minimal disruption in processing chains of the data components.

DatabaseOverview_v6a.png

Figure 1: Flow chart showing the relationship between the requested data components and the intermediate data components. White boxes represent the requested data components to be produced for the FAO WaPOR database. The grey boxes represent intermediate data components that convert external data into standardised input. Green outlines represent data components that are derived solely from other (intermediate) data components. Orange outlines represent data components that require external data sources that are not shown in the flow chart. External data sources are discussed in the section Data Sources. The flow chart represents all resolution levels, except that the Land cover box only refers to the subnational level.

Regarding spatial and temporal resolution, please note that:
1. The method to produce the data components is independent of spatial resolution. Each pixel is considered a closed system in relation to adjacent pixels. Although in reality exchange of energy and matter takes place between adjacent pixels, these exchanges are considered negligible when considering the spatial and temporal resolution of the datasets. Therefore, all variables referred to in the methodology description can be interpreted as a point representing the average for the area covered by the pixel, whether at 300m (Global), 100m (National-Continental) or 30m (Subnational) resolution.
2. When the temporal resolution of data components that are combined varies (e.g. daily, dekadal, seasonal, annual), the component with the highest temporal resolution will determine the output temporal resolution. For example, when dekadal NDVI is combined with daily weather data, processing takes place on a daily basis followed by an aggregation to dekadal values at the end. This ensures that information is retained at the highest level of detail for as long as possible during processing.

Data components produced in NRT are released approximately 3 days after the end of a dekad and should be seen as preliminary data components. A higher quality version of the data component is produced and delivered after 6 dekads have passed. This final version of the dekadal dataset has a higher quality because (1) gap filling and interpolation processes, where needed, have been based on more data observations, and (2) some of the inputs are of higher quality (e.g. the meteorological data of (Ag)ERA5). This implies that other temporal aggregations (monthly, seasonal, annual), and layers that depend on those, are updated as well. Practically this means that a final annual aggregation of the most recent full calendar year can only be produced after the end of February. Likewise, the final monthly aggregation of the most recent calendar months can only be produced 2 full months later.

The intermediate data components are used to standardise the processing chain, converting external data sources into the standardised input data required for the data components. The processing structure based on the production of intermediate data components, was designed because it has the following advantages:
1. Flexibility and adaptability are ensured. NDVI and weather data, for example, can be obtained from many different sources. External data sources can be changed easily by defining standardised inputs in the form of the intermediate data components.
2. Different approaches to the pre-processing of external data sources can easily be incorporated without changing the overall processing structure of the data components.
3. Consistency between data components is higher with the use of common standardised inputs. This is important as many data components are closely related to each other, e.g. biomass production and Evaporation, Transpiration, Interception.
4. All input data is converted to the required resolution prior to the processing of the data components.
5. Improved processing efficiency is ensured, as the intermediate data components are produced only once and are used as input in various data components.
6. Quality checks can be done on the intermediate data components. In fact, two data layers are delivered that contain information on the quality of the remote sensing observations used to produce the intermediate data components NDVI and Land Surface Temperature.

The input data undergoes one of the following actions:
• The input data is pre-processed to the correct format to be used directly in the processing chain, for example gridding of weather data or resampling of a lower resolution raster input data to a higher resolution, or
• The input data of different spatial resolutions are combined to resample lower resolution raster input data to a higher resolution (e.g. thermal sharpening to produce high resolution LST or the use of digital elevation data to increase the detail of air temperature data), or
• The input data is used in several pre-processing steps to produce intermediate data components that will be used to calculate the final biomass production data components. An example is the production of NDVI and fAPAR which involves outlier detection, smoothing and gap-filling of the input data time series.

Table 3 provides an overview of the intermediate data components, the source and type of the input data, and the application of the intermediate data component per level. Table 4 list the static layers required.

Table 3: variables required to calculate the WaPOR data components

Term Intermediate data components Data product Type of input L1 L2 L3
Tmin Minimum air Temperature GEOS-5, ERA5, AgERA5 Model RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
Tmax Maximum air Temperature GEOS-5, ERA5, AgERA5 Model RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
Ta Air Temperature GEOS-5, ERA5, AgERA5 Model RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
Φ Specific Humidity GEOS-5, ERA5, AgERA5 Model RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
u_obs Wind speed GEOS-5, ERA5, AgERA5 Model RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
φ, λ Latitude, Longitude Copernicus DEM Static RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
z Elevation Copernicus DEM Static RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
Slope Copernicus DEM Static RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
α Aspect Copernicus DEM Static RET,E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
Rs Solar radiation GEOS-5 surface incident shortwave flux Model RET, E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
Rs Solar radiation MSG shortwave radiation Model E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
P Precipitation CHIRPS v2, CHIRP Model PCP
P Precipitation IMERG Model I I I
NDVI Normalized Difference Vegetation Index Landsat 7 ETM+, 8 OLI, 9 OLI-2 Sensor E,T,I,NPP,TBP,RSM, NDVI_QUAL
NDVI Normalized Difference Vegetation Index Sentinel-2 Sensor E,T,I,NPP,TBP,PHE, RSM, NDVI_QUAL E,T,I,NPP,TBP,RSM, NDVI_QUAL
fAPAR Fraction absorbed photosynthetic active radiation Landsat 7 ETM+, 8 OLI, 9 OLI-2 Sensor E,T,I,NPP,TBP,RSM
fAPAR Fraction absorbed photosynthetic active radiation Sentinel-2 Sensor E,T,I,NPP,TBP, RSM E,T,I,NPP,TBP
Ts Land Surface Temperature VIIRS Sensor E,T,I, RSM,LST_QUAL E,T,I, RSM,LST_QUAL E,T,I, RSM, LST_QUAL
Ts Land Surface Temperature Landsat 7 ETM+, 8 TIRS, 9 TIRS-2 Sensor E,T,I, RSM, LST_QUAL
α0 Surface albedo Landsat 7 ETM+, 8 OLI, 9 OLI-2 Sensor E,T,I,NPP,TBP,RSM
α0 Surface albedo Sentinel-2 Sensor E,T,I,NPP,TBP,RSM E,T,I,NPP,TBP,RSM
LUE Light use efficiency WorldCover Static NPP NPP NPP
SMS Soil moisture Stress RSM Model E,T,I,NPP,TBP E,T,I,NPP,TBP E,T,I,NPP,TBP

Table 4 Static layers required to calculate the WaPOR data components

Term Static layer
lwslope Longwave radiation slope
lwoffset Longwave radiation offset
∆Ta,year Yearly air temperature amplitude
Topt,year Yearly air temperature optimum
rsoil,min Minimum soil resistance
rcanopy,min Minimum stomatal resistance
zobs Observation height

Code repository

The code repository contains the core physical functions used to calculate evapotranspiration (ETLook) and NPP related outputs. This repository was written by Henk Pelgrum (eLEAF) and Rutger Kassies (eLEAF). The NPP related methodology was developed by VITO

Table 5 Overview of the modules in the repository and their functionality

Module Content Required input modules
Biomass all functions related to the biomass production and Net Primary Production data components Constants
Clear Sky Radiation all functions related to the calculation of (instantaneous) clear sky radation. Most of these functions are based upon Šúri et Hofierka (2004). -
Constants constants such as conversions and reference values -
Evapotranspiration all functions related to the calculation of evapotranspiration Constants
Unstable
Leaf all functions related to estimating vegetation cover, these functions only work on an instantaneous basis
Meteo all variables related to meteorological variables. Functions can be used for instantaneous and daily calculations Constants
Neutral Neutral atmosphere -
Radiation All functions related to radiation
Resistance calculates (atmospheric) canopy and soil resistance -
Roughness all functions related to surface roughness -
Soil Moisture all functions related to soil moisture data components Constants
Unstable
Solar Radiation All functions related to solar radiation Constants
Stress contains all the (plant) stress functions (radiation, moisture, temperature, vapour pressure deficit) -
Unstable Unstable atmosphere Constants
Indices Calculates the indices (features) for thermal sharpening

Updated