Wiki
Clone wikienmap-box-idl / Data Format Definition
Concept of Storing Raster Data
The EnMAP-Box stores raster data in a flat binary format with metadata in an associated text file. The structure of the text file corresponds to ENVI type headers. This way, an easy interchange of image data is possible between the EnMAP-Box and several other software applications, while the simple structure allows for easy programmatic implementation.
Distinguishing between data and metadata, the raster data is stored as a flat byte stream in a binary file, usually in a band sequential order. An associated ASCII text encoded header file contains all information necessary to read the data file. It can be expanded easily, e.g. by user defined values.
The following section gives a short introduction to this file format and how it is used by the EnMAP-Box.
Naming Convention for Raster Data Files
The header file associated with a binary data file must meet one of the following conditions:
- It has the file name of the binary file plus the extension “.hdr”, or
- It has the file name of the binary file, where the last part starting with a dot is replaced by “.hdr”
In case the first condition matches, the EnMAP-Box will neglect a header file that matches on condition number two. The following table gives examples of file names according or disagreeing to this convention.
Table 1: Examples for names of raster files
File Encoding | File Names | Filename as shown in FileList |
---|---|---|
Matching file names | ||
Binary Text | Biomass Biomass.hdr | Biomass |
Binary Text | Biomass.sample1 Biomass.sample1.hdr | Biomass.sample1 |
Binary Text | Biomass.sample1 Biomass.hdr | Biomass.sample1 |
Binary Text | Biomass.hdr Biomass.hdr.hdr | Biomass.hdr |
Binary Binary Text | Biomass.sample1 Biomass.sample2 Biomass.hdr | Biomass.sample1 Biomass.sample2 |
Binary Text Text | Biomass.sample1 Biomass.hdr Biomass.sample1.hdr | Biomass.sample1 |
Non-matching file names | ||
Binary Text | Biomass.hdr Biomass | |
Text Text | Biomass Biomass.hdr |
Binary Data File
On physical data drives the binary data is stored as flat binary stream. The order of pixel value positions in this stream depends on the selected type of data interleave. Three types are supported: Band Sequential (BSQ), Band Interleaved by Line (BIL) and Band Interleaved by Pixel (BIP).
Figure 1 gives an example for a raster data set that consists of nine pixels. Each pixel has a numeric value for one of the three bands, represented by the colors red, green and blue. The shading effect highlights a pixels position within a row or the number of its column, respectively. Since non-interrupted reading and writing operations on physical data storages are usually faster than interrupted ones, the selective use of the interleave can accelerate the working progress and reduce memory requirements.
Figure 1: Illustration of a raster image with nine pixels and three bands.
Using the BSQ interleave pixel values are stored sequentially in order of the pixel column, the pixel row and the order of bands. This leads to a byte stream where the pixel values of one band can be read at once, as shown in Figure 2.
Figure 2: Byte stream of pixel values according to the used interleave.
The BIP interleave results in a byte stream where the pixel values are stored in order of the band first, followed by the order columns and lines. This allows the fastest access to the full spectral profile of a single pixel. BIL interleave is a compromise between BSQ and BIL. The pixel values are stored in order of its column, band and row (or line). This method can be advantageous for applications that required the full spectral information of a complete image line.
Header File Information
The header file is an ASCII coded text file that describes the data stored within a binary data file. It always starts with the text ENVI in the first row. The following rows contain the required and non-required meta information tags. A single tag can be structured as:
1. <tag name> = <tag value>
or
2. <tag name> = {<tag value1>,<tag value2>,…, <tag value n>}
Depending on the type of file and its usage in the EnMAP-Box different tags are required. Table 2 lists the tags that are necessary for all files in ENVI File Format that are supported and used by the EnMAP-Box. The tags in Table 3 are not obligatory but helpful for the “daily work” with the EnMAP-Box.
Table 2: Obligatory meta tags in an ENVI FILE Format header file
Meta tag | Description / Tag Value |
---|---|
ENVI | First row of the header file |
samples | Number of samples / columns / pixels in a row |
lines | Number of lines / rows / pixels in a column |
bands | Number of bands / layers / spectral dimension |
data type | IDL data type 1 : byte (1 byte, from 0 to 255) 2 : integer (2 bytes, from -24 to 24-1) 3 : long nteger (4 bytes, from -28 to 28-1) 4 : float (4 bytes) 5 : double (8 bytes) 12 : unsigned integer (2 bytes, from 0 o 28) 13 : unsigned long integer (4 bytes, from 0 to 216) 14 : long integer 64bit (8 bytes, from -232 to 232-1) 15 : unsigned long integer 64bit (8 bytes, from 0 to 264) |
interleave | Data storage order / interleave type bsq : band sequential bil : band interleave by line bip : band interleave by pixel |
byte order | Byte order 0 : little endian 1 : big endian |
file type | ENVI Standard ENVI Classification ENVI Spectral Library unknown or other types are mapped to ENVI Standard implicitly |
Table 3: Optional meta tags
Meta tag | Description / Tag Value |
---|---|
map info | Geographic coordinate information of the raster image using the format: map info = {reference, pixel x, pixel y, pixel easting, pixel northing, x-size of pixel, y-size of pixel, projection zone, North or South, Datum, size unit} Example: map info = {UTM, 1, 1, 390749.250, 5820819.800, 3.5, 3.5, 33, North, WGS-84, units=Meters} Providing this tag allows the EnMAP-Box to link the representation of different images. |
data ignore value | A pixel value that is given to undefined, invalid or masked pixels. |
description | General file description |
band names | Description for each single band |
default bands | Default bands to display using the RGB color scheme default band = {<band R>, <band G>, <band B>} |
wavelength | List of wavelengths, number of elements must be equal to the number of bands (tag wavelength units must be set) |
wavelength units | Physical unit of spectral values, usually Nanometers or Micrometers |
fwhm | Full width at half maximum values for each band. (tag wavelength units must be set) |
data gain values | Gain values for each band |
data offset values | Data offset value for each band |
spectra names | Names for each spectrum in a Spectral Library |
Files used by EnMAP-Box
Standard Image Files
Standard images files can contain multiple bands of categorical or continuous values. They use the meta tag file type = ENVI Standard, which is also used implicitly in case the value of file type is unknown or unspecified.
Example header for file type ENVI Standard: Hymap_Berlin-A_Image.hdr
ENVI
samples = 300
lines = 300
bands = 114
data type = 2
interleave = bsq
byte order = 0
file type = envi standard
default bands = { 26, 73, 15}
map info = { UTM, 1.000, 1.000, 390749.250, 5820819.800,
3.5000000000e+000,3.5000000000e+000,
33, North, WGS-84, units=Meters}
wavelength units = micrometers
fwhm = {0.015000000, 0.015000000, ... , 0.017000000}
wavelength = {0.45200000, 0.46440000, ... , 2.4546000}
Classification Files
These files store categorical data values, e.g. for classification results. They use the meta tag file type = ENVI Classification. It is implied that the raster image potentially contains pixel values from 0 to the value given by number of classes.
The value zero is reserved to represent pixels that are unclassified. For instance, a reference data set might have all pixels set to zero that are not used for the validation of a classification. When using this file type it is required to support all meta tags listed in Table 2 and Table 4.
Table 4: Meta tags additionally required for Classification Files
Meta tag | Description / Tag Value |
---|---|
file type | file type = ENVI Classification |
classes | The number of classes including the class unclassified |
class lookup | Specification of color representation. Each class is assigned to a specific RGB value |
class names | A name for each class including class unclassified |
Example for file type ENVI Classification:
Hymap_Berlin-A_Classification-Training-Sample.hdr
ENVI
samples = 300
lines = 300
bands = 1
file type = ENVI Classification
data type = 1
interleave = bsq
classes = 6
class lookup = {
0, 0, 0,
0, 255, 0,
255, 0, 0,
255, 255, 0,
0, 255, 255,
0, 0, 255 }
class names = {
Unclassified, vegetation, built-up, impervious, soil, water}
byte order = 0
map info = {UTM, 1.000, 1.000, 390749.250, 5820819.800,
3.5000000000e+000, 3.5000000000e+000, 33, North, WGS-84,
units=Meters}
band names = {Classification Band}
Regression Files
Regression files are used to store continuous numerical values (in opposite to classification files), as it is the case for many regression references and estimation results.
When using these files it is required to support the data ignore value, e.g. data ignore value = -1. This allows marking unlabeled pixels and allows distinguishing between valid pixels and pixels that got masked out. Even if no pixel of an image is unspecified, the data ignore value tells other routines a value that can be used. When using this file type it is required to support all meta tags listed in Table 2 and Table 5.
Table 5: Meta tags additionally required for Regression Files
Meta tag | Description / Tag Value |
---|---|
file type | file type = ENVI Standard |
data ignore value | The value of masked pixels |
bands | bands = 1 |
Example for Regression image with file type ENVI Standard
Hymap_Berlin-B_Regression-GroundTruth.hdr
ENVI
samples = 300
lines = 300
bands = 1
file type = ENVI Standard (IN TESTDATA STILL REGRESSION)
data type = 4
interleave = bsq
byte order = 0
map info = {UTM, 1.000, 1.000, 385271.750, 5821155.750, 1.0500010500e+001, 1.0500010500e+001, 33, North, WGS-84, units=Meters}
data ignore value = -1
band names = {Regression Band}
Masks
Mask files can be used to exclude pixels from certain processes. This allows to constrain operations on regions of interest only and to reduce computational costs.
Any raster file that is described by the meta tags given in Table 2 can be used as mask file. It is assumed that all pixels with a value of zero mark a position that is to be masked and neglected during a specific operation. This can be changed by setting the mask value explicitly to data ignore value = <your mask value>.
Table 6: Meta tags additionally required for Mask Files
Meta tag | Description / Tag Value |
---|---|
file type | Any file type, even classification and regression files |
bands | bands = 1 |
data ignore value | Explicit value of masked pixels. If not defined a value of zero is used by default. |
Spectral Libraries
Spectral Libraries are used to store spectra without a spatial context. ENVI Spectral Library files store each spectral profile in a separate image line. Therefore the number of lines is lines = <number of profiles> and the number of samples is samples = <number of wavebands>. Furthermore bands = 1 and interleave = bsq.
This specification means that even when using a BSQ interleave in the header the spectra are physically stored in BIP format. The EnMAP-Box uses this by permuting the header information in the following way: interleave = bip, bands = <number of wavebands>, samples = 1, lines = <number of profiles>. By doing so, EnMAP-Box applications can handle Spectral Libraries like normal images, e.g. for parameterizing a supervised classifier.
Table 7: Meta tags additionally required for Spectral Libraries
Meta tag | Description / Tag Value |
---|---|
file type | file type = ENVI Spectral Library |
interleave | interleave = bsq |
bands | bands = 1 |
lines | Number of spectral profiles |
sample | Number of wavebands |
Example for file type ENVI Spectral Library:
Spectral Library with five spectra, each having values within 235 channels.
ENVI
description = {Spectral Library Example}
samples = 235
lines = 5
bands = 1
header offset = 0
file type = ENVI Spectral Library
data type = 5
interleave = bsq
byte order = 0
reflectance scale factor = 1.000000
band names = { Spectral Library}
spectra names = {
Spectrum1, Spectrum2, Spectrum3,
Spectrum4, Spectrum5}
Wavelength units = Nanometers
wavelength = {
423.709991, 429.450012, 434.910004, 440.179993, 445.299988,
450.320007, . . .
}
Updated