A field (stored in a cf.Field object) is a container for a data array (stored in a cf.Data object) and metadata comprising properties to describe the physical nature of the data and a coordinate system (called a domain, stored in a cf.Domain object), which describes the positions of each element of the data array.
It is structured in exactly the same way as a field construct defined by the CF data model.
The field’s domain may contain dimensions and dimension coordinates, auxiliary coordinates and cell measures (which themselves contain data arrays and properties to describe them and are stored in cf.DimensionCoordinate, cf.AuxiliaryCoordinate and cf.CellMeasure objects respectively) and transforms (stored in cf.Transform objects) which provide geo-locating metadata for the coordinates.
As in the CF data model, all components of a field are optional.
The structure may be exposed with three different levels of detail.
The built-in repr function returns a short, one-line description of the field:
>>> type(f)
<class 'cf.field.Field'>
>>> print repr(f)
<CF Field: air_temperature(time(12), latitude(64), longitude(128)) K>
>>> f
<CF Field: air_temperature(time(12), latitude(64), longitude(128)) K>
This gives the identity of the field (air_temperature), the identities and sizes of its data array dimensions (time, latitude and longitude with sizes 12, 64 and 128 respectively) and the units of the field’s data array (K).
The built-in str function returns the same information as the the one-line output, along with short descriptions of the field’s other components:
>>> print f
air_temperature field summary
-----------------------------
Data : air_temperature(time(1200), latitude(64), longitude(128)) K
Cell methods : time: mean (interval: 1.0 month)
Dimensions : time(12) = [ 450-11-01 00:00:00, ..., 451-10-16 12:00:00] noleap calendar
: latitude(64) = [-87.8638000488, ..., 87.8638000488] degrees_north
: longitude(128) = [0.0, ..., 357.1875] degrees_east
: height(1) = [2.0] m
This shows that the field has a cell method and four dimension coordinates, one of which (height) is a coordinate for a size 1 dimension that is not a dimension of the field’s data array. The units and first and last values of the coordinates’ data arrays are given and relative time values are translated into strings.
The field’s dump method (or the cf.dump function) also returns each component’s properties, as well as the first and last values of the field’s data array:
>>> print f.dump()
======================
Field: air_temperature
======================
Dimensions
height(1)
latitude(64)
longitude(128)
time(12)
Data(time(12), latitude(64), longitude(128)) = [[[236.512756348, ..., 256.93371582]]] K
cell_methods = time: mean (interval: 1.0 month)
experiment_id = 'pre-industrial control experiment'
long_name = 'Surface Air Temperature'
standard_name = 'air_temperature'
title = 'model output prepared for IPCC AR4'
Dimension coordinate: time
Data(time(12)) = [ 450-11-16 00:00:00, ..., 451-10-16 12:00:00] noleap calendar
Bounds(time(12), 2) = [[ 450-11-01 00:00:00, ..., 451-11-01 00:00:00]] noleap calendar
axis = 'T'
long_name = 'time'
standard_name = 'time'
Dimension coordinate: latitude
Data(latitude(64)) = [-87.8638000488, ..., 87.8638000488] degrees_north
Bounds(latitude(64), 2) = [[-90.0, ..., 90.0]] degrees_north
axis = 'Y'
long_name = 'latitude'
standard_name = 'latitude'
Dimension coordinate: longitude
Data(longitude(128)) = [0.0, ..., 357.1875] degrees_east
Bounds(longitude(128), 2) = [[-1.40625, ..., 358.59375]] degrees_east
axis = 'X'
long_name = 'longitude'
standard_name = 'longitude'
Dimension coordinate: height
Data(height(1)) = [2.0] m
axis = 'Z'
long_name = 'height'
positive = 'up'
standard_name = 'height'
Most CF properties are stored as familiar python objects (str, int, float, tuple, list, numpy.ndarray, etc.):
>>> f.standard_name
'air_temperature'
>>> f._FillValue
1e+20
>>> f.valid_range
(-50.0, 50.0)
>>> f.flag_values
array([0, 1, 2, 4], dtype=int8)
There are some CF properties which have their own class:
Property | Class | Description |
---|---|---|
cell_methods | cf.CellMethods | The characteristics that are represented by cell values |
>>> f.cell_methods
<CF CellMethods: time: mean (interval: 1.0 month)>
There are some attributes which store metadata other than CF properties which require their own class:
Attribute | Class | Description |
---|---|---|
Flags | cf.Flags | The self describing CF flag values, meanings and masks |
Units | cf.Units | The units of the data array |
domain | cf.Domain | The field’s domain |
>>> f.Flags
<CF Flags: values=[0 1 2], masks=[0 2 2], meanings=['low' 'medium' 'high']>
>>> f.Units
<CF Units: days since 1860-1-1 calendar=360_day>
>>> f.domain
<CF Domain: (110, 106, 1, 19)>
The cf.Units object may be accessed through the field’s units and calendar CF properties and the cf.Flags object may be accessed through the field’s flag_values, flag_meanings and flag_masks CF properties:
>>> f.calendar = 'noleap'
>>> f.flag_values = ['a', 'b', 'c']
The cf.Units and cf.Flags objects may also be manipulated directly, which automatically adjusts the relevant CF properties:
>>> f.Units
<CF Units: 'm'>
>>> f.units
'm'
>>> f.Units *= 1000
>>> f.Units
<CF Units: '1000 m'>
>>> f.units
'1000 m'
>>> f.Units.units = '10 m'
>>> f.units
'10 m'
Other attributes used commonly (but not reserved) are:
Attribute | Description |
---|---|
file | The name of the file the field was read from |
id | An identifier for the field in the absence of a standard name. This may be used for ascertaining if two fields are aggregatable or combinable. |
ncvar | The netCDF variable name of the field |
>>> f.file
'/home/me/file.nc'
>>> f.id
'data_123'
>>> f.ncvar
'tas'
A field’s data array is stored by the Data attribute as a cf.Data object:
>>> type(f.Data)
<class 'cf.data.Data'>
The cf.Data object:
A field has attributes which give information about its data array. These are analogous to their numpy counterparts with the same name.
Field attribute | Description | Numpy counterpart |
---|---|---|
size | Number of elements in the data array | numpy.ndarray.size |
shape | Tuple of the data array’s dimension sizes | numpy.ndarray.shape |
ndim | Number of dimensions in the data array | numpy.ndarray.ndim |
dtype | Numpy data type of the data array | numpy.ndarray.dtype |
The data array’s mask may be retrieved with the field’s mask attribute. The mask is returned as a field with a boolean data array:
>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96) K>
>>> m = f.mask
>>> m
<CF Field: mask(time(12), latitude(73), longitude(96)>
>>> m.dtype
dtype('bool')
If the field contains no missing data then a mask of False values is still returned.
A domain completely describes the field’s coordinate system.
It contains the dimension constructs, auxiliary coordinate constructs, transform constructs and cell measure constructs defined by the CF data model.
A field’s domain is stored in its domain attribute, the value of which is a cf.Domain object.
The domain is a dictionary-like object whose key/value pairs identify and store the coordinate and cell measure constructs which describe it.
The dimension sizes of the domain are given by the domain’s dimension_sizes attribute:
>>> f.domain.dimension_sizes
{'dim1': 19, 'dim0': 12, 'dim2': 73, 'dim3': 96}
Keys are dimension identifiers ('dimN') and values are integers giving the size of each dimension.
The N part of each key identifier is replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).
The domain’s key/value pairs identify and store its coordinate and cell measure constructs.
Keys for dimension, auxiliary coordinate and cell measure identifiers ('dimN', 'auxN' and 'cm' respectively) and values are cf.DimensionCoordinate, cf.AuxiliaryCoordinate and cf.CellMeasure objects as appropriate:
>>> f.domain['dim0']
<CF Coordinate: time(12)>
>>> f.domain['dim2']
<CF Coordinate: latitude73)>
>>> f.domain['aux0']
<CF Coordinate: forecast_time(12)>
The dimensions of each of these components, and of the field’s data array, are stored as ordered lists in the dimensions attribute:
>>> f.domain.dimensions
{'data': ['dim0', 'dim1', 'dim2', 'dim3'],
'aux0': ['dim0'],
'dim0': ['dim0'],
'dim1': ['dim1'],
'dim2': ['dim2'],
'dim3': ['dim3']}
Keys are dimension coordinate identifiers ('dimN'), auxiliary coordinate identifiers ('auxN') and cell measure construct identifiers ('cmN'), and values are lists of dimension identifiers ('dimN'), stating the dimensions, in order, of the construct concerned. The dimension identifiers must all exist as keys to the dimension_sizes dictionary.
The special key 'data' stores the ordered list of dimension identifiers ('dimN') relating to a containing field’s data array.
The N part of each key identifier should be replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).
Note
The field’s data array may contain fewer size 1 dimensions than its domain.
Transform constructs are stored in the transforms attribute, which is a dictionary-like object containing cf.Transform objects:
>>> f.domain.transforms
{'trans0': <CF Transform: atmosphere_sigma_coordinate>,
'trans1': <CF Transform: rotated_latitude_longitude>}
Keys are transform identifiers ('transN') and values are cf.Transform objects.
The N part of each key identifier should be replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).
A transform may be associated with any number of the domain’s coordinates via their transform attributes.
A cf.FieldList object is an ordered sequence of fields analogous to a built-in python list.
It has all of the python list-like methods (__contains__, __getitem__, __setitem__, __len__, __delitem__, append, count, extend, index, insert, pop, remove, reverse), which behave as expected. For example:
>>> fl
[<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>,
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>]
>>> type(fl)
<class 'cf.field.FieldList'>
>>> len(fl)
2
>>> for f in fl:
... print repr(f)
...
<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> for f in fl[::-1]:
... print repr(f)
...
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>
>>> f = fl[0]
>>> type(f)
<class 'cf.field.Field'>
>>> f in fl
True
>>> f = fl.pop()
>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> fl
[<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>]
In some contexts, whether an object is a field or a field list is not known and does not matter. So to avoid ungainly type testing, some aspects of the cf.FieldList interface are shared by a cf.Field and vice versa.
Just as it is straight forward to iterate over the fields in a field list, a field will behave like a single element field list in iterative and indexing contexts:
>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> f is f[0]
True
>>> f is f[-1]
True
>>> f is f[slice(0, 1)]
True
>>> f is f[slice(0, None, -1)]
True
>>> for g in f:
... repr(g)
...
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
Any attribute or method belonging to a field may be used on a field list and will be applied independently to each element:
>>> fl.ndim
[2, 3]
>>> fl.subspace[..., 0]
[<CF Field: x_wind(grid_latitude(110), grid_longitude(1)) m s-1>,
<CF Field: air_temperature(time(12), latitude(73), longitude(1)) K>]
>>> fl **= 2
>>> for f in fl:
... f.long_name = f.standard_name + '**2'
...
>>> fl
[<CF Field: long_name:x_wind**2(grid_latitude(110), grid_longitude(1)) m2 s-2>,
<CF Field: long_name:air_temperature**2(time(12), latitude(73), longitude(1)) K2>]
>>> fl.squeeze('longitude')
[<CF Field: long_name:x_wind**2(grid_latitude(110)) m2 s-2>,
<CF Field: long_name:air_temperature**2(time(12), latitude(73)) K2>]
CF properties may be changed to a common value with the setprop method:
>>> fl.setprop('comment', 'my data')
>>> fl.comment
['my data', 'my data']
>>> fl.setprop('foo', 'bar')
>>> fl.getprop('foo')
['bar', 'bar']
Changes tailored to each individual field in the list need to be carried out in a loop:
>>> long_names = ('square of x wind', 'square of temperature')
>>> for f, value in zip(fl, long_names):
... f.long_name = value
>>> for f in fl:
... f.long_name = 'square of ' + f.long_name
Footnotes
[1] | Arrays that may have missing or invalid entries |