Structure and contents of a cf.Field object

A field (stored in a cf.Field object) is a container for a data array (stored in a cf.Data object) and metadata comprising properties to describe the physical nature of the data and a coordinate system (called a domain, stored in a cf.Domain object), which describes the positions of each element of the data array.

It is structured in exactly the same way as a field construct defined by the CF data model.

The field’s domain may contain dimensions and dimension coordinates, auxiliary coordinates and cell measures (which themselves contain data arrays and properties to describe them and are stored in cf.DimensionCoordinate, cf.AuxiliaryCoordinate and cf.CellMeasure objects respectively) and transforms (stored in cf.Transform objects) which provide geo-locating metadata for the coordinates.

As in the CF data model, all components of a field are optional.

Example

The structure may be exposed with three different levels of detail.

The built-in repr function returns a short, one-line description of the field:

>>> type(f)
<class 'cf.field.Field'>
>>> print repr(f)
<CF Field: air_temperature(time(12), latitude(64), longitude(128)) K>
>>> f
<CF Field: air_temperature(time(12), latitude(64), longitude(128)) K>

This gives the identity of the field (air_temperature), the identities and sizes of its data array dimensions (time, latitude and longitude with sizes 12, 64 and 128 respectively) and the units of the field’s data array (K).

The built-in str function returns the same information as the the one-line output, along with short descriptions of the field’s other components:

>>> print f
air_temperature field summary
-----------------------------
Data            : air_temperature(time(1200), latitude(64), longitude(128)) K
Cell methods    : time: mean (interval: 1.0 month)
Dimensions      : time(12) = [ 450-11-01 00:00:00, ...,  451-10-16 12:00:00] noleap calendar
                : latitude(64) = [-87.8638000488, ..., 87.8638000488] degrees_north
                : longitude(128) = [0.0, ..., 357.1875] degrees_east
                : height(1) = [2.0] m

This shows that the field has a cell method and four dimension coordinates, one of which (height) is a coordinate for a size 1 dimension that is not a dimension of the field’s data array. The units and first and last values of the coordinates’ data arrays are given and relative time values are translated into strings.

The field’s dump method (or the cf.dump function) also returns each component’s properties, as well as the first and last values of the field’s data array:

>>> print f.dump()
======================
Field: air_temperature
======================
Dimensions
    height(1)
    latitude(64)
    longitude(128)
    time(12)

Data(time(12), latitude(64), longitude(128)) = [[[236.512756348, ..., 256.93371582]]] K
cell_methods = time: mean (interval: 1.0 month)

experiment_id = 'pre-industrial control experiment'
long_name = 'Surface Air Temperature'
standard_name = 'air_temperature'
title = 'model output prepared for IPCC AR4'

Dimension coordinate: time
    Data(time(12)) = [ 450-11-16 00:00:00, ...,  451-10-16 12:00:00] noleap calendar
    Bounds(time(12), 2) = [[ 450-11-01 00:00:00, ...,  451-11-01 00:00:00]] noleap calendar
    axis = 'T'
    long_name = 'time'
    standard_name = 'time'

Dimension coordinate: latitude
    Data(latitude(64)) = [-87.8638000488, ..., 87.8638000488] degrees_north
    Bounds(latitude(64), 2) = [[-90.0, ..., 90.0]] degrees_north
    axis = 'Y'
    long_name = 'latitude'
    standard_name = 'latitude'

Dimension coordinate: longitude
    Data(longitude(128)) = [0.0, ..., 357.1875] degrees_east
    Bounds(longitude(128), 2) = [[-1.40625, ..., 358.59375]] degrees_east
    axis = 'X'
    long_name = 'longitude'
    standard_name = 'longitude'

Dimension coordinate: height
    Data(height(1)) = [2.0] m
    axis = 'Z'
    long_name = 'height'
    positive = 'up'
    standard_name = 'height'

CF properties and attributes

Most CF properties are stored as familiar python objects (str, int, float, tuple, list, numpy.ndarray, etc.):

>>> f.standard_name
'air_temperature'
>>> f._FillValue
1e+20
>>> f.valid_range
(-50.0, 50.0)
>>> f.flag_values
array([0, 1, 2, 4], dtype=int8)

There are some CF properties which have their own class:

Property Class Description
cell_methods cf.CellMethods The characteristics that are represented by cell values
>>> f.cell_methods
<CF CellMethods: time: mean (interval: 1.0 month)>

There are some attributes which store metadata other than CF properties which require their own class:

Attribute Class Description
Flags cf.Flags The self describing CF flag values, meanings and masks
Units cf.Units The units of the data array
domain cf.Domain The field’s domain
>>> f.Flags
<CF Flags: values=[0 1 2], masks=[0 2 2], meanings=['low' 'medium' 'high']>
>>> f.Units
<CF Units: days since 1860-1-1 calendar=360_day>
>>> f.domain
<CF Domain: (110, 106, 1, 19)>

The cf.Units object may be accessed through the field’s units and calendar CF properties and the cf.Flags object may be accessed through the field’s flag_values, flag_meanings and flag_masks CF properties:

>>> f.calendar = 'noleap'
>>> f.flag_values = ['a', 'b', 'c']

The cf.Units and cf.Flags objects may also be manipulated directly, which automatically adjusts the relevant CF properties:

>>> f.Units
<CF Units: 'm'>
>>> f.units
'm'
>>> f.Units *= 1000
>>> f.Units
<CF Units: '1000 m'>
>>> f.units
'1000 m'
>>> f.Units.units = '10 m'
>>> f.units
'10 m'

Other attributes used commonly (but not reserved) are:

Attribute Description
file The name of the file the field was read from
id An identifier for the field in the absence of a standard name. This may be used for ascertaining if two fields are aggregatable or combinable.
ncvar The netCDF variable name of the field
>>> f.file
'/home/me/file.nc'
>>> f.id
'data_123'
>>> f.ncvar
'tas'

Data array

A field’s data array is stored by the Data attribute as a cf.Data object:

>>> type(f.Data)
<class 'cf.data.Data'>

The cf.Data object:

  • Contains an N-dimensional array with many similarities to a numpy array.
  • Contains the units of the array elements.
  • Uses LAMA functionality to store and operate on arrays which are larger then the available memory.
  • Supports masked arrays [1], regardless of whether or not it was initialized with a masked array.

Attributes

A field has attributes which give information about its data array. These are analogous to their numpy counterparts with the same name.

Field attribute Description Numpy counterpart
size Number of elements in the data array numpy.ndarray.size
shape Tuple of the data array’s dimension sizes numpy.ndarray.shape
ndim Number of dimensions in the data array numpy.ndarray.ndim
dtype Numpy data type of the data array numpy.ndarray.dtype

Data mask

The data array’s mask may be retrieved with the field’s mask attribute. The mask is returned as a field with a boolean data array:

>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96) K>
>>> m = f.mask
>>> m
<CF Field: mask(time(12), latitude(73), longitude(96)>
>>> m.dtype
dtype('bool')

If the field contains no missing data then a mask of False values is still returned.

Domain structure

A domain completely describes the field’s coordinate system.

It contains the dimension constructs, auxiliary coordinate constructs, transform constructs and cell measure constructs defined by the CF data model.

A field’s domain is stored in its domain attribute, the value of which is a cf.Domain object.

The domain is a dictionary-like object whose key/value pairs identify and store the coordinate and cell measure constructs which describe it.

Dimensionality

The dimension sizes of the domain are given by the domain’s dimension_sizes attribute:

>>> f.domain.dimension_sizes
{'dim1': 19, 'dim0': 12, 'dim2': 73, 'dim3': 96}

Keys are dimension identifiers ('dimN') and values are integers giving the size of each dimension.

The N part of each key identifier is replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).

Components

The domain’s key/value pairs identify and store its coordinate and cell measure constructs.

Keys for dimension, auxiliary coordinate and cell measure identifiers ('dimN', 'auxN' and 'cm' respectively) and values are cf.DimensionCoordinate, cf.AuxiliaryCoordinate and cf.CellMeasure objects as appropriate:

>>> f.domain['dim0']
<CF Coordinate: time(12)>
>>> f.domain['dim2']
<CF Coordinate: latitude73)>
>>> f.domain['aux0']
<CF Coordinate: forecast_time(12)>

The dimensions of each of these components, and of the field’s data array, are stored as ordered lists in the dimensions attribute:

>>> f.domain.dimensions
{'data': ['dim0', 'dim1', 'dim2', 'dim3'],
 'aux0': ['dim0'],
 'dim0': ['dim0'],
 'dim1': ['dim1'],
 'dim2': ['dim2'],
 'dim3': ['dim3']}

Keys are dimension coordinate identifiers ('dimN'), auxiliary coordinate identifiers ('auxN') and cell measure construct identifiers ('cmN'), and values are lists of dimension identifiers ('dimN'), stating the dimensions, in order, of the construct concerned. The dimension identifiers must all exist as keys to the dimension_sizes dictionary.

The special key 'data' stores the ordered list of dimension identifiers ('dimN') relating to a containing field’s data array.

The N part of each key identifier should be replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).

Note

The field’s data array may contain fewer size 1 dimensions than its domain.

Transform constructs are stored in the transforms attribute, which is a dictionary-like object containing cf.Transform objects:

>>> f.domain.transforms
{'trans0': <CF Transform: atmosphere_sigma_coordinate>,
 'trans1': <CF Transform: rotated_latitude_longitude>}

Keys are transform identifiers ('transN') and values are cf.Transform objects.

The N part of each key identifier should be replaced by an arbitrary integer greater then or equal to zero, the only restriction being that the resulting identifier is not already in use. No meaning should be inferred from the integer part of the identifiers, which need not include zero nor be consecutive (although these will commonly be the case).

A transform may be associated with any number of the domain’s coordinates via their transform attributes.

Field list

A cf.FieldList object is an ordered sequence of fields analogous to a built-in python list.

It has all of the python list-like methods (__contains__, __getitem__, __setitem__, __len__, __delitem__, append, count, extend, index, insert, pop, remove, reverse), which behave as expected. For example:

>>> fl
[<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>,
 <CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>]
>>> type(fl)
<class 'cf.field.FieldList'>
>>> len(fl)
2
>>> for f in fl:
...     print repr(f)
...
<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> for f in fl[::-1]:
...     print repr(f)
...
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>
>>> f = fl[0]
>>> type(f)
<class 'cf.field.Field'>
>>> f in fl
True
>>> f = fl.pop()
>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> fl
[<CF Field: x_wind(grid_latitude(110), grid_longitude(106)) m s-1>]

Field versus field list

In some contexts, whether an object is a field or a field list is not known and does not matter. So to avoid ungainly type testing, some aspects of the cf.FieldList interface are shared by a cf.Field and vice versa.

Looping

Just as it is straight forward to iterate over the fields in a field list, a field will behave like a single element field list in iterative and indexing contexts:

>>> f
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>
>>> f is f[0]
True
>>> f is f[-1]
True
>>> f is f[slice(0, 1)]
True
>>> f is f[slice(0, None, -1)]
True
>>> for g in f:
...     repr(g)
...
<CF Field: air_temperature(time(12), latitude(73), longitude(96)) K>

Attributes and methods

Any attribute or method belonging to a field may be used on a field list and will be applied independently to each element:

>>> fl.ndim
[2, 3]
>>> fl.subspace[..., 0]
[<CF Field: x_wind(grid_latitude(110), grid_longitude(1)) m s-1>,
 <CF Field: air_temperature(time(12), latitude(73), longitude(1)) K>]
>>> fl **= 2
>>> for f in fl:
...     f.long_name = f.standard_name + '**2'
...
>>> fl
[<CF Field: long_name:x_wind**2(grid_latitude(110), grid_longitude(1)) m2 s-2>,
 <CF Field: long_name:air_temperature**2(time(12), latitude(73), longitude(1)) K2>]
>>> fl.squeeze('longitude')
[<CF Field: long_name:x_wind**2(grid_latitude(110)) m2 s-2>,
 <CF Field: long_name:air_temperature**2(time(12), latitude(73)) K2>]

CF properties may be changed to a common value with the setprop method:

>>> fl.setprop('comment', 'my data')
>>> fl.comment
['my data', 'my data']
>>> fl.setprop('foo', 'bar')
>>> fl.getprop('foo')
['bar', 'bar']

Changes tailored to each individual field in the list need to be carried out in a loop:

>>> long_names = ('square of x wind', 'square of temperature')
>>> for f, value in zip(fl, long_names):
...     f.long_name = value
>>> for f in fl:
...     f.long_name = 'square of ' + f.long_name

Footnotes

[1]Arrays that may have missing or invalid entries