Pupynere / docs / index.rst

Full commit


This module is used to read and create NetCDF files. NetCDF files are accessed through the netcdf_file object. Data written to and from NetCDF files are contained in netcdf_variable objects. Attributes are given as member variables of the netcdf_file and netcdf_variable objects.


NetCDF files are a self-describing binary data format. The file contains metadata that describes the dimensions and variables in the file. More details about NetCDF files can be found here. There are three main sections to a NetCDF data structure:

  1. Dimensions
  2. Variables
  3. Attributes

The dimensions section records the name and length of each dimension used by the variables. The variables would then indicate which dimensions it uses and any attributes such as data units, along with containing the data values for the variable. It is good practice to include a variable that is the same name as a dimension to provide the values for that axes. Lastly, the attributes section would contain additional information such as the name of the file creator or the instrument used to collect the data.

When writing data to a NetCDF file, there is often the need to indicate the 'record dimension'. A record dimension is the unbounded dimension for a variable. For example, a temperature variable may have dimensions of latitude, longitude and time. If one wants to add more temperature data to the NetCDF file as time progresses, then the temperature variable should have the time dimension flagged as the record dimension.

This module implements the Scientific.IO.NetCDF API to read and create NetCDF files. The same API is also used in the PyNIO and pynetcdf modules, allowing these modules to be used interchangeably when working with NetCDF files. The major advantage of this module over other modules is that it doesn't require the code to be linked to the NetCDF libraries.

In addition, the NetCDF file header contains the position of the data in the file, so access can be done in an efficient manner without loading unnecessary data into memory. It uses the mmap module to create Numpy arrays mapped to the data on disk, for the same purpose.


To create a NetCDF file:

Note the assignment of range(10) to time[:]. Exposing the slice of the time variable allows for the data to be set in the object, rather than letting range(10) overwrite the time variable.

To read the NetCDF file we just created: