(part of the AQMEII-NA_N2O family of projects)

table of contents


Uses bash to drive NCL and R code to

  1. process GFED-3.1 N2O emission inventories for the model year=2008:
    1. retrieve GFED inputs, gridded globally @ resolution=(0.5°x0.5°), units=gN2O/m^2/mo
      1. monthly emissions: extracted from here, in GFED native text format
      2. daily fractions: retrieved from here (may only be visible from FTP client) as compressed netCDF
      3. 3-hourly fractions: retrieved from here (may only be visible from FTP client) as compressed netCDF
    2. convert individual monthly *.txt files to a single annual netCDF file with
      • 12 timesteps
      • units=molN2O/mo, using computed gridcell areas. Mass rates are more regriddable than (the GFED-native) flux rates (see above).
    3. convert individual daily-fraction netCDF files to a single annual netCDF with 366 timesteps
    4. convert individual 3-hourly-fraction netCDF files to a single annual netCDF with 12 timesteps (3-hourly-fractions are delivered by month)
  2. 2D-regrid the annualized netCDF "bricks" from global/unprojected to a projected subdomain (AQMEII-NA).
    1. visualize the global monthlies
    2. 2D-regrid monthlies, dailies, and 3-hourlies
    3. visualize the AQMEII-NA monthlies
  3. create hourly emissions over subdomain, suitable for injecting into a CMAQ run over AQMEII-NA.
    1. compute hourly emissions over subdomain, adapting these instructions to produce emissions with units=molN2O/s
    2. create CMAQ-style emissions files (e.g., this, when gunziped) containing emissions for every hour in 2008 (the model year).
  4. check extent to which mass is conserved from global input to regional output.

Currently does not provide a clean or general-purpose (much less packaged) solution! but merely shows how to do these tasks using

  • bash (tested with version=3.2.25)
  • NCL (tested with version=6.1.2)
  • R (tested with version=3.0.0) and packages including

Visualizations of datasets are provided

  • statistically (summary statistics of datasets are generated and presented in console)
  • by plots (saved as PDF, displayed by launching the user-configured PDF viewer)


To run this code,

  1. git clone this repo. (See commandlines on the project homepage, where you probably are now.)
  2. cd to its working directory (where you cloned it to).
  3. Setup your applications and paths.
    1. Download a copy of these bash utilities to the working directory.
    2. Open the file in in an editor! You will probably need to edit its functions setup_paths and setup_apps to make it work on your platform. Notably you will want to point it to your PDF viewer and NCL and R executables.
    3. You may also want to open the driver (bash) script in an editor and take a look. It should run Out Of The Box, but you might need to tweak something there. In the worst case, you could hardcode your paths and apps in the driver.
    4. Once you've got it working, you may want to fork it. If so, you can automate running your changes with uber_driver.sh (changing that as needed, too).
  4. Run the driver: $ ./GFED_driver.sh This will download inputs, then run ...
    1. a set of NCL scripts to consolidate multiple GFED inputs into more tractable files (as described in steps 1.2-1.4 above)
    2. vis_regrid_vis.r to visualize the monthly global emissions, regrid them to AQMEII-NA, and visualize the output
    3. make_hourlies.ncl to generate hourly output (e.g., this) suitable for input to CMAQ. Note it also includes code to view output in VERDI for data visualization (and, more to the point, verification of one's IOAPI-correctness!)
    4. check_conservation.ncl to check conservation of mass from input to output. Given that the output spatial domain is significantly smaller than the input (global), it merely reports the fraction of mass (as mol) in output vs input. Current output includes

          GFED monthly N2O (mol) over globe from ./GFED-3.1_2008_N2O_monthly_emissions.nc:
          |obs|  =3.1104e+06
          min    =0
          q1     =0
          med    =0
          mean   =5.443e+03
          q3     =0
          max    =4.440e+07
          sum    =1.693e+10
          GFED monthly N2O (mol) over AQMEII-NA from ./GFED-3.1_2008_N2O_monthly_emissions_regrid.nc:
          |obs|  =1.64689e+06
          min    =0
          q1     =0
          med    =0
          mean   =2.337e+03
          q3     =0
          max    =1.797e+07
          sum    =3.848e+09
          AQMEII-NA monthly N2O/global monthly N2O==2.273e-01
          [aggregating hourly emissions]
          Is N2O conserved from input to output? units=mol N2O
          (note (US land area)/(earth land area) ~= 6.15e-02)
              input      input     output     output           
             global        NAs  AQMEII-NA        NAs     out/in
           1.69e+10          0   3.53e+09          0   2.09e-01
          AQMEII-NA N2O (mol) from output hourly emissions:
          |obs|  =1.20552e+09
          min    =0.000e+00
          q1     =0.000e+00
          med    =0.000e+00
          mean   =3.103e+00
          q3     =0.000e+00
          max    =1.079e+06
          sum    =3.530e+09

check_conservation.ncl can be a long-running process, even on relatively high-performance hardware. On terrae, its first phase (aggregation of hourly emissions) can take 3-5 hr wallclock; its second phase (summarizing the aggregate) has been found to run 15-90 min wallclock.


  1. Retest with newest regrid_utils! Currently, repo_diff.sh shows the following local diffs:
    • get_filepath_from_template.ncl
    • IOAPI.ncl
    • string.ncl
  2. Move all these TODOs to issue tracker.
  3. (major) Investigate 8% loss of mass from monthly regrids to hourly regrids.
  4. check_raw_fractions.ncl: Check that ftp://gfed3:dailyandhourly@zea.ess.uci.edu/GFEDv3.1/Readme.pdf > Over a day at each grid cell location, the sum of the eight 3-hourly fire fractions should be equal to 1.0.
  5. check_coord_vars.ncl: Should also check monotonicity of coordvars? or is that guaranteed by netCDF?
  6. Handle {name, path to} VERDI in bash_utilities.sh.
  7. *.sh: use bash booleans à la N2O_integration_driver.sh.
  8. regrid_utils: Make commandline netCDF.stats.to.stdout.r take files beginning with numerals, e.g.
    • (-) Rscript ./netCDF.stats.to.stdout.r netcdf.fp=5yravg_20111219_pure_emis.nc data.var.name=N2O # fails
    • (+) Rscript ./netCDF.stats.to.stdout.r netcdf.fp=yravg_20111219_pure_emis.nc data.var.name=N2O # works
  9. Create common project for regrid_resources à la regrid_utils, so I don't hafta hunt down which resource is in which project.
  10. All regrids: how to nudge off/onshore as required? e.g., soil or burning emissions should never be offshore, marine emissions should never be onshore..
  11. All regrid maps: add Caribbean islands (esp Bahamas! for offshore burning), Canadian provinces, Mexican states.
  12. NCL: complain to ncl-talk about "unsupported extensions," e.g., .ncf and <null/> (e.g., MCIP output).
  13. R: determine why <- assignment is occasionally required in calls to visualize.*(...).
  14. Fully document platform versions (e.g., linux, compilers, bash, NCL, R).
  15. Test on
    • tlrPanP5 (which now has R package=ncdf4, but readAsciiTable of input .txt's is very slow compared to terrae)
    • HPCC (once problem with ncdf4 on amad1 is debugged: in process with JOB and KMF)