Commits

Chris Mutel committed 6cecb1e

Change _filename to filename, and update basically all docs

  • Participants
  • Parent commits 45f594f

Comments (0)

Files changed (17)

 0.11 ()
 =======
 
+**bw2-uptodate.py is required for this update**
+
 Upgrades to updates
 -------------------
 
 Smaller changes
 ---------------
 
-- BREAKING CHANGE: The abbreviate mess
+- BREAKING CHANGE: The filenames for LCIA methods are now derived from the MD5 of the name. This breaks all method abbreviations.
+- BREAKING CHANGE: The filename and filepath attributes in SerializedDict and subclasses moved from ``_filename`` and ``filepath`` to ``filename`` and ``filepath``
 - BREAKING CHANGE: Register for all data store now takes any keyword arguments. There are no required or positional arguments.
 - BREAKING CHANGE: Database.process() doesn't raise an AssertionError for empty databases
 - FEATURE: Database.process() writes a geomapping processed array (linking activity IDs to locations), in addition to normal matrix arrays.

bw2data/_config.py

                 "preferences.json"), "w") as f:
             json.dump(self.p, f, indent=2)
 
+    @property
+    def biosphere(self):
+        if not hasattr(self, "p"):
+            self.load_preferences()
+        return self.p.get("biosphere_database", u"biosphere")
+
+    @property
+    def global_location(self):
+        if not hasattr(self, "p"):
+            self.load_preferences()
+        return self.p.get("global_location", u"GLO")
+
     def get_home_directory(self, path=None):
         """Get data directory, trying in order:
 
         """Return `dir` in Unicode"""
         return self.dir.decode('utf-8')
 
-    @property
-    def biosphere(self):
-        if not hasattr(self, "p"):
-            self.load_preferences()
-        return self.p.get("biosphere_database", u"biosphere")
-
-    @property
-    def global_location(self):
-        if not hasattr(self, "p"):
-            self.load_preferences()
-        return self.p.get("global_location", u"GLO")
-
-
 config = Config()

bw2data/data_store.py

 
 
 class DataStore(object):
+    """Base class for all Brightway2 data stores. Subclasses should define:
+
+        * **metadata**: A :ref:`serialized-dict` instance, e.g. ``databases`` or ``methods``. The custom is that each type of data store has a new metadata store, so the data store ``Foo`` would have a metadata store ``foos``.
+        * **dtype_fields**: A list of fields to construct a NumPy structured array, e.g. ``[('foo', np.int), ('bar', np.float)]``.
+        * **validator**: A data validator. Optional. See bw2data.validate.
+
+    """
     validator = None
     metadata = None
     dtype_fields = None
 
     @property
     def filename(self):
+        """Can be overwritten in cases where the filename is not the name"""
         return self.name
 
     def register(self, **kwargs):
         del self.metadata[self.name]
 
     def assert_registered(self):
+        """Raise ``UnknownObject`` if not yet registered"""
         if self.name not in self.metadata:
             raise UnknownObject(u"%s is not yet registered" % self)
 
 
     @property
     def dtype(self):
+        """Get custom dtype fields plus generic uncertainty fields"""
         return self.dtype_fields + self.base_uncertainty_fields
 
     def copy(self, name):
-        """Make a copy of this object. Takes new name as argument."""
+        """Make a copy of this object. Takes new name as argument. Returns the new object."""
         assert name not in self.metadata, u"%s already exists" % name
         new_obj = self.__class__(name)
         new_obj.register(**self.metadata[self.name])
     File data is saved in ``mapping.pickle``.
 
     This dictionary does not support setting items directly; instead, use the ``add`` method to add multiple keys."""
-    _filename = "mapping.pickle"
+    filename = "mapping.pickle"
 
     def add(self, keys):
         """Add a set of keys. These keys can already be in the mapping; only new keys will be added.
     File data is stored in ``geomapping.pickle``.
 
     This dictionary does not support setting items directly; instead, use the ``add`` method to add multiple keys."""
-    _filename = "geomapping.pickle"
+    filename = "geomapping.pickle"
 
     def __init__(self, *args, **kwargs):
         super(GeoMapping, self).__init__(*args, **kwargs)
 
 class Databases(SerializedDict):
     """A dictionary for database metadata. This class includes methods to manage database versions. File data is saved in ``databases.json``."""
-    _filename = "databases.json"
+    filename = "databases.json"
 
     def increment_version(self, database, number=None):
         """Increment the ``database`` version. Returns the new version."""
 
 class Methods(CompoundJSONDict):
     """A dictionary for method metadata. File data is saved in ``methods.json``."""
-    _filename = "methods.json"
+    filename = "methods.json"
 
     def __unicode__(self):
         return u"Brightway2 methods metadata with %i objects" % len(
 
 class WeightingMeta(Methods):
     """A dictionary for weighting metadata. File data is saved in ``methods.json``."""
-    _filename = "weightings.json"
+    filename = "weightings.json"
 
 
 class NormalizationMeta(Methods):
     """A dictionary for normalization metadata. File data is saved in ``methods.json``."""
-    _filename = "normalizations.json"
+    filename = "normalizations.json"
 
 
 mapping = Mapping()

bw2data/serialization.py

 
     Upon instantiation, the serialized dictionary is read from disk."""
     def __init__(self):
-        if not getattr(self, "_filename"):
+        if not getattr(self, "filename"):
             raise NotImplemented("SerializedDict must be subclassed, and the filename must be set.")
-        self._filepath = os.path.join(config.dir, self._filename)
+        self.filepath = os.path.join(config.dir, self.filename)
         self.load()
 
     def load(self):
             * *filepath* (str, optional): Provide an alternate filepath (e.g. for backup).
 
         """
-        JsonWrapper.dump(self.pack(self.data), filepath or self._filepath)
+        JsonWrapper.dump(self.pack(self.data), filepath or self.filepath)
 
     def deserialize(self):
         """Load the serialized data. Can be replaced with other serialization formats."""
-        return self.unpack(JsonWrapper.load(self._filepath))
+        return self.unpack(JsonWrapper.load(self.filepath))
 
     def pack(self, data):
         """Transform the data, if necessary. Needed because JSON must have strings as dictionary keys."""
     def backup(self):
         """Write a backup version of the data to the ``backups`` directory."""
         filepath = os.path.join(config.dir, "backups",
-            self._filename + ".%s.backup" % int(time()))
+            self.filename + ".%s.backup" % int(time()))
         self.serialize(filepath)
 
 
 class PickledDict(SerializedDict):
     """Subclass of ``SerializedDict`` that uses the pickle format instead of JSON."""
     def serialize(self):
-        with open(self._filepath, "wb") as f:
+        with open(self.filepath, "wb") as f:
             pickle.dump(self.pack(self.data), f,
                 protocol=pickle.HIGHEST_PROTOCOL)
 
     def deserialize(self):
-        return self.unpack(pickle.load(open(self._filepath, "rb")))
+        return self.unpack(pickle.load(open(self.filepath, "rb")))
 
 
 class CompoundJSONDict(SerializedDict):
-    """Subclass of ``SerializedDict`` that allows tuples as dictionary keys."""
+    """Subclass of ``SerializedDict`` that allows tuples as dictionary keys (not allowed in JSON)."""
     def pack(self, data):
         """Transform the dictionary to a list because JSON can't handle lists as keys"""
         return [(k, v) for k, v in data.iteritems()]

bw2data/tests/data_store.py

 
 
 class Metadata(SerializedDict):
-    _filename = "mock-meta.json"
+    filename = "mock-meta.json"
 
 metadata = Metadata()
 

bw2data/tests/ia.py

 
 
 class Metadata(CompoundJSONDict):
-    _filename = "mock-meta.json"
+    filename = "mock-meta.json"
 
 metadata = Metadata()
 

bw2data/updates.py

 
     @staticmethod
     def reprocess_all_methods():
+        """Change name hashing function from random characters (!?) to MD5 hash. Need to update abbreviations and rewrite all data."""
         print "Updating all LCIA methods"
 
         widgets = [
+Data stores
+***********
+
+.. _datastore:
+
+DataStore
+=========
+
+.. autoclass:: bw2data.DataStore
+    :members:
+    :inherited-members:
+
+.. _ia-datastore:
+
+ImpactAssessmentDataStore
+=========================
+
+.. autoclass:: bw2data.ia_data_store.ImpactAssessmentDataStore
+    :members:
+    :inherited-members:
+

docs/database.rst

-Database
-********
-
-.. _database:
-
-.. autoclass:: bw2data.Database
-    :members:
+Impact Assessment data stores
+*****************************
+
+.. _method:
+
+Method
+======
+
+.. autoclass:: bw2data.Method
+    :members:
+    :inherited-members:
+
+.. _normalization:
+
+Normalization
+=============
+
+.. autoclass:: bw2data.Normalization
+    :members:
+    :inherited-members:
+
+.. _weighting:
+
+Weighting
+=========
+
+.. autoclass:: bw2data.Weighting
+    :members:
+    :inherited-members:
-.. bw2data documentation master file, created by
-   sphinx-quickstart on Thu Nov 29 22:50:48 2012.
-   You can adapt this file completely to your liking, but it should at least
-   contain the root `toctree` directive.
-
 Brightway2-data
 ===============
 
-This is the technical documentation for Brightway2-data, part of the `Brightway2 <http://brightwaylca.org>`_ life cycle assessment calculation framework. The following online resources are available:
+This is the documentation for Brightway2-data, part of the `Brightway2 <http://brightwaylca.org>`_ life cycle assessment framework.
+
+Surprisingly enough, Brightway2-data (abbreviated to bw2data in code) is the package the manages different types of data in Brightway2. In general, Brightway2-data can save, load, process, validate, import and export different kinds of data. It also includes code to setup the data directory, query datasets, and normalize units.
+
+This page of the documentation covers the basic concepts in Brightway2-data. Documentation on querying, and import and export of data in different formats, are in separate sections.
+
+.. toctree::
+   :maxdepth: 1
+
+   querying
+   io
+
+Other resources
+---------------
+
+The following online resources are available:
 
 * `Source code <https://bitbucket.org/cmutel/brightway2-data>`_
 * `Documentation on Read the Docs <http://bw2data.readthedocs.org>`_
-* `Test coverage <http://coverage.brightwaylca.org/data/index.html>`_
+* `Test coverage report <http://coverage.brightwaylca.org/data/index.html>`_
+
+Configuration
+=============
+
+The first thing Brightway2 needs is to know where it can save data and log files. This directory location, in addition to a number of other configuration variables, is managed by the :ref:`configuration` object.
+
+The ``config`` object stores the Brightway2 directory, and can also change it, and create new directories. It also stores information about whether or not it is being run on Windows, or used in an iPython shell.
+
+The ``config`` object also stores user preferences. User preferences include things like the default number of Monte Carlo iterations to run, but it is just a dictionary, and can be added to as desired.
+
+.. warning:: Preferences are not saved automatically - you must call ``config.save_preferences()``.
+
+Data and metadata
+=================
+
+.. note:: For more detailed information, see tutorial XX: defining a new matrix.
+
+The building blocks in Brightway2 data are the **data store** and the **metadata store**. The difference between the two can be easily explained in the example of LCI databases:
+
+    * The data store object, :ref:`databases`, has the actual activity data for each database.
+    * The metadata store, :ref:`database`, has information about the database, like the format it is in, its version number, and what other databases it links to.
+
+Both the data and metadata objects *store* data, and provide easy ways to save and load data.
+
+Metadata stores
+---------------
+
+The base class for metadata is :ref:`serialized-dict`, which is basically a normal dictionary that can be easily saved or loaded (i.e. serialized) to or from a `JSON <http://en.wikipedia.org/wiki/JSON>`_ file. These files can be easily edited in a normal text editor.
+
+Brightway2-data defines the following metadata stores:
+
+    * :ref:`databases`: LCI databases
+    * :ref:`methods`: LCIA methods (characterization factors)
+    * :ref:`normalizations`: LCIA normalization factors
+    * :ref:`weightings`: LCIA weighting factors
+
+There are no required fields of metadata for any metadata stores, though some fields may be added automatically by subclasses.
+
+Metadata stores are just dictionaries that can be easily serialized - they are not associated with a specific data store, and it is possible to use metadata stores without a data store, or with multiple data stores.
+
+Metadata should be singletons
+-----------------------------
+
+Metadata stores follow the `singleton pattern <http://en.wikipedia.org/wiki/Singleton_pattern>`_, though this is not enforced. Each metadata dictionary should only exist once, to avoid having multiple conflicting versions. The normal pattern is to instantiate each class in the same file as the class pattern:
+
+.. code-block:: python
+
+    class MyObjects(bw2data.serialization.SerializedDict):
+        file = "sweet-peppers.json"
+
+    myobjects = MyObjects()
+
+Data stores
+-----------
+
+.. note:: See also tutorial XX: manipulating databases and tutorial XX: defining a new matrix.
+
+The base class for data stores is :ref:`datastore`. Each data store subclass defines a schema for its data. The normal methods provided by a data store are:
+
+    * **write(data)**: Write data to disk
+    * **load**: Load data from disk
+    * **register**: Register object with metadata store
+    * **deregister**: Remove object from metadata store
+    * **copy(name)**: Create a new object with name ``name``
+    * **backup**: Write backup of data
+    * **validate(data)**: Validate data using this object's validator
+
+Data store objects are instantiated with the object name, e.g. ``DataStore("name goes here")``.
+
+Brightway2-data defines the following data stores:
+
+    * :ref:`database`
+    * :ref:`method`
+    * :ref:`weighting`
+    * :ref:`normalization`
+
+Document and processed data
+===========================
+
+The basic form of Brightway2 data is *semi-structured* - there are some requirements, and some conventions, but a lot of flexibility. This type of database, is often called a `document database`. However, to construct matrices efficiently from these data documents, a *processing* step is required.
+
+Processing data
+---------------
+
+*Processing data* converts document data to a binary form tailored for creating matrices (a NumPy array). All extraneous information is removed, and only the numeric values needed are retained. Put another way, *processing* transforms unstructured data documents to a highly-structured binary form for calculations.
+
+Uncertainty distributions
+-------------------------
+
+Uncertainty distributions are modeled using *parameter arrays* from `stats_arrays <https://bitbucket.org/cmutel/stats_arrays>`_, which has its own `extensive documentation <http://stats-arrays.readthedocs.org/en/latest/>`_.
+
+The idea of parameter arrays is to have a common format for defining different uncertainty distributions. Parameter arrays are stored as NumPy `structured or record arrays <http://docs.scipy.org/doc/numpy/reference/generated/numpy.recarray.html#numpy.recarray>`_. The fields that define an uncertainty distribution are:
+
+    * uncertainty type
+    * loc (short for location)
+    * scale
+    * shape
+    * minimum
+    * maximum
+    * negative
+
+In document data, these fields are stored in an *uncertainty dictionary*, e.g.:
+
+.. code-block:: python
+
+    {
+        'uncertainty type': NormalUncertainty.id,
+        'loc': 0.5,
+        'scale': 0.2,
+        'minimum': 0  # Acts as bounds; prevent negative values
+    }
+
+Default values will be provided if not directly specified.
+
+.. note:: If there is no uncertainty, then a simple number can also be provided. It will be converted automatically to an uncertainty dictionary.
+
+During processing, the uncertainty dictionaries are converted to rows in a NumPy array.
+
+Mappings
+--------
+
+Sometimes, important data can't be stored as a numeric value. For example, the location of an inventory activity is important for regionalization, but is given by a text string, not an integer. In this case, we use :ref:`serialized-dict` to store mappings between objects are integer indices. Brightway2-data uses two such mappings:
+
+    * :ref:`mapping`: Maps inventory objects (activities, biosphere flows, and anything else that would appear in a supply chain graph) to indices
+    * :ref:`geomapping`: Map locations (both inventory and regionalized impact assessment) to indices
+
+Mappings are also singletons. Items are added using ``.add(keys)``, and removed using ``.delete(keys)``.
+
+Development
+===========
+
+.. note:: See also the Brightway2 `documentation on contributing <http://brightway2.readthedocs.org/en/latest/contributing.html>`_.
 
 Running tests
 -------------

docs/inventory.rst

+Inventory data stores
+*********************
+
+.. _database:
+
+Database
+========
+
+.. autoclass:: bw2data.Database
+    :members:
 
 .. note:: only **imports** are supported.
 
-.. autoclass:: bw2data.io.EcospoldImporter
+.. autoclass:: bw2data.io.Ecospold1Importer
     :members:
 
 .. autoclass:: bw2data.io.EcospoldImpactAssessmentImporter

docs/metadata.rst

 Metadata
 ********
 
+Base classes for metadata
+=========================
+
+.. _serialized-dict:
+
+Serialized Dictionary
+---------------------
+
+.. autoclass:: bw2data.serialization.SerializedDict
+    :members:
+
+.. _compound-json:
+
+Compound JSON dictionary
+------------------------
+
+JSON hash tables don't support keys like ``("biosphere", "an emission")``, so the ``pack`` and ``unpack`` methods are used to transform data from Python to JSON and back.
+
+.. autoclass:: bw2data.serialization.CompoundJSONDict
+    :members:
+    :inherited-members:
+
+.. _pickled-dict:
+
+Pickled Dictionary
+------------------
+
+.. autoclass:: bw2data.serialization.PickledDict
+    :members:
+    :inherited-members:
+
+Metadata stores
+===============
+
 .. _databases:
 
+databases
+---------
+
 .. autoclass:: bw2data.meta.Databases
     :members:
     :inherited-members:
 
 .. _methods:
 
+methods
+-------
+
 .. autoclass:: bw2data.meta.Methods
     :members:
     :inherited-members:
 
+.. _normalizations:
+
+normalizations
+--------------
+
+.. autoclass:: bw2data.meta.NormalizationMeta
+    :members:
+    :inherited-members:
+
+.. _weightings:
+
+weightings
+----------
+
+.. autoclass:: bw2data.meta.WeightingMeta
+    :members:
+    :inherited-members:
+
+Mappings
+========
+
 .. _mapping:
 
+mapping
+-------
+
 .. autoclass:: bw2data.meta.Mapping
     :members:
     :inherited-members:
 
 .. _geomapping:
 
+geomapping
+----------
+
 .. autoclass:: bw2data.meta.GeoMapping
     :members:
     :inherited-members:

docs/method.rst

-Impact Assessment
-*****************
-
-.. _method:
-
-Method
-======
-
-.. autoclass:: bw2data.Method
-    :members:
-    :inherited-members:
-
-Base IA data store
-==================
-
-.. autoclass:: bw2data.ia_data_store.ImpactAssessmentDataStore
-    :members:

docs/technical.rst

-Technical guide
-***************
-
-Modular structure
-=================
-
-Brightway2 is a framework for life cycle assessment, and consists of several packages. You only need to install or understand the components that are of interest to you. Splitting components allow for a clean separation of concerns, as each package has a limited focus, and makes testing and documenting each package easier and cleaner.
-
-This guide has technical details for the ``bw2data`` package. Each separate package also has its own documentation.
-
-The current components of Brightway2 are:
-
-* bw2data: This package provides data handling, querying, and import/export functionality.
-* bw2calc: The LCA calculators. Normal LCA, several varieties of Monte Carlo LCA (including parallel Monte Carlo using all the cores on your computer), Latin Hypercubic sampling, and graph traversal.
-* bw2analyzer: Functions for analyzing the results of LCA calculations, including contribution and sensitivity analysis.
-* bw2ui: Two different user interfaces for Brightway2. Brightway2 is pure Python, and can be used by other programs or in an ipython notebook. For people who aren't as comfortable programming in Python, this packages provides a command line interface for work in the terminal, and a web interface.
-
-Reference
-=========
+Technical documentation
+***********************
 
 .. toctree::
    :maxdepth: 2
 
    configuration
    metadata
-   database
-   method
+   data
+   inventory
+   ia
    io
    utils