HTTPS SSH

README

BRAINFormat

The LBNL BRAINFormat library specifies a general data format standardization framework and implements a novel file format for management and storage of neuro-science data. The library provides a number of core modules that can be used for implementation and specification of scientific application formats in general. Based on these components, the library implements the LBNL BRAIN file format. Important advantages and features of the format and library include:

  • Easy-to-use: User-friendly design and object-oriented python file API
  • Formal Specification: All components of the format have a formal specification which is part of the library as well as the files (JSON)
  • Verifiable: The library supports verification of format compliance of complete files and components of files
  • Modular: Managed objects allow semantic components of the format to be specified as self-contained units
  • Extensible: New components can be easily added to the format while different components can be designed independently
  • Reusable: Existing components can be nested, we can extend existing components through inheritance, and the library provides a number of base building blocks.
  • Data Annotation: Reusable modules for annotating data subsets are available which support searching, filtering, and merging of annotations and organization of annotations into collections.
  • Data Relationships: Supports modeling of complex semantic and structural relationships between data objects (groups, datasets, files) via the novel concept of relationship attributes.
  • Supports self-contained as well as modular file storage: All data can be stored in a single HDF5 file or individual managed object containers can be stored in separate files that can be accessed via external links directly from the main HDF5 file.
  • Application-independent design concepts & application-oriented modules: The library provides a number of core modules that can be used to define arbitrary, application file formats based on the concept of managed objects. Based on the concept of managed objects the library then defines the application-oriented BRAIN file format.
  • Portable, Scalable, and Self-describing: Build on top of HDF5 using best practices.
  • Detailed developer and user documentation
  • Open Source

brainformat_data_overview.png

Release Notes

This is an early-stage release of the library and the library is under active development. The library already provides a broad range of advanced features, however, the format may still change as we adapt the format to the growing needs of the neuro-science community.

Documentation

  • User documentation

    • To get started, we recommend the detailed user tutorial provided here.
    • An introduction to the more advanced topic of relationship attributes for modeling of semantic and structural relationships between data objects is available here.
    • The tutorials are also available as interactive iPython notebooks as part of the brain/examples module. You may run and interactively explore the tutorials locally using iPython's notebook feature.
  • Developer Documentation

    • Pre-built versions of the documentation are provided on a regular basis as downloads on the Downloads page: here. (PDF) (HTML zipped)
    • The sources of the developer documentation are available in the docs/ folder and are written using SPHINX. Assuming that sphinx is installed, you can build the most current version of the documentations in a variety of formats using make. E.g, to build the documentation in html form, simply execute make html in the docs/ folder. Other variants of the documentation can be build in the same fashion using, e.g, make latexpdf to build the PDF version of the page etc.. The documentation is built in the docs/built folder.

Local Installation

  • Download:
    • You can download distributions---which includes just the core sources---via the the Bitbucket download page. Distribution files are named BrainFormat-*.zip. Simply unzip the file and follow the installation instructions below. This is the recommended use for standard users.
    • You can also clone the full repo. This is recommended for developers who intend to contribute to the repo.
  • Installation:
    • You can install the library using the provided setup.py script, e.g., via python setup.py install
    • The library is written in pure Python. If you just want to test the library without installing it on your system, then simply set the PYTHONPATH to the main folder of the checked-out version of the repo (where the folder brain is located) and the library should be good to go.
  • Required libraries:
    • h5py and numpy
    • The brain.readers.htkcollection also requires scipy for reading of .mat files. NOTE: scipy is not installed by default since most users of the library will not need the HTK readers.

For example, a simple installation may look like this:

wget -O BrainFormat-0.1a.zip https://bitbucket.org/oruebel/brainformat/downloads/BrainFormat-0.1a.zip BrainFormat-0.1a.zip
unzip BrainFormat-0.1a.zip
cd BrainFormat-0.1a/
python setup.py install

Installation at NERSC

  • The library is installed at NERSC at /project/projectdirs/m2043/brainformat
  • A module file for using the library at NERSC is available. Simply execute:
module use /global/project/projectdirs/m2043/brainformat/modulefiles
module load brainformat

Alternatively you can also call:

source /project/projectdirs/m2043/brainformat/setup_environment

which simply executes the module use/load calls shown above. Afterwards you should be able to use the module in python. Here some simple test code:

# Import the module
from brain.dataformat.brainformat import *
import numpy as np
# Create a new file. This will initialize all required components of the file.
f = BrainDataFile.create(parent_object='testfile.h5' , mode='a')
# Get the data/internal group where we want to create some neural dataset
g = f.data().internal()
#Create a group for storing ECoG data (again everything needed is initalized here)
d1 = BrainDataECoG.create(parent_object=g, ephys_data_shape=(2,5), ephys_data_type='float64', sampling_rate=10, start_time=2)
# We can check whether the file and submodules are compliant with the format via
f.check_format_compliance()
d1.check_format_compliance()

Contribution guidelines

  • For details on legal/licence implications for contributing to this repository see the licence.txt file included with the repo and copyright notice below.
  • Please use the online issue tracker to view and report bugs, enhancements, and proposals.
  • The library uses sphinx style for documentation. All classes, modules and functions should be fully documented before submission to the repo.
  • Implementation should adhere to basic PEP8 coding style which can be checked, e.g, with PyLint.
  • Readers for external file formats should be placed in the /brain/readers module.
  • New tools for interaction with the data should be places in the /brain/tools module.
  • New general-purpose HDF5 managed object modules and file format modules should become a submodule of /brain/dataformat

Contact

Citing BRAINformat

O. Ruebel, Prabhat, P. Denes, D. Conant, E. Chang, and K. Bouchard, "BRAINformat: A Data Standardization Framework for Neuroscience Data," in bioRxiv, Cold Spring Harbor Labs Journals, August 2015. DOI 10.1101/024521. Online

BRAINFormat Copyright (c) 2014, 2015, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy). All rights reserved.

If you have questions about your rights to use or distribute this software, please contact Berkeley Lab's Innovation & Partnerships Office at IPO@lbl.gov referring to " BrainFormat (LBNL Ref 2015-020)."

NOTICE. This software was developed under funding from the U.S. Department of Energy. As such, the U.S. Government has been granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, prepare derivative works, and perform publicly and display publicly. Beginning five (5) years after the date permission to assert copyright is obtained from the U.S. Department of Energy, and subject to any subsequent five (5) year renewals, the U.S. Government is granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the Software to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so.