Overview

Python Setup

This meta-project collects all of the python tools I typically use. It also serves as a fairly minimal example of setting up a package the pip can install, and specifying dependencies.

Summary

There are two recommended ways of installing python:

Enthough Python Distribution:
The Enthough Python Distribution provided by Enthought_is a complete python installation including NumPy, SciPy, matplotlib, and many other useful tools. These all come packaged in a single working distribution that can be installed at once. There is both a free version, and a professional version -- the latter is free for academic use and includes some additional tools for analysis and visualization.
Anaconda:
An alternative complete python installation is provided through Anaconda from Continuum Analytics, developed by many of the people who developed the Enthough Python Distribution. There are also free and professional versions with the latter being free for academic use. One caveat is that Anaconda does not work on as many systems (it will not work on my 32 bit Mac for example). As a plus, in includes some cutting-edge high-performance tools like Numba.

My suggestion is to install one (or both) of these, then create appropriate virtual environments for managing any additional packages you need to install. In this way, you keep your system python clean, have access to the latest tools in a complete distribution (which is also kept clean) and can play with various package combinations: Important, for example, if you want to make sure you understand all of the dependencies of your code.

Note that Anaconda has its own environment manager instead of virtualenv called Conda so the setups are different. We describe both flavours here.

Quick Start

Enthough Python Distribution

If you are impatient and courageous, here is the executive summary based on the Enthough Python Distribution:

  • Install Enthough Python Distribution (or Anaconda), git, and GSL.

  • Install virtualenv (and pip which is not provided by Enthough Python Distribution):

    sudo easy_install pip
    sudo pip install virtualenv
    

    or down load virtualenv.py and replace \(virtualenv\) with \(python virtualenv.py\) below if you want to keep your base python installation pure.

  • Install the virtual environments setup some aliases:

    virtualenv --system-site-packages --distribute ~/.python_environments/epd
    virtualenv --no-site-packages --distribute ~/.python_environments/clean
    virtualenv -p /usr/bin/python --system-site-packages --distribute \
               ~/.python_environments/sys
    virtualenv -p ~/usr/apps/anaconda/Current/bin/python \
               --system-site-packages --distribute \
               ~/.python_environments/anaconda
    
    cat >> ~/.bashrc <<EOF
    alias v.epd=". ~/.python_environments/epd/bin/activate"
    alias v.sys=". ~/.python_environments/sys/bin/activate"
    alias v.clean=". ~/.python_environments/clean/bin/activate"
    v.epd
    EOF
    
  • Install Mercurial:

    pip install hg
    
  • If on a Mac, then fix pythonw:

    mkdir -p ~/src/python/git
    cd ~/src/python/git
    #git clone http://github.com/gldnspud/virtualenv-pythonw-osx.git
    git clone http://github.com/nicholsn/virtualenv-pythonw-osx.git
    cd virtualenv-pythonw-osx
    deactivate; v.epd        # Make sure you use the appropriate virtualenv
    python install_pythonw.py /Users/mforbes/.python_environments/epd
    
  • Activate your desired virtual environment and choose the set of requirements to install:

    v.epd
    pip install -r all.txt
    

Anaconda

I install Anaconda in /data/apps/anaconda/1.3.1 which I symlink to /data/apps/anaconda/current. Add /data/apps/anaconda/current/bin to your path. One can use Conda to manage the equivalents of virtual environments, but for now I am just using a "global" environment. I needed to do the following to get to a working state:

conda update anaconda conda ipython pip sympy numexpr
conda pip ipdb winpdb zope.interface mercurial
conda pip psutil memory_profiler
conda pip scikits.bvp1lg theano
conda pip pp
conda pip

Requirements

Here is a list of the various requirements. These are all disjoint, so you can pick and choose.

doc.txt :
Various documentation tools like Sphinx and associated packages. I use this for both my code documentation and for things like my website.
emacs.txt :
Various tools for setting up my development environment (I use emacs) including checking tools.
debug.txt :
Debugging tools, including remote debuggers.
profile.txt :
Profiling tools for optimizing code.
testing.txt :
Testing tools including code coverage.
vc.txt :
Version control tools like mercurial and extensions
misc.txt :
Odds and ends.
mmf.txt :
My source packages for projects. These will be installed as source distributions.
all.txt :
All of the above.

Here are some additional requirement files:

EPD.txt :
The list of requirements frozen from a fresh EPD install.
freeze.txt :
Snapshot of my system by running pip freeze > freeze.txt
bleeding-edge.txt :

Installs NumPy, SciPy, and matplotlib from source. Note: this does not work for some reason because pip fails to install some compiled libraries. (The NumPy install will look fine, but SciPy will then fail.) Here is a discussion. To deal with this, first use pip to install this developmental version of NumPy. This will install the source. Then go into the source directory and run python setup.py install --prefix=/path/to/virtualenv. I.e.:

pip install --upgrade -r bleading-edge.txt
cd ~/.python_environments/epd/src/numpy
python setup.py install --prefix=~/.python_environments/epd
mac.txt :
Specific packages for Mac's.

Details

To use it do the following:

  1. Install a version of python. Many systems have a version preinstalled, so this step is optional. However, if you plan to do serious development, then I strongly recommend installing the Enthough Python Distribution. There is a free version, and an almost full featured free version for academic use: You can also pay for a comercial version and recieve support. The EPD is very complete, and just works on most common platforms and I highly recommend it. Make sure you can run the version of python you desire.

    If you install the EPD, then it will typically add something like the following to your ~/.bash_login or ~/.profile files:

    # Setting PATH for EPD-7.3-2
    # The orginal version is saved in .bash_login.pysave
    PATH="/Library/Frameworks/Python.framework/Versions/Current/bin:${PATH}"
    export PATH
    
    MKL_NUM_THREADS=1
    export MKL_NUM_THREADS
    

    (If you want to use a multithreaded version of numpy, you will need to change the value of MKL_NUM_THREADS. See this discussion.)

  1. Create a virtualenv. This will allow you to install new packages in a controlled manner that will not mess with the system version (or the EPD version). You can create multiple virtual environments for different projects or associated with different versions of python. Again, this is highly recommended. There are several ways of doing this.

    Note

    Methods 1) and 2) will install virtualenv to the location specified by the current version of python. This means that you might need root access, and it will slightly "muck up" you pristine system install. This is generally not a problem, but if it bothers you see step 3).

    1. If you have pip (the new python packageing system), then you can use it to install virtualenv as follows:

      pip install virtualenv
      
    2. If you do not have pip, you might have easy_install:

      easy_install virtualenv
      
    3. If you do not want to muck up your system version of python at all, then you can simply download the file virtualenv.py. In the commands that follow, replace virtualenv with python virtualenv.py.

  2. Setup a virtual environment for your work. You can have many differen environments, so you will need to choose a meaningful name. I use "epd" for the EPD version of python, "sys" for the system version of python, and "clean" for a version using EPD but without the site-packages:

    virtualenv --system-site-packages --distribute ~/.python_environments/epd
    virtualenv --no-site-packages --distribute ~/.python_environments/clean
    virtualenv -p /usr/bin/python --system-site-packages --distribute \
               ~/.python_environments/sys
    

    Once this virtualenv is activated, install packages with pip will place all of the installed files in the ~/.python_environments/epd directory. (You can change this to any convenient location). The --system-site-packages option allows the virtualenv access to the system libraries (in my case, all of the EPD goodies). If you want to test a system for deployment, making sure that it does not have any external dependencies, then you would use the --no-site-packages option instead. Run virtualenv --help for more information.

  3. Add some aliases to help you activate virtualenv sessions. I include the following in my .bashrc file:

    # Some virtualenv related macros
    alias v.epd=". ~/.python_environments/epd/bin/activate"
    alias v.sys=". ~/.python_environments/sys/bin/activate"
    alias v.clean=". ~/.python_environments/clean/bin/activate"
    v.epd
    

    You can activate your chosen environment with one of the commands v.epd, v.clean, or v.sys. The default activation script will insert "(epd)" etc. to your prompt:

    ~ mforbes$ v.epd
    (epd)~ mforbes$ v.sys
    (sys)~ mforbes$ deactivate
    ~ mforbes$
    

    To get out of the environments, just type deactivate as shown above.

    Note

    If you have an older version of IPython (pre 0.13), then you may need to call ipython from a function like this:

    # Remap ipython if VIRTUAL_ENV is defined
    function ipython {
      if [ -n "${VIRTUAL_ENV}" -a -x "${VIRTUAL_ENV}/bin/python" ]; then
        START_IPYTHON='\
          import sys; \
          from IPython.frontend.terminal.ipapp import launch_new_instance;\
          sys.exit(launch_new_instance())'
         "${VIRTUAL_ENV}/bin/python" -c "${START_IPYTHON}" "$@"
       else
         command ipython "$*"
       fi
    }
    

    This deals with issues that IPython was not virtualenv aware. The recommended solution is still to install IPython in the virtualenv using pip install ipython, but then you will need one in each environment. As of IPython 0.13, this support is included. (See this PR.)

    If you have not used IPython before, then you should have a look. It has some fantastic features like %paste and the IPython notebook interface.

  1. Install mercurial. You may already have this (try hg --version). If not, either install a native distribution (which might have some GUI tools) or install with:

    pip install hg
    
  2. Install git. This may not be as easy, but some packages are only available from github.

  3. On Mac OS X you may need to install pythonw for some GUI applications (like RunSnakeRun). You an do this using this solution:

    mkdir -p ~/src/python/git
    cd ~/src/python/git
    git clone http://github.com/gldnspud/virtualenv-pythonw-osx.git
    cd virtualenv-pythonw-osx
    python install_pythonw.py /Users/mforbes/.python_environments/epd
    

    You will have to do this in each virtualenv you want to use the GUI apps from.

  1. Non-python prerequisites. These need to be installed outside of the python environment for some of the required libraries to work.

  2. Install various requirements as follows:

    pip install -r requirements/all.txt
    

Using pip

Here are some notes about using pip that I did not find obvious.

Version Control

It is clear from the documentation about requirements that you can specify version controlled repositories with pip, however, the exact syntax for specifying revisions etc. is not so clear. Examining the source shows that you can specify revisions, tags, etc. as follows:

# Get the "tip"
hg+http://bitbucket.org/mforbes/pymmf#egg=pymmf

# Get the revision with tag "v1.0" or at the tip of branch "v1.0"
hg+https://bitbucket.org/mforbes/pymmf@v1.0#egg=pymmf

# Get the specified revision exactly
hg+https://bitbucket.org/mforbes/pymmf@633be89a#egg=pymmf

What appears after the "@" sign is any valid revision (for mercurial see hg help revision for various options). Unfortunately, I see no way of specifying something like ">=1.1", or ">=633be89a" (i.e. a descendent of a particular revision). (See issue 782)

Using the MKL

The EPD is build using the Intel MKL. Here are some instructions on how to compile your own version of NumPy and SciPy with the MKL.

  • Checkout the source code:

    pip install --no-install -e git+http://github.com/numpy/numpy#egg=numpy-dev
    pip install --no-install -e git+http://github.com/scipy/scipy#egg=scipy-dev
    
  • Setup the environment to use the Intel compilers:

    . /usr/local/bin/intel64.sh
    . /opt/intel/Compiler/11.1/069/mkl/tools/environment/mklvarsem64t.sh
    
  • Edit the site.cfg file in the NumPy source directory. I am not sure exactly which libraries to include. See these discussions:

    cd ~/.python_environments/epd/src/numpy
    cp site.cfg.example site.cfg
    vi site.cfg
    

    Here is what I used:

    [mkl]
    library_dirs = /opt/intel/Compiler/11.1/069/mkl/lib/em64t/
    include_dirs = /opt/intel/Compiler/11.1/069/mkl/include
    lapack_libs = mkl_lapack95_lp64
    mkl_libs = mkl_def, mkl_intel_lp64, mkl_intel_thread, mkl_core, mkl_mc
    

    I also needed to modify numpy/distutils/intelccompiler.py as follows:

         cc_args = "-fPIC"
         def __init__ (self, verbose=0, dry_run=0, force=0):
             UnixCCompiler.__init__ (self, verbose,dry_run, force)
    -        self.cc_exe = 'icc -m64 -fPIC'
    +        self.cc_exe = 'icc -O3 -g -openmp -m64 -fPIC'
             compiler = self.cc_exe
             self.set_executables(compiler=compiler,
                                  compiler_so=compiler,
    
  • Build both NumPy and SciPy with the following:

    cd ~/.python_environments/epd/src/numpy
    python setup.py config --compiler=intelem --fcompiler=intelem\
                build_clib --compiler=intelem --fcompiler=intelem\
                build_ext --compiler=intelem --fcompiler=intelem\
                install
    cd ~/.python_environments/epd/src/scipy
    
  • Run and check the build configuration:

    $ python -c "import numpy;print numpy.__file__;print numpy.show_config()"
    /phys/users/mforbes/.python_environments/epd/lib/python2.7/site-packages/numpy/__init__.pyc
    lapack_opt_info:
        libraries = ['mkl_lapack95_lp64', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread']
        library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/']
        define_macros = [('SCIPY_MKL_H', None)]
        include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include']
    blas_opt_info:
        libraries = ['mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread']
        library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/']
        define_macros = [('SCIPY_MKL_H', None)]
        include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include']
    lapack_mkl_info:
        libraries = ['mkl_lapack95_lp64', 'mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread']
        library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/']
        define_macros = [('SCIPY_MKL_H', None)]
        include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include']
    blas_mkl_info:
        libraries = ['mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread']
        library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/']
        define_macros = [('SCIPY_MKL_H', None)]
        include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include']
    mkl_info:
        libraries = ['mkl_def', 'mkl_intel_lp64', 'mkl_intel_thread', 'mkl_core', 'mkl_mc', 'pthread']
        library_dirs = ['/opt/intel/Compiler/11.1/069/mkl/lib/em64t/']
        define_macros = [('SCIPY_MKL_H', None)]
        include_dirs = ['/opt/intel/Compiler/11.1/069/mkl/include']
    None
    

    Note

    You will need to setup the environment to run with the MKL libraries. The EPD avoids this by distributing the libraries. I suggest that you add the following to the activation script:

    cat >> ~/.python_environments/epd/bin/activate <<EOF
    
    # This adds the MKL libraries to the path for use with my custom numpy
    # and scipy builds.
    . /usr/local/bin/intel64.sh
    . /opt/intel/Compiler/11.1/069/mkl/tools/environment/mklvarsem64t.sh
    EOF
    

See also:

Other Software

This section describes various other pieces of software that I use that interact with python.

pyaudio

pyaudio is a python interface to the PortAudio library for generating sounds and sound files. To do real-time sound generation, one really needs to non-blocking interface (otherwise, the delay between blocking calls will affect the signal in a manner that is difficult to compensate for). Unfortunately, the default builds require Mac OS X 10.7 or higher.

reStructuredText

I like to write my local documentation in reStructuredText (such as this file). As I often use math, I make the default role :math:`` and use MathJax. Here is an example:

.. default-role:: math

Now I can type math like this: `E=mc^2` or in an equation line this

.. math::
   \int_0^1 e^{x} = e - 1

Note

Now I can type math like this: \(E=mc^2\) or in an equation line this

\begin{equation*} \int_0^1 e^{x} = e - 1 \end{equation*}

In order to work offline, I install MathJax locally using the IPython as described here:

from IPython.external.mathjax import install_mathjax
install_mathjax()

This installs it in ~/.python_environments/epd/lib/python2.7/site-packages/IPython/frontend/html/notebook/static/mathjax which can be used locally. I symlink it to ~/.mathjax, but you must find a way to inject the stylesheet into your HTML. One way is with the .. raw:: html directive:

.. raw:: html

   <script type="text/javascript"
    src="/Users/mforbes/.mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
   </script>

Profiling

This page has a great discussion of line and memory profiling:

Emacs

I use Emacs as my principle editor and like to have access to syntax highlighting, auto-completion etc. Thus, I typically install the following packages, but these are not completely straightforward.

Pymacs

Pymacs allows Emacs to access the python interpreter and is used by Ropemacs to provide some nice features like code checking. The source appears not to be pip installable, so you must download it and run make as follows:

git clone http://github.com/pinard/Pymacs.git
cd Pymacs
make
pip install -e .

Anaconda

Anaconda provides a very nice python system, especially with the Conda package management tool, but there are a few problems:

  1. No installation for 32-bit Mac OS X systems. (No longer an issue for me since I finally have a 64 bit machine.)
  2. No Mayavi_. This means that I must maintain an EPD 32-bit installation as well (with all my required packages) in order to visualize.

Creating Packages

As an example, here we create a Conda package for installing the FFTW_ and related software. We start with a fresh Anaconda installation: (this command would show if we have any packages installed that are not managed by Conda)

$ conda package --untracked prefix: /data/apps/anaconda/1.3.1

Now we manually install the FFTW_ etc.:

cd ~/src
wget http://www.fftw.org/fftw-3.3.3.tar.gz
wget http://www.fftw.org/fftw-3.3.3.tar.gz.md5sum
md5 fftw-3.3.3.tar.gz           # Check that this is okay
tar -zxvf fftw-3.3.3.tar.gz
cd fftw-3.3.3

# Build and install the single, double, long-double
# and quad-precision versions
PREFIX=/data/apps/anaconda/current/
for opt in " " "--enable-sse2 --enable-single" \
               "--enable-long-double" "--enable-quad-precision"; do
  ./configure --prefix="${PREFIX}"\
              --enable-threads\
              --enable-shared\
              $opt
  make -j8 install
done

# Note: this needs a patch to work on Mac OS X
# https://code.google.com/p/anfft/issues/detail?id=4
export FFTW_PATH=/data/apps/anaconda/current/lib/
pip install --upgrade anfft pyfftw

These are untracked:

$ conda package --untracked
prefix: /data/apps/anaconda/1.3.1
bin/fftw-wisdom
...
include/fftw3.f
...
lib/libfftw3.3.dylib
...
lib/pkgconfig/fftw3.pc
...
lib/python2.7/site-packages/Mako-0.7.3-py2.7.egg-info/PKG-INFO
...
lib/python2.7/site-packages/anfft-0.2-py2.7.egg-info/PKG-INFO
...
lib/python2.7/site-packages/pyFFTW-0.9.0-py2.7.egg-info/PKG-INFO
...
lib/python2.7/site-packages/pyfftw/__init__.py
...
share/info/fftw3.info
...
share/man/man1/fftw-wisdom-to-conf.1
...

These can be bundled into a new package that can later be installed directly:

$ conda package --pkg-name=fftw --pkg-version=3.3.3
prefix: /data/apps/anaconda/1.3.1
Number of files: 82
fftw-3.3.3-py27_0.tar.bz2 created successfully

Problems

I had problems installing a virtual environment with Anaconda. Don't do this! Use Conda instead.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.