-OleFileIO_PL is a Python module to read Microsoft OLE2 files (also called
-Structured Storage or Compound Document File Format), such as Microsoft Office
-documents, Image Composer and FlashPix files, Outlook messages, etc.
+`OleFileIO\_PL <http://www.decalage.info/python/olefileio>`_ is a Python
+module to read `Microsoft OLE2 files (also called Structured Storage,
+Compound File Binary Format or Compound Document File
+such as Microsoft Office documents, Image Composer and FlashPix files,
-This is an improved version of the OleFileIO module from PIL, the excellent
-Python Imaging Library v1.1.6, created and maintained by Fredrik Lundh.
+This is an improved version of the OleFileIO module from
+`PIL <http://www.pythonware.com/products/pil/index.htm>`_, the excellent
+Python Imaging Library, created and maintained by Fredrik Lundh. The API
+is still compatible with PIL, but I have improved the internal
+implementation significantly, with bugfixes and a more robust design.
-The API is still compatible with PIL, but the internal implementation has been
-improved a lot, with bugfixes and a more robust design. As far as I know, this
-module is the most complete and robust Python implementation to read MS OLE2
-files, portable on several OSes.
+As far as I know, this module is now the most complete and robust Python
+implementation to read MS OLE2 files, portable on several operating
+systems. (please tell me if you know other similar Python modules)
WARNING: THIS IS (STILL) WORK IN PROGRESS.
+Main improvements over PIL version:
+- Better compatibility with Python 2.4 up to 2.7
+- Support for files larger than 6.8MB
+- Robust: many checks to detect malformed files
+- Added setup.py and install.bat to ease installation
-- on Windows, launch install.bat
-- on other systems, launch: setup.py install
+- 2012-02-17 v0.22: fixed issues #7 (bug in getproperties) and #2
+- 2011-10-20: code hosted on bitbucket to ease contributions and bug
+- 2010-01-24 v0.21: fixed support for big-endian CPUs, such as PowerPC
+- 2009-12-11 v0.20: small bugfix in OleFileIO.open when filename is not
+- 2009-12-10 v0.19: fixed support for 64 bits platforms (thanks to Ben
+ G. and Martijn for reporting the bug)
+- see changelog in source code for more info.
+The archive is available on `the project
-See main at the end of the module, and also docstrings.
+See sample code at the end of the module, and also docstrings.
+Here are a few examples:
+ # Test if a file is an OLE container:
+ assert OleFileIO_PL.isOleFile('myfile.doc')
+ ole = OleFileIO_PL.OleFileIO('myfile.doc')
+ # Test if known streams/storages exist:
+ if ole.exists('worddocument'):
+ print "This is a Word document."
+ print "size :", ole.get_size('worddocument')
+ if ole.exists('macros/vba'):
+ print "This document seems to contain VBA macros."
+ # Extract the "Pictures" stream from a PPT file:
+ if ole.exists('Pictures'):
+ pics = ole.openstream('Pictures')
+ f = open('Pictures.bin', 'w')
+It can also be used as a script from the command-line to display the
+structure of an OLE file, for example:
+ OleFileIO_PL.py myfile.doc
+A real-life example: `using OleFileIO\_PL for malware analysis and
+The code is available in `a Mercurial repository on
+bitbucket <https://bitbucket.org/decalage/olefileio_pl>`_. You may use
+it to submit enhancements or to report any issue.
+If you would like to help us improve this module, or simply provide
+feedback, you may also send an e-mail to decalage(at)laposte.net. You
+- test this module on different platforms / Python versions
+- improve documentation, code samples, docstrings
+- write unittest test cases
+- provide tricky malformed files
+To report a bug, for example a normal file which is not parsed
+correctly, please use the `issue reporting
+or send an e-mail with an attachment containing the debugging output of
+For this, launch the following command :
+ OleFileIO_PL.py -d -c file >debug.txt
+OleFileIO\_PL is open-source.
+OleFileIO\_PL changes are Copyright (c) 2005-2012 by Philippe Lagadec.
+The Python Imaging Library (PIL) is
+- Copyright (c) 1997-2005 by Secret Labs AB
+- Copyright (c) 1995-2005 by Fredrik Lundh
+By obtaining, using, and/or copying this software and/or its associated
+documentation, you agree that you have read, understood, and will comply
+with the following terms and conditions:
+Permission to use, copy, modify, and distribute this software and its
+associated documentation for any purpose and without fee is hereby
+granted, provided that the above copyright notice appears in all copies,
+and that both that copyright notice and this permission notice appear in
+supporting documentation, and that the name of Secret Labs AB or the
+author not be used in advertising or publicity pertaining to
+distribution of the software without specific, written prior permission.
+SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
+THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
+ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
+RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
+CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
+CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.