Source

python-peps / pep-0416.txt

Full commit
PEP: 416
Title: Add a frozendict builtin type
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner@gmail.com>
Status: Rejected
Type: Standards Track
Content-Type: text/x-rst
Created: 29-February-2012
Python-Version: 3.3


Rejection Notice
================

I'm rejecting this PEP.  A number of reasons (not exhaustive):

 * According to Raymond Hettinger, use of frozendict is low.  Those
   that do use it tend to use it as a hint only, such as declaring
   global or class-level "constants": they aren't really immutable,
   since anyone can still assign to the name.
 * There are existing idioms for avoiding mutable default values.
 * The potential of optimizing code using frozendict in PyPy is
   unsure; a lot of other things would have to change first.  The same
   holds for compile-time lookups in general.
 * Multiple threads can agree by convention not to mutate a shared
   dict, there's no great need for enforcement.  Multiple processes
   can't share dicts.
 * Adding a security sandbox written in Python, even with a limited
   scope, is frowned upon by many, due to the inherent difficulty with
   ever proving that the sandbox is actually secure.  Because of this
   we won't be adding one to the stdlib any time soon, so this use
   case falls outside the scope of a PEP.

On the other hand, exposing the existing read-only dict proxy as a
built-in type sounds good to me.  (It would need to be changed to
allow calling the constructor.)  GvR.

**Update** (2012-04-15): A new ``MappingProxyType`` type was added to the types
module of Python 3.3.


Abstract
========

Add a new frozendict builtin type.


Rationale
=========

A frozendict is a read-only mapping: a key cannot be added nor removed, and a
key is always mapped to the same value. However, frozendict values can be not
hashable. A frozendict is hashable if and only if all values are hashable.

Use cases:

 * Immutable global variable like a default configuration.
 * Default value of a function parameter. Avoid the issue of mutable default
   arguments.
 * Implement a cache: frozendict can be used to store function keywords.
   frozendict can be used as a key of a mapping or as a member of set.
 * frozendict avoids the need of a lock when the frozendict is shared
   by multiple threads or processes, especially hashable frozendict. It would
   also help to prohibe coroutines (generators + greenlets) to modify the
   global state.
 * frozendict lookup can be done at compile time instead of runtime because the
   mapping is read-only. frozendict can be used instead of a preprocessor to
   remove conditional code at compilation, like code specific to a debug build.
 * frozendict helps to implement read-only object proxies for security modules.
   For example, it would be possible to use frozendict type for __builtins__
   mapping or type.__dict__. This is possible because frozendict is compatible
   with the PyDict C API.
 * frozendict avoids the need of a read-only proxy in some cases. frozendict is
   faster than a proxy because getting an item in a frozendict is a fast lookup
   whereas a proxy requires a function call.


Constraints
===========

 * frozendict has to implement the Mapping abstract base class
 * frozendict keys and values can be unorderable
 * a frozendict is hashable if all keys and values are hashable
 * frozendict hash does not depend on the items creation order


Implementation
==============

 * Add a PyFrozenDictObject structure based on PyDictObject with an extra
   "Py_hash_t hash;" field
 * frozendict.__hash__() is implemented using hash(frozenset(self.items())) and
   caches the result in its private hash attribute
 * Register frozendict as a collections.abc.Mapping
 * frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and
   PyDict_DelItem() raise a TypeError


Recipe: hashable dict
======================

To ensure that a a frozendict is hashable, values can be checked
before creating the frozendict::

    import itertools

    def hashabledict(*args, **kw):
        # ensure that all values are hashable
        for key, value in itertools.chain(args, kw.items()):
            if isinstance(value, (int, str, bytes, float, frozenset, complex)):
                # avoid the compute the hash (which may be slow) for builtin
                # types known to be hashable for any value
                continue
            hash(value)
            # don't check the key: frozendict already checks the key
        return frozendict.__new__(cls, *args, **kw)


Objections
==========

*namedtuple may fit the requiements of a frozendict.*

A namedtuple is not a mapping, it does not implement the Mapping abstract base
class.

*frozendict can be implemented in Python using descriptors" and "frozendict
just need to be practically constant.*

If frozendict is used to harden Python (security purpose), it must be
implemented in C. A type implemented in C is also faster.

*The PEP 351 was rejected.*

The PEP 351 tries to freeze an object and so may convert a mutable object to an
immutable object (using a different type). frozendict doesn't convert anything:
hash(frozendict) raises a TypeError if a value is not hashable. Freezing an
object is not the purpose of this PEP.


Alternative: dictproxy
======================

Python has a builtin dictproxy type used by type.__dict__ getter descriptor.
This type is not public. dictproxy is a read-only view of a dictionary, but it
is not read-only mapping.  If a dictionary is modified, the dictproxy is also
modified.

dictproxy can be used using ctypes and the Python C API, see for example the
`make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)`_
by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is
"mutable" (modify the dictionary linked to the dictproxy).

However dictproxy can be useful in some cases, where its mutable property is
not an issue, to avoid a copy of the dictionary.


Existing implementations
========================

Whitelist approach.

 * `Implementing an Immutable Dictionary (Python recipe 498072)
   <http://code.activestate.com/recipes/498072/>`_ by Aristotelis Mikropoulos.
   Similar to frozendict except that it is not truly read-only: it is possible
   to access to this private internal dict.  It does not implement __hash__ and
   has an implementation issue: it is possible to call again __init__() to
   modify the mapping.
 * PyWebmail contains an ImmutableDict type: `webmail.utils.ImmutableDict
   <http://pywebmail.cvs.sourceforge.net/viewvc/pywebmail/webmail/webmail/utils/ImmutableDict.py?revision=1.2&view=markup>`_.
   It is hashable if keys and values are hashable. It is not truly read-only:
   its internal dict is a public attribute.
 * remember project: `remember.dicts.FrozenDict
   <https://bitbucket.org/mikegraham/remember/src/tip/remember/dicts.py>`_.
   It is used to implement a cache: FrozenDict is used to store function callbacks.
   FrozenDict may be hashable. It has an extra supply_dict() class method to
   create a FrozenDict from a dict without copying the dict: store the dict as
   the internal dict. Implementation issue: __init__() can be called to modify
   the mapping and the hash may differ depending on item creation order. The
   mapping is not truly read-only: the internal dict is accessible in Python.


Blacklist approach: inherit from dict and override write methods to raise an
exception. It is not truly read-only: it is still possible to call dict methods
on such "frozen dictionary" to modify it.

 * brownie: `brownie.datastructures.ImmuatableDict
   <https://github.com/DasIch/brownie/blob/HEAD/brownie/datastructures/mappings.py>`_.
   It is hashable if keys and values are hashable. werkzeug project has the
   same code: `werkzeug.datastructures.ImmutableDict
   <https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/datastructures.py>`_.
   ImmutableDict is used for global constant (configuration options). The Flask
   project uses ImmutableDict of werkzeug for its default configuration.
 * SQLAchemy project: `sqlachemy.util.immutabledict
   <http://hg.sqlalchemy.org/sqlalchemy/file/tip/lib/sqlalchemy/util/_collections.py>`_.
   It is not hashable and has an extra method: union(). immutabledict is used
   for the default value of parameter of some functions expecting a mapping.
   Example: mapper_args=immutabledict() in SqlSoup.map().
 * `Frozen dictionaries (Python recipe 414283) <http://code.activestate.com/recipes/414283/>`_
   by Oren Tirosh. It is hashable if keys and values are hashable. Included in
   the following projects:

   * lingospot: `frozendict/frozendict.py
     <http://code.google.com/p/lingospot/source/browse/trunk/frozendict/frozendict.py>`_
   * factor-graphics: frozendict type in `python/fglib/util_ext_frozendict.py
     <https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_

 * The gsakkis-utils project written by George Sakkis includes a frozendict
   type: `datastructs.frozendict
   <http://code.google.com/p/gsakkis-utils/source/browse/trunk/datastructs/frozendict.py>`_
 * characters: `scripts/python/frozendict.py
   <https://github.com/JasonGross/characters/blob/15a2af5f7861cd33a0dbce70f1569cda74e9a1e3/scripts/python/frozendict.py#L1>`_.
   It is hashable. __init__() sets __init__ to None.
 * Old NLTK (1.x): `nltk.util.frozendict
   <http://nltk.googlecode.com/svn/trunk/nltk-old/src/nltk/util.py>`_. Keys and
   values must be hashable. __init__() can be called twice to modify the
   mapping. frozendict is used to "freeze" an object.

Hashable dict: inherit from dict and just add an __hash__ method.

 * `pypy.rpython.lltypesystem.lltype.frozendict
   <https://bitbucket.org/pypy/pypy/src/1f49987cc2fe/pypy/rpython/lltypesystem/lltype.py#cl-86>`_.
   It is hashable but don't deny modification of the mapping.
 * factor-graphics: hashabledict type in `python/fglib/util_ext_frozendict.py
   <https://github.com/ih/factor-graphics/blob/41006fb71a09377445cc140489da5ce8eeb9c8b1/python/fglib/util_ext_frozendict.py>`_


Links
=====

 * `Issue #14162: PEP 416: Add a builtin frozendict type
   <http://bugs.python.org/issue14162>`_
 * PEP 412: Key-Sharing Dictionary
   (`issue #13903 <http://bugs.python.org/issue13903>`_)
 * PEP 351: The freeze protocol
 * `The case for immutable dictionaries; and the central misunderstanding of
   PEP 351 <http://www.cs.toronto.edu/~tijmen/programming/immutableDictionaries.html>`_
 * `make dictproxy object via ctypes.pythonapi and type() (Python recipe
   576540) <http://code.activestate.com/recipes/576540/>`_ by Ikkei Shimomura.
 * Python security modules implementing read-only object proxies using a C
   extension:

   * `pysandbox <https://github.com/haypo/pysandbox/>`_
   * `mxProxy <http://www.egenix.com/products/python/mxBase/mxProxy/>`_
   * `zope.proxy <http://pypi.python.org/pypi/zope.proxy>`_
   * `zope.security <http://pypi.python.org/pypi/zope.security>`_


Copyright
=========

This document has been placed in the public domain.