sandbox/morph / Doc / library / multiprocessing.rst

Full commit

:mod:`multiprocessing` --- Process-based "threading" interface


:mod:`multiprocessing` is a package that supports spawning processes using an API similar to the :mod:`threading` module. The :mod:`multiprocessing` package offers both local and remote concurrency, effectively side-stepping the :term:`Global Interpreter Lock` by using subprocesses instead of threads. Due to this, the :mod:`multiprocessing` module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.


Some of this package's functionality requires a functioning shared semaphore implementation on the host operating system. Without one, the :mod:`multiprocessing.synchronize` module will be disabled, and attempts to import it will result in an :exc:`ImportError`. See :issue:`3770` for additional information.


Functionality within this package requires that the __main__ module be importable by the children. This is covered in :ref:`multiprocessing-programming` however it is worth pointing out here. This means that some examples, such as the :class:`multiprocessing.Pool` examples will not work in the interactive interpreter. For example:

>>> from multiprocessing import Pool
>>> p = Pool(5)
>>> def f(x):
...     return x*x
>>>, [1,2,3])
Process PoolWorker-1:
Process PoolWorker-2:
Process PoolWorker-3:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
AttributeError: 'module' object has no attribute 'f'
AttributeError: 'module' object has no attribute 'f'
AttributeError: 'module' object has no attribute 'f'

(If you try this it will actually output three full tracebacks interleaved in a semi-random fashion, and then you may have to stop the master process somehow.)

The :class:`Process` class

In :mod:`multiprocessing`, processes are spawned by creating a :class:`Process` object and then calling its :meth:`~Process.start` method. :class:`Process` follows the API of :class:`threading.Thread`. A trivial example of a multiprocess program is

from multiprocessing import Process

def f(name):
    print 'hello', name

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))

To show the individual process IDs involved, here is an expanded example:

from multiprocessing import Process
import os

def info(title):
    print title
    print 'module name:', __name__
    print 'parent process:', os.getppid()
    print 'process id:', os.getpid()

def f(name):
    info('function f')
    print 'hello', name

if __name__ == '__main__':
    info('main line')
    p = Process(target=f, args=('bob',))

For an explanation of why (on Windows) the if __name__ == '__main__' part is necessary, see :ref:`multiprocessing-programming`.

Exchanging objects between processes

:mod:`multiprocessing` supports two types of communication channel between processes:


The :class:`Queue` class is a near clone of :class:`Queue.Queue`. For example:

from multiprocessing import Process, Queue

def f(q):
    q.put([42, None, 'hello'])

if __name__ == '__main__':
    q = Queue()
    p = Process(target=f, args=(q,))
    print q.get()    # prints "[42, None, 'hello']"

Queues are thread and process safe.


The :func:`Pipe` function returns a pair of connection objects connected by a pipe which by default is duplex (two-way). For example:

from multiprocessing import Process, Pipe

def f(conn):
    conn.send([42, None, 'hello'])

if __name__ == '__main__':
    parent_conn, child_conn = Pipe()
    p = Process(target=f, args=(child_conn,))
    print parent_conn.recv()   # prints "[42, None, 'hello']"

The two connection objects returned by :func:`Pipe` represent the two ends of the pipe. Each connection object has :meth:`~Connection.send` and :meth:`~Connection.recv` methods (among others). Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. Of course there is no risk of corruption from processes using different ends of the pipe at the same time.

Synchronization between processes

:mod:`multiprocessing` contains equivalents of all the synchronization primitives from :mod:`threading`. For instance one can use a lock to ensure that only one process prints to standard output at a time:

from multiprocessing import Process, Lock

def f(l, i):
    print 'hello world', i

if __name__ == '__main__':
    lock = Lock()

    for num in range(10):
        Process(target=f, args=(lock, num)).start()

Without using the lock output from the different processes is liable to get all mixed up.

Sharing state between processes

As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes.

However, if you really do need to use some shared data then :mod:`multiprocessing` provides a couple of ways of doing so.

Shared memory

Data can be stored in a shared memory map using :class:`Value` or :class:`Array`. For example, the following code

from multiprocessing import Process, Value, Array

def f(n, a):
    n.value = 3.1415927
    for i in range(len(a)):
        a[i] = -a[i]

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    p = Process(target=f, args=(num, arr))

    print num.value
    print arr[:]

will print

[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]

The 'd' and 'i' arguments used when creating num and arr are typecodes of the kind used by the :mod:`array` module: 'd' indicates a double precision float and 'i' indicates a signed integer. These shared objects will be process and thread-safe.

For more flexibility in using shared memory one can use the :mod:`multiprocessing.sharedctypes` module which supports the creation of arbitrary ctypes objects allocated from shared memory.

Server process

A manager object returned by :func:`Manager` controls a server process which holds Python objects and allows other processes to manipulate them using proxies.

A manager returned by :func:`Manager` will support types :class:`list`, :class:`dict`, :class:`Namespace`, :class:`Lock`, :class:`RLock`, :class:`Semaphore`, :class:`BoundedSemaphore`, :class:`Condition`, :class:`Event`, :class:`Queue`, :class:`Value` and :class:`Array`. For example,

from multiprocessing import Process, Manager

def f(d, l):
    d[1] = '1'
    d['2'] = 2
    d[0.25] = None

if __name__ == '__main__':
    manager = Manager()

    d = manager.dict()
    l = manager.list(range(10))

    p = Process(target=f, args=(d, l))

    print d
    print l

will print

{0.25: None, 1: '1', '2': 2}
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

Server process managers are more flexible than using shared memory objects because they can be made to support arbitrary object types. Also, a single manager can be shared by processes on different computers over a network. They are, however, slower than using shared memory.

Using a pool of workers

The :class:`~multiprocessing.pool.Pool` class represents a pool of worker processes. It has methods which allows tasks to be offloaded to the worker processes in a few different ways.

For example:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes
    result = pool.apply_async(f, [10])    # evaluate "f(10)" asynchronously
    print result.get(timeout=1)           # prints "100" unless your computer is *very* slow
    print, range(10))          # prints "[0, 1, 4,..., 81]"


The :mod:`multiprocessing` package mostly replicates the API of the :mod:`threading` module.

:class:`Process` and exceptions

Process objects represent activity that is run in a separate process. The :class:`Process` class has equivalents of all the methods of :class:`threading.Thread`.

The constructor should always be called with keyword arguments. group should always be None; it exists solely for compatibility with :class:`threading.Thread`. target is the callable object to be invoked by the :meth:`run()` method. It defaults to None, meaning nothing is called. name is the process name. By default, a unique name is constructed of the form 'Process-N1:N2:...:Nk' where N1,N2,...,Nk is a sequence of integers whose length is determined by the generation of the process. args is the argument tuple for the target invocation. kwargs is a dictionary of keyword arguments for the target invocation. By default, no arguments are passed to target.

If a subclass overrides the constructor, it must make sure it invokes the base class constructor (:meth:`Process.__init__`) before doing anything else to the process.

In addition to the :class:`Threading.Thread` API, :class:`Process` objects also support the following attributes and methods:

Note that the :meth:`start`, :meth:`join`, :meth:`is_alive` and :attr:`exit_code` methods should only be called by the process that created the process object.

Example usage of some of the methods of :class:`Process`:

Pipes and Queues

When using multiple processes, one generally uses message passing for communication between processes and avoids having to use any synchronization primitives like locks.

For passing messages one can use :func:`Pipe` (for a connection between two processes) or a queue (which allows multiple producers and consumers).

The :class:`Queue` and :class:`JoinableQueue` types are multi-producer, multi-consumer FIFO queues modelled on the :class:`Queue.Queue` class in the standard library. They differ in that :class:`Queue` lacks the :meth:`~Queue.Queue.task_done` and :meth:`~Queue.Queue.join` methods introduced into Python 2.5's :class:`Queue.Queue` class.

If you use :class:`JoinableQueue` then you must call :meth:`JoinableQueue.task_done` for each task removed from the queue or else the semaphore used to count the number of unfinished tasks may eventually overflow, raising an exception.

Note that one can also create a shared queue by using a manager object -- see :ref:`multiprocessing-managers`.


:mod:`multiprocessing` uses the usual :exc:`Queue.Empty` and :exc:`Queue.Full` exceptions to signal a timeout. They are not available in the :mod:`multiprocessing` namespace so you need to import them from :mod:`Queue`.


If a process is killed using :meth:`Process.terminate` or :func:`os.kill` while it is trying to use a :class:`Queue`, then the data in the queue is likely to become corrupted. This may cause any other process to get an exception when it tries to use the queue later on.


As mentioned above, if a child process has put items on a queue (and it has not used :meth:`JoinableQueue.cancel_join_thread`), then that process will not terminate until all buffered items have been flushed to the pipe.

This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children.

Note that a queue created using a manager does not have this issue. See :ref:`multiprocessing-programming`.

For an example of the usage of queues for interprocess communication see :ref:`multiprocessing-examples`.

Returns a process shared queue implemented using a pipe and a few locks/semaphores. When a process first puts an item on the queue a feeder thread is started which transfers objects from a buffer into the pipe.

The usual :exc:`Queue.Empty` and :exc:`Queue.Full` exceptions from the standard library's :mod:`Queue` module are raised to signal timeouts.

:class:`Queue` implements all the methods of :class:`Queue.Queue` except for :meth:`~Queue.Queue.task_done` and :meth:`~Queue.Queue.join`.

:class:`multiprocessing.Queue` has a few additional methods not found in :class:`Queue.Queue`. These methods are usually unnecessary for most code:

:class:`JoinableQueue`, a :class:`Queue` subclass, is a queue which additionally has :meth:`task_done` and :meth:`join` methods.

Connection Objects

Connection objects allow the sending and receiving of picklable objects or strings. They can be thought of as message oriented connected sockets.

Connection objects are usually created using :func:`Pipe` -- see also :ref:`multiprocessing-listeners-clients`.

For example:


The :meth:`Connection.recv` method automatically unpickles the data it receives, which can be a security risk unless you can trust the process which sent the message.

Therefore, unless the connection object was produced using :func:`Pipe` you should only use the :meth:`~Connection.recv` and :meth:`~Connection.send` methods after performing some sort of authentication. See :ref:`multiprocessing-auth-keys`.


If a process is killed while it is trying to read or write to a pipe then the data in the pipe is likely to become corrupted, because it may become impossible to be sure where the message boundaries lie.

Synchronization primitives

Generally synchronization primitives are not as necessary in a multiprocess program as they are in a multithreaded program. See the documentation for :mod:`threading` module.

Note that one can also create synchronization primitives by using a manager object -- see :ref:`multiprocessing-managers`.

A bounded semaphore object: a clone of :class:`threading.BoundedSemaphore`.

(On Mac OS X, this is indistinguishable from :class:`Semaphore` because sem_getvalue() is not implemented on that platform).

A condition variable: a clone of :class:`threading.Condition`.

If lock is specified then it should be a :class:`Lock` or :class:`RLock` object from :mod:`multiprocessing`.

A clone of :class:`threading.Event`. This method returns the state of the internal semaphore on exit, so it will always return True except if a timeout is given and the operation times out.

A non-recursive lock object: a clone of :class:`threading.Lock`.

A recursive lock object: a clone of :class:`threading.RLock`.

A semaphore object: a clone of :class:`threading.Semaphore`.


The :meth:`acquire` method of :class:`BoundedSemaphore`, :class:`Lock`, :class:`RLock` and :class:`Semaphore` has a timeout parameter not supported by the equivalents in :mod:`threading`. The signature is acquire(block=True, timeout=None) with keyword parameters being acceptable. If block is True and timeout is not None then it specifies a timeout in seconds. If block is False then timeout is ignored.

On Mac OS X, sem_timedwait is unsupported, so calling acquire() with a timeout will emulate that function's behavior using a sleeping loop.


If the SIGINT signal generated by Ctrl-C arrives while the main thread is blocked by a call to :meth:`BoundedSemaphore.acquire`, :meth:`Lock.acquire`, :meth:`RLock.acquire`, :meth:`Semaphore.acquire`, :meth:`Condition.acquire` or :meth:`Condition.wait` then the call will be immediately interrupted and :exc:`KeyboardInterrupt` will be raised.

This differs from the behaviour of :mod:`threading` where SIGINT will be ignored while the equivalent blocking calls are in progress.

Shared :mod:`ctypes` Objects

It is possible to create shared objects using shared memory which can be inherited by child processes.

The :mod:`multiprocessing.sharedctypes` module

The :mod:`multiprocessing.sharedctypes` module provides functions for allocating :mod:`ctypes` objects from shared memory which can be inherited by child processes.


Although it is possible to store a pointer in shared memory remember that this will refer to a location in the address space of a specific process. However, the pointer is quite likely to be invalid in the context of a second process and trying to dereference the pointer from the second process may cause a crash.

The table below compares the syntax for creating shared ctypes objects from shared memory with the normal ctypes syntax. (In the table MyStruct is some subclass of :class:`ctypes.Structure`.)

ctypes sharedctypes using type sharedctypes using typecode
c_double(2.4) RawValue(c_double, 2.4) RawValue('d', 2.4)
MyStruct(4, 6) RawValue(MyStruct, 4, 6)  
(c_short * 7)() RawArray(c_short, 7) RawArray('h', 7)
(c_int * 3)(9, 2, 8) RawArray(c_int, (9, 2, 8)) RawArray('i', (9, 2, 8))

Below is an example where a number of ctypes objects are modified by a child process:

from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Value, Array
from ctypes import Structure, c_double

class Point(Structure):
    _fields_ = [('x', c_double), ('y', c_double)]

def modify(n, x, s, A):
    n.value **= 2
    x.value **= 2
    s.value = s.value.upper()
    for a in A:
        a.x **= 2
        a.y **= 2

if __name__ == '__main__':
    lock = Lock()

    n = Value('i', 7)
    x = Value(c_double, 1.0/3.0, lock=False)
    s = Array('c', 'hello world', lock=lock)
    A = Array(Point, [(1.875,-6.25), (-5.75,2.0), (2.375,9.5)], lock=lock)

    p = Process(target=modify, args=(n, x, s, A))

    print n.value
    print x.value
    print s.value
    print [(a.x, a.y) for a in A]

The results printed are

[(3.515625, 39.0625), (33.0625, 4.0), (5.640625, 90.25)]


Managers provide a way to create data which can be shared between different processes. A manager object controls a server process which manages shared objects. Other processes can access the shared objects by using proxies.

Manager processes will be shutdown as soon as they are garbage collected or their parent process exits. The manager classes are defined in the :mod:`multiprocessing.managers` module:

Create a BaseManager object.

Once created one should call :meth:`start` or get_server().serve_forever() to ensure that the manager object refers to a started manager process.

address is the address on which the manager process listens for new connections. If address is None then an arbitrary one is chosen.

authkey is the authentication key which will be used to check the validity of incoming connections to the server process. If authkey is None then current_process().authkey. Otherwise authkey is used and it must be a string.

:class:`BaseManager` instances also have one read-only property:

A subclass of :class:`BaseManager` which can be used for the synchronization of processes. Objects of this type are returned by :func:`multiprocessing.Manager`.

It also supports creation of shared lists and dictionaries.


Modifications to mutable values or items in dict and list proxies will not be propagated through the manager, because the proxy has no way of knowing when its values or items are modified. To modify such an item, you can re-assign the modified object to the container proxy:

# create a list proxy and append a mutable object (a dictionary)
lproxy = manager.list()
# now mutate the dictionary
d = lproxy[0]
d['a'] = 1
d['b'] = 2
# at this point, the changes to d are not yet synced, but by
# reassigning the dictionary, the proxy is notified of the change
lproxy[0] = d

Namespace objects

A namespace object has no public methods, but does have writable attributes. Its representation shows the values of its attributes.

However, when using a proxy for a namespace object, an attribute beginning with '_' will be an attribute of the proxy and not an attribute of the referent:

Customized managers

To create one's own manager, one creates a subclass of :class:`BaseManager` and uses the :meth:`~BaseManager.register` classmethod to register new types or callables with the manager class. For example:

from multiprocessing.managers import BaseManager

class MathsClass(object):
    def add(self, x, y):
        return x + y
    def mul(self, x, y):
        return x * y

class MyManager(BaseManager):

MyManager.register('Maths', MathsClass)

if __name__ == '__main__':
    manager = MyManager()
    maths = manager.Maths()
    print maths.add(4, 3)         # prints 7
    print maths.mul(7, 8)         # prints 56

Using a remote manager

It is possible to run a manager server on one machine and have clients use it from other machines (assuming that the firewalls involved allow it).

Running the following commands creates a server for a single shared queue which remote clients can access:

>>> from multiprocessing.managers import BaseManager
>>> import Queue
>>> queue = Queue.Queue()
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue', callable=lambda:queue)
>>> m = QueueManager(address=('', 50000), authkey='abracadabra')
>>> s = m.get_server()
>>> s.serve_forever()

One client can access the server as follows:

>>> from multiprocessing.managers import BaseManager
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue')
>>> m = QueueManager(address=('', 50000), authkey='abracadabra')
>>> m.connect()
>>> queue = m.get_queue()
>>> queue.put('hello')

Another client can also use it:

>>> from multiprocessing.managers import BaseManager
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue')
>>> m = QueueManager(address=('', 50000), authkey='abracadabra')
>>> m.connect()
>>> queue = m.get_queue()
>>> queue.get()

Local processes can also access that queue, using the code from above on the client to access it remotely:

>>> from multiprocessing import Process, Queue
>>> from multiprocessing.managers import BaseManager
>>> class Worker(Process):
...     def __init__(self, q):
...         self.q = q
...         super(Worker, self).__init__()
...     def run(self):
...         self.q.put('local hello')
>>> queue = Queue()
>>> w = Worker(queue)
>>> w.start()
>>> class QueueManager(BaseManager): pass
>>> QueueManager.register('get_queue', callable=lambda: queue)
>>> m = QueueManager(address=('', 50000), authkey='abracadabra')
>>> s = m.get_server()
>>> s.serve_forever()

Proxy Objects

A proxy is an object which refers to a shared object which lives (presumably) in a different process. The shared object is said to be the referent of the proxy. Multiple proxy objects may have the same referent.

A proxy object has methods which invoke corresponding methods of its referent (although not every method of the referent will necessarily be available through the proxy). A proxy can usually be used in most of the same ways that its referent can:

Notice that applying :func:`str` to a proxy will return the representation of the referent, whereas applying :func:`repr` will return the representation of the proxy.

An important feature of proxy objects is that they are picklable so they can be passed between processes. Note, however, that if a proxy is sent to the corresponding manager's process then unpickling it will produce the referent itself. This means, for example, that one shared object can contain a second:


The proxy types in :mod:`multiprocessing` do nothing to support comparisons by value. So, for instance, we have:

One should just use a copy of the referent instead when making comparisons.

Proxy objects are instances of subclasses of :class:`BaseProxy`.


A proxy object uses a weakref callback so that when it gets garbage collected it deregisters itself from the manager which owns its referent.

A shared object gets deleted from the manager process when there are no longer any proxies referring to it.

Process Pools

One can create a pool of processes which will carry out tasks submitted to it with the :class:`Pool` class.

A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.

processes is the number of worker processes to use. If processes is None then the number returned by :func:`cpu_count` is used. If initializer is not None then each worker process will call initializer(*initargs) when it starts.


Worker processes within a :class:`Pool` typically live for the complete duration of the Pool's work queue. A frequent pattern found in other systems (such as Apache, mod_wsgi, etc) to free resources held by workers is to allow a worker within a pool to complete only a set amount of work before being exiting, being cleaned up and a new process spawned to replace the old one. The maxtasksperchild argument to the :class:`Pool` exposes this ability to the end user.

The class of the result returned by :meth:`Pool.apply_async` and :meth:`Pool.map_async`.

The following example demonstrates the use of a pool:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes

    result = pool.apply_async(f, (10,))    # evaluate "f(10)" asynchronously
    print result.get(timeout=1)           # prints "100" unless your computer is *very* slow

    print, range(10))          # prints "[0, 1, 4,..., 81]"

    it = pool.imap(f, range(10))
    print                       # prints "0"
    print                       # prints "1"
    print              # prints "4" unless your computer is *very* slow

    import time
    result = pool.apply_async(time.sleep, (10,))
    print result.get(timeout=1)           # raises TimeoutError

Listeners and Clients

Usually message passing between processes is done using queues or by using :class:`Connection` objects returned by :func:`Pipe`.

However, the :mod:`multiprocessing.connection` module allows some extra flexibility. It basically gives a high level message oriented API for dealing with sockets or Windows named pipes, and also has support for digest authentication using the :mod:`hmac` module.

A wrapper for a bound socket or Windows named pipe which is 'listening' for connections.

address is the address to be used by the bound socket or named pipe of the listener object.


If an address of '' is used, the address will not be a connectable end point on Windows. If you require a connectable end-point, you should use ''.

family is the type of socket (or named pipe) to use. This can be one of the strings 'AF_INET' (for a TCP socket), 'AF_UNIX' (for a Unix domain socket) or 'AF_PIPE' (for a Windows named pipe). Of these only the first is guaranteed to be available. If family is None then the family is inferred from the format of address. If address is also None then a default is chosen. This default is the family which is assumed to be the fastest available. See :ref:`multiprocessing-address-formats`. Note that if family is 'AF_UNIX' and address is None then the socket will be created in a private temporary directory created using :func:`tempfile.mkstemp`.

If the listener object uses a socket then backlog (1 by default) is passed to the :meth:`listen` method of the socket once it has been bound.

If authenticate is True (False by default) or authkey is not None then digest authentication is used.

If authkey is a string then it will be used as the authentication key; otherwise it must be None.

If authkey is None and authenticate is True then current_process().authkey is used as the authentication key. If authkey is None and authenticate is False then no authentication is done. If authentication fails then :exc:`AuthenticationError` is raised. See :ref:`multiprocessing-auth-keys`.

Listener objects have the following read-only properties:

The module defines two exceptions:


The following server code creates a listener which uses 'secret password' as an authentication key. It then waits for a connection and sends some data to the client:

from multiprocessing.connection import Listener
from array import array

address = ('localhost', 6000)     # family is deduced to be 'AF_INET'
listener = Listener(address, authkey='secret password')

conn = listener.accept()
print 'connection accepted from', listener.last_accepted

conn.send([2.25, None, 'junk', float])


conn.send_bytes(array('i', [42, 1729]))


The following code connects to the server and receives some data from the server:

from multiprocessing.connection import Client
from array import array

address = ('localhost', 6000)
conn = Client(address, authkey='secret password')

print conn.recv()                 # => [2.25, None, 'junk', float]

print conn.recv_bytes()            # => 'hello'

arr = array('i', [0, 0, 0, 0, 0])
print conn.recv_bytes_into(arr)     # => 8
print arr                         # => array('i', [42, 1729, 0, 0, 0])


Address Formats

Note that any string beginning with two backslashes is assumed by default to be an 'AF_PIPE' address rather than an 'AF_UNIX' address.

Authentication keys

When one uses :meth:`Connection.recv`, the data received is automatically unpickled. Unfortunately unpickling data from an untrusted source is a security risk. Therefore :class:`Listener` and :func:`Client` use the :mod:`hmac` module to provide digest authentication.

An authentication key is a string which can be thought of as a password: once a connection is established both ends will demand proof that the other knows the authentication key. (Demonstrating that both ends are using the same key does not involve sending the key over the connection.)

If authentication is requested but do authentication key is specified then the return value of current_process().authkey is used (see :class:`~multiprocessing.Process`). This value will automatically inherited by any :class:`~multiprocessing.Process` object that the current process creates. This means that (by default) all processes of a multi-process program will share a single authentication key which can be used when setting up connections between themselves.

Suitable authentication keys can also be generated by using :func:`os.urandom`.


Some support for logging is available. Note, however, that the :mod:`logging` package does not use process shared locks so it is possible (depending on the handler type) for messages from different processes to get mixed up.

Below is an example session with logging turned on:

>>> import multiprocessing, logging
>>> logger = multiprocessing.log_to_stderr()
>>> logger.setLevel(logging.INFO)
>>> logger.warning('doomed')
[WARNING/MainProcess] doomed
>>> m = multiprocessing.Manager()
[INFO/SyncManager-...] child process calling
[INFO/SyncManager-...] created temp directory /.../pymp-...
[INFO/SyncManager-...] manager serving at '/.../listener-...'
>>> del m
[INFO/MainProcess] sending shutdown message to manager
[INFO/SyncManager-...] manager exiting with exitcode 0

In addition to having these two logging functions, the multiprocessing also exposes two additional logging level attributes. These are :const:`SUBWARNING` and :const:`SUBDEBUG`. The table below illustrates where theses fit in the normal level hierarchy.

Level Numeric value

For a full table of logging levels, see the :mod:`logging` module.

These additional logging levels are used primarily for certain debug messages within the multiprocessing module. Below is the same example as above, except with :const:`SUBDEBUG` enabled:

>>> import multiprocessing, logging
>>> logger = multiprocessing.log_to_stderr()
>>> logger.setLevel(multiprocessing.SUBDEBUG)
>>> logger.warning('doomed')
[WARNING/MainProcess] doomed
>>> m = multiprocessing.Manager()
[INFO/SyncManager-...] child process calling
[INFO/SyncManager-...] created temp directory /.../pymp-...
[INFO/SyncManager-...] manager serving at '/.../pymp-djGBXN/listener-...'
>>> del m
[SUBDEBUG/MainProcess] finalizer calling ...
[INFO/MainProcess] sending shutdown message to manager
[DEBUG/SyncManager-...] manager received shutdown message
[SUBDEBUG/SyncManager-...] calling <Finalize object, callback=unlink, ...
[SUBDEBUG/SyncManager-...] finalizer calling <built-in function unlink> ...
[SUBDEBUG/SyncManager-...] calling <Finalize object, dead>
[SUBDEBUG/SyncManager-...] finalizer calling <function rmtree at 0x5aa730> ...
[INFO/SyncManager-...] manager exiting with exitcode 0

The :mod:`multiprocessing.dummy` module

:mod:`multiprocessing.dummy` replicates the API of :mod:`multiprocessing` but is no more than a wrapper around the :mod:`threading` module.

Programming guidelines

There are certain guidelines and idioms which should be adhered to when using :mod:`multiprocessing`.

All platforms

Avoid shared state

As far as possible one should try to avoid shifting large amounts of data between processes.

It is probably best to stick to using queues or pipes for communication between processes rather than using the lower level synchronization primitives from the :mod:`threading` module.


Ensure that the arguments to the methods of proxies are picklable.

Thread safety of proxies

Do not use a proxy object from more than one thread unless you protect it with a lock.

(There is never a problem with different processes using the same proxy.)

Joining zombie processes

On Unix when a process finishes but has not been joined it becomes a zombie. There should never be very many because each time a new process starts (or :func:`active_children` is called) all completed processes which have not yet been joined will be joined. Also calling a finished process's :meth:`Process.is_alive` will join the process. Even so it is probably good practice to explicitly join all the processes that you start.

Better to inherit than pickle/unpickle

On Windows many types from :mod:`multiprocessing` need to be picklable so that child processes can use them. However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.

Avoid terminating processes

Using the :meth:`Process.terminate` method to stop a process is liable to cause any shared resources (such as locks, semaphores, pipes and queues) currently being used by the process to become broken or unavailable to other processes.

Therefore it is probably best to only consider using :meth:`Process.terminate` on processes which never use any shared resources.

Joining processes that use queues

Bear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the "feeder" thread to the underlying pipe. (The child process can call the :meth:`Queue.cancel_join_thread` method of the queue to avoid this behaviour.)

This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be automatically be joined.

An example which will deadlock is the following:

from multiprocessing import Process, Queue

def f(q):
    q.put('X' * 1000000)

if __name__ == '__main__':
    queue = Queue()
    p = Process(target=f, args=(queue,))
    p.join()                    # this deadlocks
    obj = queue.get()

A fix here would be to swap the last two lines round (or simply remove the p.join() line).

Explicitly pass resources to child processes

On Unix a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.

Apart from making the code (potentially) compatible with Windows this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process. This might be important if some resource is freed when the object is garbage collected in the parent process.

So for instance

from multiprocessing import Process, Lock

def f():
    ... do something using "lock" ...

if __name__ == '__main__':
   lock = Lock()
   for i in range(10):

should be rewritten as

from multiprocessing import Process, Lock

def f(l):
    ... do something using "l" ...

if __name__ == '__main__':
   lock = Lock()
   for i in range(10):
        Process(target=f, args=(lock,)).start()

Beware of replacing :data:`sys.stdin` with a "file like object"

:mod:`multiprocessing` originally unconditionally called:


in the :meth:`multiprocessing.Process._bootstrap` method --- this resulted in issues with processes-in-processes. This has been changed to:

sys.stdin = open(os.devnull)

Which solves the fundamental issue of processes colliding with each other resulting in a bad file descriptor error, but introduces a potential danger to applications which replace :func:`sys.stdin` with a "file-like object" with output buffering. This danger is that if multiple processes call :func:`close()` on this file-like object, it could result in the same data being flushed to the object multiple times, resulting in corruption.

If you write a file-like object and implement your own caching, you can make it fork-safe by storing the pid whenever you append to the cache, and discarding the cache when the pid changes. For example:

def cache(self):
    pid = os.getpid()
    if pid != self._pid:
        self._pid = pid
        self._cache = []
    return self._cache

For more information, see :issue:`5155`, :issue:`5313` and :issue:`5331`


Since Windows lacks :func:`os.fork` it has a few extra restrictions:

More picklability

Ensure that all arguments to :meth:`Process.__init__` are picklable. This means, in particular, that bound or unbound methods cannot be used directly as the target argument on Windows --- just define a function and use that instead.

Also, if you subclass :class:`Process` then make sure that instances will be picklable when the :meth:`Process.start` method is called.

Global variables

Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that :meth:`Process.start` was called.

However, global variables which are just module level constants cause no problems.

Safe importing of main module

Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).

For example, under Windows running the following module would fail with a :exc:`RuntimeError`:

from multiprocessing import Process

def foo():
    print 'hello'

p = Process(target=foo)

Instead one should protect the "entry point" of the program by using if __name__ == '__main__': as follows:

from multiprocessing import Process, freeze_support

def foo():
    print 'hello'

if __name__ == '__main__':
    p = Process(target=foo)

(The freeze_support() line can be omitted if the program will be run normally instead of frozen.)

This allows the newly spawned Python interpreter to safely import the module and then run the module's foo() function.

Similar restrictions apply if a pool or manager is created in the main module.


Demonstration of how to create and use customized managers and proxies:

Using :class:`Pool`:

Synchronization types like locks, conditions and queues:

An example showing how to use queues to feed tasks to a collection of worker processes and collect the results:

An example of how a pool of worker processes can each run a :class:`SimpleHTTPServer.HttpServer` instance while sharing a single listening socket.

Some simple benchmarks comparing :mod:`multiprocessing` with :mod:`threading`: