1. Matt Joiner
  2. gthread

Overview

This is the root directory for the gthread repository.

The gthread directory contains the gthread package. It can be used as-is.

What is gthread?
----------------

gthread is a package containing several "greened" standard library module
implementations, as well as some other auxiliary concurrency primitives.

Threads created using gthread modules are "green" threads existing entirely in userspace. All green threads belong to a single native thread. The first green thread in a native thread is created implicitly and is the main green thread (gthread.threading._MainThread).

Each native thread has a single green thread for scheduling and polling, called the
Loop. The Loop for each native thread is created the first time it's required. The
loop manages green thread deadlines, IO event interest, timeouts and green thread
readiness. The loop switches to green threads as they become ready. Green threads switch
back to the scheduler when they cannot proceed, such as by sleeping, waiting
on locks, sockets etc. Switching to and from the scheduler is mostly done
internally by gthread primitives.

Why green threading?
--------------------

Native threading is very heavyweight for use in a 1:1 scheme with blocking
code. Typically thread counts are limited to several thousand. The
scheduling costs for managing large numbers of threads for the sole purpose
of concurrency are prohibitively high. In addition, Python's GIL is
restrictive and exhibits contention issues when there are lots of
simultaneous wake-ups as the GIL competes with the native OS scheduler.

An alternative to threading is event polling. This suffers from lack of
support on Windows, and in the naive language implementation, requires that
code be inverted so that events can be handled by a dispatcher. The
fragmented code required to do this is difficult to read, convoluted, and
highly dispatcher-implementation specific. It also makes debugging very
difficult.

Green threading is the reuse of a single native thread for multiple call stacks.

Advantages:
 * The Python GIL prevents Python code from executing simultaneously. By putting many contexts onto a single native thread, there is no contention for the GIL.
 * The level of program concurrency achieveable is much higher. Context count is limited by memory availability, and context switching performance.
 * OS scheduler load is reduced since as context switching is done in user space.

Disadvantages:
 * Native threading primitives will block the underlying native thread. OS scheduling preemption does not help, since the OS is not aware of the other contexts waiting on that thread.
 * "Green" forms of many primitives must be made that operate within userspace and without kernel intervention.
 * Many libraries are implemented with the assumption that context switching is at the kernel level. Blocking calls to these libraries must be done carefully.