Source

python-peps / pep-3119.txt

Full commit
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
PEP: 3119
Title: Introducing Abstract Base Classes
Version: $Revision$
Last-Modified: $Date$
Author: Guido van Rossum <guido@python.org>, Talin <talin@acm.org>
Status: Final
Type: Standards Track
Content-Type: text/x-rst
Created: 18-Apr-2007
Post-History: 26-Apr-2007, 11-May-2007


Abstract
========

This is a proposal to add Abstract Base Class (ABC) support to Python
3000.  It proposes:

* A way to overload ``isinstance()`` and ``issubclass()``.

* A new module ``abc`` which serves as an "ABC support framework".  It
  defines a metaclass for use with ABCs and a decorator that can be
  used to define abstract methods.

* Specific ABCs for containers and iterators, to be added to the
  collections module.

Much of the thinking that went into the proposal is not about the
specific mechanism of ABCs, as contrasted with Interfaces or Generic
Functions (GFs), but about clarifying philosophical issues like "what
makes a set", "what makes a mapping" and "what makes a sequence".

There's also a companion PEP 3141, which defines ABCs for numeric
types.


Acknowledgements
----------------

Talin wrote the Rationale below [1]_ as well as most of the section on
ABCs vs. Interfaces.  For that alone he deserves co-authorship.  The
rest of the PEP uses "I" referring to the first author.


Rationale
=========

In the domain of object-oriented programming, the usage patterns for
interacting with an object can be divided into two basic categories,
which are 'invocation' and 'inspection'.

Invocation means interacting with an object by invoking its methods.
Usually this is combined with polymorphism, so that invoking a given
method may run different code depending on the type of an object.

Inspection means the ability for external code (outside of the
object's methods) to examine the type or properties of that object,
and make decisions on how to treat that object based on that
information.

Both usage patterns serve the same general end, which is to be able to
support the processing of diverse and potentially novel objects in a
uniform way, but at the same time allowing processing decisions to be
customized for each different type of object.

In classical OOP theory, invocation is the preferred usage pattern,
and inspection is actively discouraged, being considered a relic of an
earlier, procedural programming style.  However, in practice this view
is simply too dogmatic and inflexible, and leads to a kind of design
rigidity that is very much at odds with the dynamic nature of a
language like Python.

In particular, there is often a need to process objects in a way that
wasn't anticipated by the creator of the object class.  It is not
always the best solution to build in to every object methods that
satisfy the needs of every possible user of that object.  Moreover,
there are many powerful dispatch philosophies that are in direct
contrast to the classic OOP requirement of behavior being strictly
encapsulated within an object, examples being rule or pattern-match
driven logic.

On the other hand, one of the criticisms of inspection by classic
OOP theorists is the lack of formalisms and the ad hoc nature of what
is being inspected.  In a language such as Python, in which almost any
aspect of an object can be reflected and directly accessed by external
code, there are many different ways to test whether an object conforms
to a particular protocol or not.  For example, if asking 'is this
object a mutable sequence container?', one can look for a base class
of 'list', or one can look for a method named '__getitem__'.  But note
that although these tests may seem obvious, neither of them are
correct, as one generates false negatives, and the other false
positives.

The generally agreed-upon remedy is to standardize the tests, and
group them into a formal arrangement.  This is most easily done by
associating with each class a set of standard testable properties,
either via the inheritance mechanism or some other means.  Each test
carries with it a set of promises: it contains a promise about the
general behavior of the class, and a promise as to what other class
methods will be available.

This PEP proposes a particular strategy for organizing these tests
known as Abstract Base Classes, or ABC.  ABCs are simply Python
classes that are added into an object's inheritance tree to signal
certain features of that object to an external inspector.  Tests are
done using ``isinstance()``, and the presence of a particular ABC
means that the test has passed.

In addition, the ABCs define a minimal set of methods that establish
the characteristic behavior of the type.  Code that discriminates
objects based on their ABC type can trust that those methods will
always be present.  Each of these methods are accompanied by an
generalized abstract semantic definition that is described in the
documentation for the ABC.  These standard semantic definitions are
not enforced, but are strongly recommended.

Like all other things in Python, these promises are in the nature of a
gentlemen's agreement, which in this case means that while the
language does enforce some of the promises made in the ABC, it is up
to the implementer of the concrete class to insure that the remaining
ones are kept.


Specification
=============

The specification follows the categories listed in the abstract:

* A way to overload ``isinstance()`` and ``issubclass()``.

* A new module ``abc`` which serves as an "ABC support framework".  It
  defines a metaclass for use with ABCs and a decorator that can be
  used to define abstract methods.

* Specific ABCs for containers and iterators, to be added to the
  collections module.


Overloading ``isinstance()`` and ``issubclass()``
-------------------------------------------------

During the development of this PEP and of its companion, PEP 3141, we
repeatedly faced the choice between standardizing more, fine-grained
ABCs or fewer, course-grained ones.  For example, at one stage, PEP
3141 introduced the following stack of base classes used for complex
numbers: MonoidUnderPlus, AdditiveGroup, Ring, Field, Complex (each
derived from the previous).  And the discussion mentioned several
other algebraic categorizations that were left out: Algebraic,
Transcendental, and IntegralDomain, and PrincipalIdealDomain.  In
earlier versions of the current PEP, we considered the use cases for
separate classes like Set, ComposableSet, MutableSet, HashableSet,
MutableComposableSet, HashableComposableSet.

The dilemma here is that we'd rather have fewer ABCs, but then what
should a user do who needs a less refined ABC?  Consider e.g. the
plight of a mathematician who wants to define his own kind of
Transcendental numbers, but also wants float and int to be considered
Transcendental.  PEP 3141 originally proposed to patch float.__bases__
for that purpose, but there are some good reasons to keep the built-in
types immutable (for one, they are shared between all Python
interpreters running in the same address space, as is used by
mod_python [16]_).

Another example would be someone who wants to define a generic
function (PEP 3124) for any sequence that has an ``append()`` method.
The ``Sequence`` ABC (see below) doesn't promise the ``append()``
method, while ``MutableSequence`` requires not only ``append()`` but
also various other mutating methods.

To solve these and similar dilemmas, the next section will propose a
metaclass for use with ABCs that will allow us to add an ABC as a
"virtual base class" (not the same concept as in C++) to any class,
including to another ABC.  This allows the standard library to define
ABCs ``Sequence`` and ``MutableSequence`` and register these as
virtual base classes for built-in types like ``basestring``, ``tuple``
and ``list``, so that for example the following conditions are all
true::

    isinstance([], Sequence)
    issubclass(list, Sequence)
    issubclass(list, MutableSequence)
    isinstance((), Sequence)
    not issubclass(tuple, MutableSequence)
    isinstance("", Sequence)
    issubclass(bytearray, MutableSequence)

The primary mechanism proposed here is to allow overloading the
built-in functions ``isinstance()`` and ``issubclass()``.  The
overloading works as follows: The call ``isinstance(x, C)`` first
checks whether ``C.__instancecheck__`` exists, and if so, calls
``C.__instancecheck__(x)`` instead of its normal implementation.
Similarly, the call ``issubclass(D, C)`` first checks whether
``C.__subclasscheck__`` exists, and if so, calls
``C.__subclasscheck__(D)`` instead of its normal implementation.

Note that the magic names are not ``__isinstance__`` and
``__issubclass__``; this is because the reversal of the arguments
could cause confusion, especially for the ``issubclass()`` overloader.

A prototype implementation of this is given in [12]_.

Here is an example with (naively simple) implementations of
``__instancecheck__`` and ``__subclasscheck__``::

    class ABCMeta(type):

        def __instancecheck__(cls, inst):
            """Implement isinstance(inst, cls)."""
            return any(cls.__subclasscheck__(c)
                       for c in {type(inst), inst.__class__})

        def __subclasscheck__(cls, sub):
            """Implement issubclass(sub, cls)."""
            candidates = cls.__dict__.get("__subclass__", set()) | {cls}
            return any(c in candidates for c in sub.mro())

    class Sequence(metaclass=ABCMeta):
        __subclass__ = {list, tuple}

    assert issubclass(list, Sequence)
    assert issubclass(tuple, Sequence)

    class AppendableSequence(Sequence):
        __subclass__ = {list}

    assert issubclass(list, AppendableSequence)
    assert isinstance([], AppendableSequence)

    assert not issubclass(tuple, AppendableSequence)
    assert not isinstance((), AppendableSequence)

The next section proposes a full-fledged implementation.


The ``abc`` Module: an ABC Support Framework
--------------------------------------------

The new standard library module ``abc``, written in pure Python,
serves as an ABC support framework.  It defines a metaclass
``ABCMeta`` and decorators ``@abstractmethod`` and
``@abstractproperty``.  A sample implementation is given by [13]_.

The ``ABCMeta`` class overrides ``__instancecheck__`` and
``__subclasscheck__`` and defines a ``register`` method.  The
``register`` method takes one argument, which must be a class; after
the call ``B.register(C)``, the call ``issubclass(C, B)`` will return
True, by virtue of ``B.__subclasscheck__(C)`` returning True.
Also, ``isinstance(x, B)`` is equivalent to ``issubclass(x.__class__,
B) or issubclass(type(x), B)``.  (It is possible ``type(x)`` and
``x.__class__`` are not the same object, e.g. when x is a proxy
object.)

These methods are intended to be be called on classes whose metaclass
is (derived from) ``ABCMeta``; for example::

    from abc import ABCMeta

    class MyABC(metaclass=ABCMeta):
        pass

    MyABC.register(tuple)

    assert issubclass(tuple, MyABC)
    assert isinstance((), MyABC)

The last two asserts are equivalent to the following two::

    assert MyABC.__subclasscheck__(tuple)
    assert MyABC.__instancecheck__(())

Of course, you can also directly subclass MyABC::

    class MyClass(MyABC):
        pass

    assert issubclass(MyClass, MyABC)
    assert isinstance(MyClass(), MyABC)

Also, of course, a tuple is not a ``MyClass``::

    assert not issubclass(tuple, MyClass)
    assert not isinstance((), MyClass)

You can register another class as a subclass of ``MyClass``::

    MyClass.register(list)

    assert issubclass(list, MyClass)
    assert issubclass(list, MyABC)

You can also register another ABC::

    class AnotherClass(metaclass=ABCMeta):
        pass

    AnotherClass.register(basestring)

    MyClass.register(AnotherClass)

    assert isinstance(str, MyABC)

That last assert requires tracing the following superclass-subclass
relationships::

    MyABC -> MyClass (using regular subclassing)
    MyClass -> AnotherClass (using registration)
    AnotherClass -> basestring (using registration)
    basestring -> str (using regular subclassing)

The ``abc`` module also defines a new decorator, ``@abstractmethod``,
to be used to declare abstract methods.  A class containing at least
one method declared with this decorator that hasn't been overridden
yet cannot be instantiated.  Such methods may be called from the
overriding method in the subclass (using ``super`` or direct
invocation).  For example::

    from abc import ABCMeta, abstractmethod

    class A(metaclass=ABCMeta):
        @abstractmethod
        def foo(self): pass

    A()  # raises TypeError

    class B(A):
        pass

    B()  # raises TypeError

    class C(A):
        def foo(self): print(42)

    C()  # works

**Note:** The ``@abstractmethod`` decorator should only be used
inside a class body, and only for classes whose metaclass is (derived
from) ``ABCMeta``.  Dynamically adding abstract methods to a class, or
attempting to modify the abstraction status of a method or class once
it is created, are not supported.  The ``@abstractmethod`` only
affects subclasses derived using regular inheritance; "virtual
subclasses" registered with the ``register()`` method are not affected.

**Implementation:** The ``@abstractmethod`` decorator sets the
function attribute ``__isabstractmethod__`` to the value ``True``.
The ``ABCMeta.__new__`` method computes the type attribute
``__abstractmethods__`` as the set of all method names that have an
``__isabstractmethod__`` attribute whose value is true.  It does this
by combining the ``__abstractmethods__`` attributes of the base
classes, adding the names of all methods in the new class dict that
have a true ``__isabstractmethod__`` attribute, and removing the names
of all methods in the new class dict that don't have a true
``__isabstractmethod__`` attribute.  If the resulting
``__abstractmethods__`` set is non-empty, the class is considered
abstract, and attempts to instantiate it will raise ``TypeError``.
(If this were implemented in CPython, an internal flag
``Py_TPFLAGS_ABSTRACT`` could be used to speed up this check [6]_.)

**Discussion:** Unlike Java's abstract methods or C++'s pure abstract
methods, abstract methods as defined here may have an implementation.
This implementation can be called via the ``super`` mechanism from the
class that overrides it.  This could be useful as an end-point for a
super-call in framework using cooperative multiple-inheritance [7]_,
[8]_.

A second decorator, ``@abstractproperty``, is defined in order to
define abstract data attributes.  Its implementation is a subclass of
the built-in ``property`` class that adds an ``__isabstractmethod__``
attribute::

    class abstractproperty(property):
        __isabstractmethod__ = True

It can be used in two ways::

    class C(metaclass=ABCMeta):

        # A read-only property:

        @abstractproperty
        def readonly(self):
            return self.__x

        # A read-write property (cannot use decorator syntax):

        def getx(self):
            return self.__x
        def setx(self, value):
            self.__x = value
        x = abstractproperty(getx, setx)

Similar to abstract methods, a subclass inheriting an abstract
property (declared using either the decorator syntax or the longer
form) cannot be instantiated unless it overrides that abstract
property with a concrete property.


ABCs for Containers and Iterators
---------------------------------

The ``collections`` module will define ABCs necessary and sufficient
to work with sets, mappings, sequences, and some helper types such as
iterators and dictionary views.  All ABCs have the above-mentioned
``ABCMeta`` as their metaclass.

The ABCs provide implementations of their abstract methods that are
technically valid but fairly useless; e.g. ``__hash__`` returns 0, and
``__iter__`` returns an empty iterator.  In general, the abstract
methods represent the behavior of an empty container of the indicated
type.

Some ABCs also provide concrete (i.e. non-abstract) methods; for
example, the ``Iterator`` class has an ``__iter__`` method returning
itself, fulfilling an important invariant of iterators (which in
Python 2 has to be implemented anew by each iterator class).  These
ABCs can be considered "mix-in" classes.

No ABCs defined in the PEP override ``__init__``, ``__new__``,
``__str__`` or ``__repr__``.  Defining a standard constructor
signature would unnecessarily constrain custom container types, for
example Patricia trees or gdbm files.  Defining a specific string
representation for a collection is similarly left up to individual
implementations.

**Note:** There are no ABCs for ordering operations (``__lt__``,
``__le__``, ``__ge__``, ``__gt__``).  Defining these in a base class
(abstract or not) runs into problems with the accepted type for the
second operand.  For example, if class ``Ordering`` defined
``__lt__``, one would assume that for any ``Ordering`` instances ``x``
and ``y``, ``x < y`` would be defined (even if it just defines a
partial ordering).  But this cannot be the case: If both ``list`` and
``str`` derived from ``Ordering``, this would imply that ``[1, 2] <
(1, 2)`` should be defined (and presumably return False), while in
fact (in Python 3000!)  such "mixed-mode comparisons" operations are
explicitly forbidden and raise ``TypeError``.  See PEP 3100 and [14]_
for more information.  (This is a special case of a more general issue
with operations that take another argument of the same type).


One Trick Ponies
''''''''''''''''

These abstract classes represent single methods like ``__iter__`` or
``__len__``.

``Hashable``
    The base class for classes defining ``__hash__``.  The
    ``__hash__`` method should return an integer.  The abstract
    ``__hash__`` method always returns 0, which is a valid (albeit
    inefficient) implementation.  **Invariant:** If classes ``C1`` and
    ``C2`` both derive from ``Hashable``, the condition ``o1 == o2``
    must imply ``hash(o1) == hash(o2)`` for all instances ``o1`` of
    ``C1`` and all instances ``o2`` of ``C2``.  In other words, two
    objects should never compare equal if they have different hash
    values.

    Another constraint is that hashable objects, once created, should
    never change their value (as compared by ``==``) or their hash
    value.  If a class cannot guarantee this, it should not derive
    from ``Hashable``; if it cannot guarantee this for certain
    instances, ``__hash__`` for those instances should raise a
    ``TypeError`` exception.

    **Note:** being an instance of this class does not imply that an
    object is immutable; e.g. a tuple containing a list as a member is
    not immutable; its ``__hash__`` method raises ``TypeError``.
    (This is because it recursively tries to compute the hash of each
    member; if a member is unhashable it raises ``TypeError``.)

``Iterable``
    The base class for classes defining ``__iter__``.  The
    ``__iter__`` method should always return an instance of
    ``Iterator`` (see below).  The abstract ``__iter__`` method
    returns an empty iterator.

``Iterator``
    The base class for classes defining ``__next__``.  This derives
    from ``Iterable``.  The abstract ``__next__`` method raises
    ``StopIteration``.  The concrete ``__iter__`` method returns
    ``self``.  Note the distinction between ``Iterable`` and
    ``Iterator``: an ``Iterable`` can be iterated over, i.e. supports
    the ``__iter__`` methods; an ``Iterator`` is what the built-in
    function ``iter()`` returns, i.e. supports the ``__next__``
    method.

``Sized``
    The base class for classes defining ``__len__``.  The ``__len__``
    method should return an ``Integer`` (see "Numbers" below) >= 0.
    The abstract ``__len__`` method returns 0.  **Invariant:** If a
    class ``C`` derives from ``Sized`` as well as from ``Iterable``,
    the invariant ``sum(1 for x in c) == len(c)`` should hold for any
    instance ``c`` of ``C``.

``Container``
    The base class for classes defining ``__contains__``.  The
    ``__contains__`` method should return a ``bool``.  The abstract
    ``__contains__`` method returns ``False``.  **Invariant:** If a
    class ``C`` derives from ``Container`` as well as from
    ``Iterable``, then ``(x in c for x in c)`` should be a generator
    yielding only True values for any instance ``c`` of ``C``.

**Open issues:** Conceivably, instead of using the ABCMeta metaclass,
these classes could override ``__instancecheck__`` and
``__subclasscheck__`` to check for the presence of the applicable
special method; for example::

    class Sized(metaclass=ABCMeta):
        @abstractmethod
        def __hash__(self):
            return 0
        @classmethod
        def __instancecheck__(cls, x):
            return hasattr(x, "__len__")
        @classmethod
        def __subclasscheck__(cls, C):
            return hasattr(C, "__bases__") and hasattr(C, "__len__")

This has the advantage of not requiring explicit registration.
However, the semantics are hard to get exactly right given the confusing
semantics of instance attributes vs. class attributes, and that a
class is an instance of its metaclass; the check for ``__bases__`` is
only an approximation of the desired semantics.  **Strawman:** Let's
do it, but let's arrange it in such a way that the registration API
also works.


Sets
''''

These abstract classes represent read-only sets and mutable sets.  The
most fundamental set operation is the membership test, written as ``x
in s`` and implemented by ``s.__contains__(x)``.  This operation is
already defined by the ``Container`` class defined above.  Therefore,
we define a set as a sized, iterable container for which certain
invariants from mathematical set theory hold.

The built-in type ``set`` derives from ``MutableSet``.  The built-in
type ``frozenset`` derives from ``Set`` and ``Hashable``.

``Set``
    This is a sized, iterable container, i.e., a subclass of
    ``Sized``, ``Iterable`` and ``Container``.  Not every subclass of
    those three classes is a set though!  Sets have the additional
    invariant that each element occurs only once (as can be determined
    by iteration), and in addition sets define concrete operators that
    implement the inequality operations as subclass/superclass tests.
    In general, the invariants for finite sets in mathematics
    hold. [11]_

    Sets with different implementations can be compared safely,
    (usually) efficiently and correctly using the mathematical
    definitions of the subclass/superclass operations for finite sets.
    The ordering operations have concrete implementations; subclasses
    may override these for speed but should maintain the semantics.
    Because ``Set`` derives from ``Sized``, ``__eq__`` may take a
    shortcut and return ``False`` immediately if two sets of unequal
    length are compared.  Similarly, ``__le__`` may return ``False``
    immediately if the first set has more members than the second set.
    Note that set inclusion implements only a partial ordering;
    e.g. ``{1, 2}`` and ``{1, 3}`` are not ordered (all three of
    ``<``, ``==`` and ``>`` return ``False`` for these arguments).
    Sets cannot be ordered relative to mappings or sequences, but they
    can be compared to those for equality (and then they always
    compare unequal).

    This class also defines concrete operators to compute union,
    intersection, symmetric and asymmetric difference, respectively
    ``__or__``, ``__and__``, ``__xor__`` and ``__sub__``.  These
    operators should return instances of ``Set``.  The default
    implementations call the overridable class method
    ``_from_iterable()`` with an iterable argument.  This factory
    method's default implementation returns a ``frozenset`` instance;
    it may be overridden to return another appropriate ``Set``
    subclass.

    Finally, this class defines a concrete method ``_hash`` which
    computes the hash value from the elements.  Hashable subclasses of
    ``Set`` can implement ``__hash__`` by calling ``_hash`` or they
    can reimplement the same algorithm more efficiently; but the
    algorithm implemented should be the same.  Currently the algorithm
    is fully specified only by the source code [15]_.

    **Note:** the ``issubset`` and ``issuperset`` methods found on the
    set type in Python 2 are not supported, as these are mostly just
    aliases for ``__le__`` and ``__ge__``.

``MutableSet``
    This is a subclass of ``Set`` implementing additional operations
    to add and remove elements.  The supported methods have the
    semantics known from the ``set`` type in Python 2 (except for
    ``discard``, which is modeled after Java):

    ``.add(x)``
        Abstract method returning a ``bool`` that adds the element
        ``x`` if it isn't already in the set.  It should return
        ``True`` if ``x`` was added, ``False`` if it was already
        there. The abstract implementation raises
        ``NotImplementedError``.

    ``.discard(x)``
        Abstract method returning a ``bool`` that removes the element
        ``x`` if present.  It should return ``True`` if the element
        was present and ``False`` if it wasn't.  The abstract
        implementation raises ``NotImplementedError``.

    ``.pop()``
        Concrete method that removes and returns an arbitrary item.
        If the set is empty, it raises ``KeyError``.  The default
        implementation removes the first item returned by the set's
        iterator.

    ``.toggle(x)``
        Concrete method returning a ``bool`` that adds x to the set if
        it wasn't there, but removes it if it was there.  It should
        return ``True`` if ``x`` was added, ``False`` if it was
        removed.

    ``.clear()``
        Concrete method that empties the set.  The default
        implementation repeatedly calls ``self.pop()`` until
        ``KeyError`` is caught.  (**Note:** this is likely much slower
        than simply creating a new set, even if an implementation
        overrides it with a faster approach; but in some cases object
        identity is important.)

    This also supports the in-place mutating operations ``|=``,
    ``&=``, ``^=``, ``-=``.  These are concrete methods whose right
    operand can be an arbitrary ``Iterable``, except for ``&=``, whose
    right operand must be a ``Container``.  This ABC does not provide
    the named methods present on the built-in concrete ``set`` type
    that perform (almost) the same operations.


Mappings
''''''''

These abstract classes represent read-only mappings and mutable
mappings.  The ``Mapping`` class represents the most common read-only
mapping API.

The built-in type ``dict`` derives from ``MutableMapping``.

``Mapping``
    A subclass of ``Container``, ``Iterable`` and ``Sized``.  The keys
    of a mapping naturally form a set.  The (key, value) pairs (which
    must be tuples) are also referred to as items.  The items also
    form a set.  Methods:

    ``.__getitem__(key)``
        Abstract method that returns the value corresponding to
        ``key``, or raises ``KeyError``.  The implementation always
        raises ``KeyError``.

    ``.get(key, default=None)``
        Concrete method returning ``self[key]`` if this does not raise
        ``KeyError``, and the ``default`` value if it does.

    ``.__contains__(key)``
        Concrete method returning ``True`` if ``self[key]`` does not
        raise ``KeyError``, and ``False`` if it does.

    ``.__len__()``
        Abstract method returning the number of distinct keys (i.e.,
        the length of the key set).

    ``.__iter__()``
        Abstract method returning each key in the key set exactly once.

    ``.keys()``
        Concrete method returning the key set as a ``Set``.  The
        default concrete implementation returns a "view" on the key
        set (meaning if the underlying mapping is modified, the view's
        value changes correspondingly); subclasses are not required to
        return a view but they should return a ``Set``.

    ``.items()``
        Concrete method returning the items as a ``Set``.  The default
        concrete implementation returns a "view" on the item set;
        subclasses are not required to return a view but they should
        return a ``Set``.

    ``.values()``
        Concrete method returning the values as a sized, iterable
        container (not a set!).  The default concrete implementation
        returns a "view" on the values of the mapping; subclasses are
        not required to return a view but they should return a sized,
        iterable container.

    The following invariants should hold for any mapping ``m``::

        len(m.values()) == len(m.keys()) == len(m.items()) == len(m)
        [value for value in m.values()] == [m[key] for key in m.keys()]
        [item for item in m.items()] == [(key, m[key]) for key in m.keys()]

    i.e. iterating over the items, keys and values should return
    results in the same order.

``MutableMapping``
    A subclass of ``Mapping`` that also implements some standard
    mutating methods.  Abstract methods include ``__setitem__``,
    ``__delitem__``.  Concrete methods include ``pop``, ``popitem``,
    ``clear``, ``update``.  **Note:** ``setdefault`` is *not* included.
    **Open issues:** Write out the specs for the methods.


Sequences
'''''''''

These abstract classes represent read-only sequences and mutable
sequences.

The built-in ``list`` and ``bytes`` types derive from
``MutableSequence``.  The built-in ``tuple`` and ``str`` types derive
from ``Sequence`` and ``Hashable``.

``Sequence``
    A subclass of ``Iterable``, ``Sized``, ``Container``.  It
    defines a new abstract method ``__getitem__`` that has a somewhat
    complicated signature: when called with an integer, it returns an
    element of the sequence or raises ``IndexError``; when called with
    a ``slice`` object, it returns another ``Sequence``.  The concrete
    ``__iter__`` method iterates over the elements using
    ``__getitem__`` with integer arguments 0, 1, and so on, until
    ``IndexError`` is raised.  The length should be equal to the
    number of values returned by the iterator.

    **Open issues:** Other candidate methods, which can all have
    default concrete implementations that only depend on ``__len__``
    and ``__getitem__`` with an integer argument: ``__reversed__``,
    ``index``, ``count``, ``__add__``, ``__mul__``.

``MutableSequence``
    A subclass of ``Sequence`` adding some standard mutating methods.
    Abstract mutating methods: ``__setitem__`` (for integer indices as
    well as slices), ``__delitem__`` (ditto), ``insert``.  Concrete
    mutating methods: ``append``, ``reverse``, ``extend``, ``pop``,
    ``remove``.  Concrete mutating operators: ``+=``, ``*=`` (these
    mutate the object in place).  **Note:** this does not define
    ``sort()`` -- that is only required to exist on genuine ``list``
    instances.


Strings
-------

Python 3000 will likely have at least two built-in string types: byte
strings (``bytes``), deriving from ``MutableSequence``, and (Unicode)
character strings (``str``), deriving from ``Sequence`` and
``Hashable``.

**Open issues:** define the base interfaces for these so alternative
implementations and subclasses know what they are in for.  This may be
the subject of a new PEP or PEPs (PEP 358 should be co-opted for the
``bytes`` type).


ABCs vs. Alternatives
=====================

In this section I will attempt to compare and contrast ABCs to other
approaches that have been proposed.


ABCs vs. Duck Typing
--------------------

Does the introduction of ABCs mean the end of Duck Typing?  I don't
think so.  Python will not require that a class derives from
``BasicMapping`` or ``Sequence`` when it defines a ``__getitem__``
method, nor will the ``x[y]`` syntax require that ``x`` is an instance
of either ABC.  You will still be able to assign any "file-like"
object to ``sys.stdout``, as long as it has a ``write`` method.

Of course, there will be some carrots to encourage users to derive
from the appropriate base classes; these vary from default
implementations for certain functionality to an improved ability to
distinguish between mappings and sequences.  But there are no sticks.
If ``hasattr(x, "__len__")`` works for you, great!  ABCs are intended to
solve problems that don't have a good solution at all in Python 2,
such as distinguishing between mappings and sequences.


ABCs vs. Generic Functions
--------------------------

ABCs are compatible with Generic Functions (GFs).  For example, my own
Generic Functions implementation [4]_ uses the classes (types) of the
arguments as the dispatch key, allowing derived classes to override
base classes.  Since (from Python's perspective) ABCs are quite
ordinary classes, using an ABC in the default implementation for a GF
can be quite appropriate.  For example, if I have an overloaded
``prettyprint`` function, it would make total sense to define
pretty-printing of sets like this::

    @prettyprint.register(Set)
    def pp_set(s):
        return "{" + ... + "}"  # Details left as an exercise

and implementations for specific subclasses of Set could be added
easily.

I believe ABCs also won't present any problems for RuleDispatch,
Phillip Eby's GF implementation in PEAK [5]_.

Of course, GF proponents might claim that GFs (and concrete, or
implementation, classes) are all you need.  But even they will not
deny the usefulness of inheritance; and one can easily consider the
ABCs proposed in this PEP as optional implementation base classes;
there is no requirement that all user-defined mappings derive from
``BasicMapping``.


ABCs vs. Interfaces
-------------------

ABCs are not intrinsically incompatible with Interfaces, but there is
considerable overlap.  For now, I'll leave it to proponents of
Interfaces to explain why Interfaces are better.  I expect that much
of the work that went into e.g. defining the various shades of
"mapping-ness" and the nomenclature could easily be adapted for a
proposal to use Interfaces instead of ABCs.

"Interfaces" in this context refers to a set of proposals for
additional metadata elements attached to a class which are not part of
the regular class hierarchy, but do allow for certain types of
inheritance testing.

Such metadata would be designed, at least in some proposals, so as to
be easily mutable by an application, allowing application writers to
override the normal classification of an object.

The drawback to this idea of attaching mutable metadata to a class is
that classes are shared state, and mutating them may lead to conflicts
of intent.  Additionally, the need to override the classification of
an object can be done more cleanly using generic functions: In the
simplest case, one can define a "category membership" generic function
that simply returns False in the base implementation, and then provide
overrides that return True for any classes of interest.


References
==========

.. [1] An Introduction to ABC's, by Talin
   (http://mail.python.org/pipermail/python-3000/2007-April/006614.html)

.. [2] Incomplete implementation prototype, by GvR
   (http://svn.python.org/view/sandbox/trunk/abc/)

.. [3] Possible Python 3K Class Tree?, wiki page created by Bill Janssen
   (http://wiki.python.org/moin/AbstractBaseClasses)

.. [4] Generic Functions implementation, by GvR
   (http://svn.python.org/view/sandbox/trunk/overload/)

.. [5] Charming Python: Scaling a new PEAK, by David Mertz
   (http://www-128.ibm.com/developerworks/library/l-cppeak2/)

.. [6] Implementation of @abstractmethod
   (http://python.org/sf/1706989)

.. [7] Unifying types and classes in Python 2.2, by GvR
   (http://www.python.org/download/releases/2.2.3/descrintro/)

.. [8] Putting Metaclasses to Work: A New Dimension in Object-Oriented
   Programming, by Ira R. Forman and Scott H. Danforth
   (http://www.amazon.com/gp/product/0201433052)

.. [9] Partial order, in Wikipedia
   (http://en.wikipedia.org/wiki/Partial_order)

.. [10] Total order, in Wikipedia
   (http://en.wikipedia.org/wiki/Total_order)

.. [11] Finite set, in Wikipedia
   (http://en.wikipedia.org/wiki/Finite_set)

.. [12] Make isinstance/issubclass overloadable
   (http://python.org/sf/1708353)

.. [13] ABCMeta sample implementation
   (http://svn.python.org/view/sandbox/trunk/abc/xyz.py)

.. [14] python-dev email ("Comparing heterogeneous types")
   http://mail.python.org/pipermail/python-dev/2004-June/045111.html

.. [15] Function ``frozenset_hash()`` in Object/setobject.c
   (http://svn.python.org/view/python/trunk/Objects/setobject.c)

.. [16] Multiple interpreters in mod_python
   (http://www.modpython.org/live/current/doc-html/pyapi-interps.html)


Copyright
=========

This document has been placed in the public domain.



..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End: