Commits

committed fb3ef3e

Finish off binary data section. First draft of the update complete

Doc/library/stdtypes.rst

` `
` `
` To clarify the above rules, here's some example Python code,`
`-equivalent to the builtin hash, for computing the hash of a rational`
`+equivalent to the built-in hash, for computing the hash of a rational`
` number, :class:`float`, or :class:`complex`::`
` `
` `
` `
` The only operation that immutable sequence types generally implement that is`
` not also implemented by mutable sequence types is support for the :func:`hash``
`-builtin.`
`+built-in.`
` `
` This support allows immutable sequences, such as :class:`tuple` instances, to`
` be used as :class:`dict` keys and stored in :class:`set` and :class:`frozenset``
` * Using a pair of square brackets to denote the empty list: ``[]```
` * Using square brackets, separating items with commas: ``[a]``, ``[a, b, c]```
` * Using a list comprehension: ``[x for x in iterable]```
`-* Using the :func:`list` builtin: ``list()`` or ``list(iterable)```
`-`
`-Many other operations also produce lists, including the :func:`sorted` builtin.`
`+* Using the :func:`list` built-in: ``list()`` or ``list(iterable)```
`+`
`+Many other operations also produce lists, including the :func:`sorted` built-in.`
` `
` Lists implement all of the :ref:`common <typesseq-common>` and`
` :ref:`mutable <typesseq-mutable>` sequence operations. Lists also provide the`
` `
` Tuples are immutable sequences, typically used to store collections of`
` heterogeneous data (such as the 2-tuples produced by the :func:`enumerate``
`-builtin). Tuples are also used for cases where an immutable sequence of`
`+built-in). Tuples are also used for cases where an immutable sequence of`
` homogeneous data is needed (such as allowing storage in a :class:`set` or`
` :class:`dict` instance).`
` `
` * Using a pair of parentheses to denote the empty tuple: ``()```
` * Using a trailing comma for a singleton tuple: ``a,`` or ``(a,)```
` * Separating items with commas: ``a, b, c`` or ``(a, b, c)```
`-* Using the :func:`tuple` builtin: ``tuple()`` or ``tuple(iterable)```
`+* Using the :func:`tuple` built-in: ``tuple()`` or ``tuple(iterable)```
` `
` Note that the parentheses are optional (except in the empty tuple case, or`
` when needed to avoid syntactic ambiguity). It is actually the comma which`
` `
` The :class:`range` type represents an immutable sequence of numbers and is`
` commonly used for looping a specific number of times. Instances are created`
`-using the :func:`range` builtin.`
`+using the :func:`range` built-in.`
` `
` For positive indices with results between the defined ``start`` and ``stop```
` values, integers within the range are determined by the formula:`
` Text Sequence Type --- :class:`str``
` ===================================`
` `
`-.. TODO: clean up this section based on the restructure`
`-`
` .. index::`
`    object: string`
`    object: bytes`
` including supported escape sequences, and the ``r`` ("raw") prefix that`
` disables most escape sequence processing.`
` `
`-There is no mutable string type, but :class:`io.StringIO` can be used to`
`-efficiently construct strings from multiple fragments.`
`+Strings may also be created from other objects with the :func:`str` built-in.`
`+`
`+Since there is no separate "character" type, indexing a string produces`
`+strings of length 1. That is, for a non-empty string *s*, ``s[0] == s[0:1]``.`
`+`
`+There is also no mutable string type, but :meth:`str.join` or`
`+:class:`io.StringIO` can be used to efficiently construct strings from`
`+multiple fragments.`
` `
` `
` .. _string-methods:`
` Binary Sequence Types --- :class:`bytes`, :class:`bytearray`, :class:`memoryview``
` =================================================================================`
` `
`-.. TODO: clean up this section based on the restructure`
`-`
`-`
`-Bytes and bytearray objects contain single bytes -- the former is immutable`
`-while the latter is a mutable sequence.  Bytes objects can be constructed the`
`-constructor, :func:`bytes`, and from literals; use a ``b`` prefix with normal`
`-string syntax: ``b'xyzzy'``.  To construct byte arrays, use the`
`-:func:`bytearray` function.`
`-`
`-While string objects are sequences of characters (represented by strings of`
`-length 1), bytes and bytearray objects are sequences of *integers* (between 0`
`-and 255), representing the ASCII value of single bytes.  That means that for`
`-a bytes or bytearray object *b*, ``b[0]`` will be an integer, while`
`-``b[0:1]`` will be a bytes or bytearray object of length 1.  The`
`-representation of bytes objects uses the literal format (``b'...'``) since it`
`-is generally more useful than e.g. ``bytes([50, 19, 100])``.  You can always`
`-convert a bytes object into a list of integers using ``list(b)``.`
`-`
`-Also, while in previous Python versions, byte strings and Unicode strings`
`-could be exchanged for each other rather freely (barring encoding issues),`
`-strings and bytes are now completely separate concepts.  There's no implicit`
`-en-/decoding if you pass an object of the wrong type.  A string always`
`-compares unequal to a bytes or bytearray object.`
`-`
`+.. index::`
`+   object: bytes`
`+   object: bytearray`
`+   object: memoryview`
`+   module: array`
`+`
`+The core built-in types for manipulating binary data are :class:`bytes` and`
`+:class:`bytearray`. They are supported by :class:`memoryview` which uses`
`+the buffer protocol to access the memory of other binary objects without`
`+needing to make a copy.`
`+`
`+The :mod:`array` module supports efficient storage of basic data types like`
`+32-bit integers and IEEE754 double-precision floating values.`
`+`
`+.. _typebytes:`
`+`
`+Bytes`
`+-----`
`+`
`+.. index:: object: bytes`
`+`
`+Bytes objects are immutable sequences of single bytes. Since many major`
`+binary protocols are based on the ASCII text encoding, bytes objects offer`
`+several methods that are only valid when working with ASCII compatible`
`+data and are closely related to string objects in a variety of other ways.`
`+`
`+Firstly, the syntax for bytes literals is largely the same as that for string`
`+literals, except that a ``b`` prefix is added:`
`+`
`+* Single quotes: ``b'still allows embedded "double" quotes'```
`+* Double quotes: ``b"still allows embedded 'single' quotes"``.`
`+* Triple quoted: ``b'''3 single quotes'''``, ``b"""3 double quotes"""```
`+`
`+Only ASCII characters are permitted in bytes literals (regardless of the`
`+declared source code encoding). Any binary values over 127 must be entered`
`+into bytes literals using the appropriate escape sequence.`
`+`
`+As with string literals, bytes literals may also use a ``r`` prefix to disable`
`+processing of escape sequences. See :ref:`strings` for more about the various`
`+forms of bytes literal, including supported escape sequences.`
`+`
`+While bytes literals and representations are based on ASCII text, bytes`
`+objects actually behave like immutable sequences of integers, with each`
`+value in the sequence restricted such that ``0 <= x < 256`` (attempts to`
`+violate this restriction will trigger :exc:`ValueError`. This is done`
`+deliberately to emphasise that while many binary formats include ASCII based`
`+elements and can be usefully manipulated with some text-oriented algorithms,`
`+this is not generally the case for arbitrary binary data (blindly applying`
`+text processing algorithms to binary data formats that are not ASCII`
`+compatible will usually lead to data corruption).`
`+`
`+In addition to the literal forms, bytes objects can be created in a number of`
`+other ways:`
`+`
`+* A zero-filled bytes object of a specified length: ``bytes(10)```
`+* From an iterable of integers: ``bytes(range(20))```
`+* Copying existing binary data via the buffer protocol:  ``bytes(obj)```
`+`
`+Since bytes objects are sequences of integers, for a bytes object *b*,`
`+``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes object of`
`+length 1.  (This contrasts with text strings, where both indexing and`
`+slicing will produce a string of length 1)`
`+`
`+The representation of bytes objects uses the literal format (``b'...'``)`
`+since it is often more useful than e.g. ``bytes([46, 46, 46])``.  You can`
`+always convert a bytes object into a list of integers using ``list(b)``.`
`+`
`+Note for Python 2.x users: In the Python 2.x series, a variety of implicit`
`+conversions between 8-bit strings (the closest thing 2.x offers to a built-in`
`+binary data type) and Unicode strings were permitted. This was a backwards`
`+compatibility workaround to account for the fact that Python originally only`
`+supported 8-bit text, and Unicode text was a later addition. In Python 3.x,`
`+those implicit conversions are gone - conversions between 8-bit binary data`
`+and Unicode text must be explicit, and bytes and string objects will always`
`+compare unequal.`
`+`
`+`
`+.. _typebytearray:`
`+`
`+Bytearray Objects`
`+-----------------`
` `
` .. index:: object: bytearray`
` `
`-List and bytearray objects support additional operations that allow in-place`
`-modification of the object.  Other mutable sequence types (when added to the`
`-language) should also support these operations.  Strings and tuples are`
`-immutable sequence types: such objects cannot be modified once created. The`
`-following operations are defined on mutable sequence types (where *x* is an`
`-arbitrary object).`
`-`
`-Note that while lists allow their items to be of any type, bytearray object`
`-"items" are all integers in the range 0 <= x < 256.`
`+:class:`bytearray` objects are a mutable counterpart to :class:`bytes``
`+objects. There is no dedicated literal syntax for bytearray objects, instead`
`+they are always created by calling the constructor:`
`+`
`+* Creating an empty instance: ``bytearray()```
`+* Creating a zero-filled instance with a given length: ``bytearray(10)```
`+* From an iterable of integers: ``bytearray(range(20))```
`+* Copying existing binary data via the buffer protocol:  ``bytearray(b'Hi!)```
`+`
`+As bytearray objects are mutable, they support the`
`+:ref:`mutable <typesseq-mutable>` sequence operations in addition to the`
`+common bytes and bytearray operations described in :ref:`bytes-methods`.`
` `
` `
` .. _bytes-methods:`
` `
`-Bytes and Byte Array Methods`
`-----------------------------`
`+Bytes and Bytearray Operations`
`+------------------------------`
` `
` .. index:: pair: bytes; methods`
`            pair: bytearray; methods`
` `
`-Bytes and bytearray objects, being "strings of bytes", have all methods found on`
`-strings, with the exception of :func:`encode`, :func:`format` and`
`-:func:`isidentifier`, which do not make sense with these types.  For converting`
`-the objects to strings, they have a :func:`decode` method.`
`-`
`-Wherever one of these methods needs to interpret the bytes as characters`
`-(e.g. the :func:`is...` methods), the ASCII character set is assumed.`
`-`
`-.. versionadded:: 3.3`
`-   The functions :func:`count`, :func:`find`, :func:`index`,`
`-   :func:`rfind` and :func:`rindex` have additional semantics compared to`
`-   the corresponding string functions: They also accept an integer in`
`-   range 0 to 255 (a byte) as their first argument.`
`+Both bytes and bytearray objects support the :ref:`common <typesseq-common>``
`+sequence operations. They interoperate not just with operands of the same`
`+type, but with any object that supports the`
`+:ref:`buffer protocol <bufferobjects>`. Due to this flexibility, they can be`
`+freely mixed in operations without causing errors. However, the return type`
`+of the result may depend on the order of operands.`
`+`
`+Due to the common use of ASCII text as the basis for binary protocols, bytes`
`+and bytearray objects provide almost all methods found on text strings, with`
`+the exceptions of`
`+* :meth:`str.encode` (which converts text strings to bytes objects)`
`+* :meth:`str.format` and :meth:`str.format_map` (which are used to format`
`+  text for display to users)`
`+* :meth:`str.isidentifier`, :meth:`str.isnumeric`, :meth:`str.isdecimal`,`
`+  :meth:`str.isprintable` (which are used to check various properties of`
`+  text strings which are not typically applicable to binary protocols).`
`+`
`+All other string methods are supported, although sometimes with slight`
`+differences in functionality and semantics (as described below).`
` `
` .. note::`
` `
`    The methods on bytes and bytearray objects don't accept strings as their`
`    arguments, just as the methods on strings don't accept bytes as their`
`-   arguments.  For example, you have to write ::`
`+   arguments.  For example, you have to write::`
` `
`       a = "abc"`
`       b = a.replace("a", "f")`
` `
`-   and ::`
`+   and::`
` `
`       a = b"abc"`
`       b = a.replace(b"a", b"f")`
` `
`+Whenever a bytes or bytearray method needs to interpret the bytes as`
`+characters (e.g. the :meth:`is...` methods, :meth:`split`, :meth:`strip`),`
`+the ASCII character set is assumed (text strings use Unicode semantics).`
`+`
`+.. note::`
`+   Using these ASCII based methods to manipulate binary data that is not`
`+   stored in an ASCII based format may lead to data corruption.`
`+`
`+The search operations (:keyword:`in`, :meth:`count`, :meth:`find`,`
`+:meth:`index`, :meth:`rfind` and :meth:`rindex`) all accept both integers`
`+in the range 0 to 255 as well bytes and byte array sequences.`
`+`
`+.. versionchanged:: 3.3`
`+   All of the search methods accept an integer in range 0 to 255 (a byte) as`
`+   their first argument, not just containment testing.`
`+`
`+`
`+Each bytes and bytearray instance provides a :meth:`decode` convenience`
`+method that is the inverse of "meth:`str.encode`:`
` `
` .. method:: bytes.decode(encoding="utf-8", errors="strict")`
`             bytearray.decode(encoding="utf-8", errors="strict")`
`    .. versionchanged:: 3.1`
`       Added support for keyword arguments.`
` `
`-`
`-The bytes and bytearray types have an additional class method:`
`+Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal`
`+numbers are a commonly used format for describing binary data. Accordingly,`
`+the bytes and bytearray types have an additional class method to read data in`
`+that format:`
` `
` .. classmethod:: bytes.fromhex(string)`
`                  bytearray.fromhex(string)`
`    decoding the given string object.  The string must contain two hexadecimal`
`    digits per byte, spaces are ignored.`
` `
`-   >>> bytes.fromhex('f0 f1f2  ')`
`-   b'\xf0\xf1\xf2'`
`+   >>> bytes.fromhex('2Ef0 F1f2  ')`
`+   b'.\xf0\xf1\xf2'`
` `
` `
` The maketrans and translate methods differ in semantics from the versions`
` `
` .. _typememoryview:`
` `
`-memoryview type`
`----------------`
`+Memory Views`
`+------------`
` `
` :class:`memoryview` objects allow Python code to access the internal data`
` of an object that supports the :ref:`buffer protocol <bufferobjects>` without`