Overview

python-split

Functions to split and partition sequences.

Installation

pip install split

Usage

All functions in this module return iterators, and consume input lazily. In the examples below, the results are forced using list and dict.

Chunks of equal size

To partition a sequence into chunks of equal size, use chop:

>>> from split import chop
>>> list(chop(3, range(10)))
[[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]

If truncate=True keyword argument is given, then sequence length is truncated to a multiple of chunk size, and all chunks have the same size:

>>> list(chop(3, range(10), truncate=True))
[[0, 1, 2], [3, 4, 5], [6, 7, 8]]

Subsequences by a predicate

To split a sequence into two by a given predicate, use partition:

>>> from split import partition
>>> def odd(x): return x%2
>>> map(list, partition(odd, range(5)))
[[1, 3], [0, 2, 4]]

For more general partitioning, use groupby:

>>> [(k, list(i)) for k,i in groupby(lambda x: x%3, range(7))]
[(0, [0, 3, 6]), (1, [1, 4]), (2, [2, 5])]

This function is different from itertools.groupby: it returns only one subsequence iterator per predicate value. Its return value can be converted into dictionary.

When working with very long sequences, consider using predicate_values keyword argument to avoid scanning the entire sequence. For example:

>>> longseq = xrange(int(1e9))
>>> pred = lambda x: x%3
>>> dict(groupby(pred, longseq, predicate_values=(0,1,2)))
{0: <generator object subsequence at 0x301b7d0>,
 1: <generator object subsequence at 0x301b780>,
 2: <generator object subsequence at 0x301b730>}

Breaking on separators

To break a sequence into chunks on some separators, use split. For example, breaking on zero elements:

>>> list(split(0, [1,2,3,0,4,5,0,0,6]))
[[1, 2, 3], [4, 5], [], [6]]

You can use a function as a predicate too:

>>> list(split(lambda x: x==5, range(10)))
[[0, 1, 2, 3, 4], [6, 7, 8, 9]]