scalar types should not be iterable.

Issue #10 resolved

Yichao Yu created an issue 2014-05-17

The following code should fail but actually print 6 empty list on pypy.

import numpy as np
print(list(np.int8(17)))
print(list(np.int16(17)))
print(list(np.int32(17)))
print(list(np.int64(17)))
print(list(np.float32(17)))
print(list(np.float64(17)))

Comments (11)

Yichao Yu reporter
tested with pypy 2.4.0-alpha0 from ArchLinux Official repo and numpy master
- 2014-05-17T23:48:02+00:00
Yichao Yu reporter
Actually, after looking at the np.generic class, I don't really understand why it is not iterable in cpython since it has the __getitem__ method. Is there some tricks in cpython (c level?) to make a object with __getitem__ not iterable?
- 2014-06-23T22:27:05+00:00
lilydjwg
@yuyichao Quoting from the doc: "object must be a collection object which supports the iteration protocol (the __iter__() method), or it must support the sequence protocol (the __getitem__() method with integer arguments starting at 0)." As these types raise IndexErrors with index 0, they are not iterables.
- 2014-06-24T03:42:17+00:00
Yichao Yu reporter
@lilydjwg raising IndexError with index 0 does not make them not-iterable. The following code runs fine on all python versions I can find (cpython/pypy, 2/3)
```
class A:
    def __getitem__(self, key):
        raise IndexError

list(A())
```
- 2014-06-24T03:50:21+00:00
Yichao Yu reporter
P.S. str object in python2 does not have __iter__ method but a 0-length str is still iterable.
- 2014-06-24T03:51:43+00:00
lilydjwg
Oops, I was wrong.

In PyObject_GetIter, if __iter__ is not defined, it calls PySequence_Check, which then checks the .tp_as_sequence field of the type. This is NULL for numpy.generic (it has the .tp_as_mapping field to provide __getitem__).
- 2014-06-24T05:37:52+00:00
Yichao Yu reporter
So it is indeed a feature of the cpython c-api The best workarround I can think of so far is to add the following method to numpy.generic
```
    @classmethod
    def __iter__(cls):
        raise TypeError("'%s.%s' object is not iterable" %
                        (cls.__module__, cls.__name__))
```
However, this will make isinstance(numpy.int32(1), collections.Iterable) True......
- 2014-06-24T06:42:16+00:00
lilydjwg
In Python code, when __getitem__ is defined, when the class is instanticated, it calls type_call in Objects/typeobject.c. It assigns the address of as_sequence of a PyHeapTypeObject to the class's tp_as_sequence field. The PySequenceMethods struct it points to is initially all zeros, so tp_as_sequence->sq_item is NULL. Then, in update_one_slot called from fixup_slot_dispatchers called from type_new as the type's tp_new field called from type_call, it checks if __getitem__ is defined. If that is true, it assigns the slot_sq_item function to tp_as_sequence->sq_item, to make PySequence_Check return True.
- 2014-06-24T06:58:18+00:00
Armin Rigo
There is no way in Python, according to the language spec, to have an object with __getitem__ which is not iterable. I suppose that numpy implements that by obscure hacking at the C level. Yichao's is the only workaround. I don't think that collections.Iterable is a big blocker...
- 2014-07-04T12:19:31+00:00
Yichao Yu reporter
LOL.

I guess being collections.Iterable is indeed fine since it is already confusing enough and an object with __getitem__ is technically "iterable" anyway....
- 2014-07-04T23:01:28+00:00
Yichao Yu reporter
- changed status to resolved
Fixed
- 2014-07-08T03:54:51+00:00
Log in to comment

Assignee: –

Type: bug

Priority: critical

Status: resolved

Votes: 1

Watchers: 3