Commits

Hakan Ardo committed 3594446 Merge

hg merge

  • Participants
  • Parent commits a6f23c0, 753627d

Comments (0)

Files changed (61)

File lib-python/modified-2.7/heapq.py

+# -*- coding: latin-1 -*-
+
+"""Heap queue algorithm (a.k.a. priority queue).
+
+Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
+all k, counting elements from 0.  For the sake of comparison,
+non-existing elements are considered to be infinite.  The interesting
+property of a heap is that a[0] is always its smallest element.
+
+Usage:
+
+heap = []            # creates an empty heap
+heappush(heap, item) # pushes a new item on the heap
+item = heappop(heap) # pops the smallest item from the heap
+item = heap[0]       # smallest item on the heap without popping it
+heapify(x)           # transforms list into a heap, in-place, in linear time
+item = heapreplace(heap, item) # pops and returns smallest item, and adds
+                               # new item; the heap size is unchanged
+
+Our API differs from textbook heap algorithms as follows:
+
+- We use 0-based indexing.  This makes the relationship between the
+  index for a node and the indexes for its children slightly less
+  obvious, but is more suitable since Python uses 0-based indexing.
+
+- Our heappop() method returns the smallest item, not the largest.
+
+These two make it possible to view the heap as a regular Python list
+without surprises: heap[0] is the smallest item, and heap.sort()
+maintains the heap invariant!
+"""
+
+# Original code by Kevin O'Connor, augmented by Tim Peters and Raymond Hettinger
+
+__about__ = """Heap queues
+
+[explanation by Fran�ois Pinard]
+
+Heaps are arrays for which a[k] <= a[2*k+1] and a[k] <= a[2*k+2] for
+all k, counting elements from 0.  For the sake of comparison,
+non-existing elements are considered to be infinite.  The interesting
+property of a heap is that a[0] is always its smallest element.
+
+The strange invariant above is meant to be an efficient memory
+representation for a tournament.  The numbers below are `k', not a[k]:
+
+                                   0
+
+                  1                                 2
+
+          3               4                5               6
+
+      7       8       9       10      11      12      13      14
+
+    15 16   17 18   19 20   21 22   23 24   25 26   27 28   29 30
+
+
+In the tree above, each cell `k' is topping `2*k+1' and `2*k+2'.  In
+an usual binary tournament we see in sports, each cell is the winner
+over the two cells it tops, and we can trace the winner down the tree
+to see all opponents s/he had.  However, in many computer applications
+of such tournaments, we do not need to trace the history of a winner.
+To be more memory efficient, when a winner is promoted, we try to
+replace it by something else at a lower level, and the rule becomes
+that a cell and the two cells it tops contain three different items,
+but the top cell "wins" over the two topped cells.
+
+If this heap invariant is protected at all time, index 0 is clearly
+the overall winner.  The simplest algorithmic way to remove it and
+find the "next" winner is to move some loser (let's say cell 30 in the
+diagram above) into the 0 position, and then percolate this new 0 down
+the tree, exchanging values, until the invariant is re-established.
+This is clearly logarithmic on the total number of items in the tree.
+By iterating over all items, you get an O(n ln n) sort.
+
+A nice feature of this sort is that you can efficiently insert new
+items while the sort is going on, provided that the inserted items are
+not "better" than the last 0'th element you extracted.  This is
+especially useful in simulation contexts, where the tree holds all
+incoming events, and the "win" condition means the smallest scheduled
+time.  When an event schedule other events for execution, they are
+scheduled into the future, so they can easily go into the heap.  So, a
+heap is a good structure for implementing schedulers (this is what I
+used for my MIDI sequencer :-).
+
+Various structures for implementing schedulers have been extensively
+studied, and heaps are good for this, as they are reasonably speedy,
+the speed is almost constant, and the worst case is not much different
+than the average case.  However, there are other representations which
+are more efficient overall, yet the worst cases might be terrible.
+
+Heaps are also very useful in big disk sorts.  You most probably all
+know that a big sort implies producing "runs" (which are pre-sorted
+sequences, which size is usually related to the amount of CPU memory),
+followed by a merging passes for these runs, which merging is often
+very cleverly organised[1].  It is very important that the initial
+sort produces the longest runs possible.  Tournaments are a good way
+to that.  If, using all the memory available to hold a tournament, you
+replace and percolate items that happen to fit the current run, you'll
+produce runs which are twice the size of the memory for random input,
+and much better for input fuzzily ordered.
+
+Moreover, if you output the 0'th item on disk and get an input which
+may not fit in the current tournament (because the value "wins" over
+the last output value), it cannot fit in the heap, so the size of the
+heap decreases.  The freed memory could be cleverly reused immediately
+for progressively building a second heap, which grows at exactly the
+same rate the first heap is melting.  When the first heap completely
+vanishes, you switch heaps and start a new run.  Clever and quite
+effective!
+
+In a word, heaps are useful memory structures to know.  I use them in
+a few applications, and I think it is good to keep a `heap' module
+around. :-)
+
+--------------------
+[1] The disk balancing algorithms which are current, nowadays, are
+more annoying than clever, and this is a consequence of the seeking
+capabilities of the disks.  On devices which cannot seek, like big
+tape drives, the story was quite different, and one had to be very
+clever to ensure (far in advance) that each tape movement will be the
+most effective possible (that is, will best participate at
+"progressing" the merge).  Some tapes were even able to read
+backwards, and this was also used to avoid the rewinding time.
+Believe me, real good tape sorts were quite spectacular to watch!
+From all times, sorting has always been a Great Art! :-)
+"""
+
+__all__ = ['heappush', 'heappop', 'heapify', 'heapreplace', 'merge',
+           'nlargest', 'nsmallest', 'heappushpop']
+
+from itertools import islice, repeat, count, imap, izip, tee, chain
+from operator import itemgetter
+import bisect
+
+def heappush(heap, item):
+    """Push item onto heap, maintaining the heap invariant."""
+    heap.append(item)
+    _siftdown(heap, 0, len(heap)-1)
+
+def heappop(heap):
+    """Pop the smallest item off the heap, maintaining the heap invariant."""
+    lastelt = heap.pop()    # raises appropriate IndexError if heap is empty
+    if heap:
+        returnitem = heap[0]
+        heap[0] = lastelt
+        _siftup(heap, 0)
+    else:
+        returnitem = lastelt
+    return returnitem
+
+def heapreplace(heap, item):
+    """Pop and return the current smallest value, and add the new item.
+
+    This is more efficient than heappop() followed by heappush(), and can be
+    more appropriate when using a fixed-size heap.  Note that the value
+    returned may be larger than item!  That constrains reasonable uses of
+    this routine unless written as part of a conditional replacement:
+
+        if item > heap[0]:
+            item = heapreplace(heap, item)
+    """
+    returnitem = heap[0]    # raises appropriate IndexError if heap is empty
+    heap[0] = item
+    _siftup(heap, 0)
+    return returnitem
+
+def heappushpop(heap, item):
+    """Fast version of a heappush followed by a heappop."""
+    if heap and heap[0] < item:
+        item, heap[0] = heap[0], item
+        _siftup(heap, 0)
+    return item
+
+def heapify(x):
+    """Transform list into a heap, in-place, in O(len(heap)) time."""
+    n = len(x)
+    # Transform bottom-up.  The largest index there's any point to looking at
+    # is the largest with a child index in-range, so must have 2*i + 1 < n,
+    # or i < (n-1)/2.  If n is even = 2*j, this is (2*j-1)/2 = j-1/2 so
+    # j-1 is the largest, which is n//2 - 1.  If n is odd = 2*j+1, this is
+    # (2*j+1-1)/2 = j so j-1 is the largest, and that's again n//2-1.
+    for i in reversed(xrange(n//2)):
+        _siftup(x, i)
+
+def nlargest(n, iterable):
+    """Find the n largest elements in a dataset.
+
+    Equivalent to:  sorted(iterable, reverse=True)[:n]
+    """
+    if n < 0: # for consistency with the c impl
+        return []
+    it = iter(iterable)
+    result = list(islice(it, n))
+    if not result:
+        return result
+    heapify(result)
+    _heappushpop = heappushpop
+    for elem in it:
+        _heappushpop(result, elem)
+    result.sort(reverse=True)
+    return result
+
+def nsmallest(n, iterable):
+    """Find the n smallest elements in a dataset.
+
+    Equivalent to:  sorted(iterable)[:n]
+    """
+    if n < 0: # for consistency with the c impl
+        return []
+    if hasattr(iterable, '__len__') and n * 10 <= len(iterable):
+        # For smaller values of n, the bisect method is faster than a minheap.
+        # It is also memory efficient, consuming only n elements of space.
+        it = iter(iterable)
+        result = sorted(islice(it, 0, n))
+        if not result:
+            return result
+        insort = bisect.insort
+        pop = result.pop
+        los = result[-1]    # los --> Largest of the nsmallest
+        for elem in it:
+            if los <= elem:
+                continue
+            insort(result, elem)
+            pop()
+            los = result[-1]
+        return result
+    # An alternative approach manifests the whole iterable in memory but
+    # saves comparisons by heapifying all at once.  Also, saves time
+    # over bisect.insort() which has O(n) data movement time for every
+    # insertion.  Finding the n smallest of an m length iterable requires
+    #    O(m) + O(n log m) comparisons.
+    h = list(iterable)
+    heapify(h)
+    return map(heappop, repeat(h, min(n, len(h))))
+
+# 'heap' is a heap at all indices >= startpos, except possibly for pos.  pos
+# is the index of a leaf with a possibly out-of-order value.  Restore the
+# heap invariant.
+def _siftdown(heap, startpos, pos):
+    newitem = heap[pos]
+    # Follow the path to the root, moving parents down until finding a place
+    # newitem fits.
+    while pos > startpos:
+        parentpos = (pos - 1) >> 1
+        parent = heap[parentpos]
+        if newitem < parent:
+            heap[pos] = parent
+            pos = parentpos
+            continue
+        break
+    heap[pos] = newitem
+
+# The child indices of heap index pos are already heaps, and we want to make
+# a heap at index pos too.  We do this by bubbling the smaller child of
+# pos up (and so on with that child's children, etc) until hitting a leaf,
+# then using _siftdown to move the oddball originally at index pos into place.
+#
+# We *could* break out of the loop as soon as we find a pos where newitem <=
+# both its children, but turns out that's not a good idea, and despite that
+# many books write the algorithm that way.  During a heap pop, the last array
+# element is sifted in, and that tends to be large, so that comparing it
+# against values starting from the root usually doesn't pay (= usually doesn't
+# get us out of the loop early).  See Knuth, Volume 3, where this is
+# explained and quantified in an exercise.
+#
+# Cutting the # of comparisons is important, since these routines have no
+# way to extract "the priority" from an array element, so that intelligence
+# is likely to be hiding in custom __cmp__ methods, or in array elements
+# storing (priority, record) tuples.  Comparisons are thus potentially
+# expensive.
+#
+# On random arrays of length 1000, making this change cut the number of
+# comparisons made by heapify() a little, and those made by exhaustive
+# heappop() a lot, in accord with theory.  Here are typical results from 3
+# runs (3 just to demonstrate how small the variance is):
+#
+# Compares needed by heapify     Compares needed by 1000 heappops
+# --------------------------     --------------------------------
+# 1837 cut to 1663               14996 cut to 8680
+# 1855 cut to 1659               14966 cut to 8678
+# 1847 cut to 1660               15024 cut to 8703
+#
+# Building the heap by using heappush() 1000 times instead required
+# 2198, 2148, and 2219 compares:  heapify() is more efficient, when
+# you can use it.
+#
+# The total compares needed by list.sort() on the same lists were 8627,
+# 8627, and 8632 (this should be compared to the sum of heapify() and
+# heappop() compares):  list.sort() is (unsurprisingly!) more efficient
+# for sorting.
+
+def _siftup(heap, pos):
+    endpos = len(heap)
+    startpos = pos
+    newitem = heap[pos]
+    # Bubble up the smaller child until hitting a leaf.
+    childpos = 2*pos + 1    # leftmost child position
+    while childpos < endpos:
+        # Set childpos to index of smaller child.
+        rightpos = childpos + 1
+        if rightpos < endpos and not heap[childpos] < heap[rightpos]:
+            childpos = rightpos
+        # Move the smaller child up.
+        heap[pos] = heap[childpos]
+        pos = childpos
+        childpos = 2*pos + 1
+    # The leaf at pos is empty now.  Put newitem there, and bubble it up
+    # to its final resting place (by sifting its parents down).
+    heap[pos] = newitem
+    _siftdown(heap, startpos, pos)
+
+# If available, use C implementation
+try:
+    from _heapq import *
+except ImportError:
+    pass
+
+def merge(*iterables):
+    '''Merge multiple sorted inputs into a single sorted output.
+
+    Similar to sorted(itertools.chain(*iterables)) but returns a generator,
+    does not pull the data into memory all at once, and assumes that each of
+    the input streams is already sorted (smallest to largest).
+
+    >>> list(merge([1,3,5,7], [0,2,4,8], [5,10,15,20], [], [25]))
+    [0, 1, 2, 3, 4, 5, 5, 7, 8, 10, 15, 20, 25]
+
+    '''
+    _heappop, _heapreplace, _StopIteration = heappop, heapreplace, StopIteration
+
+    h = []
+    h_append = h.append
+    for itnum, it in enumerate(map(iter, iterables)):
+        try:
+            next = it.next
+            h_append([next(), itnum, next])
+        except _StopIteration:
+            pass
+    heapify(h)
+
+    while 1:
+        try:
+            while 1:
+                v, itnum, next = s = h[0]   # raises IndexError when h is empty
+                yield v
+                s[0] = next()               # raises StopIteration when exhausted
+                _heapreplace(h, s)          # restore heap condition
+        except _StopIteration:
+            _heappop(h)                     # remove empty iterator
+        except IndexError:
+            return
+
+# Extend the implementations of nsmallest and nlargest to use a key= argument
+_nsmallest = nsmallest
+def nsmallest(n, iterable, key=None):
+    """Find the n smallest elements in a dataset.
+
+    Equivalent to:  sorted(iterable, key=key)[:n]
+    """
+    # Short-cut for n==1 is to use min() when len(iterable)>0
+    if n == 1:
+        it = iter(iterable)
+        head = list(islice(it, 1))
+        if not head:
+            return []
+        if key is None:
+            return [min(chain(head, it))]
+        return [min(chain(head, it), key=key)]
+
+    # When n>=size, it's faster to use sort()
+    try:
+        size = len(iterable)
+    except (TypeError, AttributeError):
+        pass
+    else:
+        if n >= size:
+            return sorted(iterable, key=key)[:n]
+
+    # When key is none, use simpler decoration
+    if key is None:
+        it = izip(iterable, count())                        # decorate
+        result = _nsmallest(n, it)
+        return map(itemgetter(0), result)                   # undecorate
+
+    # General case, slowest method
+    in1, in2 = tee(iterable)
+    it = izip(imap(key, in1), count(), in2)                 # decorate
+    result = _nsmallest(n, it)
+    return map(itemgetter(2), result)                       # undecorate
+
+_nlargest = nlargest
+def nlargest(n, iterable, key=None):
+    """Find the n largest elements in a dataset.
+
+    Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]
+    """
+
+    # Short-cut for n==1 is to use max() when len(iterable)>0
+    if n == 1:
+        it = iter(iterable)
+        head = list(islice(it, 1))
+        if not head:
+            return []
+        if key is None:
+            return [max(chain(head, it))]
+        return [max(chain(head, it), key=key)]
+
+    # When n>=size, it's faster to use sort()
+    try:
+        size = len(iterable)
+    except (TypeError, AttributeError):
+        pass
+    else:
+        if n >= size:
+            return sorted(iterable, key=key, reverse=True)[:n]
+
+    # When key is none, use simpler decoration
+    if key is None:
+        it = izip(iterable, count(0,-1))                    # decorate
+        result = _nlargest(n, it)
+        return map(itemgetter(0), result)                   # undecorate
+
+    # General case, slowest method
+    in1, in2 = tee(iterable)
+    it = izip(imap(key, in1), count(0,-1), in2)             # decorate
+    result = _nlargest(n, it)
+    return map(itemgetter(2), result)                       # undecorate
+
+if __name__ == "__main__":
+    # Simple sanity test
+    heap = []
+    data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
+    for item in data:
+        heappush(heap, item)
+    sort = []
+    while heap:
+        sort.append(heappop(heap))
+    print sort
+
+    import doctest
+    doctest.testmod()

File lib-python/modified-2.7/test/test_heapq.py

         self.assertFalse(sys.modules['heapq'] is self.module)
         self.assertTrue(hasattr(self.module.heapify, 'func_code'))
 
+    def test_islice_protection(self):
+        m = self.module
+        self.assertFalse(m.nsmallest(-1, [1]))
+        self.assertFalse(m.nlargest(-1, [1]))
+
 
 class TestHeapC(TestHeap):
     module = c_heapq

File pypy/doc/project-ideas.rst

 projects, or anything else in PyPy, pop up on IRC or write to us on the
 `mailing list`_.
 
+Make big integers faster
+-------------------------
+
+PyPy's implementation of the Python ``long`` type is slower than CPython's.
+Find out why and optimize them.
+
 Numpy improvements
 ------------------
 

File pypy/jit/backend/llsupport/descr.py

     def is_float_field(self):
         return self.fielddescr.is_float_field()
 
+    def sort_key(self):
+        return self.fielddescr.sort_key()
+
     def repr_of_descr(self):
         return '<InteriorFieldDescr %s>' % self.fielddescr.repr_of_descr()
 

File pypy/jit/backend/llsupport/regalloc.py

File contents unchanged.

File pypy/jit/backend/test/test_ll_random.py

             v, S = from_[i][:2]
             if not isinstance(S, type):
                 continue
-            if (isinstance(S, lltype.Array) and
-                isinstance(S.OF, lltype.Struct) == array_of_structs):
+            if ((isinstance(S, lltype.Array) and
+                 isinstance(S.OF, lltype.Struct)) == array_of_structs):
                 ptrvars.append((v, S))
         return ptrvars
 
                     dic[fieldname] = getattr(p, fieldname)
         else:
             assert isinstance(S, lltype.Array)
-            for i in range(len(p)):
-                dic[i] = p[i]
+            if isinstance(S.OF, lltype.Struct):
+                for i in range(len(p)):
+                    item = p[i]
+                    s1 = {}
+                    for fieldname in S.OF._names:
+                        s1[fieldname] = getattr(item, fieldname)
+                    dic[i] = s1
+            else:
+                for i in range(len(p)):
+                    dic[i] = p[i]
         return dic
 
     def print_loop_prebuilt(self, names, writevar, s):
                                          array_of_structs=True)
         array = v.getref(lltype.Ptr(A))
         v_index = builder.get_index(len(array), r)
-        names = A.OF._names
-        if names[0] == 'parent':
-            names = names[1:]
-        name = r.choice(names)
+        name = r.choice(A.OF._names)
         descr = builder.cpu.interiorfielddescrof(A, name)
         descr._random_info = 'cpu.interiorfielddescrof(%s, %r)' % (A.OF._name,
                                                                    name)
                 break
         builder.do(self.opnum, [v, w], descr)
 
-class SetInteriorFieldOperation(GetFieldOperation):
+class SetInteriorFieldOperation(GetInteriorFieldOperation):
     def produce_into(self, builder, r):
-        import pdb
-        pdb.set_trace()
-        v, descr, TYPE = self.field_descr(builder, r)
+        v, v_index, descr, TYPE = self.field_descr(builder, r)
         while True:
             if r.random() < 0.3:
                 w = ConstInt(r.random_integer())
                 w = r.choice(builder.intvars)
             if rffi.cast(lltype.Signed, rffi.cast(TYPE, w.value)) == w.value:
                 break
-        builder.do(self.opnum, [v, w], descr)
+        builder.do(self.opnum, [v, v_index, w], descr)
 
 class NewOperation(test_random.AbstractOperation):
     def size_descr(self, builder, S):
     OPERATIONS.append(GetFieldOperation(rop.GETFIELD_GC))
     OPERATIONS.append(GetInteriorFieldOperation(rop.GETINTERIORFIELD_GC))
     OPERATIONS.append(SetFieldOperation(rop.SETFIELD_GC))
-    #OPERATIONS.append(SetInteriorFieldOperation(rop.SETINTERIORFIELD_GC))
+    OPERATIONS.append(SetInteriorFieldOperation(rop.SETINTERIORFIELD_GC))
     OPERATIONS.append(NewOperation(rop.NEW))
     OPERATIONS.append(NewOperation(rop.NEW_WITH_VTABLE))
 

File pypy/jit/backend/test/test_random.py

             for name, value in fields.items():
                 if isinstance(name, str):
                     setattr(container, name, value)
+                elif isinstance(value, dict):
+                    item = container.getitem(name)
+                    for key1, value1 in value.items():
+                        setattr(item, key1, value1)
                 else:
                     container.setitem(name, value)
 

File pypy/jit/backend/x86/assembler.py

     genop_getarrayitem_gc_pure = genop_getarrayitem_gc
     genop_getarrayitem_raw = genop_getarrayitem_gc
 
+    def _get_interiorfield_addr(self, temp_loc, index_loc, itemsize_loc,
+                                base_loc, ofs_loc):
+        assert isinstance(itemsize_loc, ImmedLoc)
+        if isinstance(index_loc, ImmedLoc):
+            temp_loc = imm(index_loc.value * itemsize_loc.value)
+        else:
+            # XXX should not use IMUL in most cases
+            assert isinstance(temp_loc, RegLoc)
+            assert isinstance(index_loc, RegLoc)
+            self.mc.IMUL_rri(temp_loc.value, index_loc.value,
+                             itemsize_loc.value)
+        assert isinstance(ofs_loc, ImmedLoc)
+        return AddressLoc(base_loc, temp_loc, 0, ofs_loc.value)
+
     def genop_getinteriorfield_gc(self, op, arglocs, resloc):
-        base_loc, ofs_loc, itemsize_loc, fieldsize_loc, index_loc, sign_loc = arglocs
-        # XXX should not use IMUL in most cases
-        self.mc.IMUL(index_loc, itemsize_loc)
-        src_addr = AddressLoc(base_loc, index_loc, 0, ofs_loc.value)
+        (base_loc, ofs_loc, itemsize_loc, fieldsize_loc,
+            index_loc, sign_loc) = arglocs
+        src_addr = self._get_interiorfield_addr(resloc, index_loc,
+                                                itemsize_loc, base_loc,
+                                                ofs_loc)
         self.load_from_mem(resloc, src_addr, fieldsize_loc, sign_loc)
 
 
         self.save_into_mem(dest_addr, value_loc, size_loc)
 
     def genop_discard_setinteriorfield_gc(self, op, arglocs):
-        base_loc, ofs_loc, itemsize_loc, fieldsize_loc, index_loc, value_loc = arglocs
-        # XXX should not use IMUL in most cases
-        self.mc.IMUL(index_loc, itemsize_loc)
-        dest_addr = AddressLoc(base_loc, index_loc, 0, ofs_loc.value)
+        (base_loc, ofs_loc, itemsize_loc, fieldsize_loc,
+            index_loc, temp_loc, value_loc) = arglocs
+        dest_addr = self._get_interiorfield_addr(temp_loc, index_loc,
+                                                 itemsize_loc, base_loc,
+                                                 ofs_loc)
         self.save_into_mem(dest_addr, value_loc, fieldsize_loc)
 
     def genop_discard_setarrayitem_gc(self, op, arglocs):

File pypy/jit/backend/x86/regalloc.py

         t = self._unpack_interiorfielddescr(op.getdescr())
         ofs, itemsize, fieldsize, _ = t
         args = op.getarglist()
-        tmpvar = TempBox()
-        base_loc = self.rm.make_sure_var_in_reg(op.getarg(0), args)
-        index_loc = self.rm.force_result_in_reg(tmpvar, op.getarg(1),
-                                                args)
-        # we're free to modify index now
-        value_loc = self.make_sure_var_in_reg(op.getarg(2), args)
-        self.possibly_free_vars(args)
-        self.rm.possibly_free_var(tmpvar)
+        if fieldsize.value == 1:
+            need_lower_byte = True
+        else:
+            need_lower_byte = False
+        box_base, box_index, box_value = args
+        base_loc = self.rm.make_sure_var_in_reg(box_base, args)
+        index_loc = self.rm.make_sure_var_in_reg(box_index, args)
+        value_loc = self.make_sure_var_in_reg(box_value, args,
+                                              need_lower_byte=need_lower_byte)
+        # If 'index_loc' is not an immediate, then we need a 'temp_loc' that
+        # is a register whose value will be destroyed.  It's fine to destroy
+        # the same register as 'index_loc', but not the other ones.
+        self.rm.possibly_free_var(box_index)
+        if not isinstance(index_loc, ImmedLoc):
+            tempvar = TempBox()
+            temp_loc = self.rm.force_allocate_reg(tempvar, [box_base,
+                                                            box_value])
+            self.rm.possibly_free_var(tempvar)
+        else:
+            temp_loc = None
+        self.rm.possibly_free_var(box_base)
+        self.possibly_free_var(box_value)
         self.PerformDiscard(op, [base_loc, ofs, itemsize, fieldsize,
-                                 index_loc, value_loc])
+                                 index_loc, temp_loc, value_loc])
 
     def consider_strsetitem(self, op):
         args = op.getarglist()
         else:
             sign_loc = imm0
         args = op.getarglist()
-        tmpvar = TempBox()
         base_loc = self.rm.make_sure_var_in_reg(op.getarg(0), args)
-        index_loc = self.rm.force_result_in_reg(tmpvar, op.getarg(1),
-                                                args)
-        self.rm.possibly_free_vars_for_op(op)
-        self.rm.possibly_free_var(tmpvar)
-        result_loc = self.force_allocate_reg(op.result)
+        index_loc = self.rm.make_sure_var_in_reg(op.getarg(1), args)
+        # 'base' and 'index' are put in two registers (or one if 'index'
+        # is an immediate).  'result' can be in the same register as
+        # 'index' but must be in a different register than 'base'.
+        self.rm.possibly_free_var(op.getarg(1))
+        result_loc = self.force_allocate_reg(op.result, [op.getarg(0)])
+        self.rm.possibly_free_var(op.getarg(0))
         self.Perform(op, [base_loc, ofs, itemsize, fieldsize,
                           index_loc, sign_loc], result_loc)
 

File pypy/jit/codewriter/jtransform.py

         if self._is_gc(op.args[0]):
             return op
 
+    def rewrite_op_cast_opaque_ptr(self, op):
+        # None causes the result of this op to get aliased to op.args[0]
+        return [SpaceOperation('mark_opaque_ptr', op.args, None), None]
+
     def rewrite_op_force_cast(self, op):
         v_arg = op.args[0]
         v_result = op.result

File pypy/jit/codewriter/test/test_jtransform.py

                         varoftype(lltype.Signed))
     tr = Transformer(None, None)
     raises(NotImplementedError, tr.rewrite_operation, op)
+
+def test_cast_opaque_ptr():
+    S = lltype.GcStruct("S", ("x", lltype.Signed))
+    v1 = varoftype(lltype.Ptr(S))
+    v2 = varoftype(lltype.Ptr(rclass.OBJECT))
+
+    op = SpaceOperation('cast_opaque_ptr', [v1], v2)
+    tr = Transformer()
+    [op1, op2] = tr.rewrite_operation(op)
+    assert op1.opname == 'mark_opaque_ptr'
+    assert op1.args == [v1]
+    assert op1.result is None
+    assert op2 is None

File pypy/jit/metainterp/blackhole.py

     @arguments("r", "r", returns="i")
     def bhimpl_instance_ptr_ne(a, b):
         return a != b
-    @arguments("r", returns="r")
-    def bhimpl_cast_opaque_ptr(a):
-        return a
     @arguments("r", returns="i")
     def bhimpl_cast_ptr_to_int(a):
         i = lltype.cast_ptr_to_int(a)
         ll_assert((i & 1) == 1, "bhimpl_cast_int_to_ptr: not an odd int")
         return lltype.cast_int_to_ptr(llmemory.GCREF, i)
 
+    @arguments("r")
+    def bhimpl_mark_opaque_ptr(a):
+        pass
+
     @arguments("i", returns="i")
     def bhimpl_int_copy(a):
         return a

File pypy/jit/metainterp/heapcache.py

         self.clear_caches(opnum, descr, argboxes)
 
     def mark_escaped(self, opnum, argboxes):
-        idx = 0
         if opnum == rop.SETFIELD_GC:
             assert len(argboxes) == 2
             box, valuebox = argboxes
                 self.dependencies.setdefault(box, []).append(valuebox)
             else:
                 self._escape(valuebox)
-        # GETFIELD_GC doesn't escape it's argument
-        elif opnum != rop.GETFIELD_GC:
+        elif opnum == rop.SETARRAYITEM_GC:
+            assert len(argboxes) == 3
+            box, indexbox, valuebox = argboxes
+            if self.is_unescaped(box) and self.is_unescaped(valuebox):
+                self.dependencies.setdefault(box, []).append(valuebox)
+            else:
+                self._escape(valuebox)
+        # GETFIELD_GC, MARK_OPAQUE_PTR, PTR_EQ, and PTR_NE don't escape their
+        # arguments
+        elif (opnum != rop.GETFIELD_GC and
+              opnum != rop.MARK_OPAQUE_PTR and
+              opnum != rop.PTR_EQ and
+              opnum != rop.PTR_NE):
+            idx = 0
             for box in argboxes:
                 # setarrayitem_gc don't escape its first argument
                 if not (idx == 0 and opnum in [rop.SETARRAYITEM_GC]):
                 self._escape(dep)
 
     def clear_caches(self, opnum, descr, argboxes):
-        if opnum == rop.SETFIELD_GC:
-            return
-        if opnum == rop.SETARRAYITEM_GC:
-            return
-        if opnum == rop.SETFIELD_RAW:
-            return
-        if opnum == rop.SETARRAYITEM_RAW:
+        if (opnum == rop.SETFIELD_GC or
+            opnum == rop.SETARRAYITEM_GC or
+            opnum == rop.SETFIELD_RAW or
+            opnum == rop.SETARRAYITEM_RAW or
+            opnum == rop.SETINTERIORFIELD_GC or
+            opnum == rop.COPYSTRCONTENT or
+            opnum == rop.COPYUNICODECONTENT):
             return
         if rop._OVF_FIRST <= opnum <= rop._OVF_LAST:
             return
         if opnum == rop.CALL or opnum == rop.CALL_LOOPINVARIANT:
             effectinfo = descr.get_extra_info()
             ef = effectinfo.extraeffect
-            if ef == effectinfo.EF_LOOPINVARIANT or \
-               ef == effectinfo.EF_ELIDABLE_CANNOT_RAISE or \
-               ef == effectinfo.EF_ELIDABLE_CAN_RAISE:
+            if (ef == effectinfo.EF_LOOPINVARIANT or
+                ef == effectinfo.EF_ELIDABLE_CANNOT_RAISE or
+                ef == effectinfo.EF_ELIDABLE_CAN_RAISE):
                 return
             # A special case for ll_arraycopy, because it is so common, and its
             # effects are so well defined.

File pypy/jit/metainterp/history.py

     def view(self, **kwds):
         pass
 
+    def clear(self):
+        pass
+
 class Stats(object):
     """For tests."""
 
         self.aborted_keys = []
         self.invalidated_token_numbers = set()
 
+    def clear(self):
+        del self.loops[:]
+        del self.locations[:]
+        del self.aborted_keys[:]
+        self.invalidated_token_numbers.clear()
+        self.compiled_count = 0
+        self.enter_count = 0
+        self.aborted_count = 0
+
     def set_history(self, history):
         self.operations = history.operations
 

File pypy/jit/metainterp/optimizeopt/optimizer.py

     def setfield(self, ofs, value):
         raise NotImplementedError
 
+    def getlength(self):
+        raise NotImplementedError
+
     def getitem(self, index):
         raise NotImplementedError
 
-    def getlength(self):
+    def setitem(self, index, value):
         raise NotImplementedError
 
-    def setitem(self, index, value):
+    def getinteriorfield(self, index, ofs, default):
+        raise NotImplementedError
+
+    def setinteriorfield(self, index, ofs, value):
         raise NotImplementedError
 
 
             return self.optimizer.optpure.has_pure_result(opnum, args, descr)
         return False
 
-    def get_pure_result(self, key):    
+    def get_pure_result(self, key):
         if self.optimizer.optpure:
             return self.optimizer.optpure.get_pure_result(key)
         return None
-    
+
     def setup(self):
         pass
 
 
     def replace_op(self, old_op, new_op):
         # XXX: Do we want to cache indexes to prevent search?
-        i = len(self._newoperations) 
+        i = len(self._newoperations)
         while i > 0:
             i -= 1
             if self._newoperations[i] is old_op:

File pypy/jit/metainterp/optimizeopt/rewrite.py

                                         args = [op.getarg(0), ConstInt(highest_bit(val))])
         self.emit_operation(op)
 
-    def optimize_CAST_OPAQUE_PTR(self, op):
+    def optimize_MARK_OPAQUE_PTR(self, op):
         value = self.getvalue(op.getarg(0))
         self.optimizer.opaque_pointers[value] = True
-        self.make_equal_to(op.result, value)
 
     def optimize_CAST_PTR_TO_INT(self, op):
         self.pure(rop.CAST_INT_TO_PTR, [op.result], op.getarg(0))

File pypy/jit/metainterp/optimizeopt/simplify.py

         #     but it's a bit hard to implement robustly if heap.py is also run
         pass
 
-    optimize_CAST_OPAQUE_PTR = optimize_VIRTUAL_REF
+    def optimize_MARK_OPAQUE_PTR(self, op):
+        pass
 
 
 dispatch_opt = make_dispatcher_method(OptSimplify, 'optimize_',

File pypy/jit/metainterp/optimizeopt/test/test_optimizebasic.py

         """
         self.optimize_loop(ops, expected)
 
-
     def test_virtual_constant_isnonnull(self):
         ops = """
         [i0]
         """
         self.optimize_loop(ops, expected)
 
+    def test_virtual_array_of_struct(self):
+        ops = """
+        [f0, f1, f2, f3]
+        p0 = new_array(2, descr=complexarraydescr)
+        setinteriorfield_gc(p0, 0, f0, descr=complexrealdescr)
+        setinteriorfield_gc(p0, 0, f1, descr=compleximagdescr)
+        setinteriorfield_gc(p0, 1, f2, descr=complexrealdescr)
+        setinteriorfield_gc(p0, 1, f3, descr=compleximagdescr)
+        f4 = getinteriorfield_gc(p0, 0, descr=complexrealdescr)
+        f5 = getinteriorfield_gc(p0, 1, descr=complexrealdescr)
+        f6 = float_mul(f4, f5)
+        f7 = getinteriorfield_gc(p0, 0, descr=compleximagdescr)
+        f8 = getinteriorfield_gc(p0, 1, descr=compleximagdescr)
+        f9 = float_mul(f7, f8)
+        f10 = float_add(f6, f9)
+        finish(f10)
+        """
+        expected = """
+        [f0, f1, f2, f3]
+        f4 = float_mul(f0, f2)
+        f5 = float_mul(f1, f3)
+        f6 = float_add(f4, f5)
+        finish(f6)
+        """
+        self.optimize_loop(ops, expected)
+
+    def test_virtual_array_of_struct_forced(self):
+        ops = """
+        [f0, f1]
+        p0 = new_array(1, descr=complexarraydescr)
+        setinteriorfield_gc(p0, 0, f0, descr=complexrealdescr)
+        setinteriorfield_gc(p0, 0, f1, descr=compleximagdescr)
+        f2 = getinteriorfield_gc(p0, 0, descr=complexrealdescr)
+        f3 = getinteriorfield_gc(p0, 0, descr=compleximagdescr)
+        f4 = float_mul(f2, f3)
+        i0 = escape(f4, p0)
+        finish(i0)
+        """
+        expected = """
+        [f0, f1]
+        f2 = float_mul(f0, f1)
+        p0 = new_array(1, descr=complexarraydescr)
+        setinteriorfield_gc(p0, 0, f0, descr=complexrealdescr)
+        setinteriorfield_gc(p0, 0, f1, descr=compleximagdescr)
+        i0 = escape(f2, p0)
+        finish(i0)
+        """
+        self.optimize_loop(ops, expected)
+
     def test_nonvirtual_1(self):
         ops = """
         [i]
         class FakeCallInfoCollection:
             def callinfo_for_oopspec(self, oopspecindex):
                 calldescrtype = type(LLtypeMixin.strequaldescr)
+                effectinfotype = type(LLtypeMixin.strequaldescr.get_extra_info())
                 for value in LLtypeMixin.__dict__.values():
                     if isinstance(value, calldescrtype):
                         extra = value.get_extra_info()
-                        if extra and extra.oopspecindex == oopspecindex:
+                        if (extra and isinstance(extra, effectinfotype) and
+                            extra.oopspecindex == oopspecindex):
                             # returns 0 for 'func' in this test
                             return value, 0
                 raise AssertionError("not found: oopspecindex=%d" %

File pypy/jit/metainterp/optimizeopt/test/test_optimizeopt.py

         class FakeCallInfoCollection:
             def callinfo_for_oopspec(self, oopspecindex):
                 calldescrtype = type(LLtypeMixin.strequaldescr)
+                effectinfotype = type(LLtypeMixin.strequaldescr.get_extra_info())
                 for value in LLtypeMixin.__dict__.values():
                     if isinstance(value, calldescrtype):
                         extra = value.get_extra_info()
-                        if extra and extra.oopspecindex == oopspecindex:
+                        if (extra and isinstance(extra, effectinfotype) and
+                            extra.oopspecindex == oopspecindex):
                             # returns 0 for 'func' in this test
                             return value, 0
                 raise AssertionError("not found: oopspecindex=%d" %

File pypy/jit/metainterp/optimizeopt/test/test_util.py

              EffectInfo([], [arraydescr], [], [arraydescr],
                         oopspecindex=EffectInfo.OS_ARRAYCOPY))
 
+
+    # array of structs (complex data)
+    complexarray = lltype.GcArray(
+        lltype.Struct("complex",
+            ("real", lltype.Float),
+            ("imag", lltype.Float),
+        )
+    )
+    complexarraydescr = cpu.arraydescrof(complexarray)
+    complexrealdescr = cpu.interiorfielddescrof(complexarray, "real")
+    compleximagdescr = cpu.interiorfielddescrof(complexarray, "imag")
+
     for _name, _os in [
         ('strconcatdescr',               'OS_STR_CONCAT'),
         ('strslicedescr',                'OS_STR_SLICE'),
 ##    def get_class_of_box(self, box):
 ##        root = box.getref(ootype.ROOT)
 ##        return ootype.classof(root)
-    
+
 ##    cpu = runner.OOtypeCPU(None)
 ##    NODE = ootype.Instance('NODE', ootype.ROOT, {})
 ##    NODE._add_fields({'value': ootype.Signed,

File pypy/jit/metainterp/optimizeopt/virtualize.py

     def _make_virtual(self, modifier):
         return modifier.make_varray(self.arraydescr)
 
+class VArrayStructValue(AbstractVirtualValue):
+    def __init__(self, arraydescr, size, keybox, source_op=None):
+        AbstractVirtualValue.__init__(self, keybox, source_op)
+        self.arraydescr = arraydescr
+        self._items = [{} for _ in xrange(size)]
+
+    def getlength(self):
+        return len(self._items)
+
+    def getinteriorfield(self, index, ofs, default):
+        return self._items[index].get(ofs, default)
+
+    def setinteriorfield(self, index, ofs, itemvalue):
+        assert isinstance(itemvalue, optimizer.OptValue)
+        self._items[index][ofs] = itemvalue
+
+    def _really_force(self, optforce):
+        assert self.source_op is not None
+        if not we_are_translated():
+            self.source_op.name = 'FORCE ' + self.source_op.name
+        optforce.emit_operation(self.source_op)
+        self.box = box = self.source_op.result
+        for index in range(len(self._items)):
+            for descr, value in self._items[index].iteritems():
+                subbox = value.force_box(optforce)
+                op = ResOperation(rop.SETINTERIORFIELD_GC,
+                    [box, ConstInt(index), subbox], None, descr=descr
+                )
+                optforce.emit_operation(op)
+
+    def _get_list_of_descrs(self):
+        descrs = []
+        for item in self._items:
+            item_descrs = item.keys()
+            sort_descrs(item_descrs)
+            descrs.append(item_descrs)
+        return descrs
+
+    def get_args_for_fail(self, modifier):
+        if self.box is None and not modifier.already_seen_virtual(self.keybox):
+            itemdescrs = self._get_list_of_descrs()
+            itemboxes = []
+            for i in range(len(self._items)):
+                for descr in itemdescrs[i]:
+                    itemboxes.append(self._items[i][descr].get_key_box())
+            modifier.register_virtual_fields(self.keybox, itemboxes)
+            for i in range(len(self._items)):
+                for descr in itemdescrs[i]:
+                    self._items[i][descr].get_args_for_fail(modifier)
+
+    def force_at_end_of_preamble(self, already_forced, optforce):
+        if self in already_forced:
+            return self
+        already_forced[self] = self
+        for index in range(len(self._items)):
+            for descr in self._items[index].keys():
+                self._items[index][descr] = self._items[index][descr].force_at_end_of_preamble(already_forced, optforce)
+        return self
+
+    def _make_virtual(self, modifier):
+        return modifier.make_varraystruct(self.arraydescr, self._get_list_of_descrs())
+
+
 class OptVirtualize(optimizer.Optimization):
     "Virtualize objects until they escape."
 
         return vvalue
 
     def make_varray(self, arraydescr, size, box, source_op=None):
-        constvalue = self.new_const_item(arraydescr)
-        vvalue = VArrayValue(arraydescr, constvalue, size, box, source_op)
+        if arraydescr.is_array_of_structs():
+            vvalue = VArrayStructValue(arraydescr, size, box, source_op)
+        else:
+            constvalue = self.new_const_item(arraydescr)
+            vvalue = VArrayValue(arraydescr, constvalue, size, box, source_op)
         self.make_equal_to(box, vvalue)
         return vvalue
 
 
     def optimize_NEW_ARRAY(self, op):
         sizebox = self.get_constant_box(op.getarg(0))
-        # For now we can't make arrays of structs virtual.
-        if sizebox is not None and not op.getdescr().is_array_of_structs():
+        if sizebox is not None:
             # if the original 'op' did not have a ConstInt as argument,
             # build a new one with the ConstInt argument
             if not isinstance(op.getarg(0), ConstInt):
         value.ensure_nonnull()
         self.emit_operation(op)
 
+    def optimize_GETINTERIORFIELD_GC(self, op):
+        value = self.getvalue(op.getarg(0))
+        if value.is_virtual():
+            indexbox = self.get_constant_box(op.getarg(1))
+            if indexbox is not None:
+                descr = op.getdescr()
+                fieldvalue = value.getinteriorfield(
+                    indexbox.getint(), descr, None
+                )
+                if fieldvalue is None:
+                    fieldvalue = self.new_const(descr)
+                self.make_equal_to(op.result, fieldvalue)
+                return
+        value.ensure_nonnull()
+        self.emit_operation(op)
+
+    def optimize_SETINTERIORFIELD_GC(self, op):
+        value = self.getvalue(op.getarg(0))
+        if value.is_virtual():
+            indexbox = self.get_constant_box(op.getarg(1))
+            if indexbox is not None:
+                value.setinteriorfield(
+                    indexbox.getint(), op.getdescr(), self.getvalue(op.getarg(2))
+                )
+                return
+        value.ensure_nonnull()
+        self.emit_operation(op)
+
 
 dispatch_opt = make_dispatcher_method(OptVirtualize, 'optimize_',
         default=OptVirtualize.emit_operation)

File pypy/jit/metainterp/optimizeopt/virtualstate.py

 
 class AbstractVirtualStateInfo(resume.AbstractVirtualInfo):
     position = -1
-    
+
     def generalization_of(self, other, renum, bad):
         raise NotImplementedError
 
                 s.debug_print(indent + "    ", seen, bad)
         else:
             debug_print(indent + "    ...")
-                
+
 
     def debug_header(self, indent):
         raise NotImplementedError
             bad[self] = True
             bad[other] = True
             return False
+
+        assert isinstance(other, AbstractVirtualStructStateInfo)
         assert len(self.fielddescrs) == len(self.fieldstate)
         assert len(other.fielddescrs) == len(other.fieldstate)
         if len(self.fielddescrs) != len(other.fielddescrs):
             bad[self] = True
             bad[other] = True
             return False
-        
+
         for i in range(len(self.fielddescrs)):
             if other.fielddescrs[i] is not self.fielddescrs[i]:
                 bad[self] = True
     def _enum(self, virtual_state):
         for s in self.fieldstate:
             s.enum(virtual_state)
-        
-        
+
+
 class VirtualStateInfo(AbstractVirtualStructStateInfo):
     def __init__(self, known_class, fielddescrs):
         AbstractVirtualStructStateInfo.__init__(self, fielddescrs)
 
     def debug_header(self, indent):
         debug_print(indent + 'VirtualStateInfo(%d):' % self.position)
-        
+
 class VStructStateInfo(AbstractVirtualStructStateInfo):
     def __init__(self, typedescr, fielddescrs):
         AbstractVirtualStructStateInfo.__init__(self, fielddescrs)
         self.typedescr = typedescr
 
-    def _generalization_of(self, other):        
+    def _generalization_of(self, other):
         if not isinstance(other, VStructStateInfo):
             return False
         if self.typedescr is not other.typedescr:
 
     def debug_header(self, indent):
         debug_print(indent + 'VStructStateInfo(%d):' % self.position)
-        
+
 class VArrayStateInfo(AbstractVirtualStateInfo):
     def __init__(self, arraydescr):
         self.arraydescr = arraydescr
             bad[other] = True
             return False
         renum[self.position] = other.position
-        if not isinstance(other, VArrayStateInfo):
-            bad[self] = True
-            bad[other] = True
-            return False
-        if self.arraydescr is not other.arraydescr:
+        if not self._generalization_of(other):
             bad[self] = True
             bad[other] = True
             return False
                 return False
         return True
 
+    def _generalization_of(self, other):
+        return (isinstance(other, VArrayStateInfo) and
+            self.arraydescr is other.arraydescr)
+
     def enum_forced_boxes(self, boxes, value, optimizer):
         assert isinstance(value, virtualize.VArrayValue)
         assert value.is_virtual()
 
     def debug_header(self, indent):
         debug_print(indent + 'VArrayStateInfo(%d):' % self.position)
-            
-        
+
+class VArrayStructStateInfo(AbstractVirtualStateInfo):
+    def __init__(self, arraydescr, fielddescrs):
+        self.arraydescr = arraydescr
+        self.fielddescrs = fielddescrs
+
+    def generalization_of(self, other, renum, bad):
+        assert self.position != -1
+        if self.position in renum:
+            if renum[self.position] == other.position:
+                return True
+            bad[self] = True
+            bad[other] = True
+            return False
+        renum[self.position] = other.position
+        if not self._generalization_of(other):
+            bad[self] = True
+            bad[other] = True
+            return False
+
+        assert isinstance(other, VArrayStructStateInfo)
+        if len(self.fielddescrs) != len(other.fielddescrs):
+            bad[self] = True
+            bad[other] = True
+            return False
+
+        p = 0
+        for i in range(len(self.fielddescrs)):
+            if len(self.fielddescrs[i]) != len(other.fielddescrs[i]):
+                bad[self] = True
+                bad[other] = True
+                return False
+            for j in range(len(self.fielddescrs[i])):
+                if self.fielddescrs[i][j] is not other.fielddescrs[i][j]:
+                    bad[self] = True
+                    bad[other] = True
+                    return False
+                if not self.fieldstate[p].generalization_of(other.fieldstate[p],
+                                                            renum, bad):
+                    bad[self] = True
+                    bad[other] = True
+                    return False
+                p += 1
+        return True
+
+    def _generalization_of(self, other):
+        return (isinstance(other, VArrayStructStateInfo) and
+            self.arraydescr is other.arraydescr)
+
+    def _enum(self, virtual_state):
+        for s in self.fieldstate:
+            s.enum(virtual_state)
+
+    def enum_forced_boxes(self, boxes, value, optimizer):
+        assert isinstance(value, virtualize.VArrayStructValue)
+        assert value.is_virtual()
+        p = 0
+        for i in range(len(self.fielddescrs)):
+            for j in range(len(self.fielddescrs[i])):
+                v = value._items[i][self.fielddescrs[i][j]]
+                s = self.fieldstate[p]
+                if s.position > self.position:
+                    s.enum_forced_boxes(boxes, v, optimizer)
+                p += 1
+
+    def debug_header(self, indent):
+        debug_print(indent + 'VArrayStructStateInfo(%d):' % self.position)
+
+
 class NotVirtualStateInfo(AbstractVirtualStateInfo):
     def __init__(self, value):
         self.known_class = value.known_class
             op = ResOperation(rop.GUARD_CLASS, [box, self.known_class], None)
             extra_guards.append(op)
             return
-        
+
         if self.level == LEVEL_NONNULL and \
                other.level == LEVEL_UNKNOWN and \
                isinstance(box, BoxPtr) and \
             op = ResOperation(rop.GUARD_NONNULL, [box], None)
             extra_guards.append(op)
             return
-        
+
         if self.level == LEVEL_UNKNOWN and \
                other.level == LEVEL_UNKNOWN and \
                isinstance(box, BoxInt) and \
                     op = ResOperation(rop.GUARD_TRUE, [res], None)
                     extra_guards.append(op)
             return
-        
+
         # Remaining cases are probably not interesting
         raise InvalidLoop
         if self.level == LEVEL_CONSTANT:
     def enum_forced_boxes(self, boxes, value, optimizer):
         if self.level == LEVEL_CONSTANT:
             return
-        assert 0 <= self.position_in_notvirtuals 
+        assert 0 <= self.position_in_notvirtuals
         boxes[self.position_in_notvirtuals] = value.force_box(optimizer)
 
     def _enum(self, virtual_state):
         lb = ''
         if self.lenbound:
             lb = ', ' + self.lenbound.bound.__repr__()
-        
+
         debug_print(indent + mark + 'NotVirtualInfo(%d' % self.position +
                     ', ' + l + ', ' + self.intbound.__repr__() + lb + ')')
 
                 return False
         return True
 
-    def generate_guards(self, other, args, cpu, extra_guards):        
+    def generate_guards(self, other, args, cpu, extra_guards):
         assert len(self.state) == len(other.state) == len(args)
         renum = {}
         for i in range(len(self.state)):
                     inputargs.append(box)
 
         assert None not in inputargs
-            
+
         return inputargs
 
     def debug_print(self, hdr='', bad=None):
 
     def register_virtual_fields(self, keybox, fieldboxes):
         self.fieldboxes[keybox] = fieldboxes
-        
+
     def already_seen_virtual(self, keybox):
         return keybox in self.fieldboxes
 
     def make_varray(self, arraydescr):
         return VArrayStateInfo(arraydescr)
 
+    def make_varraystruct(self, arraydescr, fielddescrs):
+        return VArrayStructStateInfo(arraydescr, fielddescrs)
+
 class BoxNotProducable(Exception):
     pass
 
             else: # Low priority
                 lo -= 1
         return alts
-            
+
     def renamed(self, box):
         if box in self.rename:
             return self.rename[box]
         return box
-    
+
     def add_to_short(self, box, op):
         if op:
             op = op.clone()
             self.optimizer.make_equal_to(newbox, value)
         else:
             self.short_boxes[box] = op
-        
+
     def produce_short_preamble_box(self, box):
         if box in self.short_boxes:
-            return 
+            return
         if isinstance(box, Const):
-            return 
+            return
         if box in self.potential_ops:
             ops = self.prioritized_alternatives(box)
             produced_one = False
             else:
                 debug_print(logops.repr_of_arg(box) + ': None')
         debug_stop('jit-short-boxes')
-        
+
     def operations(self):
         if not we_are_translated(): # For tests
             ops = self.short_boxes.values()
         if not isinstance(oldbox, Const) and newbox not in self.short_boxes:
             self.short_boxes[newbox] = self.short_boxes[oldbox]
         self.aliases[newbox] = oldbox
-        
+
     def original(self, box):
         while box in self.aliases:
             box = self.aliases[box]

File pypy/jit/metainterp/optimizeopt/vstring.py

             for value in self._chars:
                 value.get_args_for_fail(modifier)
 
-    def FIXME_enum_forced_boxes(self, boxes, already_seen):
-        key = self.get_key_box()
-        if key in already_seen:
-            return
-        already_seen[key] = None
-        if self.box is None:
-            for box in self._chars:
-                box.enum_forced_boxes(boxes, already_seen)
-        else:
-            boxes.append(self.box)
-
     def _make_virtual(self, modifier):
         return modifier.make_vstrplain(self.mode is mode_unicode)
 
             self.left.get_args_for_fail(modifier)
             self.right.get_args_for_fail(modifier)
 
-    def FIXME_enum_forced_boxes(self, boxes, already_seen):
-        key = self.get_key_box()
-        if key in already_seen:
-            return
-        already_seen[key] = None
-        if self.box is None:
-            self.left.enum_forced_boxes(boxes, already_seen)
-            self.right.enum_forced_boxes(boxes, already_seen)
-            self.lengthbox = None
-        else:
-            boxes.append(self.box)
-
     def _make_virtual(self, modifier):
         return modifier.make_vstrconcat(self.mode is mode_unicode)
 
             self.vstart.get_args_for_fail(modifier)
             self.vlength.get_args_for_fail(modifier)
 
-    def FIXME_enum_forced_boxes(self, boxes, already_seen):
-        key = self.get_key_box()
-        if key in already_seen:
-            return
-        already_seen[key] = None
-        if self.box is None:
-            self.vstr.enum_forced_boxes(boxes, already_seen)
-            self.vstart.enum_forced_boxes(boxes, already_seen)
-            self.vlength.enum_forced_boxes(boxes, already_seen)
-        else:
-            boxes.append(self.box)
-
     def _make_virtual(self, modifier):
         return modifier.make_vstrslice(self.mode is mode_unicode)
 

File pypy/jit/metainterp/pyjitpl.py

         return self.execute(rop.PTR_EQ, box, history.CONST_NULL)
 
     @arguments("box")
-    def opimpl_cast_opaque_ptr(self, box):
-        return self.execute(rop.CAST_OPAQUE_PTR, box)
+    def opimpl_mark_opaque_ptr(self, box):
+        return self.execute(rop.MARK_OPAQUE_PTR, box)
 
     @arguments("box")
     def _opimpl_any_return(self, box):

File pypy/jit/metainterp/resoperation.py

     'PTR_NE/2b',
     'INSTANCE_PTR_EQ/2b',
     'INSTANCE_PTR_NE/2b',
-    'CAST_OPAQUE_PTR/1b',
     #
     'ARRAYLEN_GC/1d',
     'STRLEN/1',
     'FORCE_TOKEN/0',
     'VIRTUAL_REF/2',         # removed before it's passed to the backend
     'READ_TIMESTAMP/0',
+    'MARK_OPAQUE_PTR/1b',
     '_NOSIDEEFFECT_LAST', # ----- end of no_side_effect operations -----
 
     'SETARRAYITEM_GC/3d',

File pypy/jit/metainterp/resume.py

         self.numberings = {}
         self.cached_boxes = {}
         self.cached_virtuals = {}
-    
+
         self.nvirtuals = 0
         self.nvholes = 0
         self.nvreused = 0
     def make_varray(self, arraydescr):
         return VArrayInfo(arraydescr)
 
+    def make_varraystruct(self, arraydescr, fielddescrs):
+        return VArrayStructInfo(arraydescr, fielddescrs)
+
     def make_vstrplain(self, is_unicode=False):
         if is_unicode:
             return VUniPlainInfo()
                 virtuals[num] = vinfo
 
         if self._invalidation_needed(len(liveboxes), nholes):
-            memo.clear_box_virtual_numbers()           
+            memo.clear_box_virtual_numbers()
 
     def _invalidation_needed(self, nliveboxes, nholes):
         memo = self.memo
 
     def debug_prints(self):
         raise NotImplementedError
-        
+
 class AbstractVirtualStructInfo(AbstractVirtualInfo):
     def __init__(self, fielddescrs):
         self.fielddescrs = fielddescrs
         for i in self.fieldnums:
             debug_print("\t\t", str(untag(i)))
 
+
+class VArrayStructInfo(AbstractVirtualInfo):
+    def __init__(self, arraydescr, fielddescrs):
+        self.arraydescr = arraydescr
+        self.fielddescrs = fielddescrs
+
+    def debug_prints(self):
+        debug_print("\tvarraystructinfo", self.arraydescr)
+        for i in self.fieldnums:
+            debug_print("\t\t", str(untag(i)))
+
+    @specialize.argtype(1)
+    def allocate(self, decoder, index):
+        array = decoder.allocate_array(self.arraydescr, len(self.fielddescrs))
+        decoder.virtuals_cache[index] = array
+        p = 0
+        for i in range(len(self.fielddescrs)):
+            for j in range(len(self.fielddescrs[i])):
+                decoder.setinteriorfield(i, self.fielddescrs[i][j], array, self.fieldnums[p])
+                p += 1
+        return array
+
+
 class VStrPlainInfo(AbstractVirtualInfo):
     """Stands for the string made out of the characters of all fieldnums."""
 
         self.metainterp.execute_and_record(rop.SETFIELD_GC, descr,
                                            structbox, fieldbox)
 
+    def setinteriorfield(self, index, descr, array, fieldnum):
+        if descr.is_pointer_field():
+            kind = REF
+        elif descr.is_float_field():
+            kind = FLOAT
+        else:
+            kind = INT
+        fieldbox = self.decode_box(fieldnum, kind)
+        self.metainterp.execute_and_record(rop.SETINTERIORFIELD_GC, descr,
+                                           array, ConstInt(index), fieldbox)
+
     def setarrayitem_int(self, arraydescr, arraybox, index, fieldnum):
         self._setarrayitem(arraydescr, arraybox, index, fieldnum, INT)
 
             newvalue = self.decode_int(fieldnum)
             self.cpu.bh_setfield_gc_i(struct, descr, newvalue)
 
+    def setinteriorfield(self, index, descr, array, fieldnum):
+        if descr.is_pointer_field():
+            newvalue = self.decode_ref(fieldnum)
+            self.cpu.bh_setinteriorfield_gc_r(array, index, descr, newvalue)
+        elif descr.is_float_field():
+            newvalue = self.decode_float(fieldnum)
+            self.cpu.bh_setinteriorfield_gc_f(array, index, descr, newvalue)
+        else:
+            newvalue = self.decode_int(fieldnum)
+            self.cpu.bh_setinteriorfield_gc_i(array, index, descr, newvalue)
+
     def setarrayitem_int(self, arraydescr, array, index, fieldnum):
         newvalue = self.decode_int(fieldnum)
         self.cpu.bh_setarrayitem_gc_i(arraydescr, array, index, newvalue)

File pypy/jit/metainterp/test/test_ajit.py

 from pypy.jit.metainterp.typesystem import LLTypeHelper, OOTypeHelper
 from pypy.jit.metainterp.warmspot import get_stats
 from pypy.jit.metainterp.warmstate import set_future_value
+from pypy.rlib import rerased
 from pypy.rlib.jit import (JitDriver, we_are_jitted, hint, dont_look_inside,
     loop_invariant, elidable, promote, jit_debug, assert_green,
     AssertGreenFailed, unroll_safe, current_trace_length, look_inside_iff,
             d = None
             while n > 0:
                 myjitdriver.jit_merge_point(n=n, d=d)
-                d = {}
+                d = {"q": 1}
                 if n % 2:
                     d["k"] = n
                 else:
                     d["z"] = n
-                n -= len(d)
+                n -= len(d) - d["q"]
             return n
         res = self.meta_interp(f, [10])
         assert res == 0
 
+    def test_virtual_dict_constant_keys(self):
+        myjitdriver = JitDriver(greens = [], reds = ["n"])
+        def g(d):
+            return d["key"] - 1
+
+        def f(n):
+            while n > 0:
+                myjitdriver.jit_merge_point(n=n)
+                n = g({"key": n})
+            return n
+
+        res = self.meta_interp(f, [10])
+        assert res == 0
+        self.check_loops({"int_sub": 1, "int_gt": 1, "guard_true": 1, "jump": 1})
+
+    def test_virtual_opaque_ptr(self):
+        myjitdriver = JitDriver(greens = [], reds = ["n"])
+        erase, unerase = rerased.new_erasing_pair("x")
+        @look_inside_iff(lambda x: isvirtual(x))
+        def g(x):
+            return x[0]
+        def f(n):
+            while n > 0:
+                myjitdriver.jit_merge_point(n=n)
+                x = []
+                y = erase(x)
+                z = unerase(y)
+                z.append(1)
+                n -= g(z)
+            return n
+        res = self.meta_interp(f, [10])
+        assert res == 0
+        self.check_loops({"int_sub": 1, "int_gt": 1, "guard_true": 1, "jump": 1})
+
+    def test_virtual_opaque_dict(self):
+        myjitdriver = JitDriver(greens = [], reds = ["n"])
+        erase, unerase = rerased.new_erasing_pair("x")
+        @look_inside_iff(lambda x: isvirtual(x))
+        def g(x):
+            return x[0]["key"] - 1
+        def f(n):
+            while n > 0:
+                myjitdriver.jit_merge_point(n=n)
+                x = [{}]
+                x[0]["key"] = n
+                x[0]["other key"] = n
+                y = erase(x)
+                z = unerase(y)
+                n = g(x)
+            return n
+        res = self.meta_interp(f, [10])
+        assert res == 0
+        self.check_loops({"int_sub": 1, "int_gt": 1, "guard_true": 1, "jump": 1})
+
 
 
 class TestLLtype(BaseLLtypeTests, LLJitMixin):
         res = self.meta_interp(main, [False, 100, True], taggedpointers=True)
 
     def test_rerased(self):
-        from pypy.rlib.rerased import erase_int, unerase_int, new_erasing_pair
-        eraseX, uneraseX = new_erasing_pair("X")
+        eraseX, uneraseX = rerased.new_erasing_pair("X")
         #
         class X:
             def __init__(self, a, b):
                 e = eraseX(X(i, j))
             else:
                 try:
-                    e = erase_int(i)
+                    e = rerased.erase_int(i)
                 except OverflowError:
                     return -42
             if j & 1:
                 x = uneraseX(e)
                 return x.a - x.b
             else:
-                return unerase_int(e)
+                return rerased.unerase_int(e)
         #
         x = self.interp_operations(f, [-128, 0], taggedpointers=True)
         assert x == -128

File pypy/jit/metainterp/test/test_heapcache.py

         assert h.is_unescaped(box1)
         h.invalidate_caches(rop.SETARRAYITEM_GC, None, [box2, index1, box1])
         assert not h.is_unescaped(box1)
+
+        h = HeapCache()
+        h.new_array(box1, lengthbox1)
+        h.new(box2)
+        assert h.is_unescaped(box1)
+        assert h.is_unescaped(box2)
+        h.invalidate_caches(rop.SETARRAYITEM_GC, None, [box1, lengthbox2, box2])
+        assert h.is_unescaped(box1)
+        assert h.is_unescaped(box2)
+        h.invalidate_caches(
+            rop.CALL, FakeCallDescr(FakeEffektinfo.EF_RANDOM_EFFECTS), [box1]
+        )
+        assert not h.is_unescaped(box1)
+        assert not h.is_unescaped(box2)

File pypy/jit/metainterp/test/test_tracingopts.py

 from pypy.jit.metainterp.test.support import LLJitMixin
 from pypy.rlib import jit
 from pypy.rlib.rarithmetic import ovfcheck
+from pypy.rlib.rstring import StringBuilder
 
 import py
 
         assert res == 4
         self.check_operations_history(int_add_ovf=0)
         res = self.interp_operations(fn, [sys.maxint])
-        assert res == 12
+        assert res == 12
+
+    def test_copy_str_content(self):
+        def fn(n):
+            a = StringBuilder()
+            x = [1]
+            a.append("hello world")
+            return x[0]
+        res = self.interp_operations(fn, [0])
+        assert res == 1
+        self.check_operations_history(getarrayitem_gc=0, getarrayitem_gc_pure=0 )

File pypy/jit/metainterp/warmspot.py

     clear_tcache()
     return jittify_and_run(interp, graph, args, backendopt=backendopt, **kwds)
 
-def jittify_and_run(interp, graph, args, repeat=1,
+def jittify_and_run(interp, graph, args, repeat=1, graph_and_interp_only=False,
                     backendopt=False, trace_limit=sys.maxint,
                     inline=False, loop_longevity=0, retrace_limit=5,
                     function_threshold=4,
         jd.warmstate.set_param_max_retrace_guards(max_retrace_guards)
         jd.warmstate.set_param_enable_opts(enable_opts)
     warmrunnerdesc.finish()
+    if graph_and_interp_only:
+        return interp, graph
     res = interp.eval_graph(graph, args)
     if not kwds.get('translate_support_code', False):
         warmrunnerdesc.metainterp_sd.profiler.finish()
 def get_stats():
     return pyjitpl._warmrunnerdesc.stats
 
+def reset_stats():
+    pyjitpl._warmrunnerdesc.stats.clear()
+
 def get_translator():
     return pyjitpl._warmrunnerdesc.translator
 

File pypy/module/micronumpy/__init__.py

         'empty': 'interp_numarray.zeros',
         'ones':