Commits

Armin Rigo committed 5791a9b Merge

merge heads

Comments (0)

Files changed (13)

blog/draft/arm-status-update.rst

+ARM Backend News Update
+=======================
+
+Starting with the good news, we finally merged the ``arm-backend-2`` branch
+into PyPy's main development line. As described in previous_ posts_ the main
+goal of this branch was to add support for ARM processors to PyPY's JIT.  As a
+general byproduct PyPy should now do a better job supporting non-x86
+architectures such as ARM and the in-progress support for PPC64.
+
+On ARM, the JIT requires an ARMv7 processor or newer with a VFP unit targeting
+the ARM application profile. Although this sounds like a strong restriction,
+most of the ARM processors used in mobile devices and development boards are
+ARMv7 (sadly the raspberry pi isn't) or newer. Also these are the same
+requirements as those of the Ubuntu ARM port. The non-JIT version might support
+previous architecture versions, but without the JIT it will be slow.
+
+
+Floating Point Support
+----------------------
+
+The support for a floating point unit is optional for ARM processor vendors.
+Due to this there are different calling conventions, that differ on the
+requirement for a floating point unit and the treatment of floats.
+The **Procedure Call Standard for the ARM Architecture**
+(`PDF`_) describes in the *base procedure call standard* how parameters are
+passed in processor registers and on the stack when calling a function.
+
+When adding floating points to the mix there are two incompatible procedure
+call standards and three ways of handling floats. Usually they are referred to
+as *softfp*, *soft-float* and *hard-float*. The first two use the core
+registers to pass floating point arguments and do not make any assumptions
+about a floating point unit. The first uses a software based
+float-implementation, while the second can use a floating point unit. The
+latter and incompatible one requires a floating point unit and uses the
+coprocessor registers to pass floating arguments to calls. A detailed
+comparison can be found `here`_.
+
+At the time we started implementing the float support in the ARM backend of the
+JIT, the soft-float calling conventions were the most commonly supported ones
+by most GNU/Linux distributions, so we decided to implement that one first.
+This means that we have to copy floating point values from the VFP to core
+registers and the stack when generating code for a call that involves floating
+point values. Because the soft- and hard-float calling conventions are
+incompatible, PyPy for ARM currently will only work on systems built using
+soft-float.  By now more and more GNU/Linux distributions for ARM are
+supporting hard-floats.  In PyPy there is basic support in the JIT backend for
+the hard-float calling convention. But we seem to have hit an issue with
+ctypes/libffi on ARM that is blocking us to run our tests against the
+hard-float implementation.
+
+
+Testing and Infrastructure
+--------------------------
+
+By now we have an infrastructure the allows us to create cross-translated
+binaries for ARM and to run tests on them. Currently we compile binaries in a
+32bit Ubuntu 12.04 environment using scratchbox2_ to encapsulate the
+cross-compiler calls. The results can be downloaded and tested from our
+`nightly build server`_. Some documentation on how to cross-translate is
+available in the `PyPy docs`_.
+
+We also have some hardware to run the subset of the PyPy test-suite relevant to
+the ARM-JIT backend and to run the tests suite that tests the translated ARM
+binaries. The nightly tests are run on a Beagleboard-xM_ and an i.MX53_
+versatile board (kindly provided by Michael Foord), both boards run the ARM port `Ubuntu
+12.04 Precise Pangolin`_. The current results for the different builders can be
+seen on the `PyPy buildbot`_. As can be seen there are still some issues to be
+fixed, but we are getting there.
+
+
+Open Topics
+-----------
+In a previous post we mentioned a set of open topics regarding PyPy's ARM
+support, here is an update on these topics:
+
+Done:
+
+* We are looking for a better way to translate PyPy for ARM, than the one
+  describe above. I am not sure if there currently is hardware with enough
+  memory to directly translate PyPy on an ARM based system, this would require
+  between 1.5 or 2 Gig of memory. A fully QEMU based approach could also work,
+  instead of Scratchbox2 that uses QEMU under the hood.  *The scratchbox2 based
+  approach has given the best results so far. Qemu has shown to be too unstable
+  to be used as a base for the translation, also the qemu-arm emulation is very
+  slow when compared to cross-translating.*
+* Test the JIT on different hardware.
+  *As mentioned we are running nightly tests on a Beagleboard-xM and a i.MX53 board.*
+* Continuous integration: We are looking for a way to run the PyPy test suite
+  to make sure everything works as expected on ARM, here QEMU also might
+  provide an alternative.  
+  *As stated above this is now working, we explored
+  using qemu-arm and a chroot to run tests. This, although faster than
+  our boards, was very unstable and crashed randomly making it unusable to run tests on
+  a regular basis. A fully emulated approach using QEMU might still be worth trying.*
+* Improve the tools, i.e. integrate with jitviewer_.
+
+Long term on open topics/projects for ARM:
+
+* Review of the generated machine code the JIT generates on ARM to see if the
+  instruction selection makes sense for ARM.
+* Build a version that runs on Android.
+* Experiment with the JIT settings to find the optimal thresholds for ARM.
+  This is still open
+* A long term plan would be to port the backend to ARMv5 ISA and improve the
+  support for systems without a floating point unit. This would require to
+  implement the ISA and create different code paths and improve the instruction
+  selection depending on the target architecture.
+
+While we continue to fix the remaining issues you can get a nightly version to
+try PyPy on ARM.
+
+
+.. _Beagleboard-xM: http://beagleboard.org/hardware-xm
+.. _`PDF`: http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042d/IHI0042D_aapcs.pdf
+.. _`PyPy buildbot`: http://buildbot.pypy.org/summary?branch=%3Ctrunk%3E&category=linux-armel
+.. _`PyPy docs`: https://bitbucket.org/pypy/pypy/src/default/pypy/doc/arm.rst
+.. _`Ubuntu 12.04 Precise Pangolin`: https://wiki.ubuntu.com/ARM 
+.. _`here`: http://wiki.debian.org/ArmHardFloatPort/VfpComparison
+.. _`nightly build server`: http://buildbot.pypy.org/nightly/trunk/
+.. _`scratchbox2`: http://maemo.gitorious.org/scratchbox2
+.. _i.MX53: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=IMX53QSB
+.. _jitviewer: https://bitbucket.org/pypy/jitviewer
+.. _posts: http://morepypy.blogspot.de/2012/02/almost-there-pypys-arm-backend_01.html
+.. _previous: http://morepypy.blogspot.de/2011/01/jit-backend-for-arm-processors.html

talk/dls2012/demo/analytics.py

+
+from reloader import ReloadHack
+from io import view
+from background import Background
+from foreground import foreground
+from detect import find_objects
+
+class Tracker(ReloadHack):
+    def __init__(self):
+        self.background = Background()
+
+    def update(self, frame):
+        self.background.update(frame)
+        fg = foreground(frame, self.background.image)
+        #view(self.background.image)
+        find_objects(fg)
+        view(255 * fg)
+

talk/dls2012/demo/background.py

+
+from reloader import ReloadHack
+
+class Background(ReloadHack):
+    def __init__(self):
+        self.fcnt = 0
+        self.image = 0
+
+    def update(self, frame):
+        self.fcnt += 1
+        alfa = self.fcnt/(self.fcnt + 1.0)
+        self.image = alfa * self.image + (1 - alfa) * frame
+

talk/dls2012/demo/demo.avi

Binary file added.

talk/dls2012/demo/demo.py

+from subprocess import call, Popen
+import os, time, re
+
+class Vim(object):
+    def __init__(self, *args):
+        self.servername = 'DEMO_%d_%d' % (os.getpid(), id(self))
+        call(['gvim', '--servername', self.servername] + list(args))
+
+    def send(self, cmd):
+        call(['gvim', '--servername', self.servername, '--remote-send', cmd])
+
+    def type(self, cmd, delay=0.05):
+        for c in re.findall('[^<>]|<.*?>', cmd):
+            self.send(c)
+            time.sleep(delay)
+
+    def __del__(self):
+        self.send('<ESC>:q!<CR>')
+
+def pause(msg=''):
+    print
+    print msg
+    raw_input('Press ENTER')
+
+def demo():
+    with open('analytics.py', 'w') as fd:
+        print >>fd, """
+from reloader import ReloadHack
+from io import view
+
+class Tracker(ReloadHack):
+    def update(self, frame):
+        view(frame)
+"""
+    runner = Popen(['pypy', 'run.py', 'demo.avi'])
+    vim = Vim('analytics.py')
+
+    pause("We'r looking at the input and output of this Tracker object that\n" + 
+          "currently simply returns it input. Let's modify it to make some\n" +
+          "simple contrast adjustment.")
+    vim.send('7gg$')
+    vim.type('i * 2<ESC>:w<CR>', 0.2)
+
+    pause("Now let's create a new class that estimates a background image\n" +
+          "using a sliding mean.")
+    with open('background.py', 'w') as fd:
+        print >>fd, """
+from reloader import ReloadHack
+
+class Background(ReloadHack):
+    def __init__(self):
+        self.image = 0
+
+    def update(self, frame):
+        alfa = 0.9
+        self.image = alfa * self.image + (1 - alfa) * frame
+"""    
+    vim.send(':e background.py<CR>')
+
+    pause("Then, update Tracker to using the background estimater.")
+    vim.send(':e analytics.py<CR>')
+    vim.type('4ggifrom background import Background<CR><ESC>')
+    vim.type('7ggOdef __init__(self):<CR>self.background = Background()<CR><ESC>')
+    vim.type('11ggddOself.background.update(frame)<CR>view(self.background.image)<ESC>:w<CR>')
+
+    pause("The moving objects are turned into ghosts. We need to increase the\n" +
+          "window size to get a better estimate.")
+    vim.send(':e background.py<CR>9gg')
+    vim.type('A9<ESC>:w<CR>', 0.2)
+
+    pause("We'r still getting ghost. Let's try to increase it even more.")
+    vim.type('A9<ESC>:w<CR>', 0.2)
+
+    pause("Now it's taking forever to converge. Let's make the window size\n" + 
+          "depend on the number of frames observed.")
+    vim.type('6ggOself.fcnt = 0<ESC>')
+    vim.type('10ggOself.fcnt += 1<ESC>')
+    vim.type('11ggA<BS><BS><BS><BS><BS>self.fcnt/(self.fcnt + 1.0)<ESC>:w<CR>')
+
+    pause("That's better. Now, let's create a new function performing\n" + 
+          "background subtraction,")
+    with open('foreground.py', 'w') as fd:
+        print >>fd, """
+from reloader import autoreload
+
+@autoreload
+def foreground(img, bkg):
+    return ((bkg - img) ** 2) > 40
+"""
+    vim.send(':e foreground.py<CR>')
+
+    pause("and update Tracker to call it using the estimated background.")
+    vim.send(':e analytics.py<CR>')
+    vim.type('4ggofrom foreground import foreground<ESC>')
+    vim.type('12ggofg = foreground(frame, self.background.image)<ESC>')
+    vim.type('jI#<ESC>oview(255 * fg)<ESC>:w<CR>')
+
+    pause("Wait a bit for the background to converge.")
+
+    pause("That's a bit noisy. We'll have to increase the threashold a bit.")
+    vim.send(':e foreground.py<CR>')
+    vim.type('6ggA<BS><BS>100<ESC>:w<CR>', 0.2)
+
+    pause("Still a bit noisy, let's increase it even more.")
+    vim.type('6ggA<BS><BS><BS>200<ESC>:w<CR>', 0.2)
+
+    pause("That's all!")
+
+    runner.kill()
+if __name__ == '__main__':
+    demo()

talk/dls2012/demo/detect.py

+from reloader import autoreload
+from io import view
+
+def morph(fg, r, fn):
+    res = fg.new()
+    for y in xrange(fg.height):
+        for x in xrange(fg.width):
+            #res[x, y] = max(fg[x+dx, y+dy] 
+            #                for dx in xrange(-r, r+1) 
+            #                for dy in xrange(-r, r+1))
+            res[x, y] = fg[x, y]
+            for dx in xrange(-r, r+1):
+                for dy in xrange(-r, r+1):
+                    res[x, y] = fn(res[x, y], fg[x+dx, y+dy])
+    return res
+
+def morph(fg, r, fn):
+    xres = fg.new()
+    for y in xrange(fg.height):
+        for x in xrange(fg.width):
+            xres[x, y] = fg[x, y]
+            for dx in xrange(-r, r+1):
+                xres[x, y] = fn(xres[x, y], fg[x+dx, y])
+    res = fg.new()
+    for y in xrange(fg.height):
+        for x in xrange(fg.width):
+            res[x, y] = xres[x, y]
+            for dy in xrange(-r, r+1):
+                res[x, y] = fn(res[x, y], xres[x, y+dy])
+    return res
+
+def erode(fg, r=1):
+    return morph(fg, r, min)
+
+def dilate(fg, r=1):
+    return morph(fg, r, max)
+
+@autoreload
+def find_objects(fg):
+    seg = erode(dilate(fg, 3), 4)
+    view(255*seg, 'd')

talk/dls2012/demo/foreground.py

+
+from reloader import autoreload
+
+@autoreload
+def foreground(img, bkg):
+    return ((bkg - img) ** 2) > 200
+

talk/dls2012/demo/image.py

+from array import array
+
+def binop(op):
+    def f(a, b):
+        if not isinstance(a, Image):
+            a = ConstantImage(b.width, b.height, a)
+        if not isinstance(b, Image):
+            b = ConstantImage(a.width, a.height, b)
+
+        out = a.new(typecode='d')
+        for y in xrange(a.height):
+            for x in xrange(a.width):
+                out[x, y] = op(float(a[x, y]), float(b[x, y]))
+
+        return out
+    return f
+
+class Image(object):
+    def __init__(self, w, h, typecode='d', data=None):
+        self.width = w
+        self.height = h
+        self.typecode = typecode
+        if data is None:
+            self.data = array(typecode, [0]) * (w*h)
+        else:
+            self.data = data
+
+    def new(self, w=None, h=None, typecode=None):
+        if w is None:
+            w = self.width
+        if h is None:
+            h = self.height
+        if typecode is None:
+            typecode = self.typecode
+        return Image(w, h, typecode)
+
+    def __getitem__(self, (x, y)):
+        if 0 <= x < self.width and 0 <= y < self.height:
+            return self.data[y * self.width + x]
+        return 0
+
+    def __setitem__(self, (x, y), value):
+        if 0 <= x < self.width and 0 <= y < self.height:
+            self.data[y * self.width + x] = value
+
+    __add__ = binop(float.__add__)
+    __sub__ = binop(float.__sub__)
+    __mul__ = binop(float.__mul__)
+    __div__ = binop(float.__div__)
+    __pow__ = binop(float.__pow__)
+
+    __radd__ = binop(float.__radd__)
+    __rsub__ = binop(float.__rsub__)
+    __rmul__ = binop(float.__rmul__)
+    __rdiv__ = binop(float.__rdiv__)
+    __rpow__ = binop(float.__rpow__)
+
+    __lt__ = binop(float.__lt__)
+    __le__ = binop(float.__le__)
+    __eq__ = binop(float.__eq__)
+    __ne__ = binop(float.__ne__)
+    __gt__ = binop(float.__gt__)
+    __ge__ = binop(float.__ge__)
+
+    def __nonzero__(self):
+        return all(self.data)
+
+class ConstantImage(Image):
+    def __init__(self, w, h, value):
+        self.width = w
+        self.height = h
+        self.value = value
+
+    def __getitem__(self, (x, y)):
+        return self.value
+
+    def __setitem__(self, (x, y), value):
+        raise TypeError('ConstantImage does not support item assignment')
+
+
+def test_image():
+    img = Image(10, 20)
+    img[3, 4] = 7
+    assert img[3, 4] == 7
+    img[1, 2] = 42
+    assert img[1, 2] == 42
+
+    img2 = img + img
+    assert img2[1, 2] == 84
+    assert img2[3, 4] == 14
+
+    img += 1
+    assert img[2, 1] == 1
+    assert img[3, 4] == 8
+    assert img + img == 2 * img == img * 2
+    assert not (2 * img == 3 * img)
+    
+

talk/dls2012/demo/io.py

+from subprocess import Popen, PIPE, STDOUT
+import os, re
+from image import Image
+from array import array
+
+def mplayer(filename='tv://', options=()):
+    cmd = Popen(['mplayer', '-vo', 'null', '-ao', 'null',
+                 '-frames', '1'] + list(options) + [filename],
+                stdin=PIPE, stdout=PIPE, stderr=PIPE)
+    (out, err) = cmd.communicate()
+    for line in (out + err).split('\n'):
+        m = re.search('(VIDEO|VO): .*? (\d+)x(\d+)', line)
+        if m:
+            width, height = int(m.group(2)), int(m.group(3))
+            break
+    else:
+        raise IOError
+    fmt = 'y8'
+
+    mplayer = Popen(['mencoder', '-o', '-',
+                                 '-ovc', 'raw', '-of', 'rawvideo',
+                                 '-vf', 'format=' + fmt,
+                                 '-nosound', '-really-quiet',
+                    ] + list(options) + [filename],
+                    stdin=PIPE, stdout=PIPE, stderr=PIPE)
+    while True:
+        try:
+            data = array('B')
+            data.fromfile(mplayer.stdout, width*height)
+            img = Image(width, height, 'B', data)
+        except EOFError:
+            raise StopIteration
+        yield img
+
+        
+
+class MplayerViewer(object):
+    def __init__(self):
+        self.width = self.height = None
+    def view(self, img):
+        if img.data.typecode != 'B':
+            out = img.new(typecode='B')
+            for y in xrange(img.height):
+                for x in xrange(img.width):
+                    out[x, y] = int(min(max(img[x, y], 0), 255))
+            img = out
+        if not self.width:
+            w, h = img.width, img.height
+            self.mplayer = Popen(['mplayer', '-', '-benchmark',
+                                  '-demuxer', 'rawvideo',
+                                 '-rawvideo', 'w=%d:h=%d:format=y8' % (w, h),
+                                 '-really-quiet'],
+                                 stdin=PIPE, stdout=PIPE, stderr=PIPE)
+            
+            self.width = img.width
+            self.height = img.height
+        assert self.width == img.width
+        assert self.height == img.height
+        img.data.tofile(self.mplayer.stdin)
+
+viewers = {}
+def view(img, name='default'):
+    try:
+        viewer = viewers[name]
+    except KeyError:
+        viewer = viewers[name] = MplayerViewer()
+    viewer.view(img)
+    

talk/dls2012/demo/process.py

+from reloader import ReloadHack
+from fgbg import background
+
+@ReloadHack
+def process(video):
+    bkg = background(video)
+    for img in bkg:
+        yield img 
+
+

talk/dls2012/demo/reloader.py

+import os, sys, time, traceback
+
+class ReloadHack(object):
+    def __new__(cls, *new_args, **new_kwargs):
+        class Wrapper(object):
+            module = sys.modules[cls.__module__]
+            filename = module.__file__
+            if filename.endswith('.pyc'):
+                filename = filename[:-1]
+            name = cls.__name__
+            mtime = -1
+
+            def update(self, *args, **kwargs):
+                while True:
+                    while True:
+                        try:
+                            mtime = os.stat(self.filename).st_mtime
+                        except OSError:
+                            pass
+                        else:
+                            try:
+                                if mtime > self.mtime:
+                                    self.mtime = mtime
+                                    reload(self.module)
+                                    cls = getattr(self.module, self.name)
+                                    self.obj = object.__new__(cls)
+                                    self.obj.__init__(*new_args, **new_kwargs)
+                                    self.halted = False
+                            except Exception as e:
+                                print
+                                traceback.print_exc()
+                                self.halted = True
+                            else:
+                                if not self.halted:
+                                    break
+                    try:
+                        return self.obj.update(*args, **kwargs)
+                    except Exception as e:
+                        print
+                        traceback.print_exc()
+                        self.halted = True
+
+            def __getattr__(self, name):
+                return getattr(self.obj, name)
+
+        return Wrapper()
+
+def autoreload(fn):
+    module = sys.modules[fn.__module__]
+    filename = module.__file__
+    if filename.endswith('.pyc'):
+        filename = filename[:-1]
+    name = fn.__name__
+    
+    def wrapper(*args, **kwargs):
+        halted = False
+        while True:
+            while True:
+                try:
+                    mtime = os.stat(filename).st_mtime
+                except OSError:
+                    pass
+                else:
+                    try:
+                        if mtime > wrapper.last_mtime:
+                            wrapper.last_mtime = mtime
+                            reload(module)
+                            wrapper.fn = getattr(module, name).fn
+                            halted = False
+                    except Exception as e:
+                        print
+                        traceback.print_exc()
+                    else:
+                        if not halted:
+                            break
+            try:
+                return wrapper.fn(*args, **kwargs)
+            except Exception as e:
+                print
+                traceback.print_exc()
+                halted = True
+    wrapper.fn = fn
+    wrapper.last_mtime = -1
+
+    return wrapper
+
+

talk/dls2012/demo/run.py

+from io import mplayer, view
+from analytics import Tracker
+import sys
+
+if len(sys.argv) > 1:
+    fn = sys.argv[1]
+else:
+    fn = 'tv://'
+
+tracker = Tracker()
+while True:
+    for img in mplayer(fn):
+        view(img, 'Input')
+        tracker.update(img)
+
+

talk/dls2012/demo/view.py

+from io import mplayer, view
+import sys
+
+for img in mplayer(sys.argv[1]):
+    view(img)
+