Issues

Issue #1771 resolved

3x slower than CPython on alphabet soup challenge script

William Edwards
created an issue

The program is just a quick-n-dirty script to do a bit of random swapping. Its playing a game called "alfabetsoep" which was published in a Dutch programming magazine Luminis Conversing Worlds juli 2013 nr. 8.

When it runs, it does some random swaps on the alphabet. Its unlikely to make any improvements for a while, so after 1 minute it tries another starting shuffle. And it prints a log line, which shows how many random swaps it has evaluated.

On my Macbook 2.7Ghz i7 it manages just over 3000 swaps per minute using CPython 2.7.

Using pypy-2.3-osx64 it manages only 1000 swaps per minute.

Comments (9)

  1. Armin Rigo

    This is 3.3x slower on PyPy than on CPython:

    import time
    
    
    prebuilt = set([str(n) for n in range(10**7)])
    
    t1 = time.time()
    for i in range(5):
        foo = set(['Foo'])
        foo |= prebuilt
    print time.time() - t1
    

    This is 6.4x slower:

    import time
    
    prebuilt = {}
    for i in range(10**7):
        prebuilt[str(i)] = -12.34
    
    t1 = time.time()
    for i in range(5):
        foo = {'Foo': 213.1}
        foo.update(prebuilt)
    print time.time() - t1
    
  2. Armin Rigo

    The slow-downs above have been fixed (there was several sources, see 0a347de43469 7f9dca73c7b6 711f53c92504). PyPy is now running 1.73x slower than CPython on both my examples, which is ok (PyPy is known to be a bit slower than CPython on non-jittable code, particularly on code that is heavy on large lists or dicts).

    It seems that it didn't improve alphabet_soup.2.py, though.

  3. Armin Rigo

    This example is now twice as slow as it was in pypy 2.3.1:

    import time
    
    prebuilt = set([str(n) for n in range(10**7)])
    prebuilt2 = set([str(n*5//6) for n in range(10**7)])
    
    t1 = time.time()
    for i in range(5):
        foo = set()
        foo |= prebuilt
        foo |= prebuilt2
    print time.time() - t1
    
  4. Armin Rigo

    Oh btw: thanks for reporting this, it shows where we need to improve things with large dictionary updates and set unions. In case you are wondering why I would be happy if I "only" manage to reduce the overhead from 3x to 1.5x: this kind of code cannot possibly gain any speed-up with PyPy, because it's spending most of its time in big set operations. PyPy can (and will) JIT the Python code, but that's not where most of the time is spent here.

  5. Armin Rigo

    Thanks for the link to the talk! I'm not completely sure from just looking at the talk, but it seems to be a partial redo of Psyco, which was the Python compiler I did around 2001: at least the basic block handling reminds me a lot about that. However, by now I think that although the results were good, V8 and SpiderMonkey for JavaScript and PyPy for Python have all shown that a different approach gives even better results...

  6. Log in to comment