Build Linux wheels with manylinux

Issue #295 resolved
Thomas Kluyver
created an issue

This will probably take a bit of fiddling, but now that Linux wheels can be uploaded to PyPI, we should use the manylinux docker images, and the auditwheel tool to create Linux wheels (in addition to Windows and OS X wheels - see issue #222 for that).

Comments (20)

  1. Thomas Kluyver reporter

    I have got this sort of working in the manylinux-wheels branch in my copy of the repo. The wheels build, and I can install one onto my regular system and play the 'aliens' example OK. But there are a bunch of weird failures and segfaults in the tests, both when they run in the docker container, and on my local machine. I guess the old versions of the libraries that are bundled might be responsible, but I'm not sure how to go about debugging it.

  2. Thomas Kluyver reporter

    I made a bit more progress with this by building the SDL libraries from source rather than relying on outdated RPM packages. However, I'm still seeing several failures with libpng and libjpeg when I try to install the wheels on my own system:

    ======================================================================
    ERROR: ImageModuleTest.test_save_colorkey
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/takluyver/miniconda3/envs/pygame-whl-test/lib/python3.5/site-packages/pygame/tests/image_test.py", line 276, in test_save_colorkey
        s2 = pygame.image.load(temp_filename)
    pygame.error: Failed loading libpng.so.3: libpng.so.3: cannot open shared object file: No such file or directory
    

    I think something is loading libraries dynamically, and the auditwheel tool can't fix where it looks for those libraries.

    There are also a couple of other failures that look unrelated:

    ======================================================================
    ERROR: all_tests_for (pygame.tests.freetype_test.AllTestCases)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "test/pygame.tests.freetype_test.py", line 1, in all_tests_for
    subprocess completely failed with return code of 0
    cmd:          ['/home/takluyver/miniconda3/envs/pygame-whl-test/bin/python3', '-m', 'pygame.tests.test_utils.test_runner', 'pygame.tests.freetype_test', '--exclude', 'interactive,subprocess_ignore,python3_ignore', '--timings', '1']
    test_env:     environ({'XDG_SESSION_DESKTOP': 'gnome', 'IM_CONFIG_PHASE': '1', 'LOADED_CONFIG': '1', 'SHELL': '/bin/bash', 'LOADED_RC_FILES': 'False,False', 'TERM': 'xterm-256color', 'ANSIBLE_NOCOWS': '1', 'DESKTOP_SESSION': 'gnome', 'CONDA_DEFAULT_ENV': 'pygame-whl-test', 'DISPLAY': ':1', 'MANDATORY_PATH': '/usr/share/gconf/gnome.mandatory.path', 'QT_ACCESSIBILITY': '1', 'VTE_VERSION': '4205', 'LOGNAME': 'takluyver', 'XMODIFIERS': '@im=ibus', 'SDL_VIDEO_X11_WMCLASS': '-m', 'XDG_DATA_DIRS': '/usr/share/gnome:/usr/local/share/:/usr/share/:/var/lib/snapd/desktop', 'XDG_SEAT': 'seat0', 'AUTO_CD': '1', 'CLUTTER_IM_MODULE': 'xim', 'CONDA_ENV_PATH': '/home/takluyver/miniconda3/envs/pygame-whl-test', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-7pseMk2PZR', 'PWD': '/home/takluyver/Code/pygame/manylinux-build', 'GDMSESSION': 'gnome', 'PATH': '/home/takluyver/miniconda3/envs/pygame-whl-test/bin:/home/takluyver/.local/bin:/home/takluyver/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin', 'SESSION': 'gnome', 'XDG_RUNTIME_DIR': '/run/user/1000', 'XDG_CURRENT_DESKTOP': 'GNOME', 'XDG_SESSION_TYPE': 'x11', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-gnome:/usr/share/upstart/xdg:/etc/xdg', 'XAUTHORITY': '/run/user/1000/gdm/Xauthority', 'UPSTART_SESSION': 'unix:abstract=/com/ubuntu/upstart-session/1000/1324', 'WINDOWID': '25165830', 'QT_LINUX_ACCESSIBILITY_ALWAYS_ON': '1', 'XONSH_INTERACTIVE': 'True', 'USERNAME': 'takluyver', 'GJS_DEBUG_TOPICS': 'JS ERROR;JS LOG', 'XDG_SESSION_ID': '1', 'GJS_DEBUG_OUTPUT': 'stderr', '_': '/home/takluyver/miniconda3/envs/pygame-whl-test/bin/python3', 'USER': 'takluyver', 'INSTANCE': '', 'GPG_AGENT_INFO': '/home/takluyver/.gnupg/S.gpg-agent:0:1', 'SHLVL': '2', 'QT_IM_MODULE': 'ibus', 'GTK_MODULES': 'gail:atk-bridge', 'DEFAULTS_PATH': '/usr/share/gconf/gnome.default.path', 'SESSION_MANAGER': 'local/minion:@/tmp/.ICE-unix/1540,unix/minion:/tmp/.ICE-unix/1540', 'LANG': 'en_GB.UTF-8', 'SESSIONTYPE': 'gnome-session', 'XONSH_LOGIN': '1', 'QT4_IM_MODULE': 'xim', 'BASH_COMPLETIONS': '/usr/share/bash-completion:/usr/share/bash-completion/completions', 'GNOME_KEYRING_PID': '', 'GNOME_KEYRING_CONTROL': '', 'XONSH_VERSION': '0.3.0', 'XDG_MENU_PREFIX': 'gnome-', 'WINDOWPATH': '2', 'OLDPWD': '/home/takluyver/Code/pygame/manylinux-build/SDL_mixer-1.2.12', 'XDG_VTNR': '2', 'SHELL_TYPE': 'best', 'JOB': 'dbus', 'LANGUAGE': 'en_GB:en', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'HOME': '/home/takluyver', 'GTK_IM_MODULE': 'ibus', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'PROMPT': '{hostname}:{GREEN}{short_cwd}{NO_COLOR}$ '})
    working_dir:  /tmp/tmpao_3eelk
    return (top 5 lines):
    loading pygame.tests.freetype_test
    
    
    ======================================================================
    FAIL: FontTypeTest.test_metrics
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/takluyver/miniconda3/envs/pygame-whl-test/lib/python3.5/site-packages/pygame/tests/font_test.py", line 263, in test_metrics
        self.assert_(um[0] is not None)
    AssertionError
    

    I'm shelving this for now, but anyone who's interested is welcome to pick up my branch (linked above).

  3. René Dudfield

    We had to update the rpmforge repo.

    For sdl_image, and such we are trying --disable-sdl-dlopen to get past those errors. This is because although sdl links to them it tries to dlopen local copies of the libs anyway. Which have a different incompatible ABI.

  4. René Dudfield

    It's mostly working now.

    disable-sdl-dlopen didn't work. However, there were some configure options to tell sdl_image to not load jpg and png dynamically.

    Another issue is that auditwheel didn't link them quite correctly. So we had to manually add symlinks to the libjpeg.so.62 etc. Since auditwheel used some sort of weird name with a hash in there.

  5. Thomas Kluyver reporter

    Great! I imagine that @Nathaniel Smith might be interested to know that auditwheel didn't quite work by itself in this case.

    So will we be able to do a 1.9.2 release with Linux, Mac and Windows wheels, in addition to an sdist on PyPI? :-)

  6. René Dudfield

    I think using the same .so names that linux uses would be more resilient. eg. libjpeg.so.62 in this case. Especially for cases where dlopen is used, rather than 'static dynamic linking' (sorry, I don't know what the term is exactly). But maybe that is quite rare, I'm not sure exactly. I only know of sdl that does this really.

    Yes! I'm quite happy that we will be able to pip install pygame most everywhere (after first upgrading pip). Well, there is still more work to do... but almost there :)

  7. Nathaniel Smith

    The reason auditwheel puts a hash in the filename is that we found empirically that this was necessary to avoid breaking things. We don't do this just for fun :-)

    Specifically, the problem is that there is a "optimization" in the Linux shared library loading code, where when it goes to look for "libjpeg.so.62", then the first thing it does is check whether libjpeg.so.62 has ever been loaded in this process before, and if it has, it skips the search and returns the cached copy.

    So think about what happens if two extension modules both ship their own copy of libjpeg.so.62, or maybe one module ships it and another is linked against the version provided by the system package manager. The names collide, and suddenly instead of using the libjpeg.so.62 that shipped with your package, you might find yourself using the one that goes with the other package. You might get lucky and find that this works just as well... But then again, maybe not.

    In general, this will work so long as all files named "libjpeg.so.62" have exactly the same compiled in features and are 100% forwards and backwards compatible. For the specific case of libjpeg that might even be true, I dunno. But it's definitely not true in general, as we learned when auditwheel first started out and didn't yet have the library renaming feature :-).

    That said: it sounds like what you're doing now is letting auditwheel copy in and rename the libraries, and then adding symlinks to them, and then the only thing you do with those symlinks is to call dlopen and pass in an explicit path (I.e. the argument to dlopen always has a slash in it?). If all those things are true then you might be OK, because of some details about how the shared loader computes the key for the cache table (specifically I think it is clever enough to use the mangled hashed name, even though you access it through the symlink).

    If you can control the string that SDL passes to dlopen (I assume you must be able to to get it to look in the right directory?), then it might be simpler and more reliable all around to either have it look up the hashed name directly, or if keeping track of the hashes is difficult then we could teach auditwheel to use some more predictable mangling scheme, like "libjpeg-vendored-for-pygame.so.66".

  8. René Dudfield

    Aha! Thanks for the explanation Nathan. Also, thanks for auditwheel!

    Luckily SDL is pretty nice and allows a configuration where it doesn't try to dlopen the libraries. We're using that, and no symlinks now.

    I've started to documenting/automating the build process some more so it is reproducible. Using vagrant so people can also do it from win/mac/etc. Perhaps we can bake a docker image afterwards so that we don't need to build all the dependencies each time.

    Below I found a mirror because repoforge is down, and modified the script.

    #!/bin/bash
    set -e -x
    
    rpm --import /io/manylinux-build/RPM-GPG-KEY.dag.txt
    
    if [ "$(uname -i)" = "x86_64" ]; then
        RPMFORGE_FILE="rpmforge-release-0.5.3-1.el5.rf.x86_64.rpm"
        RPMFORGE_URL="https://repoforge.cu.be/redhat/el5/en/x86_64/dag/RPMS/rpmforge-release-0.5.3-1.el5.rf.x86_64.rpm"
    else
        RPMFORGE_FILE="rpmforge-release-0.5.3-1.el5.rf.i386.rpm"
        RPMFORGE_URL="https://repoforge.cu.be/redhat/el5/en/i386/dag/RPMS/rpmforge-release-0.5.3-1.el5.rf.i386.rpm"
    fi
    
    # wget http://pkgs.repoforge.org/rpmforge-release/${RPMFORGE_FILE}
    wget ${RPMFORGE_URL}
    rpm -i ${RPMFORGE_FILE}
    
  9. Thomas Kluyver reporter

    Perhaps we can bake a docker image afterwards so that we don't need to build all the dependencies each time.

    That's definitely my intention; I've been getting to grips with docker as I've worked on this (and some other unrelated stuff), and now I feel I understand it well enough to do that, whereas when I started, I was just throwing stuff in the shell script.

  10. René Dudfield

    Ah ok :) Well, I'll have to buy Robert a {beverage_of_choice} too some day.

    To test this we need to run the scripts on systems which are not the CentOS one where we do the build on. For now I'm just manually testing on the vagrant using some virtual envs. I guess we could also use docker to test on a few different distros.

    Some future improvements would be to use the homebrew scripts somewhat as inspiration. In there for example, they get fluidsynth, mikmod, and webp dependencies.

    There is also still a font test failing unfortunately.

    ======================================================================
    FAIL: FontTypeTest.test_metrics
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/opt/python/cp34-cp34m/lib/python3.4/site-packages/pygame/tests/font_test.py", line 263, in test_metrics
        self.assert_(um[0] is not None)
    AssertionError: None
    

    There is also a glibc memory error with test_music. I added a tag for that and exclude the test for now.

    We also need to fix the x86/x64 so that auditwheel repair only runs on the ones for the correct architecture.

    I see some sysfont related errors when running the tests on the vagrant I created (probably because the font related parts are not installed. It should however not error out in this case).

    Traceback (most recent call last):
      File "/home/ubuntu/anenv/lib/python3.5/site-packages/pygame/tests/font_test.py", line 115, in test_match_font_all_exist
        path = pygame_font.match_font(font)
      File "/home/ubuntu/anenv/lib/python3.5/site-packages/pygame/sysfont.py", line 353, in match_font
        for name in allnames.split(','):
    AttributeError: 'NoneType' object has no attribute 'split'
    

    These aren't meant to be copypasta'd in. Perhaps these can be worked into a script later.

    # Download many megabytes of ubuntu.
    vagrant init ubuntu/xenial64
    vagrant up
    vagrant ssh
    
    # now we are on the vagrant ubuntu host
    # We set up docker following these instructions for ubuntu-xenial
    # https://docs.docker.com/engine/installation/linux/ubuntulinux/
    sudo apt-get install apt-transport-https ca-certificates
    sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
    vi /etc/apt/sources.list.d/docker.list
    sudo vi /etc/apt/sources.list.d/docker.list
    sudo apt-get update
    sudo apt-get purge lxc-docker
    apt-cache policy docker-engine
    sudo apt-get install docker-engine
    
    # Now edit /etc/hosts so it has a first line with the hostname ubuntu-xenial in it.
    # Otherwise docker does not start.
    # 127.0.0.1 localhost ubuntu-xenial
    # makes a /etc/hosts.bak in case something breaks.
    sudo sed -i".bak" '/127.0.0.1 localhost/s/$/ ubuntu-xenial/' /etc/hosts
    
    # We should have been in our python package clone root directory before we ran vagrant ssh
    cd /vagrant
    
    # install auditwheel to create the wheel.
    # https://github.com/pypa/auditwheel
    #sudo apt install python3-pip
    #pip3 install auditwheel
    
    # We need to be able to run docker as the ubuntu user.
    sudo usermod -aG docker ubuntu
    sudo usermod -aG docker $USER
    
    # now log out of vagrant.
    #exit
    
    vagrant reload
    vagrant ssh
    
    # now we can start docker.
    sudo service docker start
    
    cd /vagrant/manylinux-build
    make
    
    # Now perhaps the whl files build correctly.
    ls -la wheelhouse
    
    export SDL_AUDIODRIVER=disk
    export SDL_VIDEODRIVER=dummy
    
    python3.5 -m venv anenv35
    . ./anenv35/bin/activate
    pip install wheelhouse/pygame-*cp35-cp35m-manylinux1_x86_64.whl
    python -m pygame.tests --exclude opengl,music
    

    There is a pull request https://bitbucket.org/pygame/pygame/pull-requests/68 with the changes I made.

  11. Thomas Kluyver reporter

    That looks good; I've been working in parallel to make base docker images, so we don't have to download and compile SDL & portmidi every time we build wheels - see pull request #69 (which is just the changes in my repo against the manylinux-wheels branch of this repo).

  12. René Dudfield

    Nice! (I'm illume on there). Note I was using the manylinux-wheels branch of this repo for pull request 68, so it seems there's conflicts now.

    I've uploaded the linux x64 wheels I made to pypi.

  13. Thomas Kluyver reporter

    Yeah, I think the conflicts are pretty trivial, but it will require a manual merge. I'll have a go tomorrow if you don't get to it before then.

    I've added you to the pygame org on Docker hub.

  14. Thomas Kluyver reporter

    I have fixed the failing font test by building a newer version of freetype into the base images (and consequently a newer version of libpng). They pass the tests inside the docker container, and the wheels I've tested (64 bit, Py 3.5/2.7) pass all the tests when I run them on my machine.

  15. Log in to comment