1. Pypy
  2. Untitled project
  3. compatibility

Wiki

Clone wiki

compatibility / PyStemmer

  1. Straightforward installation doesn't work, all details are below. But!
  2. Snowball Stemmer could be used in pypy as part of NLTK installation! (tested against pypy-1.6.1-dev0).
>>>> import nltk
>>>> stemmer = nltk.SnowballStemmer('german')
>>>> stemmer.stem('lehrerinnen')
u'lehrerinn'

NLTK is very easy to install with a tiny patch, see NLTK compatibility entry.

Why straightforward installation fails:

/tmp/PyStemmer-1.1.0$ pypy setup.py build
running build
running build_ext
building 'Stemmer' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/libstemmer_c
creating build/temp.linux-x86_64-2.7/libstemmer_c/src_c
creating build/temp.linux-x86_64-2.7/libstemmer_c/runtime
creating build/temp.linux-x86_64-2.7/libstemmer_c/libstemmer
creating build/temp.linux-x86_64-2.7/src
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_danish.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_danish.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_dutch.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_dutch.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_english.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_english.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_finnish.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_finnish.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_french.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_french.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_german.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_german.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_hungarian.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_hungarian.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_italian.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_italian.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_norwegian.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_norwegian.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_porter.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_porter.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_portuguese.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_portuguese.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_romanian.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_romanian.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_russian.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_russian.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_spanish.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_spanish.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_swedish.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_swedish.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/src_c/stem_UTF_8_turkish.c -o build/temp.linux-x86_64-2.7/libstemmer_c/src_c/stem_UTF_8_turkish.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/runtime/api.c -o build/temp.linux-x86_64-2.7/libstemmer_c/runtime/api.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/runtime/utilities.c -o build/temp.linux-x86_64-2.7/libstemmer_c/runtime/utilities.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c libstemmer_c/libstemmer/libstemmer_utf8.c -o build/temp.linux-x86_64-2.7/libstemmer_c/libstemmer/libstemmer_utf8.o
cc -fPIC -Wimplicit -Isrc -Ilibstemmer_c/include -I/home/vak/pkg/pypy/include -c src/Stemmer.c -o build/temp.linux-x86_64-2.7/src/Stemmer.o
src/Stemmer.c: In function ‘__pyx_f_7Stemmer_algorithms’:
src/Stemmer.c:152:16: warning: assignment from incompatible pointer type
src/Stemmer.c: In function ‘__Pyx_GetException’:
src/Stemmer.c:1136:5: error: ‘PyThreadState’ has no member named ‘exc_type’
src/Stemmer.c:1137:5: error: ‘PyThreadState’ has no member named ‘exc_value’
src/Stemmer.c:1138:5: error: ‘PyThreadState’ has no member named ‘exc_traceback’
src/Stemmer.c:1139:11: error: ‘PyThreadState’ has no member named ‘exc_type’
src/Stemmer.c:1140:11: error: ‘PyThreadState’ has no member named ‘exc_value’
src/Stemmer.c:1141:11: error: ‘PyThreadState’ has no member named ‘exc_traceback’
src/Stemmer.c: In function ‘__Pyx_InitStrings’:
src/Stemmer.c:1156:13: warning: implicit declaration of function ‘PyString_InternInPlace’
error: command 'cc' failed with exit status 1

The above case uses pre-generated C file. We can install Pyrex using easy_install coming with pypy in bin directory, delete Stemmer.c and even do

pypy setup.py install

This works indeed. However the Stemmer module becomes not importable because PyString_InternInPlace is not supplied with pypy-1.6

Updated