Using Python package TPOT in parallel on IBM i 7.3

Issue #52 resolved
Clemens Zauchner created an issue

I am using Python 3.6 on IBM i 7.3. The installation of TPOT ( using pip3 install TPOT finished successfully.

Using the package with n_jobs = 1 works without any problem.

from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(,,
                                                    train_size=0.75, test_size=0.25)

pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5,
                                    random_state=42, verbosity=2, n_jobs=1), y_train)
print(pipeline_optimizer.score(X_test, y_test))

Note that there is a warning at the start:

Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/", line 186, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/", line 87, in _cleanup
FileNotFoundError: [Errno 2] No such file or directory

and at the end:

Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/", line 186, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/", line 87, in _cleanup
FileNotFoundError: [Errno 2] No such file or directory

If you want to optimise a pipeline in parallel, there is an error, which is attached to the issue. This is the code to reproduce the problem:

from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(,,
                                                    train_size=0.75, test_size=0.25)

pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5,
                                    random_state=42, verbosity=2, n_jobs=-1), y_train)
print(pipeline_optimizer.score(X_test, y_test))

Comments (14)

  1. Gavin Gan Zhang Account Deactivated

    Following is what I got on one i7.3 system. Looks like it works for me, of course with some wornings. From one warning, you may noticed that I am not using xgboost. Did you get this worning ? Or you already have xgboost working on i ? My experimental test on xgboost is that it does not works on i. I am still investigating it . Not sure whether the issue you hit is from xgboost or not.

    /qopensys/pkgs/lib/python3.6/site-packages/sklearn/ensemble/ DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
      from numpy.core.umath_tests import inner1d
    Warning: xgboost.XGBClassifier is not available and will not be used by TPOT.
    Generation 1 - Current best internal CV score: 0.9844391821591708              
    Generation 2 - Current best internal CV score: 0.9859404912419489            
    Generation 3 - Current best internal CV score: 0.9859404912419489               
    Generation 4 - Current best internal CV score: 0.9859404912419489               
    Generation 5 - Current best internal CV score: 0.9859404912419489             
    Best pipeline: LogisticRegression(SelectPercentile(PolynomialFeatures(input_matrix, degree=2, include_bias=False, interaction_only=False), percentile=21), C=25.0, dual=False, penalty=l1)
    /qopensys/pkgs/lib/python3.6/site-packages/scipy/stats/ FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
      return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval

    The code I am using here is with n_jobs=-1 or n_jobs=3. both works for me.

  2. Clemens Zauchner reporter

    That’s interesting. I have just tried it again and it still does not work.

    Which versions are you using?

    for me it’s:

    Python 3.6.8
    TPOT 0.10.1
    sklearn 0.20.3
    numpy 1.15.4
    scipy 1.1.0

    And no, I don’t use xgboost on IBMi.

  3. Gavin Gan Zhang Account Deactivated
    bash-4.4$ yum list installed |grep -i "python3\."
    python3.ppc64                 3.6.8-1       @ibm                                
    bash-4.4$ yum list installed |grep -i "scikit"
    python3-scikit-learn.ppc64    0.19.1-6      @ibm                                
    bash-4.4$ yum list installed |grep -i "numpy"
    python3-numpy.ppc64           1.15.4-0      @ibm                                
    bash-4.4$ yum list installed |grep -i "scipy"
    python3-scipy.ppc64           1.1.0-0       @ibm                                
    bash-4.4$ pip3 list|grep -i tpot
    TPOT               0.10.1  

    Here’s mine.

  4. Gavin Gan Zhang Account Deactivated

    BTW, how did you get sklearn 0.20.3 on i ? I noticed that version of RPM is 0.19.1 only.

  5. Clemens Zauchner reporter

    What’s the output of this in Python?

    import multiprocessing
  6. Clemens Zauchner reporter

    It’s part of the Python standard library, it ships with Python. It’s also the root cause of the problem I am experiencing.

  7. Gavin Gan Zhang Account Deactivated

    Can you show me the output of following commands on your system?

    yum list installed |grep -i "python3"
    yum list installed |grep -i "scikit"  
  8. Clemens Zauchner reporter

    yum list installed |grep -i "python3"

    python3.ppc64 3.6.8-1 @ibm
    python3-Pillow.ppc64 5.0.0-4 @ibm
    python3-asn1crypto.noarch 0.24.0-0 @ibm
    python3-bcrypt.ppc64 3.1.4-5 @ibm
    python3-cffi.ppc64 1.11.5-2 @ibm
    python3-cryptography.ppc64 2.2.2-2 @ibm
    python3-cycler.noarch 0.10.0-0 @/python3-cycler-0.10.0-0.ibmi7.2.noarch
    python3-dateutil.noarch 2.7.5-0 @ibm
    python3-devel.ppc64 3.6.8-1 @ibm
    python3-ibm_db.ppc64 @ibm
    python3-idna.noarch 2.8-0 @ibm
    python3-itoolkit.ppc64 1.6.0-0 @ibm
    python3-kiwisolver.noarch 1.0.1-0 @/python3-kiwisolver-1.0.1-0.ibmi7.2.noarch
    python3-lxml.ppc64 4.2.1-3 @ibm
    python3-matplotlib.ppc64 3.0.2-0 @/python3-matplotlib-3.0.2-0.ibmi7.2.ppc64
    python3-numpy.ppc64 1.15.4-0 @ibm
    python3-pandas.ppc64 0.22.0-4 @ibm
    python3-pip.noarch 9.0.1-2 @ibm
    python3-pycparser.ppc64 2.19-1 @ibm
    python3-pynacl.ppc64 1.2.1-3 @ibm
    python3-pyparsing.noarch 2.3.1-0 @/python3-pyparsing-2.3.1-0.ibmi7.2.noarch
    python3-pyzmq.ppc64 17.1.2-0 @/python3-pyzmq-17.1.2-0.ibmi7.2.ppc64
    python3-rpm.ppc64 @ibm
    python3-scikit-learn.ppc64 0.19.1-6 @ibm
    python3-scipy.ppc64 1.1.0-0 @ibm
    python3-setuptools.noarch 36.0.1-2 @ibm
    python3-six.noarch 1.10.0-0 @ibm
    python3-tkinter.ppc64 3.6.8-1 @ibm
    python3-wheel.noarch 0.29.0-2 @ibm

    yum list installed |grep -i "scikit"

    python3-scikit-learn.ppc64 0.19.1-6 @ibm

  9. Gavin Gan Zhang Account Deactivated

    thanks for your share. I would guess you installed some multiprocess library on your side. Can you share me the output of “yum list all|grep mp”. thx.

  10. Clemens Zauchner reporter

    I have ‘downgraded’ and uninstalled the newer version of scikit learn, now it works for me as well.

    And no, I have not installed it on my side, as mentioned, the multiprocessing library is part of the python standard lib, see

  11. Log in to comment