Using Python package TPOT in parallel on IBM i 7.3
I am using Python 3.6 on IBM i 7.3. The installation of TPOT (https://pypi.org/project/TPOT/) using pip3 install TPOT
finished successfully.
Using the package with n_jobs = 1
works without any problem.
from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target,
train_size=0.75, test_size=0.25)
pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5,
random_state=42, verbosity=2, n_jobs=1)
pipeline_optimizer.fit(X_train, y_train)
print(pipeline_optimizer.score(X_test, y_test))
pipeline_optimizer.export('tpot_exported_pipeline.py')
Note that there is a warning at the start:
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/util.py", line 186, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/synchronize.py", line 87, in _cleanup
sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory
and at the end:
Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/util.py", line 186, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/QOpenSys/pkgs/lib/python3.6/multiprocessing/synchronize.py", line 87, in _cleanup
sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory
If you want to optimise a pipeline in parallel, there is an error, which is attached to the issue. This is the code to reproduce the problem:
from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target,
train_size=0.75, test_size=0.25)
pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5,
random_state=42, verbosity=2, n_jobs=-1)
pipeline_optimizer.fit(X_train, y_train)
print(pipeline_optimizer.score(X_test, y_test))
pipeline_optimizer.export('tpot_exported_pipeline.py')
Comments (14)
-
Account Deactivated -
reporter That’s interesting. I have just tried it again and it still does not work.
Which versions are you using?
for me it’s:
Version Python 3.6.8 TPOT 0.10.1 sklearn 0.20.3 numpy 1.15.4 scipy 1.1.0 And no, I don’t use xgboost on IBMi.
-
Account Deactivated bash-4.4$ yum list installed |grep -i "python3\." python3.ppc64 3.6.8-1 @ibm bash-4.4$ yum list installed |grep -i "scikit" python3-scikit-learn.ppc64 0.19.1-6 @ibm bash-4.4$ yum list installed |grep -i "numpy" python3-numpy.ppc64 1.15.4-0 @ibm bash-4.4$ yum list installed |grep -i "scipy" python3-scipy.ppc64 1.1.0-0 @ibm bash-4.4$ pip3 list|grep -i tpot TPOT 0.10.1
Here’s mine.
-
Account Deactivated BTW, how did you get sklearn 0.20.3 on i ? I noticed that version of RPM is 0.19.1 only.
-
reporter What’s the output of this in Python?
import multiprocessing multiprocessing.cpu_count()
-
Account Deactivated Looks like I do not have this package on i. How did you installed it ?
-
reporter It’s part of the Python standard library, it ships with Python. It’s also the root cause of the problem I am experiencing.
-
Account Deactivated Can you show me the output of following commands on your system?
yum list installed |grep -i "python3" yum list installed |grep -i "scikit"
-
reporter yum list installed |grep -i "python3"
python3.ppc64 3.6.8-1 @ibm
python3-Pillow.ppc64 5.0.0-4 @ibm
python3-asn1crypto.noarch 0.24.0-0 @ibm
python3-bcrypt.ppc64 3.1.4-5 @ibm
python3-cffi.ppc64 1.11.5-2 @ibm
python3-cryptography.ppc64 2.2.2-2 @ibm
python3-cycler.noarch 0.10.0-0 @/python3-cycler-0.10.0-0.ibmi7.2.noarch
python3-dateutil.noarch 2.7.5-0 @ibm
python3-devel.ppc64 3.6.8-1 @ibm
python3-ibm_db.ppc64 2.0.5.9-0 @ibm
python3-idna.noarch 2.8-0 @ibm
python3-itoolkit.ppc64 1.6.0-0 @ibm
python3-kiwisolver.noarch 1.0.1-0 @/python3-kiwisolver-1.0.1-0.ibmi7.2.noarch
python3-lxml.ppc64 4.2.1-3 @ibm
python3-matplotlib.ppc64 3.0.2-0 @/python3-matplotlib-3.0.2-0.ibmi7.2.ppc64
python3-numpy.ppc64 1.15.4-0 @ibm
python3-pandas.ppc64 0.22.0-4 @ibm
python3-pip.noarch 9.0.1-2 @ibm
python3-pycparser.ppc64 2.19-1 @ibm
python3-pynacl.ppc64 1.2.1-3 @ibm
python3-pyparsing.noarch 2.3.1-0 @/python3-pyparsing-2.3.1-0.ibmi7.2.noarch
python3-pyzmq.ppc64 17.1.2-0 @/python3-pyzmq-17.1.2-0.ibmi7.2.ppc64
python3-rpm.ppc64 4.13.0.1-17 @ibm
python3-scikit-learn.ppc64 0.19.1-6 @ibm
python3-scipy.ppc64 1.1.0-0 @ibm
python3-setuptools.noarch 36.0.1-2 @ibm
python3-six.noarch 1.10.0-0 @ibm
python3-tkinter.ppc64 3.6.8-1 @ibm
python3-wheel.noarch 0.29.0-2 @ibm
yum list installed |grep -i "scikit"
python3-scikit-learn.ppc64 0.19.1-6 @ibm
-
Account Deactivated thanks for your share. I would guess you installed some multiprocess library on your side. Can you share me the output of “yum list all|grep mp”. thx.
-
reporter I have ‘downgraded’ and uninstalled the newer version of scikit learn, now it works for me as well.
And no, I have not installed it on my side, as mentioned, the multiprocessing library is part of the python standard lib, see https://docs.python.org/3/library/
-
Account Deactivated Glad to hear that it works for you now.
-
reporter I can also confirm that it works in parallel. thanks for the support!
-
reporter - changed status to resolved
- Log in to comment
Following is what I got on one i7.3 system. Looks like it works for me, of course with some wornings. From one warning, you may noticed that I am not using xgboost. Did you get this worning ? Or you already have xgboost working on i ? My experimental test on xgboost is that it does not works on i. I am still investigating it . Not sure whether the issue you hit is from xgboost or not.
The code I am using here is with n_jobs=-1 or n_jobs=3. both works for me.