Error in process parallelization while importing ecoinvent database

Issue #17 resolved
Anonymous created an issue

I just installed brightway2 and I am trying to import the ecoinvent 3.5 database. I'm using the lines from the "how to get started" notebook:

from brightway2 import *

create project

projects.set_current("ecoinvent-import")

load ecoinvent db

ei35default = SingleOutputEcospold2Importer( r"C:\Users\name\Desktop\LCA\resources\ecoinvent v3.5\APOS\datasets", "ecoinvent 3.5 APOS" )

The code starts and shows a message:

Extracting XML data from 16045 datasets

Shortly after the following error pops up:

File "C:\Users\user\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self)

RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if name == 'main': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.

The code keeps running, showing the same error over and over again. I let it run for an hour, then I aborted. There seems to be some problem with the parallelization. Any ideas on how to fix it?

I am using Python 3.6 in the Pycharm IDE on Windows.

Comments (4)

  1. Chris Mutel repo owner

    There have been problems with multiprocessing in Windows before, and I thought that it defaulted to not using it on Windows, but I don't see this code anymore.

    For the time being, you can do this:

    from bw2io.extractors.ecospold2 import Ecospold2DataExtractor
    
    class FixedExtractor:
        @classmethod
        def extract(cls, dirpath, db_name):
            return Ecospold2DataExtractor.extract(dirpath, db_name, use_mp=False)
    
    
    ei = SingleOutputEcospold2Importer(
        "/Users/cmutel/Sync/3.5/cutoff/datasets",
        "something",
        extractor=FixedExtractor
    )
    
  2. Adrian Haas

    It did some quick checks and it appears that the problem is specific to PyCharm on Windows. From the console/ipython/jupyter windows uses multiprocessing successfully (all cores busy). PyCharm on linux also works fine.

  3. Log in to comment