pipeline runs one ThresholdTask per replicate (only one is needed)
could this be addressed by making ThresholdTask extend a new CombiningTask task class (extending luigi.WrapperTask) which would wrap an inner task class that is not parametrized by rep (so it will only be scheduled once)?
Comments (2)
-
reporter -
reporter - changed status to resolved
multiple changes
fixes
#30by changing the logic used to decide whether or not a task should be parallelized over replicates: previously the string key of the task was tested to see if it ended in 'threshold', now the top-level task is resolved to a task class name and checked for equality with 'MakeThreshold'change in Docker image version management: previously inside the Docker image lib5c would always report version "unknown" because versioneer cannot determine the version information since the .git folder is excluded from the image build by the .dockerignore, now the output of "git describe" can be passed to the Dockerfile at build time to pass this through via a folder naming trick
bumped package versions in requirements (notably python-daemon may now be installed at its latest version, previously 2.1.1 was being forced by setup.py)
changed version reporting style in versioneer to match "git describe" output for simplicity
→ <<cset 44104d138f6b>>
- Log in to comment
MakeThreshold extends JointInnerMixin, so as long as it gets scheduled only once it will trigger scheduling of all upstream tasks in a per replicate fashion
in PipelineTask.requires(), a conditional checks to see if the task key ends in the substring 'threshold', and if it does, then only one of that task is scheduled (instead of one per replicate)
this bug therefore reduces to the failure of that conditional to check the value of the task - it should not depend on the key since that is user-controlled