relion_flex_analyse_mpi

Issue #31 new
L Kater created an issue

Hi, when I use the relion_flex_analyse_mpi tool to do partial signal subtraction after multi body refinement I will often (not always when I tried this) get less particle entries in the resulting particles.star files than images in the stacks.

It might be a user error, as I am not entirely sure that this is how the program was meant to be run. It works fine with relion_flex_analyse, it just takes forever.

I.e. if I use

mpiexec -n 5 relion_flex_analyse_mpi -data MultiBody/job285/run_ct4_data.star --model MultiBody/job285/run_ct4_model.star --bodies multibody_1.star --o Subtract/j285_bx250cen/particles --subtract --keep_inside MaskCreate/job110/mask.mrc --boxsize 250 --norm --ctf | tee Subtract/j285_bx250cen/run.out

I will get the following output:

 Reading in data.star file ...
 Reading in model.star file ...
 Initialising bodies ...
 Setting up subtraction masks and projectors ...
 The center of mass of the provided mask is at: -79.2703 -95.2927 47.3996
 Thus, the following arguments to relion_image_handler will bring the mask into the same box as the subtracted particles.
  --shift_x 79.2703 --shift_y 95.2927 --shift_z -47.3996 --new_box 250
 Processing all particles ... 
16.23/51.32 min ..................~~(,_,"> The relion GUI has been idle for more than 3600 seconds, exiting now... 
51.55/51.55 min ............................................................~~(,_,">
--------------------------------------------------------------------------
mpiexec has exited due to process rank 3 with PID 83988 on
node be-cryogpu05 exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.

--------------------------------------------------------------------------

and the following output files (5x mrcs but only 3x star with each 1/5 of the particles):

particles_body001_mask.mrc
particles_body002_mask.mrc
particles_body003_mask.mrc
particles_000002_subtracted.mrcs
particles_000004_subtracted.mrcs
particles_000004_subtracted.star
particles_000001_subtracted.mrcs
particles_000003_subtracted.mrcs
particles_000003_subtracted.star
particles_000005_subtracted.mrcs
particles_000005_subtracted.star

Comments (4)

  1. Takanori Nakane

    It is very strange that particles_000001_subtracted.star and particles_000002_subtracted.star were not generated.

    • Did you run all MPI processeson a single node or over multiple nodes?
    • If you run the non-MPI version on the same dataset, does it always finish cleanly?
  2. L Kater reporter

    All of these were run on one workstation directly from the command line (i.e. no queuing system). I have not tried this many times (without MPI it took about 5h). In all cases I have tried, it ran through without MPI (maybe 4 times) and in most cases I tried with MPI it produced only a subset of star files. I can think of only one case where I got all star files with entries for all particles while using MPI. If I recall correctly, I still got the error message (mpiexec has exited due to process rank 3 with PID...). Is it supposed to produce so many star files in the first place? It seems like it would be more useful if it would join them in the end.

  3. Takanori Nakane

    Each MPI process generates one MRCS and one STAR file. STAR files are joined in the end as you suggested. Somehow the program crashes on your computer before joining them.

  4. Log in to comment