Allow to opt out of precalculated AB matrices when onthefly_shifts = true
I encountered this problem in v2.1, but it looks like v3 would behave the same.
For sub-tomogram averaging, onthefly_shifts defaults to true, which causes AB matrices to be pre-calculated. Even with small boxes (170) and fine translational sampling, this leads to a very large memory footprint, severely limiting the number of MPI processes I can spawn. However, GPU calculations don't use these matrices and calculate sin & cos values on the fly. I commented out the pre-calculation, and everything seems to be running fine with plenty of processes. Obviously, this will break the failsafe mode because CPU calculations would still need the matrices. But if a data set doesn't cause fallbacks, it's not a problem.
I think it would be great to have an option to disable AB matrices for sub-tomo averaging on GPUs in case a user is sure failsafe mode won't be encountered.
Comments (6)
-
-
Obviously, this will break the failsafe mode because CPU calculations would still need the matrices. But if a data set doesn't cause fallbacks, it's not a problem
I don't understand this. When GPU calculation fails, the program simply dies. There is no automatic fallback to the CPU code path.
-
reporter Something like this: 7d228da
Sorry, that's how I imagined failsafe mode worked – fall back to CPU for FP64 calculations. Should have looked into the code.
-
Actually,
--do_shifts_onthefly
is disabled when--cpu
or--gpu
.if (do_shifts_onthefly && (do_gpu || do_cpu)) { std::cerr << "WARNING: --onthefly_shifts cannot be combined with --cpu or --gpu, setting do_shifts_onthefly to false" << std::endl; do_shifts_onthefly = false; }
However, this is turned on again later if
mymodel.data_dim == 3
, which is the problem. So I'd like to fix this as follows. Could you please confirm if this makes sense? I don't work on tomography at all, so I cannot test myself.diff --git a/src/ml_optimiser.cpp b/src/ml_optimiser.cpp index 8b9a0b0..22f3d1a 100644 --- a/src/ml_optimiser.cpp +++ b/src/ml_optimiser.cpp @@ -1856,7 +1856,7 @@ void MlOptimiser::initialiseGeneral(int rank) // Don't do norm correction for volume averaging at this stage.... do_norm_correction = false; - if (!((do_helical_refine) && (!ignore_helical_symmetry))) // For 3D helical sub-tomogram averaging, either is OK, so let the user decide + if (!((do_helical_refine) && (!ignore_helical_symmetry)) && !(do_cpu || do_gpu)) // For 3D helical sub-tomogram averaging, either is OK, so let the user decide do_shifts_onthefly = true; // save RAM for volume data (storing all shifted versions would take a lot!) if (do_skip_align)
-
reporter Can confirm, this solves the problem!
There is some irony in GPU calculations requiring orders of magnitude less CPU memory than the CPU code path.
-
- changed status to resolved
Fixed in commit 78d2161. Thank you very much for confirmation.
- Log in to comment
Can you show us the actual change (patch) you want to be incorporated?