Large memory consumption from `wdriftttp` array

Issue #100 resolved
David Dickinson created an issue

The (real valued, i.e. 64 bits per element) wdriftttp array is allocated as allocate (wdriftttp(-ntgrid:ntgrid,ntheta0,naky,negrid,nspec,2))so can become quite large for nonlinear simulations and the memory requirements do not scale with the number of processors (each process will allocate this full array).

The other related drift arrays, such as wdrift, all allocated as distribution function shaped objects, i.e. allocate (wdrift(-ntgrid:ntgrid,2,g_lo%llim_proc:g_lo%ulim_alloc)). This ensures the memory requirements are distributed, so reduce with increasing processor count.

I think we should be able to switch the wdriftttp form to match the wdrift form in order to have distributed memory usage. This will increase the memory usage if running on a single core by a factor nlambda but we would anticipate jobs being run on a low number of cores having relatively low resolution so we wouldn’t expect hardware resource to be an issue in such cases.

There may be more efficient ways to store this information, considering it is only non-zero at theta grid points where there is a totally trapped particle.

Comments (5)

  1. David Dickinson reporter

    Looking at this code it seems there’s possibly one bug with this code already. The array wdriftttp doesn’t depend on lambda but we could potentially assign to this for multiple values of the pitch angle depending on the properties of ittp.

    Currently the code looks something like

           do ig = -ntgrid, ntgrid
              if (ittp(ig) == nlambda+1) cycle
              do il = ittp(ig), nlambda
                 if (forbid(ig, il)) exit
                 !GWH+JAB: should this be calculated only at ittp? or for each totally trapped pitch angle? (Orig logic: there was only one totally trapped pitch angle; now multiple ttp are allowed.                                            
                 wdriftttp(ig,it,ik,ie,is,1) &
                      = wdrift_func(ig,ittp(ig),ie,it,ik,is)*tpdriftknob
                 !CMR:  totally trapped particle drifts also scaled by tpdriftknob                                                                                                                                                                
                 ! add Coriolis drift to magnetic drifts                                                                                                                                                                                          
                 wdriftttp(ig,it,ik,ie,is,2) = wdriftttp(ig,it,ik,ie,is,1) &
                      - wcoriolis_func(ig,il,ie,it,ik,is)
                 wdriftttp(ig,it,ik,ie,is,1) = wdriftttp(ig,it,ik,ie,is,1) &
                      + wcoriolis_func(ig,il,ie,it,ik,is)
              end do
           end do
    

    The line if (ittp(ig) == nlambda+1) cycle skips setting wdriftttp at any theta for which there are no totally trapped particles. This is fine.

    The lines

              do il = ittp(ig), nlambda
                 if (forbid(ig, il)) exit
    

    show that we’re potentially iterating over several totally trapped pitch angles, starting at the one totally trapped at this theta and the exiting once we get to the first pitch angle that is forbidden at this location. Either we can only ever have exactly one pitch angle for which at the theta index ig , ittp(ig) < nlambda + 1 and forbid(ig, il) = .false. or there is a bug here and we only get the Coriolis contribution from the highest non-forbidden pitch angle at this point. This is unlikely to be significant I think as mach is often 0 in which case the Coriolis contribution would be 0 for all pitch angles.

  2. David Dickinson reporter

    Second minor potential bug is that the Coriolis drift contribution is not weighted by tpdriftknob.

  3. Log in to comment