Numerical instability for small ion collision frequencies

Issue #147 new
Michael Hardman created an issue

In the course of attempting a scan in collisionality, I have noticed a very perplexing bug.

I have been performing my simulations of extended modes (nperiod > 20) at zero ion collisionality (vnewk =0 for ions), yet with collisions for electrons ranging from 0 to 100 (going on a log scale). I now wish to reproduce these results including the ion collision frequency with its physical value.

For small collision frequencies of ions (here vnewk = 10^-7), I see a numerical instability develop at large theta. This instability results from the Lorentz part of the collision operator, as can be seen from my tests. The attached files are as follows:

itge.vnewk.1e-05.default.in – The standard problem where I noticed the instability, full collision operator for both electrons and ions

itge.vnewk.1e-05.nocols.in – The ion collision frequency is set to 0, but the electrons retained their collision frequency (Numerical instability disappears)

itge.vnewk.1e-05.test.in – The ion collision frequency is nonzero, and the Lorentz collision operator model is used. Numerical instability appears.

itge.vnewk.1e-05.ediff.in – The ion collision frequency is nonzero, and only the energy diffusion part of the collision operator is used. numerical instability disappears.

The simulations were performed using the branch "ms_pgelres" with commit number

ade578037218496bd9d328a55212e4f0add7529f

Any help appreciated. David listed as he has been looking at the collision operator recently.

Comments (94)

  1. David Dickinson

    Just to confirm this goes away with ei_coll_only = T as would be expected. Increasing ky causes the numerical instability to blow up faster, reducing it prevents the instability from arising (within the duration of the simulation at least). To make the test case cheaper it is possible to set theta0=40 and nperiod=3 – for ky=0.5, shat=0.8, theta0=0.1 we see the mode blow up significantly at around theta=100. (Setting theta0=40 should give us similar kperp2 at theta=0 as at theta=100 with theta0=0.1).

  2. David Dickinson

    The problem also seems to persist with adiabatic electrons and (using PR #431) the instability also arises in the homogeneous problem (i.e. in the absence of sources).

  3. David Dickinson

    Increasing ntheta to 128 seems to avoid the instability in the reduced case. Clearly this is rather expensive to test in the original case but may be worth a go.

  4. David Dickinson

    For ntheta = 46 the instability is there but at ~half the growth rate. It is still there at ntheta=56 but appears to be stable by ntheta=60. It may be interesting to explore the lambda grid generated for these different theta resolutions.

  5. Michael Hardman reporter

    In the meantime I have been looking at bakdiff. The instability goes away for bakdiff = 0.05 – now trying to go lower. Unfortunately I need to have bakdiff = 0 exactly for the electron species in order to do the mass ratio scan.

  6. David Dickinson

    It’s also possible to remove the instability by increasing delt or making fexpr sufficiently negative, although I would anticipate this also changing the ability to capture the interesting physics.

  7. Michael Hardman reporter

    There appears to be an option to set bakdiff separately for each species. This can be achieved when mult_imp = .true. in the dist_fn_knobs namelist. I find the instability persists when bakdiff = 0.025 (for ions), so it seems that one cannot go far below bakdiff = 0.05 for this case. I will try to redo my mass ratio study with these parameters, but is quite unsatisfying to have to hack the ion species this way. I am not sure what consequences that might be of using different backwards differencing for the different species.

    I agree that it would be interesting to look into the different lambda grids for this case. I will report back on what happens to the ntgrid = 64 and ntgrid = 128 examples. Needless to say, these parameters would make production runs very expensive (especially given that these simulations here use the hydrogen mass, and I typically have been using masses larger than the deuterium mass to test the mass ratio expansions).

  8. David Dickinson

    The handling of different bakdiff for different species is not very consistent in the code. For example, in the source term we always use the bakdif of the first species.

    The instability also goes away if the drifts are zeroed out (presumably where the kperp2 dependence comes in). Given bakdif is also a way to remove this I think it is probably in the coefficients calculated in init_invert_rhs where the issue could perhaps arise. These are the main terms which enter the homogeneous problem. Artificially removing the v|| terms from these coefficients has no effect, whilst removing the drift terms from here only does suppress the instability.

    The instability growth rate is also sensitive to the energy grid (increasing negrid increases the growth rate, increasing nesuper at fixed negrid also seems to increase the growth rate).

  9. Michael Hardman reporter

    How do we explain the fact that the numerical instability seems absent for nu_ii (vnewk) = 0? I could believe there is a separate problem with init_invert_rhs and the drifts. For interest, I tried setting cfac = 0 in the collision operator and the instability remained.

    In the orginal “default” case, the instability persists for ntgrid = 64, but has disappeared for ntgrid =128 (55mins on 48 cores!) With the importance of the Lorentz collision operator for the instability, right now I think this pinpoints the lambda grid as the issue.

    Files attached

  10. Michael Hardman reporter

    itge.vnewk.1e-05.default.large – The original default case with ntgrid increased to 128, bakdiff =0 for both species and nonzero collision frequencies for both species. No numerical instability is observed

  11. Michael Hardman reporter

    David, do you still find a numerical instability when nu_ii = 0 in the simplified homogeneous case that you are examining? Is your opinion that the instability arises independently of the collision operator, and that I am just unlucky to see it here? Do I need to test the "default.large" case with increased negrid points?

  12. David Dickinson

    Removing any kperp2 dependence in the collisions module does not remove the instability, yet we know the instability only arises when kx/kperp is large (or at least the growth rate is dependent on this).

    Setting cfac=0 and turning off conservation removes any direct dependence on any of the collisions code on any property of the species other than vnewk and type. In other words no mass dependence enters directly.

    I do not see the instability with zero ion collisions in the simplified system.

    I don’t observe the instability in the simplified case without collisions, but equally I don’t observe the instability with collisions but zero drifts. It’s not uncommon for there to be a numerical instability that arises through the interaction of different terms. In the collisionless homogeneous problem with zero parallel velocity we’re essentially describing something like g.(omega-omega_D) = 0 where omega_D ~ kperp V_D is velocity dependent. I’m more used to thinking about the slab problem where we drop omega_D and keep v||, the equation then describes parallel advection. As the vperp/vpar values on our v-space grid change with theta structures* on our velocity grid become increasingly sheared as the structure advects along theta. This results in the development of small scale structure on the velocity grid. One possibility is that a similar thing is happening here such that we’re developing small scale structure, such that terms like nu * d g/ dlambda can become large even when nu is small. Of course this can’t really explain directly the instability as the description may similarly hold for electrons which have a larger nu and hence such terms may be expected to behave in the same way. One possibility is that using delt=0.1 is damping out any equivalent electron instability due to stepping over the relevant time scale (can note that setting delt=10 allows us to avoid the ion instability, for example). Another possibility is that at higher nu the collisions are sufficient to prevent such small scale structure from developing to any significant extent. It may be interesting to see what the adapative collision frequency algorithm suggests for this problem.

    Setting the electron mass to 1, the electron collision frequency to 1e-7 and the ion collision frequency to zero we find the same instability. Dropping the mass to 0.1 the instability is there but at a reduced growth rate. Dropping the time step from 0.1 to 0.01 in this case increases the growth rate a bit further (indicating we are getting some stabilisation from the time step/implicit scheme). Dropping the time step further only marginally increases the growth rate. If we consider an electron mass of 0.001 then we may expect to see an instability provided the time step is around 0.001 (actually we probably don’t expect a linear dependence, but it gives a rough ballpark to aim for). Of course we would need rather large nstep as well I believe. So far I’ve not found an instability with the mass reduced to 0.01 and delt =0.0001, perhaps this also changes the required vnewk for instability – this seems to be the case as for mass = 0.01, delt = 0.01 and electron vnewk = 1e-9 I do again see instability.

    We’ve also seen numerical instabilities in collisionless electromagnetic cases which only become unstable at sufficiently small time step (see issue #142). Whilst I don’t think these are likely to be directly related, it at least serves to highlight that there can be numerical instabilities lurking in the system which our implicit scheme can suppress in linear simulations but may then arise in NL runs when the time step drops low enough. I have some thoughts for how we may be able to avoid this in NL runs, but it would be nice if we could find a way to eliminate such instabilities from the system to start with.

    *Here by structures I’m considering something like a gaussian blob in theta that is initialised to be constant in velocity-space. Issue #144 discusses a collisionless homogenous periodic slab test case which may help my explanation.

  13. David Dickinson

    As vnewk drops the associated timescale increases. Calculating the relevant timescale and comparing to the time step may help highlight when we are able to resolve this collisional time scale. This time scale may depend on the scale of structures in velocity space as well as vnewk and the velocity space resolution, which could explain why the instability seems to be sensitive to the drifts (growth rate drops as drifts scaled down) as this could be viewed as setting some form of small scale structure formation timescale (crudely speaking).

    The dependence on energy grid resolution may come from that fact that for increasing resolution our minimum and maximum energy values move towards more extreme values. One might imagine that the highest energies are where the velocity space structures become sheared most rapidly, but could be interesting to check this.

  14. David Dickinson

    Zeroing the drifts for trapped particles only seems to be sufficient to remove the instability.

  15. Michael Hardman reporter

    Which knob did you use to remove the trapped particle drifts? This observation is very interesting, because changing the number of ngauss passing particle points did not remove the numerical instability, whereas you showed that increasing ntheta (and hence the number of trapped lambdas) did remove the instability.

  16. David Dickinson

    Ah, if I make the wfb be treated as a passing particle when we scale the drifts then the instability no longer goes away when I zero the “trapped” particle drifts but now goes away when we zero the “passing” particle ones. This strongly suggests that the wfb is a key part of this issue I think.

  17. David Dickinson

    The flag tpdriftknob scales the trapped particle drifts, whilst driftknob scales the passing drifts (and tpdriftknob defaults to driftknob)

  18. Michael Hardman reporter

    And you are using the wfbbc_option to switch between “trapped” and “passing” for wfb?

    I will try this too. It’s usually the wfb..

  19. Michael Hardman reporter

    I see, you have only changed whether or not tpdriftknob affects the wfb or not? The wfbbc_option changes whether or not the wfb is treated as a passing or a trapped particle for the parallel solve. I have tried switching it between the various options and it didn't make a difference for me, hence my interest.

  20. Michael Hardman reporter

    I confirm that using tpdriftknob to zero out the magnetic drifts for the wfb (and only the wfb, i.e. il = ng2 +1) removes the numerical instability in my original case.

  21. Michael Hardman reporter

    Looking at init_wdrift and wdrift_func, why are there no forbid statements here for traps and passing particles, when there are forbid statements for totally trapped particles? Is it possible that some noise (due to 1- lambda B = 0 not being satisfied exactly) is getting into the drifts at the bounce points of the wfb?

  22. David Dickinson

    That could be an issue I suppose. Our v||^2 calculation energy(ie,is)*(1.0 - al(il)*bmag(ig)) in the drift frequency can clearly go negative when considering a trapped particle in the forbidden region. One may argue this is fine provided we don’t then go and use the wdrift values in the forbidden region (which we shouldn’t do), however I can note that we actually put the drift on a mid-point grid (i.e. wdrift(ig) = (wdrift(ig)+wdrift(ig+1))/2 ) which will end up using the drift in the forbidden region to calculate the drift in the allowed region. This may be intentional but has caused me concern in the past. I also note that this means the value wdrift(ntgrid) is not treated consistently as this is not the value at a grid mid-point (however again I think we shouldn’t generally use this).

  23. David Dickinson

    I should also note that the calculation of wdrift is identical for trapped and passing pitch angles, the only difference is which scaling flag is used to scale the drift.

  24. Michael Hardman reporter

    I have just remembered that the wfb does not have any forbidden points by definition. Zeroing out the wdrift at ig = +-ntrgrid does not remove the instability. I shall have to look at the parallel solve -- maybe this business with the midpoint vs gridpoints holds the answer.

  25. David Dickinson

    It may be as mentioned earlier that the drift is just a way to generate velocity space structure to allow the collisional issue to take off. Is it possible that the velocity volume (i.e. integration weight) associated with the wfb is becoming rather small and that this then leads to a poorly conditioned collisional operator? I don’t think this picture would get better with increasing theta (/trapped pitch) resolution.

  26. Michael Hardman reporter

    I will take a look at the integration weights. For the example that I'm looking at now (default and default.large above), increasing ntheta does remove the instability, but I have to go to extreme values.

  27. David Dickinson

    If you use next there’s a small helper dump_grids which will write out the grids and integration weights without running the full code.

  28. David Dickinson

    Taking vn->0 I think that we should find aa=cc=0 and bb=1 in the Lorentz operator matrix elements. This corresponds to betaale=1 , qle=0.0 and c1le=0.0. Forcing these values for these arrays still leads to a blow up of the code. Removing the code which turns off/skips collisions when vnewk <=0 we can see that the numerical instability still arises when vnewk=0 for ions. This seems to imply that the instability is due to the code contained in solfp_lorentz_le_layout (and if we return from this routine immediately then we do not see instability).

  29. Michael Hardman reporter

    I will try to reproduce this, I think I see how to. Do you find numerical instability when you have turned off the drifts for the wfb?

    I am wondering if this problem here is related to the numerical instability for small dt and small ky, although I am struggling to find that test case.

  30. David Dickinson

    To enable the collision operator with zero collision frequency it is necessary to remove

        if (all(abs(vnew(:,1,:)) <= 2.0*epsilon(0.0))) then                                                                                                                                     
            collision_model_switch = collision_model_none                                                                                                                                        
           colls = .false.                                                                                                                                                                      
            return                                                                                                                                                                               
         end if 
    

    and to comment out the if (abs(vnew(ik,1,is)) < 2.0*epsilon(0.0)) cycle line in the solfp routines.

    Yes removing the wfb drift seemed to help remove the instability. I also wonder if it’s related to the small dt and small ky issue, although the issue here arises at large kperp instead.

    Commenting out most of the code so that it basically sets delta=gle ; gle=delta is sufficient for the instability to occur, despite if we simply return (equivalent to gle=gle) then there is no instability. This suggests there is likely some issue in how we are copying the data about (e.g. the logic in what we copy where). Of course just commenting out chunks of code isn’t necessarily a sensible thing to do. If I do this, however and remove if (jend(ig) /= 0) gle(2*jend(ig),ie,ile) = gle(jend(ig),ie,ile) then the instability does go away.

  31. Michael Hardman reporter

    I have confirmed for myself that commenting out the tridiagonal solve (and associated lines of code that try to clean up the fact that there are two vpar =0 points) in solfp_lorentz_standard_layout removes the numerical instability.

    I find this quite worrying, because I spent quite a long time in the subroutine during my DPhil, and I thought that I had fixed it. Obviously only adequately enough for my purposes at the time!

  32. David Dickinson

    To be explicit the reduced code I am using is

        do ile = le_lo%llim_proc, le_lo%ulim_proc
           it = it_idx(le_lo,ile)
           ik = ik_idx(le_lo,ile)
           if (kwork_filter(it,ik)) cycle
           if (ieqzip(it,ik)==0) cycle
           is = is_idx(le_lo,ile)
           ig = ig_idx(le_lo,ile)
    
    !   je  = #physical xi values at location, includes duplicate point at vpar=0                                                                                                                 
    !  je-1 = #physical xi values removing duplicate vpar=0 point                                                                                                                                 
           je=max(2*jend(ig),2*ng2+1)
           nxi_scatt=je-1
           if (jend(ig) == ng2+1 .and.special_wfb_lorentz) then
              nxi_scatt=nxi_scatt-1
           endif
    
           do ie = 1, negrid
           ! LINE TO REMOVE
              gle(je,ie,ile)=0.0d0  ! zero redundant duplicate xi, isign=2 for vpar=0!                                                                                                            
              ! right and left sweeps for tridiagonal solve:                                                                                                                                      
              delta(1) = gle(1,ie,ile)
              do il = 1, nxi_scatt
                 delta(il+1) = gle(il+1,ie,ile) !Usually : gle(il+1,ie,ile) - qle(il+1,ie,ile)*delta(il)                                                                                                                     
              end do
    
              gle(nxi_scatt+1,ie,ile) = delta(nxi_scatt+1)!Usually :delta(nxi_scatt+1)*betaale(nxi_scatt+1,ie,ile)                                                                                                           
              do il = nxi_scatt, 1, -1
                 gle(il,ie,ile) = delta(il) !Usually : (delta(il) - c1le(il,ie,ile)*gle(il+1,ie,ile))*betaale(il,ie,ile)                                                                                    
              end do
           ! LINE TO REMOVE          
              if (jend(ig) /= 0) gle(2*jend(ig),ie,ile) = gle(jend(ig),ie,ile)
    
           end do
        end do
    

    As written, this code leads to numerical instability. Removing the two lines following ! LINE TO REMOVE avoids the instability.

    Some notes:

    1. For this case je=2*jend(ig) everywhere.
    2. Behaviour not significantly different with choice of special_wfb_lorentz → can consider nxi_scatt = je-1
    3. We set gle(je) = 0.0 , we then set gle(nxi_scatt+1) = gle(je-1+1) = delta(nxi_scatt+1) (which is set to 0.0) and finally set gle(2*jend(ig)) = gle(je) = gle(jend(ig)). We set this duplicate point to zero, it can then enter the solve in the loop on line 22 (note the usual form allows this to be non-zero when vnewk >0), it is then set directly on line 26 to something potentially non-zero. We then overwrite this value on line 31 based on the other instance of this duplicate value. It seems strange that the duplicate point enters the right-left sweeps for the tridiagonal solve. I suspect this is why the special_wfb_lorentz flag has been added as this should avoid the duplicate point entering.

  33. Michael Hardman reporter

    I remember being troubled by your point 3) when I first looked at this routine. I think the resolution was that betaa(nxi_scatt+1) = 0 in the initialisation, although I will have to check this. I agree that the tridiagonal has no reason to extend to nxiscatt + 1. When I made some changes to the special_wfb_lorentz option some years ago I found that there was an inconsistency in the initialised matrix of coefficients, at theta = +- pi the length of the coefficients array was short by 1 element.

  34. Michael Hardman reporter

    Indeed, in init_lorentz

    te2 = 2*je-1

    and later

    qle(te2+1,ie,ile) = 0.0
    betaale(te2+1,ie,ile) = 0.0

    meaning that betaale(nxi_scatt+1) =0.

  35. Michael Hardman reporter

    By setting these values before the tri diagonal solve

    ql = 0.
    betaa = 1.
    c1 = 0.

    and commenting out these lines

    !glz(je,ilz)=0.0 ! zero unphysical/duplicate vpar=0 poin

    !if (jend(ig) /= 0) glz(2*jend(ig),ilz) = glz(jend(ig),ilz)

    , And setting vpar_zero_mean = .false. ,I lose the numerical instability.

    Now the code looks like

    !Hack
           ql = 0.
           betaa = 1.
           c1 = 0.
        !if( .not. skip_this_bug) then!MRH bugskip
        ! solve for glz row by row
        do ilz = lz_lo%llim_proc, lz_lo%ulim_proc
    
           ik = ik_idx(lz_lo,ilz)
           it = it_idx(lz_lo,ilz)
           if(kwork_filter(it,ik))cycle
           if (ieqzip(it,ik)==0) cycle
           is = is_idx(lz_lo,ilz)
    !       if (abs(vnew(ik,1,is)) < 2.0*epsilon(0.0)) cycle
           ie = ie_idx(lz_lo,ilz)
           ig = ig_idx(lz_lo,ilz)
    
    !CMRDDGC, 10/2/2014: 
    ! Fixes for wfb treatment below, use same je definition in ALL cases
    !   je  = #physical xi values + 1 
    !       NB +1 above WITH TRAPPED is duplicate xi=vpar=0 point with isign=2
    !          +1 above WITHOUT TRAPPED is entirely unphysical extra grid point
    !  je-1 = #physical xi values removing unphysical/duplicate extra point
           je=max(2*jend(ig),2*ng2+1)
           nxi_scatt=je-1
    
           if (je==2*jend(ig) .and. vpar_zero_mean) then 
           ! MRH if we have trapped particles and hence 2 vpar=0 points on the grid
           ! use the average of both vpar = 0 points in the lorentz diffusion in pitch angle
           ! this turns out to be necessary to suppress numerical instability
           ! when we handle pitch angle scattering physically at theta = +/- pi
           ! by setting special_wfb_lorentz =.false.
            glz(jend(ig),ilz)=(glz(jend(ig),ilz)+glz(2*jend(ig),ilz))/2.0
           endif 
    
           !glz(je,ilz)=0.0  ! zero unphysical/duplicate vpar=0 point
           if (jend(ig) == ng2+1 .and.special_wfb_lorentz) then
    !CMRDDGC:  special_wfb_lorentz = t  => unphysical handling of wfb at bounce pt: 
    !          remove wfb from collisions, reinsert later
    !
    ! first save gwfb for reinsertion later
              gwfb = glz(ng2+1,ilz) 
    ! then remove vpa = 0 point, weight 0: (CMR confused by this comment!)  
              glz(ng2+1:je-2,ilz) = glz(ng2+2:je-1,ilz)
              nxi_scatt=nxi_scatt-1
           endif
    !CMRDDGCend
    
           ! right and left sweeps for tridiagonal solve:
    
           delta(1) = glz(1,ilz)
           do il = 1, nxi_scatt
              delta(il+1) = glz(il+1,ilz) - ql(il+1,ilz)*delta(il)
           end do
    
    !CMR, 5/9/14:
    ! Correction to next line fixes a bug introduced at r2699 which
    ! affected Lorentz scattering operator at wfb bounce point (theta=pi)
    ! with default setting of special_wfb_lorentz=t        
           glz(nxi_scatt+1,ilz) = delta(nxi_scatt+1)*betaa(nxi_scatt+1,ilz)
           do il = nxi_scatt, 1, -1
              glz(il,ilz) = (delta(il) - c1(il,ilz)*glz(il+1,ilz))*betaa(il,ilz)
           end do
    
    !       ! interpolate to obtain glz(vpa = 0) point for wfb
    !       ! and insert this point into glz
           ! interpolation described above mysteriously causing numerical instability
           ! stabilized by using old (pre-collision) value of g for wfb
           if (jend(ig) == ng2+1.and.special_wfb_lorentz) then
              glz(ng2+2:je-1,ilz) = glz(ng2+1:je-2,ilz)
              glz(ng2+1,ilz) = gwfb
           end if
    
    !
    ! bug fixed 4.14.03
    ! was only a problem with collisions but with no trapped particles
    ! because of an array index out of bounds 
    ! Overall, this was a rare bug.
    !
           ! update the duplicate vpar=0 point with the post pitch angle scattering value 
           !if (jend(ig) /= 0) glz(2*jend(ig),ilz) = glz(jend(ig),ilz) 
    
        end do
        !endif ! MRH bugskip
    

  36. Michael Hardman reporter

    I think this just confirms that the original tri diagonal solve works in the trivial case.

    Fundamentally it looks like we are up against the duplicate vpar = 0 point.

  37. Michael Hardman reporter

    I am struggling to see how to make the pitch angle scattering routine work when nu = 0. The matrix coefficients appear to be fine. If there is an error it is likely to be in the limits of the loops.

    As a sanity check, I have checked that the energy diffusion tri diagonal solve has no problems with nu =0.

  38. Michael Hardman reporter

    After examining solfp_lorentz_standard_layout a bit further, I have been able to make the subroutine act as the identity operator on the distribution function. To achieve this I insert the lines 21 and 22 below, and commenting out lines 33 & 78, this ensures that the tri diagonal solve returns the collisionless g unchanged.

    I have not been able to find a solution where only a single vpar = 0 element (il = jend(ig) or il = 2*jend(ig)) participates in the tri-diagonal solve. Fundamentally, the gs2 collisionless algorithm has these two vpar = 0 points on the grid, whereas for the collision operator there should only be one vpar = 0 point. It seems to me that this might be the fundamental issue -- although I do not currently understand why switching off the magnetic drift for the wfb would smooth this over.

        do ilz = lz_lo%llim_proc, lz_lo%ulim_proc
    
           ik = ik_idx(lz_lo,ilz)
           it = it_idx(lz_lo,ilz)
           if(kwork_filter(it,ik))cycle
           if (ieqzip(it,ik)==0) cycle
           is = is_idx(lz_lo,ilz)
    !       if (abs(vnew(ik,1,is)) < 2.0*epsilon(0.0)) cycle
           ie = ie_idx(lz_lo,ilz)
           ig = ig_idx(lz_lo,ilz)
    
    !CMRDDGC, 10/2/2014: 
    ! Fixes for wfb treatment below, use same je definition in ALL cases
    !   je  = #physical xi values + 1 
    !       NB +1 above WITH TRAPPED is duplicate xi=vpar=0 point with isign=2
    !          +1 above WITHOUT TRAPPED is entirely unphysical extra grid point
    !  je-1 = #physical xi values removing unphysical/duplicate extra point
           je=max(2*jend(ig),2*ng2+1)
           nxi_scatt=je-1
           ! MRH Hack
           betaa(:,ilz) = 0.
           betaa(:nxi_scatt+1,ilz) = 1.
    
           if (je==2*jend(ig) .and. vpar_zero_mean) then 
           ! MRH if we have trapped particles and hence 2 vpar=0 points on the grid
           ! use the average of both vpar = 0 points in the lorentz diffusion in pitch angle
           ! this turns out to be necessary to suppress numerical instability
           ! when we handle pitch angle scattering physically at theta = +/- pi
           ! by setting special_wfb_lorentz =.false.
            glz(jend(ig),ilz)=(glz(jend(ig),ilz)+glz(2*jend(ig),ilz))/2.0
           endif 
    
           !glz(je,ilz)=0.0  ! zero unphysical/duplicate vpar=0 point
           if (jend(ig) == ng2+1 .and.special_wfb_lorentz) then
    !CMRDDGC:  special_wfb_lorentz = t  => unphysical handling of wfb at bounce pt: 
    !          remove wfb from collisions, reinsert later
    !
    ! first save gwfb for reinsertion later
              gwfb = glz(ng2+1,ilz) 
    ! then remove vpa = 0 point, weight 0: (CMR confused by this comment!)  
              glz(ng2+1:je-2,ilz) = glz(ng2+2:je-1,ilz)
              nxi_scatt=nxi_scatt-1
           endif
    !CMRDDGCend
    
           ! right and left sweeps for tridiagonal solve:
    
           delta(1) = glz(1,ilz)
           do il = 1, nxi_scatt
              delta(il+1) = glz(il+1,ilz) - ql(il+1,ilz)*delta(il)
           end do
    
    !CMR, 5/9/14:
    ! Correction to next line fixes a bug introduced at r2699 which
    ! affected Lorentz scattering operator at wfb bounce point (theta=pi)
    ! with default setting of special_wfb_lorentz=t        
           glz(nxi_scatt+1,ilz) = delta(nxi_scatt+1)*betaa(nxi_scatt+1,ilz)
           do il = nxi_scatt, 1, -1
              glz(il,ilz) = (delta(il) - c1(il,ilz)*glz(il+1,ilz))*betaa(il,ilz)
           end do
    
    !       ! interpolate to obtain glz(vpa = 0) point for wfb
    !       ! and insert this point into glz
           ! interpolation described above mysteriously causing numerical instability
           ! stabilized by using old (pre-collision) value of g for wfb
           if (jend(ig) == ng2+1.and.special_wfb_lorentz) then
              glz(ng2+2:je-1,ilz) = glz(ng2+1:je-2,ilz)
              glz(ng2+1,ilz) = gwfb
           end if
    
    !
    ! bug fixed 4.14.03
    ! was only a problem with collisions but with no trapped particles
    ! because of an array index out of bounds 
    ! Overall, this was a rare bug.
    !
           ! update the duplicate vpar=0 point with the post pitch angle scattering value 
           !if (jend(ig) /= 0) glz(2*jend(ig),ilz) = glz(jend(ig),ilz) 
    
        end do
    

  39. Michael Hardman reporter

    I am now puzzled by the following observation. Trapped particles have vpar = 0 twice on the grid, thus resulting in the special exceptions in the tri-diagonal solve in the code above. Nonetheless, we do not see numerical instabilities from trapped particles. Since we find that the numerical instability goes away for the cases where the world's fattest banana magnetic drifts are zero, this indicates that the numerical instability is related to the world's fattest banana.

    However, I have already implemented the option for the world's fattest banana to have trapped particle boundary conditions in the parallel solve (wfbbc_option = "trapped"), by using this option with the collision operator still gives numerical instability. Does this mean that the problem is not really in the collision operator, but in the parallel solve?

  40. David Dickinson

    Yes I think I agree that the inclusion of the duplicate point does work out fine, it just doesn’t seem necessary and hence causes me some confusion.

  41. David Dickinson

    Yes I think the issue is that we’re enforcing g(wfb,1) = g(wfb,2) at both ends of the theta grid when a collisionless solve does not (always) enforce this so we’ll find that our solution may experience a discontinuity in the parallel direction. I’ve been able to make the code go unstable by adding the below snippet at the end of timeadv and running without collisions

       do iglo = g_lo%llim_proc, g_lo%ulim_proc
           il = il_idx(g_lo, iglo)
           if(il/=ng2+1) cycle
           gnew(-ntgrid,1,iglo) = gnew(-ntgrid,2,iglo)
           gnew(ntgrid,1,iglo) = gnew(ntgrid,2,iglo)
        end do
    

    I’m not too familiar with your different wfb boundaries but I seem to recall the trapped treatment treats the wfb as bouncing at the ends of the parallel grid rather than at every instance of the inboard side (i.e. (2n+1)pi)?

  42. David Dickinson

    I meant to add:

    We do enforce continuity in the trapped particles already in the collisionless parallel solve (i.e gnew(bounce,1,iglo) = gnew(bounce,2,iglo)) which is a difference from the wfb (at least the traditional treatment).

    Personally I like to consider the wfb as a trapped particle and would like to treat it as a particle which bounces at bmag=bmax.

  43. David Dickinson

    Also we could check that g(wfb, 1) = g(wfb, 2) on entry to collisions. If it doesn’t then we have two possible solutions (as we could pick either of these to use in the collisional solve) which we don’t seem to have a sensible way to pick between currently (i.e. both appear equally valid). If it does then the current collisional code should probably work fine whichever choice we make.

  44. David Dickinson

    I added a few print statements to look at the gnew(-ntgrid,:,iglo_wfbs) and gnew(ntgrid,:,iglo_wfbs) values before and after collisions (for istep = 1).

    I use code like

           print*,"AA",iglo,abs(gnew(-ntgrid,1,iglo)),abs(gnew(-ntgrid,2,iglo))
           print*,"BB",iglo,abs(gnew(ntgrid,1,iglo)),abs(gnew(ntgrid,2,iglo))
    

    before collisions and the same with A->C and B->D after collisions.

    With wfbbc_option = ‘passing’ I get the following

    AA 631 4.0535249035213036E-063 1.0110789079091423E-064
    BB 631 4.0535249035213036E-063 1.0110789079091423E-064

    CC 631 1.9760352895927395E-063 1.9760352895927395E-063
    DD 631 1.9762081094233478E-063 1.9762081094233478E-063

    With wfbbc_option = ‘default’ I get

    AA 631 4.0535249035213036E-063 1.0110789079091423E-064
    BB 631 4.0535249035213036E-063 1.0110789079091423E-064

    CC 631 1.9760352895927395E-063 1.9760352895927395E-063
    DD 631 1.9762081094233478E-063 1.9762081094233478E-063

    With wfbbc_option = ‘trapped’ I get

    AA 631 1.5772046198087082E-066 1.5772046198087082E-066
    BB 631 3.9538474699267821E-063 3.9538474699267821E-063

    CC 631 1.5808356974614790E-066 1.5808356974614790E-066
    DD 631 3.9536767289548769E-063 3.9536767289548769E-063

    We can see that in all cases the collision operator is ensuring that the distribution is single valued at the ends of the parallel domain between the two signs of v||. This is only also true pre-collisions in the trapped case. As noted, the trapped case also blows up in the same way so this difference may not be related to the blow up (although I do think highlights that there is certainly an issue to address with the wfb treatment). I find it slightly surprising that the trapped case shows four orders of magnitude difference in the distribution function at the two ends of the domain, unlike the other cases.

  45. David Dickinson

    Skipping collisions just at the ends of the theta domain doesn’t seem to prevent things from blowing up so perhaps there’s something not quite right away from the wfb “bounce point”.

    If we believe g(wfb) having different values for the two sigma when bmag=bmax then I suppose another option is to apply the collisions twice – once for each of the duplicate wfb v||=0 values. If this doesn’t lead to the same values for g at the other pitch angles (and I can’t see why it would give the same result) then this appears like it must be inconsistent. I suppose what this could be saying is that we consider our wfb points with the two different sigma to have v|| approaching 0 at bmag=bmax from opposite sides and that we are content for there to be a discontinuity across this (which we might hope collisions will smooth out) . In this picture I guess we would then include both v||->0 points for the wfb. This is clearly different from the treatment of trapped particles, but may be acceptable if we view the wfb as passing such that the two different sigmas are only coupled through collisions and potentials. I’m not sure there’s a nice way to implement this if we wanted to.

    I think we probably would benefit from settling down on a single approach to treating the wfb. As I noted earlier my preference is for this to just be a regular trapped particle treatment (we’d probably want to introduce a bounce_point array to work alongside forbid in order to do this nicely). This could raise the issue that the wfb would like to have different values for g from the separate isolated trapped regions. I’m not sure if this can be addressed through use of the homogeneous solution.

  46. Michael Hardman reporter

    Thanks for this helpful exposition! I think we are getting somewhere.

    To answer this point:

    I’m not too familiar with your different wfb boundaries but I seem to recall the trapped treatment treats the wfb as bouncing at the ends of the parallel grid rather than at every instance of the inboard side (i.e. (2n+1)pi)?

    When I last touched this code, wfbbc_option could take three values "passing", "trapped", "mixed", and "default". The "passing" option corresponds to the wfb being treated as a normal passing particle, with zero incoming boundary condition at the ends of the extended data domain. The "trapped" option corresponds to the wfb being treated as an ordinary trapped particle bouncing at B = Bmax (theta = +-pi) points on the grid. The "mixed option" corresponds to the wfb being treated unphysically -- the wfb is treated as a passing particle for interior +-pi points, but bounces like a trapped particle at the +-(2n-1)pi points where n is nperiod. The "default" option pointed to "mixed". In most examples these options gave no difference, although as you have seen in your previous post these boundary conditions indeed enforce different values for the wfb distribution.

    Because the trapped wfb option is already implemented, I think that the wfb can already be treated as a trapped particle consistent with the rest of the trapped particles simply by setting wfbbc_option = "trapped" and using the default collision options (special_wfb_lorentz = .false.). The fact that this still gives numerical instability is deeply troubling to me -- what have I missed?

    I would prefer to view the wfb as a passing particle (with vpar -> 0 at +-pi). This is inconsistent with the current treatment in the collision operator, and I wonder if it would be possible to remove the wfb completely from the code. I note that stella does not include the vpar = 0 point on its grids because it causes numerical issues.

  47. Michael Hardman reporter

    Because the trapped wfb option is already implemented, I think that the wfb can already be treated as a trapped particle consistent with the rest of the trapped particles simply by setting wfbbc_option = "trapped" and using the default collision options (special_wfb_lorentz = .false.)

    Taking this statement at face value, and setting vpar_zero_mean = .false. in the input file, I have tried to test exactly which points in theta have magnetic drifts that cause the wfb to carry a numerical instability.

    To test this, I have made the changes in commits 3a5c2a9, 602adc2, and ad4da15 in branch

    https://bitbucket.org/gyrokinetics/gs2/branch/wfbdriftknobtest

    The addition to the collision operator is to allow the tri-diagonal routines to be used even if vnewk = 0 for both species. The change to the magnetic routine is below. Instead of including the wfb with trapped particles for the purposes of tpdriftknob, we had a separate knob wfbdriftknob, and an integer iwfbfudge which controls (via forbid) how many points have wdrift modified by wdriftknob.

    if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il > ng2+1 .and. il <= jend(ig)) then
                 wdrift(ig,1,iglo) = wdrift_func(ig, il, ie, it, ik)*tpdriftknob
    #ifdef LOWFLOW
                 wcurv(ig,iglo) = wcurv_func(ig, it, ik)*tpdriftknob
    #endif
              else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. forbid(ig,il+iwfbfudge)) then
              !else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. il <= jend(ig)) then
                 wdrift(ig,1,iglo) = wdrift_func(ig, il, ie, it, ik)*wfbdriftknob
    #ifdef LOWFLOW
                 wcurv(ig,iglo) = wcurv_func(ig, it, ik)*wfbdriftknob
    #endif
    

    Quite spectacularly, if I set wfbdriftknob = 0. and iwfbfudge = 2, I am able to remove the numerical instability for my original test case, for all input values of ntheta. This is true for nu =0 and nu = 2e-07 for ions and 1e-05 for electrons (default case above).

    In conclusion, it looks like the parallel solve has a problem for the first two points on the grid away from Bmin.

  48. Michael Hardman reporter

    Input files showing how to treat the wfb as a trapped particle, and how to remove the numerical instability with the hack in https://bitbucket.org/gyrokinetics/gs2/branch/wfbdriftknobtest . This strongly suggests that the parallel solve is broken for the wfb at the points adjacent to the Bmax point. To compile this branch use Makefiles on branch ms_makefiles (5883dc3) and utils on branch ms_utils (f599234)

  49. Michael Hardman reporter

    Inspection of the eigenmodes reveals that there are unphysical structures at large kperp, if ntheta is too small, even with the hack described in the last two posts. The problem appears to go away for large enough ntheta (> 16 or so). It is not clear if this is because the problem with the wfb propagates beyond points near to Bmax, or if the hack itself is responsible.

  50. David Dickinson

    Just catching up on this. To me, treating the wfb as passing with vpar->0 suggests that the two sigma values correspond to different populations at the wfb bounce point so both should appear in the Lorentz operation.

    I note that else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. forbid(ig,il+iwfbfudge)) then will potentially result in an out-of-bounds access (there’s no logical short-circuiting in Fortran so we will evaluate every condition in this statement, e.g. if il=nlambda we’ll still evaluate forbid(ig,il+2)). Doesn’t this condition evaluate to true for all theta beyond the bounce point of the wfb+2 pitch – i.e. the region between ntgrid_bounce(ng2+1 + iwfbfudge) and ntgrid, so the points around bmax rather than the points around bmin? I’m probably just misreading this though.

    On the wfbbc_option='trapped' code I’ve just had a look and it appears that we do not set g2 for the wfb in this case, but do for all the other trapped particles. It may turn out that this is not required for the wfb as there are no gaps where g1==0. We also ensure a single valued distribution function at the upper ntgrid bounce point, but we don’t seem to enforce this at any internal bounce point. (We also ensure single valued wfb at theta = -ntgrid through construction). Again, this may work itself out if I think about things further but I just thought I’d check if my initial reading of the code is consistent with your understanding.

  51. Michael Hardman reporter

    Doesn’t this condition evaluate to true for all theta beyond the bounce point of the wfb+2 pitch – i.e. the region between ntgrid_bounce(ng2+1 + iwfbfudge) and ntgrid, so the points around bmax rather than the points around bmin?

    Sorry, I should have written Bmax, where I wrote Bmin above. I am trying to prevent the wfb drifting near Bmax. I have corrected my statements above

  52. Michael Hardman reporter

    I note that else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. forbid(ig,il+iwfbfudge)) then will potentially result in an out-of-bounds access (there’s no logical short-circuiting in Fortran so we will evaluate every condition in this statement, e.g. if il=nlambda we’ll still evaluate forbid(ig,il+2))

    I am not suggesting that this becomes a permenant fix! I can change il+iwfbfudge to ng2 + 1 + iwfbfudge.

  53. Michael Hardman reporter

    We also ensure a single valued distribution function at the upper ntgrid bounce point, but we don’t seem to enforce this at any internal bounce point.

    Good spot. I would have written this with “box” mode with nperiod = 1 in mind. This should be fixed by solving for wfb between every N pi, N+2 pi

  54. David Dickinson

    Thanks for the responses. The only reason I point out the out-of-bounds access is that this can, in some instances, lead to all sorts of strange behaviour in unrelated parts of the code, it’s only a read access here so probably fine.

  55. Michael Hardman reporter

    It looks like using “box” mode with nperiod = 1 and wfbbc_option = “trapped” removes the instability. Testing further.

  56. Michael Hardman reporter

    Very reduced input files that run quickly on a single core, showing that the numerical instability is removed by using "box" mode with wfb treated as a trapped particle. If one switches from the wfbbc_option = 'trapped' to the wfbbc_option = 'passing' option then the numerical instability returns.

    This supports the need to treat wfb as a trapped particle. However this is problematic outside of "box" mode with nperiod = 1. The reason for this is that theta = N pi points are duplicated in "box" mode, but they are not duplicated in "range" mode. Presumably in range mode we should treat wfb as a passing particle, and modify the collision operator appropriately?

    David, what are your thoughts on this?

  57. David Dickinson

    I agree that if we treat the wfb as passing then the collision operator should be modified.

    Yes in ballooning space we have some extra complication in being able to handle fact that at the wfb bounce point we might expect to find values for the wfb distribution function coming from either side of this bounce point. It would be nice to think of an elegant way to handle this. One (inelegant) approach may be to drop the duplicate sigma point at bounce points as we require the distribution to be identical for both sigma we shouldn’t need to store both. We could then interpret the two sigma as referring to the two sides of the bounce point here. This is hacky, likely to cause significant confusion and would require very different handling to the current passing approach so I’m not particularly in favour of this approach, but it might be one way to quickly hack in an implementation of this for us to test.

    If one could argue that g_wfb(bounce) had to be single valued then our life would be simplified significantly I think, but I’ve yet to think of a good reason this would have to be the case other than arguing that collisions will tend to smooth the distribution function across the trapped-passing boundary so if the barely passing passing distribution is single valued and continuous here then the wfb at either side of bmag=bmax at least can’t be very different from one another.

  58. David Dickinson

    For the trapped wfb treatment I think we would always want to use radau_gauss_grid = F – is this right? Presumably we would expect the potentials to be single valued at the duplicate theta point -- for this to be the case we would expect the velocity integral of the distribution function to have the same value at both of the duplicate points. With Radau-Gauss grids we have non-zero weight for the wfb at its bounce point so we could find different potentials there if the wfb has different distribution function values at the duplicate bounce points.

  59. Michael Hardman reporter

    That might be correct, that argument sounds reasonable. One of the reasons that I introduced the radau_gauss_grid = T option was because the collision operator coefficients are calculated using the particle weights, and if we have zero weight at one point then this prescription will run into trouble. Perhaps this point is fixable.

    What I do not see how to fix is how to resolve the fact that we would need to perform pitch angle scattering at two duplicate theta points (when we only have one on grid). Since we only have the rest of the v|| points once, how do we decide which pitch angle scattering calculation was correct? The one from theta=pi- or theta =pi+?

  60. Michael Hardman reporter

    I was able to find enough time to experiment with my proposed solution of removing the wfb grid point entirely. According to @William Dorland , gs2 was originally written without a wfb grid point. A working version of my branch without the wfb is here https://bitbucket.org/gyrokinetics/gs2/branch/remove-wfb-from-grid

    This branch does not have the numerical instability featured in this thread. The collision operator causes no numerical instability in the limit nu->0, even with bakdiff = 0. I have done some initial sanity checks with some linear simulations close to my original input file here, but no nonlinear runs

    I chose to implement these changes in my branch because 1) I want this feature, along with other features in my branch, and 2) it was relatively quick in a version of the code I was familiar with. I'm sure it will be possible to put this to the more recent version of gs2 in master or next -- however, there may be places in the code that I did not make fully consistent for lack of time, and so I thought this initial branch could act as a talking point for users to play with.

    I would be keen to have feedback from the other developers to see if there is a downside to removing the wfb, and what code I have missed, if any. @Michael Barnes @Joseph Parker

  61. David Dickinson

    I think there isn’t a correct point to choose here but we would have to do and store both. This is certainly not something which has an obvious simple elegant solution (for me at least). I’m warming a little to my earlier suggested horrible hacky approach which doesn’t store the duplicate v||=0 for both signs, freeing up space to store both values of g at the duplicate theta point. This might not be as terrible as I first worried if applied consistently, but I may also be warming to it just because I’ve not been able to think of an alternative.

    I see you’ve added a commit which demonstrates how to remove the wfb entirely. This is a useful option to have and I’m interested to see if this reproduces existing results well whilst avoiding the observed numerical instabilities (both the original collisional issue and also similar in appearance issues such as issue #108).

  62. Michael Hardman reporter

    I am up for a discussion about your approach if that would be useful.

    I should warn that in the commit that I just made the wfb is removed by force, with no option to add it back in with a change of a logical flag. I found it too tiring to accommodate a new option alongside the various wfb special exceptions that already existed -- if need be I can think more carefully about how to get all of these options into one code.

    I would be very enthusiastic about testing the wfb-free grid and code in a variety of circumstances, if you have a list of issues to test. (Hopefully the issues exist also in my old branch!)

  63. David Dickinson

    I strongly agree with your approach of removing the wfb in that branch rather than trying to accommodate every possible option! At this point it’s better to get clearer but less flexible code as it allows us to test the consequences of the decision. I’m happy to look at porting this to work with the current release (as I’d like to use this).

    In terms of issues to test against I think issue #108 is the main one recorded on the Bitbucket tracker which also provides an input file to test against. I’ve seen a similar instability in a number of different cases and I suspect they are all somewhat related. I do wonder if it could possibly help with the issue reported in issue #95.

    I’m currently trying to work on wrapping up our logic about how we decide how to treat various pitch angles into logical functions (see https://bitbucket.org/gyrokinetics/gs2/commits/fcc5ce05261ea9b4ef32841fb1c7be0e6252b3b4?at=feature/add_functions_wrapping_up_logic_about_pitch_angles) for an example – this might make it easier to support a number of different choices as it limits the appearance of such choices to these small logical functions. If nothing else, I think this leads to more readable code.

  64. David Dickinson

    I tested the input file from issue #108 with your branch and unfortunately it seems it still does blow up. Are there any changes to inputs that are needed to make sure this is consistent?

  65. David Dickinson

    I think I can see potential for a more general issue which eliminating the wfb won’t necessarily fix. If we have a magnetic field with internal local maxima (i.e. maxima with bmag < bmax) then we will have a pitch angle which bounces on both sides of this point, as with the wfb. This is likely to be subject to the same instability as is facing the wfb.

  66. Michael Hardman reporter

    I would be wary of new_trap_int and opt_source, so I have just tested the minimal.in of #108 with new_trap_int = .false. and opt_source=.false.. I also find numerical instability. On reflection, I am not surprised. The bug reported in this thread appears to be related to the inconsistency between the treatment of collisions and the collisionless parallel solve, whereas #108 appears to be about collisionless physics only. Can you find the bug in #108 in toroidal geometry? It looks very similar to an instability that I have previously come across in cyclone base case.

  67. David Dickinson

    Yes it was a bit of a long shot, thanks for confirming.

    I believe this does exist in toroidal geometry as well.

  68. David Dickinson

    The only other numerical instability associated with collisions that I’m aware of is one that @Joseph Parker saw with his split collisions code but I think that’s also probably not related to the wfb but rather the change in treatment of collisions generally.

  69. Michael Hardman reporter

    I had not considered internal bounce points. Do you have a recipe for producing such a magnetic field? I have never considered how the parallel solve deals with internally bouncing particles. My current understanding is the trapped particles bounce at ig when forbid(ig+1,il) = T (if v|| > 0) or forbid(ig-1,il) = T (if v|| < 0). For a magnetic field with a local maximum B_loc, how does gs2 treat lambda = 1/B_loc? For this lambda, I would expect that forbid = F at B=B_loc, and so I would expect that the particle passes over B=B_loc. This would mean (as I think you imply) that the particle would have v|| = 0 at B_loc but v|| =0+ and v||= 0- could have different amplitudes.

  70. David Dickinson

    A recipe is as follows

    &theta_grid_knobs
                            equilibrium_option =  "eik"
    /
    
    &theta_grid_eik_knobs
                              beta_prime_input =  -1.00000000000E+00
                                        bishop =  4
                                     equal_arc =  F
                                         iflux =  0
                                          irho =  2
                                      local_eq =  T
                                   s_hat_input =  4.00000000000E+00
                                     writelots =  T
    /
    
    &theta_grid_parameters
                                        akappa =  2.0
                                       akappri =  -0.3
                                       nperiod =  1
                                        ntheta =  128
                                          qinp =  0.5
                                         r_geo =  0.30000000000E+01
                                          rhoc =  0.50000000000E+00
                                          rmaj =  0.30000000000E+01
                                         shift =  0.00000000000E+00
                                           tri =  0.50000000000E+00
                                        tripri =  1.5
    /
    

    This may not be particularly representative of a real tokamak, but I’ve found that these local maxima do appear when heading towards the pedestal of STs at least.

    Yes GS2 uses forbid(ig +/- 1, il) to determine if we’re at an upper or lower bounce point and to then act as required. For a pitch angle bouncing at an internal maximum we would therefore not identify the bounce point and the pitch would be treated as passing at these internal bounce points, only bouncing at the outer bounce points so could have different values based on the sign of velocity at the internal bounce point.

  71. Michael Hardman reporter

    Thanks for the code! I have tried inserting this in my input files, and I see numerical instability regardless of which branch I use. Could I trouble you to supply a whole input file to make sure that I'm not making a mistake?

  72. David Dickinson

    This was not taken from a real case, but just constructed to demonstrate a possible bmag structure.

    Adding the following gives me a vaguely complete input file which doesn’t seem to blow up with current next

    &dist_fn_species_knobs_1
                                        bakdif =  0.00000000000E+00
                                         fexpr =  0.50000000000E+00
    /
    
    &kt_grids_knobs
                                   grid_option =  "single"
    /
    
    &kt_grids_single_parameters
                                           aky =  0.40000000000E+00
    /
    
    &species_knobs
                                         nspec =  1
    /
    
    &species_parameters_1
                                          dens =  0.10000000000E+01
                                         fprim =  0.22000000000E+01
                                          mass =  0.10000000000E+01
                                          temp =  0.10000000000E+01
                                         tprim =  0.69000000000E+01
                                          type =  "default"
                                         vnewk =  0.00000000000E+00
                                             z =  0.10000000000E+01
    /
    
    &knobs
                                        nstep = 1000
    /
    

    Turning on collisions (vnewk =0.1) doesn’t seem to lead to instability either – so it’s not immediately obvious that this demonstrates a problem. Increasing nperiod from 1 to 2 also doesn’t show an issue.

  73. David Dickinson

    Ah I spoke too soon. Setting nperiod=2 and vnewk = 1.0e-7 does eventually go numerically unstable by around t=90

  74. Michael Hardman reporter

    My gs2 output looks like the following…

     r_bar =   0.500000000000000
    
     d beta/d rho =   0.000000000000000E+000 ,   -1.00000000000000
     s_hat    4.77550232732293      ,    4.00000000000000
    sum phinew**2= 0.67E+09 sum aparnew**2= 0.00E+00 sum bparnew**2= 0.00E+00
    sum phinew**2= 0.67E+09 sum aparnew**2= 0.00E+00 sum bparnew**2= 0.00E+00
    ky= 5.00E-01 kx= 0.00E+00 om= 0.00E+00  0.00E+00 omav= 0.00E+00  0.00E+00 phtot= 1.07E+08 theta0= 1.00E-01
    
    sum phinew**2=      NaN sum aparnew**2= 0.00E+00 sum bparnew**2= 0.00E+00
    ky= 5.00E-01 kx= 0.00E+00 om=      NaN       NaN omav=      NaN       NaN phtot=      NaN theta0= 1.00E-01
    

    Trying this with current trunk works if I change ntheta = 130. I suspect a bug in the geometry module in my branches, but not in next.

  75. David Dickinson

    I think there must be some difference in the geometry calculations as I find the “expected” shear is reported as 4.7816880047933630 whilst you find 4.77550232732293 whilst these numbers aren’t very different (and not useful), they’re different enough to demonstrate some variation in the code.

    It turns out I was testing with the wrong setup – I had qinp = 3.5 rather that 0.5. When I use the correct value I also see the fields becoming Nan after the first output. Dropping tripri to 0.5 sees this go away (the mode grows but at a reasonable rate). Slowly increasing tripri I find the appearance of Nan seems to coincide with the appearance of the local maxima. Using tripri=0.777 the simulations behaves reasonably, using tripri=0.778 leads to Nans appearing immediately. Checking diff(bmag[theta>=0]) I can see that this seems to be associated with the appearance of a negative diff value, giving evidence that the issue is indeed to do with local maxima appearing.

    I should also note that for both tripri values I find diff values which are exactly 0 suggesting that we have a flat bmag structure in places.

  76. David Dickinson

    I should note, the Nans appear in the collisionless case with nperiod=1 so the issue here cannot be solely with the trapped treatment interacting with the Lorentz collision operator.

  77. David Dickinson

    Furthermore, it seems that we end up with different nlambda in these two cases – 72 with tripri=0.778 and 74 with tripri=0.777 (with ntheta=130, ngauss=5 I would expect nlambda = 10 + 66 =76 I think). I wonder if gridgen is perhaps struggling with this case but we’ve not triggered one of the test cases which emit a warning.

  78. David Dickinson

    It seems that the nan is appearing because we find that g1(ig, 1)==1.0 for a particular lambda and theta point. When trying to ensure consistency at trapped bounce points we add in a piece of the homogeneous solution proportional to 1/(1-g1(bounce,1)) so here we end up dividing by zero.

  79. David Dickinson

    For this particular point I find forbid(ig+1, il) = T, forbid(ig, il) = F, forbid(ig-1, il) = F, forbid(ig-2, il) = T where ig is the upper bounce point at which we calculate 1/(1-g1(ig,1)). This looks like a stretched out totally trapped particle – there are no points at which v||/=0 in this allowed region.

  80. David Dickinson

    In this case it can be avoided by simply guarding the calculation of beta1 on if (gnew(ig, 1, iglo) /= gnew(ig, 2, iglo)) - in other words only set beta1 /= 0 if we don’t already satisfy the bounce condition.

  81. David Dickinson

    With this small change to the code I’ve been able to reproduce the wfb-like collisional error.

    For reference my input file is

    &dist_fn_species_knobs_1
                                        bakdif =  0.00000000000E+00
                                         fexpr =  0.50000000000E+00
    /
    
    &kt_grids_knobs
                                   grid_option =  "single"
    /
    
    &kt_grids_single_parameters
    aky =  0.40000000000E+00
    theta0 = 40.0
    /
    
    &species_knobs
                                         nspec =  1
    /
    
    &species_parameters_1
                                          dens =  0.10000000000E+01
                                         fprim =  0.22000000000E+01
                                          mass =  0.10000000000E+01
                                          temp =  0.10000000000E+01
                                         tprim =  0.69000000000E+01
                                          type =  "default"
                                         vnewk =  0.0 !10000000000E-06                                                                                                                            
                                             z =  0.10000000000E+01
    /
    
    &knobs
    nstep =1000
    /
    &theta_grid_knobs
                            equilibrium_option =  "eik"
    /
    
    &theta_grid_eik_knobs
                              beta_prime_input =  -1.00000000000E+00
                                        bishop =  4
                                     equal_arc =  F
                                         iflux =  0
                                          irho =  2
                                      local_eq =  T
                                   s_hat_input =  4.00000000000E+00
                                     writelots =  T
    /
    
    &theta_grid_parameters
                                        akappa =  2.0
                                       akappri =  -0.3
                                       nperiod =  1                                                                                                                                            
                                        ntheta =  128              
                                          qinp =  0.5
                                         r_geo =  0.30000000000E+01
                                          rhoc =  0.50000000000E+00
                                          rmaj =  0.30000000000E+01
                                         shift =  0.00000000000E+00
                                           tri =  0.50000000000E+00
                                        tripri = 1.5
    /
    

    Simply switching between vnewk = 0 and vnewk = 1.0e-7 is enough to see the instability. Keeping vnewk = 1.0e-7 but dropping tripri to 0.7 removes the internal local maxima and appears to avoid the instability as shown below. It would probably be useful to identify a narrower range of tripri as in the earlier cases to make the argument that the issue is the introduction of local maxima a bit more convincing.

  82. Michael Hardman reporter

    This is very convincing, but unfortunately I'm not able to get past the original NaNs mentioned above. In dist_fn.fpp, I have tried to take into account that g1 may be 1.0 in the following way (changes on line 8 & 9). Have I misunderstood? Have you tested the above with next, or with my remove-wfb-from-grid branch? What I want to confirm is that the code with the wfb removed definitely has the problem that you feature above. I expect it will!

    if (il >= ng2+1 .and. il <= lmax) then
           beta1 = 0.0
           do ig = ntgr-1, ntgl, -1
              if (ittp(ig) <= il) cycle
              if (forbid(ig,il)) then
                 beta1 = 0.0
                 cycle !CMR: to avoid pointless arithmetic later in loop
              else if (forbid(ig+1,il) .and. (gnew(ig, 1, iglo) /= gnew(ig, 2, iglo))) then
              !else if (forbid(ig+1,il)) then
                 beta1 = (gnew(ig,1,iglo) - gnew(ig,2,iglo))/(1.0 - g1(ig,1))
              end if
              gnew(ig,1,iglo) = gnew(ig,1,iglo) + beta1*g1(ig,1)
              gnew(ig,2,iglo) = gnew(ig,2,iglo) + beta1*g1(ig,2)
           end do
        end if
    

  83. David Dickinson

    This was a test on next (or something branched off next).

    That fix looks appropriate. I wrote it slightly differently, but it should be equivalent I think

              else if (forbid(ig+1,il)) then
                 if (gnew(ig,1,iglo) /= gnew(ig,2,iglo)) then
                    beta1 = (gnew(ig,1,iglo) - gnew(ig,2,iglo))/(1.0 - g1(ig,1))
                 else
                    beta1 = 0.0
                 end if
    
              end if
    

  84. David Dickinson

    Just to note, I also find the NaNs remain in your branch with my version of the “fix” as well, also even for tripri=0.7.

    I've identified that this is still due to the same calculation of beta1 . The relevant il index has a forbidden region just near the in-board side unlike the il which was causing issues for me.

    This remains true if I revert 32b5a708 so is not due to the removal of the wfb (and even if I checkout 6d9a844 and then apply the beta1 fix).

    This also seems to remain true if I relax the equilibrium to a circle so I worry that I must be doing something wrong :

    &theta_grid_knobs
                            equilibrium_option =  "eik"
    /
    
    &theta_grid_eik_knobs
                              beta_prime_input =  0.00000000000E+00
                                        bishop =  4
                                     equal_arc =  F
                                         iflux =  0
                                          irho =  2
                                      local_eq =  T
                                   s_hat_input =  4.00000000000E+00
                                     writelots =  T
    /
    
    &theta_grid_parameters
                                        akappa =  1.0
                                       akappri =  0.0                                                                                                                                     
                                       nperiod =  1                                                                                                                                             
                                        ntheta =  128                                                                                                                                      
                                          qinp =  0.5
                                         r_geo =  0.30000000000E+01
                                          rhoc =  0.50000000000E+00
                                          rmaj =  0.30000000000E+01
                                         shift =  0.00000000000E+00
                                           tri =  0.00000000000E+00
                                        tripri = 0.0                                                                                                                                     
    /
    

  85. Michael Hardman reporter

    This input file works in my branch with the change to limit beta1 being set, above. Perhaps you can use to to work out what’s wrong with your input file?

  86. David Dickinson

    Thanks, it turns out delt is not given a default value in your branch (I think this is something we fixed in the main release a while ago). Setting this to 0.1 prevents the NaNs.

    I’ve reproduced the previous results from the earlier version of your branch – just vnewk>0 tripri=1.5 blows up here

    The problem also persists with your latest version of the branch (i.e. with the wfb removed)

  87. Michael Hardman reporter

    Ok - I think this means we have converged on the need to have particles bouncing from left and right at a single point?

    The other solution would be to make gridgen remove every WFB-like lambda.

  88. David Dickinson

    Yes I think that’s right. It’s possible that there could be similar issues with trapped particles where bounce regions are separated by a single point as well (less clear of an issue but we sometimes set coefficients in the forbidden region as a part of the algorithm and I worry there could be chance for there to be some unintended interaction here).

  89. Log in to comment