Numerical instability for small ion collision frequencies
In the course of attempting a scan in collisionality, I have noticed a very perplexing bug.
I have been performing my simulations of extended modes (nperiod > 20) at zero ion collisionality (vnewk =0 for ions), yet with collisions for electrons ranging from 0 to 100 (going on a log scale). I now wish to reproduce these results including the ion collision frequency with its physical value.
For small collision frequencies of ions (here vnewk = 10^-7), I see a numerical instability develop at large theta. This instability results from the Lorentz part of the collision operator, as can be seen from my tests. The attached files are as follows:
itge.vnewk.1e-05.default.in – The standard problem where I noticed the instability, full collision operator for both electrons and ions
itge.vnewk.1e-05.nocols.in – The ion collision frequency is set to 0, but the electrons retained their collision frequency (Numerical instability disappears)
itge.vnewk.1e-05.test.in – The ion collision frequency is nonzero, and the Lorentz collision operator model is used. Numerical instability appears.
itge.vnewk.1e-05.ediff.in – The ion collision frequency is nonzero, and only the energy diffusion part of the collision operator is used. numerical instability disappears.
The simulations were performed using the branch "ms_pgelres" with commit number
ade578037218496bd9d328a55212e4f0add7529f
Any help appreciated. David listed as he has been looking at the collision operator recently.
Comments (94)
-
-
The problem also seems to persist with adiabatic electrons and (using PR #431) the instability also arises in the homogeneous problem (i.e. in the absence of sources).
-
Increasing ntheta to 128 seems to avoid the instability in the reduced case. Clearly this is rather expensive to test in the original case but may be worth a go.
-
For ntheta = 46 the instability is there but at ~half the growth rate. It is still there at ntheta=56 but appears to be stable by ntheta=60. It may be interesting to explore the lambda grid generated for these different theta resolutions.
-
reporter In the meantime I have been looking at bakdiff. The instability goes away for bakdiff = 0.05 – now trying to go lower. Unfortunately I need to have bakdiff = 0 exactly for the electron species in order to do the mass ratio scan.
-
It’s also possible to remove the instability by increasing delt or making fexpr sufficiently negative, although I would anticipate this also changing the ability to capture the interesting physics.
-
reporter There appears to be an option to set bakdiff separately for each species. This can be achieved when mult_imp = .true. in the dist_fn_knobs namelist. I find the instability persists when bakdiff = 0.025 (for ions), so it seems that one cannot go far below bakdiff = 0.05 for this case. I will try to redo my mass ratio study with these parameters, but is quite unsatisfying to have to hack the ion species this way. I am not sure what consequences that might be of using different backwards differencing for the different species.
I agree that it would be interesting to look into the different lambda grids for this case. I will report back on what happens to the ntgrid = 64 and ntgrid = 128 examples. Needless to say, these parameters would make production runs very expensive (especially given that these simulations here use the hydrogen mass, and I typically have been using masses larger than the deuterium mass to test the mass ratio expansions).
-
The handling of different bakdiff for different species is not very consistent in the code. For example, in the source term we always use the bakdif of the first species.
The instability also goes away if the drifts are zeroed out (presumably where the
kperp2
dependence comes in). Givenbakdif
is also a way to remove this I think it is probably in the coefficients calculated ininit_invert_rhs
where the issue could perhaps arise. These are the main terms which enter the homogeneous problem. Artificially removing the v|| terms from these coefficients has no effect, whilst removing the drift terms from here only does suppress the instability.
The instability growth rate is also sensitive to the energy grid (increasing negrid increases the growth rate, increasing nesuper at fixed negrid also seems to increase the growth rate).
-
reporter How do we explain the fact that the numerical instability seems absent for nu_ii (vnewk) = 0? I could believe there is a separate problem with init_invert_rhs and the drifts. For interest, I tried setting cfac = 0 in the collision operator and the instability remained.
In the orginal “default” case, the instability persists for ntgrid = 64, but has disappeared for ntgrid =128 (55mins on 48 cores!) With the importance of the Lorentz collision operator for the instability, right now I think this pinpoints the lambda grid as the issue.
Files attached
-
reporter - attached itge.vnewk.1e-05.default.large.scrn.out
- attached itge.vnewk.1e-05.default.large.out.nc
- attached itge.vnewk.1e-05.default.large.in
- attached itge.vnewk.1e-05.default.large.ikx.0.iky.0.sources.pdf
- attached itge.vnewk.1e-05.default.large.ikx.0.iky.0.fields.pdf
itge.vnewk.1e-05.default.large – The original default case with ntgrid increased to 128, bakdiff =0 for both species and nonzero collision frequencies for both species. No numerical instability is observed
-
reporter David, do you still find a numerical instability when nu_ii = 0 in the simplified homogeneous case that you are examining? Is your opinion that the instability arises independently of the collision operator, and that I am just unlucky to see it here? Do I need to test the "default.large" case with increased negrid points?
-
Removing any kperp2 dependence in the collisions module does not remove the instability, yet we know the instability only arises when kx/kperp is large (or at least the growth rate is dependent on this).
Setting cfac=0 and turning off conservation removes any direct dependence on any of the collisions code on any property of the species other than vnewk and type. In other words no mass dependence enters directly.
I do not see the instability with zero ion collisions in the simplified system.
I don’t observe the instability in the simplified case without collisions, but equally I don’t observe the instability with collisions but zero drifts. It’s not uncommon for there to be a numerical instability that arises through the interaction of different terms. In the collisionless homogeneous problem with zero parallel velocity we’re essentially describing something like
g.(omega-omega_D) = 0
whereomega_D ~ kperp V_D
is velocity dependent. I’m more used to thinking about the slab problem where we drop omega_D and keep v||, the equation then describes parallel advection. As the vperp/vpar values on our v-space grid change with theta structures* on our velocity grid become increasingly sheared as the structure advects along theta. This results in the development of small scale structure on the velocity grid. One possibility is that a similar thing is happening here such that we’re developing small scale structure, such that terms likenu * d g/ dlambda
can become large even whennu
is small. Of course this can’t really explain directly the instability as the description may similarly hold for electrons which have a largernu
and hence such terms may be expected to behave in the same way. One possibility is that usingdelt=0.1
is damping out any equivalent electron instability due to stepping over the relevant time scale (can note that setting delt=10 allows us to avoid the ion instability, for example). Another possibility is that at highernu
the collisions are sufficient to prevent such small scale structure from developing to any significant extent. It may be interesting to see what the adapative collision frequency algorithm suggests for this problem.Setting the electron mass to 1, the electron collision frequency to 1e-7 and the ion collision frequency to zero we find the same instability. Dropping the mass to 0.1 the instability is there but at a reduced growth rate. Dropping the time step from 0.1 to 0.01 in this case increases the growth rate a bit further (indicating we are getting some stabilisation from the time step/implicit scheme). Dropping the time step further only marginally increases the growth rate. If we consider an electron mass of 0.001 then we may expect to see an instability provided the time step is around 0.001 (actually we probably don’t expect a linear dependence, but it gives a rough ballpark to aim for). Of course we would need rather large nstep as well I believe. So far I’ve not found an instability with the mass reduced to 0.01 and delt =0.0001, perhaps this also changes the required vnewk for instability – this seems to be the case as for mass = 0.01, delt = 0.01 and electron vnewk = 1e-9 I do again see instability.
We’ve also seen numerical instabilities in collisionless electromagnetic cases which only become unstable at sufficiently small time step (see issue #142). Whilst I don’t think these are likely to be directly related, it at least serves to highlight that there can be numerical instabilities lurking in the system which our implicit scheme can suppress in linear simulations but may then arise in NL runs when the time step drops low enough. I have some thoughts for how we may be able to avoid this in NL runs, but it would be nice if we could find a way to eliminate such instabilities from the system to start with.
*Here by structures I’m considering something like a gaussian blob in theta that is initialised to be constant in velocity-space. Issue #144 discusses a collisionless homogenous periodic slab test case which may help my explanation.
-
As vnewk drops the associated timescale increases. Calculating the relevant timescale and comparing to the time step may help highlight when we are able to resolve this collisional time scale. This time scale may depend on the scale of structures in velocity space as well as vnewk and the velocity space resolution, which could explain why the instability seems to be sensitive to the drifts (growth rate drops as drifts scaled down) as this could be viewed as setting some form of small scale structure formation timescale (crudely speaking).
The dependence on energy grid resolution may come from that fact that for increasing resolution our minimum and maximum energy values move towards more extreme values. One might imagine that the highest energies are where the velocity space structures become sheared most rapidly, but could be interesting to check this.
-
Zeroing the drifts for trapped particles only seems to be sufficient to remove the instability.
-
reporter Which knob did you use to remove the trapped particle drifts? This observation is very interesting, because changing the number of ngauss passing particle points did not remove the numerical instability, whereas you showed that increasing ntheta (and hence the number of trapped lambdas) did remove the instability.
-
Ah, if I make the wfb be treated as a passing particle when we scale the drifts then the instability no longer goes away when I zero the “trapped” particle drifts but now goes away when we zero the “passing” particle ones. This strongly suggests that the wfb is a key part of this issue I think.
-
The flag
tpdriftknob
scales the trapped particle drifts, whilstdriftknob
scales the passing drifts (and tpdriftknob defaults to driftknob)
-
reporter And you are using the wfbbc_option to switch between “trapped” and “passing” for wfb?
I will try this too. It’s usually the wfb..
-
I’ve not touched the
wfbbc_option
in this instance . -
reporter I see, you have only changed whether or not tpdriftknob affects the wfb or not? The wfbbc_option changes whether or not the wfb is treated as a passing or a trapped particle for the parallel solve. I have tried switching it between the various options and it didn't make a difference for me, hence my interest.
-
reporter I confirm that using tpdriftknob to zero out the magnetic drifts for the wfb (and only the wfb, i.e. il = ng2 +1) removes the numerical instability in my original case.
-
reporter Looking at init_wdrift and wdrift_func, why are there no forbid statements here for traps and passing particles, when there are forbid statements for totally trapped particles? Is it possible that some noise (due to 1- lambda B = 0 not being satisfied exactly) is getting into the drifts at the bounce points of the wfb?
-
That could be an issue I suppose. Our v||^2 calculation
energy(ie,is)*(1.0 - al(il)*bmag(ig))
in the drift frequency can clearly go negative when considering a trapped particle in the forbidden region. One may argue this is fine provided we don’t then go and use thewdrift
values in the forbidden region (which we shouldn’t do), however I can note that we actually put the drift on a mid-point grid (i.e.wdrift(ig) = (wdrift(ig)+wdrift(ig+1))/2
) which will end up using the drift in the forbidden region to calculate the drift in the allowed region. This may be intentional but has caused me concern in the past. I also note that this means the valuewdrift(ntgrid)
is not treated consistently as this is not the value at a grid mid-point (however again I think we shouldn’t generally use this). -
I should also note that the calculation of
wdrift
is identical for trapped and passing pitch angles, the only difference is which scaling flag is used to scale the drift. -
reporter I have just remembered that the wfb does not have any forbidden points by definition. Zeroing out the wdrift at ig = +-ntrgrid does not remove the instability. I shall have to look at the parallel solve -- maybe this business with the midpoint vs gridpoints holds the answer.
-
It may be as mentioned earlier that the drift is just a way to generate velocity space structure to allow the collisional issue to take off. Is it possible that the velocity volume (i.e. integration weight) associated with the wfb is becoming rather small and that this then leads to a poorly conditioned collisional operator? I don’t think this picture would get better with increasing theta (/trapped pitch) resolution.
-
reporter I will take a look at the integration weights. For the example that I'm looking at now (default and default.large above), increasing ntheta does remove the instability, but I have to go to extreme values.
-
If you use
next
there’s a small helperdump_grids
which will write out the grids and integration weights without running the full code. -
Taking
vn->0
I think that we should findaa=cc=0
andbb=1
in the Lorentz operator matrix elements. This corresponds tobetaale=1
,qle=0.0
andc1le=0.0
. Forcing these values for these arrays still leads to a blow up of the code. Removing the code which turns off/skips collisions when vnewk <=0 we can see that the numerical instability still arises whenvnewk=0
for ions. This seems to imply that the instability is due to the code contained insolfp_lorentz_le_layout
(and if we return from this routine immediately then we do not see instability). -
(the same holds for
use_le_layout=F
) -
reporter I will try to reproduce this, I think I see how to. Do you find numerical instability when you have turned off the drifts for the wfb?
I am wondering if this problem here is related to the numerical instability for small dt and small ky, although I am struggling to find that test case.
-
To enable the collision operator with zero collision frequency it is necessary to remove
if (all(abs(vnew(:,1,:)) <= 2.0*epsilon(0.0))) then collision_model_switch = collision_model_none colls = .false. return end if
and to comment out the
if (abs(vnew(ik,1,is)) < 2.0*epsilon(0.0)) cycle
line in thesolfp
routines.Yes removing the wfb drift seemed to help remove the instability. I also wonder if it’s related to the small dt and small ky issue, although the issue here arises at large kperp instead.
Commenting out most of the code so that it basically sets
delta=gle ; gle=delta
is sufficient for the instability to occur, despite if we simply return (equivalent togle=gle
) then there is no instability. This suggests there is likely some issue in how we are copying the data about (e.g. the logic in what we copy where). Of course just commenting out chunks of code isn’t necessarily a sensible thing to do. If I do this, however and removeif (jend(ig) /= 0) gle(2*jend(ig),ie,ile) = gle(jend(ig),ie,ile)
then the instability does go away.
-
reporter I have confirmed for myself that commenting out the tridiagonal solve (and associated lines of code that try to clean up the fact that there are two vpar =0 points) in solfp_lorentz_standard_layout removes the numerical instability.
I find this quite worrying, because I spent quite a long time in the subroutine during my DPhil, and I thought that I had fixed it. Obviously only adequately enough for my purposes at the time!
-
To be explicit the reduced code I am using is
do ile = le_lo%llim_proc, le_lo%ulim_proc it = it_idx(le_lo,ile) ik = ik_idx(le_lo,ile) if (kwork_filter(it,ik)) cycle if (ieqzip(it,ik)==0) cycle is = is_idx(le_lo,ile) ig = ig_idx(le_lo,ile) ! je = #physical xi values at location, includes duplicate point at vpar=0 ! je-1 = #physical xi values removing duplicate vpar=0 point je=max(2*jend(ig),2*ng2+1) nxi_scatt=je-1 if (jend(ig) == ng2+1 .and.special_wfb_lorentz) then nxi_scatt=nxi_scatt-1 endif do ie = 1, negrid ! LINE TO REMOVE gle(je,ie,ile)=0.0d0 ! zero redundant duplicate xi, isign=2 for vpar=0! ! right and left sweeps for tridiagonal solve: delta(1) = gle(1,ie,ile) do il = 1, nxi_scatt delta(il+1) = gle(il+1,ie,ile) !Usually : gle(il+1,ie,ile) - qle(il+1,ie,ile)*delta(il) end do gle(nxi_scatt+1,ie,ile) = delta(nxi_scatt+1)!Usually :delta(nxi_scatt+1)*betaale(nxi_scatt+1,ie,ile) do il = nxi_scatt, 1, -1 gle(il,ie,ile) = delta(il) !Usually : (delta(il) - c1le(il,ie,ile)*gle(il+1,ie,ile))*betaale(il,ie,ile) end do ! LINE TO REMOVE if (jend(ig) /= 0) gle(2*jend(ig),ie,ile) = gle(jend(ig),ie,ile) end do end do
As written, this code leads to numerical instability. Removing the two lines following
! LINE TO REMOVE
avoids the instability.
Some notes:
- For this case
je=2*jend(ig)
everywhere. - Behaviour not significantly different with choice of
special_wfb_lorentz
→ can considernxi_scatt = je-1
- We set
gle(je) = 0.0
, we then setgle(nxi_scatt+1) = gle(je-1+1) = delta(nxi_scatt+1)
(which is set to 0.0) and finally setgle(2*jend(ig)) = gle(je) = gle(jend(ig))
. We set this duplicate point to zero, it can then enter the solve in the loop on line 22 (note the usual form allows this to be non-zero when vnewk >0), it is then set directly on line 26 to something potentially non-zero. We then overwrite this value on line 31 based on the other instance of this duplicate value. It seems strange that the duplicate point enters the right-left sweeps for the tridiagonal solve. I suspect this is why thespecial_wfb_lorentz
flag has been added as this should avoid the duplicate point entering.
- For this case
-
reporter I remember being troubled by your point 3) when I first looked at this routine. I think the resolution was that betaa(nxi_scatt+1) = 0 in the initialisation, although I will have to check this. I agree that the tridiagonal has no reason to extend to nxiscatt + 1. When I made some changes to the special_wfb_lorentz option some years ago I found that there was an inconsistency in the initialised matrix of coefficients, at theta = +- pi the length of the coefficients array was short by 1 element.
-
reporter Indeed, in init_lorentz
te2 = 2*je-1
and later
qle(te2+1,ie,ile) = 0.0
betaale(te2+1,ie,ile) = 0.0meaning that betaale(nxi_scatt+1) =0.
-
reporter By setting these values before the tri diagonal solve
ql = 0.
betaa = 1.
c1 = 0.and commenting out these lines
!glz(je,ilz)=0.0 ! zero unphysical/duplicate vpar=0 poin
!if (jend(ig) /= 0) glz(2*jend(ig),ilz) = glz(jend(ig),ilz)
, And setting vpar_zero_mean = .false. ,I lose the numerical instability.
Now the code looks like
!Hack ql = 0. betaa = 1. c1 = 0. !if( .not. skip_this_bug) then!MRH bugskip ! solve for glz row by row do ilz = lz_lo%llim_proc, lz_lo%ulim_proc ik = ik_idx(lz_lo,ilz) it = it_idx(lz_lo,ilz) if(kwork_filter(it,ik))cycle if (ieqzip(it,ik)==0) cycle is = is_idx(lz_lo,ilz) ! if (abs(vnew(ik,1,is)) < 2.0*epsilon(0.0)) cycle ie = ie_idx(lz_lo,ilz) ig = ig_idx(lz_lo,ilz) !CMRDDGC, 10/2/2014: ! Fixes for wfb treatment below, use same je definition in ALL cases ! je = #physical xi values + 1 ! NB +1 above WITH TRAPPED is duplicate xi=vpar=0 point with isign=2 ! +1 above WITHOUT TRAPPED is entirely unphysical extra grid point ! je-1 = #physical xi values removing unphysical/duplicate extra point je=max(2*jend(ig),2*ng2+1) nxi_scatt=je-1 if (je==2*jend(ig) .and. vpar_zero_mean) then ! MRH if we have trapped particles and hence 2 vpar=0 points on the grid ! use the average of both vpar = 0 points in the lorentz diffusion in pitch angle ! this turns out to be necessary to suppress numerical instability ! when we handle pitch angle scattering physically at theta = +/- pi ! by setting special_wfb_lorentz =.false. glz(jend(ig),ilz)=(glz(jend(ig),ilz)+glz(2*jend(ig),ilz))/2.0 endif !glz(je,ilz)=0.0 ! zero unphysical/duplicate vpar=0 point if (jend(ig) == ng2+1 .and.special_wfb_lorentz) then !CMRDDGC: special_wfb_lorentz = t => unphysical handling of wfb at bounce pt: ! remove wfb from collisions, reinsert later ! ! first save gwfb for reinsertion later gwfb = glz(ng2+1,ilz) ! then remove vpa = 0 point, weight 0: (CMR confused by this comment!) glz(ng2+1:je-2,ilz) = glz(ng2+2:je-1,ilz) nxi_scatt=nxi_scatt-1 endif !CMRDDGCend ! right and left sweeps for tridiagonal solve: delta(1) = glz(1,ilz) do il = 1, nxi_scatt delta(il+1) = glz(il+1,ilz) - ql(il+1,ilz)*delta(il) end do !CMR, 5/9/14: ! Correction to next line fixes a bug introduced at r2699 which ! affected Lorentz scattering operator at wfb bounce point (theta=pi) ! with default setting of special_wfb_lorentz=t glz(nxi_scatt+1,ilz) = delta(nxi_scatt+1)*betaa(nxi_scatt+1,ilz) do il = nxi_scatt, 1, -1 glz(il,ilz) = (delta(il) - c1(il,ilz)*glz(il+1,ilz))*betaa(il,ilz) end do ! ! interpolate to obtain glz(vpa = 0) point for wfb ! ! and insert this point into glz ! interpolation described above mysteriously causing numerical instability ! stabilized by using old (pre-collision) value of g for wfb if (jend(ig) == ng2+1.and.special_wfb_lorentz) then glz(ng2+2:je-1,ilz) = glz(ng2+1:je-2,ilz) glz(ng2+1,ilz) = gwfb end if ! ! bug fixed 4.14.03 ! was only a problem with collisions but with no trapped particles ! because of an array index out of bounds ! Overall, this was a rare bug. ! ! update the duplicate vpar=0 point with the post pitch angle scattering value !if (jend(ig) /= 0) glz(2*jend(ig),ilz) = glz(jend(ig),ilz) end do !endif ! MRH bugskip
-
reporter I think this just confirms that the original tri diagonal solve works in the trivial case.
Fundamentally it looks like we are up against the duplicate vpar = 0 point.
-
reporter I am struggling to see how to make the pitch angle scattering routine work when nu = 0. The matrix coefficients appear to be fine. If there is an error it is likely to be in the limits of the loops.
As a sanity check, I have checked that the energy diffusion tri diagonal solve has no problems with nu =0.
-
reporter After examining solfp_lorentz_standard_layout a bit further, I have been able to make the subroutine act as the identity operator on the distribution function. To achieve this I insert the lines 21 and 22 below, and commenting out lines 33 & 78, this ensures that the tri diagonal solve returns the collisionless g unchanged.
I have not been able to find a solution where only a single vpar = 0 element (il = jend(ig) or il = 2*jend(ig)) participates in the tri-diagonal solve. Fundamentally, the gs2 collisionless algorithm has these two vpar = 0 points on the grid, whereas for the collision operator there should only be one vpar = 0 point. It seems to me that this might be the fundamental issue -- although I do not currently understand why switching off the magnetic drift for the wfb would smooth this over.
do ilz = lz_lo%llim_proc, lz_lo%ulim_proc ik = ik_idx(lz_lo,ilz) it = it_idx(lz_lo,ilz) if(kwork_filter(it,ik))cycle if (ieqzip(it,ik)==0) cycle is = is_idx(lz_lo,ilz) ! if (abs(vnew(ik,1,is)) < 2.0*epsilon(0.0)) cycle ie = ie_idx(lz_lo,ilz) ig = ig_idx(lz_lo,ilz) !CMRDDGC, 10/2/2014: ! Fixes for wfb treatment below, use same je definition in ALL cases ! je = #physical xi values + 1 ! NB +1 above WITH TRAPPED is duplicate xi=vpar=0 point with isign=2 ! +1 above WITHOUT TRAPPED is entirely unphysical extra grid point ! je-1 = #physical xi values removing unphysical/duplicate extra point je=max(2*jend(ig),2*ng2+1) nxi_scatt=je-1 ! MRH Hack betaa(:,ilz) = 0. betaa(:nxi_scatt+1,ilz) = 1. if (je==2*jend(ig) .and. vpar_zero_mean) then ! MRH if we have trapped particles and hence 2 vpar=0 points on the grid ! use the average of both vpar = 0 points in the lorentz diffusion in pitch angle ! this turns out to be necessary to suppress numerical instability ! when we handle pitch angle scattering physically at theta = +/- pi ! by setting special_wfb_lorentz =.false. glz(jend(ig),ilz)=(glz(jend(ig),ilz)+glz(2*jend(ig),ilz))/2.0 endif !glz(je,ilz)=0.0 ! zero unphysical/duplicate vpar=0 point if (jend(ig) == ng2+1 .and.special_wfb_lorentz) then !CMRDDGC: special_wfb_lorentz = t => unphysical handling of wfb at bounce pt: ! remove wfb from collisions, reinsert later ! ! first save gwfb for reinsertion later gwfb = glz(ng2+1,ilz) ! then remove vpa = 0 point, weight 0: (CMR confused by this comment!) glz(ng2+1:je-2,ilz) = glz(ng2+2:je-1,ilz) nxi_scatt=nxi_scatt-1 endif !CMRDDGCend ! right and left sweeps for tridiagonal solve: delta(1) = glz(1,ilz) do il = 1, nxi_scatt delta(il+1) = glz(il+1,ilz) - ql(il+1,ilz)*delta(il) end do !CMR, 5/9/14: ! Correction to next line fixes a bug introduced at r2699 which ! affected Lorentz scattering operator at wfb bounce point (theta=pi) ! with default setting of special_wfb_lorentz=t glz(nxi_scatt+1,ilz) = delta(nxi_scatt+1)*betaa(nxi_scatt+1,ilz) do il = nxi_scatt, 1, -1 glz(il,ilz) = (delta(il) - c1(il,ilz)*glz(il+1,ilz))*betaa(il,ilz) end do ! ! interpolate to obtain glz(vpa = 0) point for wfb ! ! and insert this point into glz ! interpolation described above mysteriously causing numerical instability ! stabilized by using old (pre-collision) value of g for wfb if (jend(ig) == ng2+1.and.special_wfb_lorentz) then glz(ng2+2:je-1,ilz) = glz(ng2+1:je-2,ilz) glz(ng2+1,ilz) = gwfb end if ! ! bug fixed 4.14.03 ! was only a problem with collisions but with no trapped particles ! because of an array index out of bounds ! Overall, this was a rare bug. ! ! update the duplicate vpar=0 point with the post pitch angle scattering value !if (jend(ig) /= 0) glz(2*jend(ig),ilz) = glz(jend(ig),ilz) end do
-
reporter I am now puzzled by the following observation. Trapped particles have vpar = 0 twice on the grid, thus resulting in the special exceptions in the tri-diagonal solve in the code above. Nonetheless, we do not see numerical instabilities from trapped particles. Since we find that the numerical instability goes away for the cases where the world's fattest banana magnetic drifts are zero, this indicates that the numerical instability is related to the world's fattest banana.
However, I have already implemented the option for the world's fattest banana to have trapped particle boundary conditions in the parallel solve (wfbbc_option = "trapped"), by using this option with the collision operator still gives numerical instability. Does this mean that the problem is not really in the collision operator, but in the parallel solve?
-
Yes I think I agree that the inclusion of the duplicate point does work out fine, it just doesn’t seem necessary and hence causes me some confusion.
-
Yes I think the issue is that we’re enforcing g(wfb,1) = g(wfb,2) at both ends of the theta grid when a collisionless solve does not (always) enforce this so we’ll find that our solution may experience a discontinuity in the parallel direction. I’ve been able to make the code go unstable by adding the below snippet at the end of
timeadv
and running without collisionsdo iglo = g_lo%llim_proc, g_lo%ulim_proc il = il_idx(g_lo, iglo) if(il/=ng2+1) cycle gnew(-ntgrid,1,iglo) = gnew(-ntgrid,2,iglo) gnew(ntgrid,1,iglo) = gnew(ntgrid,2,iglo) end do
I’m not too familiar with your different wfb boundaries but I seem to recall the trapped treatment treats the wfb as bouncing at the ends of the parallel grid rather than at every instance of the inboard side (i.e.
(2n+1)pi
)? -
I meant to add:
We do enforce continuity in the trapped particles already in the collisionless parallel solve (i.e gnew(bounce,1,iglo) = gnew(bounce,2,iglo)) which is a difference from the wfb (at least the traditional treatment).
Personally I like to consider the wfb as a trapped particle and would like to treat it as a particle which bounces at bmag=bmax.
-
Also we could check that g(wfb, 1) = g(wfb, 2) on entry to collisions. If it doesn’t then we have two possible solutions (as we could pick either of these to use in the collisional solve) which we don’t seem to have a sensible way to pick between currently (i.e. both appear equally valid). If it does then the current collisional code should probably work fine whichever choice we make.
-
I added a few print statements to look at the
gnew(-ntgrid,:,iglo_wfbs)
andgnew(ntgrid,:,iglo_wfbs)
values before and after collisions (for istep = 1).I use code like
print*,"AA",iglo,abs(gnew(-ntgrid,1,iglo)),abs(gnew(-ntgrid,2,iglo)) print*,"BB",iglo,abs(gnew(ntgrid,1,iglo)),abs(gnew(ntgrid,2,iglo))
before collisions and the same with A->C and B->D after collisions.
With wfbbc_option = ‘passing’ I get the following
AA 631 4.0535249035213036E-063 1.0110789079091423E-064
BB 631 4.0535249035213036E-063 1.0110789079091423E-064CC 631 1.9760352895927395E-063 1.9760352895927395E-063
DD 631 1.9762081094233478E-063 1.9762081094233478E-063With wfbbc_option = ‘default’ I get
AA 631 4.0535249035213036E-063 1.0110789079091423E-064
BB 631 4.0535249035213036E-063 1.0110789079091423E-064CC 631 1.9760352895927395E-063 1.9760352895927395E-063
DD 631 1.9762081094233478E-063 1.9762081094233478E-063With wfbbc_option = ‘trapped’ I get
AA 631 1.5772046198087082E-066 1.5772046198087082E-066
BB 631 3.9538474699267821E-063 3.9538474699267821E-063CC 631 1.5808356974614790E-066 1.5808356974614790E-066
DD 631 3.9536767289548769E-063 3.9536767289548769E-063
We can see that in all cases the collision operator is ensuring that the distribution is single valued at the ends of the parallel domain between the two signs of v||. This is only also true pre-collisions in the
trapped
case. As noted, the trapped case also blows up in the same way so this difference may not be related to the blow up (although I do think highlights that there is certainly an issue to address with the wfb treatment). I find it slightly surprising that the trapped case shows four orders of magnitude difference in the distribution function at the two ends of the domain, unlike the other cases. -
Skipping collisions just at the ends of the theta domain doesn’t seem to prevent things from blowing up so perhaps there’s something not quite right away from the wfb “bounce point”.
If we believe g(wfb) having different values for the two sigma when bmag=bmax then I suppose another option is to apply the collisions twice – once for each of the duplicate wfb v||=0 values. If this doesn’t lead to the same values for g at the other pitch angles (and I can’t see why it would give the same result) then this appears like it must be inconsistent. I suppose what this could be saying is that we consider our wfb points with the two different sigma to have v|| approaching 0 at bmag=bmax from opposite sides and that we are content for there to be a discontinuity across this (which we might hope collisions will smooth out) . In this picture I guess we would then include both v||->0 points for the wfb. This is clearly different from the treatment of trapped particles, but may be acceptable if we view the wfb as passing such that the two different sigmas are only coupled through collisions and potentials. I’m not sure there’s a nice way to implement this if we wanted to.
I think we probably would benefit from settling down on a single approach to treating the wfb. As I noted earlier my preference is for this to just be a regular trapped particle treatment (we’d probably want to introduce a
bounce_point
array to work alongsideforbid
in order to do this nicely). This could raise the issue that the wfb would like to have different values for g from the separate isolated trapped regions. I’m not sure if this can be addressed through use of the homogeneous solution. -
reporter Thanks for this helpful exposition! I think we are getting somewhere.
To answer this point:
I’m not too familiar with your different wfb boundaries but I seem to recall the trapped treatment treats the wfb as bouncing at the ends of the parallel grid rather than at every instance of the inboard side (i.e.
(2n+1)pi
)?When I last touched this code, wfbbc_option could take three values "passing", "trapped", "mixed", and "default". The "passing" option corresponds to the wfb being treated as a normal passing particle, with zero incoming boundary condition at the ends of the extended data domain. The "trapped" option corresponds to the wfb being treated as an ordinary trapped particle bouncing at B = Bmax (theta = +-pi) points on the grid. The "mixed option" corresponds to the wfb being treated unphysically -- the wfb is treated as a passing particle for interior +-pi points, but bounces like a trapped particle at the +-(2n-1)pi points where n is nperiod. The "default" option pointed to "mixed". In most examples these options gave no difference, although as you have seen in your previous post these boundary conditions indeed enforce different values for the wfb distribution.
Because the trapped wfb option is already implemented, I think that the wfb can already be treated as a trapped particle consistent with the rest of the trapped particles simply by setting wfbbc_option = "trapped" and using the default collision options (special_wfb_lorentz = .false.). The fact that this still gives numerical instability is deeply troubling to me -- what have I missed?
I would prefer to view the wfb as a passing particle (with vpar -> 0 at +-pi). This is inconsistent with the current treatment in the collision operator, and I wonder if it would be possible to remove the wfb completely from the code. I note that stella does not include the vpar = 0 point on its grids because it causes numerical issues.
-
reporter Because the trapped wfb option is already implemented, I think that the wfb can already be treated as a trapped particle consistent with the rest of the trapped particles simply by setting wfbbc_option = "trapped" and using the default collision options (special_wfb_lorentz = .false.)
Taking this statement at face value, and setting vpar_zero_mean = .false. in the input file, I have tried to test exactly which points in theta have magnetic drifts that cause the wfb to carry a numerical instability.
To test this, I have made the changes in commits 3a5c2a9, 602adc2, and ad4da15 in branch
https://bitbucket.org/gyrokinetics/gs2/branch/wfbdriftknobtest
The addition to the collision operator is to allow the tri-diagonal routines to be used even if vnewk = 0 for both species. The change to the magnetic routine is below. Instead of including the wfb with trapped particles for the purposes of tpdriftknob, we had a separate knob wfbdriftknob, and an integer iwfbfudge which controls (via forbid) how many points have wdrift modified by wdriftknob.
if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il > ng2+1 .and. il <= jend(ig)) then wdrift(ig,1,iglo) = wdrift_func(ig, il, ie, it, ik)*tpdriftknob #ifdef LOWFLOW wcurv(ig,iglo) = wcurv_func(ig, it, ik)*tpdriftknob #endif else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. forbid(ig,il+iwfbfudge)) then !else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. il <= jend(ig)) then wdrift(ig,1,iglo) = wdrift_func(ig, il, ie, it, ik)*wfbdriftknob #ifdef LOWFLOW wcurv(ig,iglo) = wcurv_func(ig, it, ik)*wfbdriftknob #endif
Quite spectacularly, if I set wfbdriftknob = 0. and iwfbfudge = 2, I am able to remove the numerical instability for my original test case, for all input values of ntheta. This is true for nu =0 and nu = 2e-07 for ions and 1e-05 for electrons (default case above).
In conclusion, it looks like the parallel solve has a problem for the first two points on the grid away from Bmin.
-
reporter - attached itge.vnewk.1e-05.withoutNinstability.in
- attached itge.vnewk.1e-05.withNinstability.scrn.out
- attached itge.vnewk.1e-05.withNinstability.in
- attached itge.vnewk.1e-05.withoutNinstability.scrn.out
Input files showing how to treat the wfb as a trapped particle, and how to remove the numerical instability with the hack in https://bitbucket.org/gyrokinetics/gs2/branch/wfbdriftknobtest . This strongly suggests that the parallel solve is broken for the wfb at the points adjacent to the Bmax point. To compile this branch use Makefiles on branch ms_makefiles (5883dc3) and utils on branch ms_utils (f599234)
-
reporter Inspection of the eigenmodes reveals that there are unphysical structures at large kperp, if ntheta is too small, even with the hack described in the last two posts. The problem appears to go away for large enough ntheta (> 16 or so). It is not clear if this is because the problem with the wfb propagates beyond points near to Bmax, or if the hack itself is responsible.
-
Just catching up on this. To me, treating the wfb as passing with vpar->0 suggests that the two sigma values correspond to different populations at the wfb bounce point so both should appear in the Lorentz operation.
I note that
else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. forbid(ig,il+iwfbfudge)) then
will potentially result in an out-of-bounds access (there’s no logical short-circuiting in Fortran so we will evaluate every condition in this statement, e.g. if il=nlambda we’ll still evaluateforbid(ig,il+2)
). Doesn’t this condition evaluate to true for all theta beyond the bounce point of the wfb+2 pitch – i.e. the region between ntgrid_bounce(ng2+1 + iwfbfudge) and ntgrid, so the points around bmax rather than the points around bmin? I’m probably just misreading this though.On the
wfbbc_option='trapped'
code I’ve just had a look and it appears that we do not setg2
for the wfb in this case, but do for all the other trapped particles. It may turn out that this is not required for the wfb as there are no gaps whereg1==0
. We also ensure a single valued distribution function at the upper ntgrid bounce point, but we don’t seem to enforce this at any internal bounce point. (We also ensure single valued wfb at theta = -ntgrid through construction). Again, this may work itself out if I think about things further but I just thought I’d check if my initial reading of the code is consistent with your understanding.
-
reporter Doesn’t this condition evaluate to true for all theta beyond the bounce point of the wfb+2 pitch – i.e. the region between ntgrid_bounce(ng2+1 + iwfbfudge) and ntgrid, so the points around bmax rather than the points around bmin?
Sorry, I should have written Bmax, where I wrote Bmin above. I am trying to prevent the wfb drifting near Bmax. I have corrected my statements above
-
reporter I note that
else if ( jend(ig) > 0 .and. jend(ig) <= nlambda .and. il == ng2+1 .and. forbid(ig,il+iwfbfudge)) then
will potentially result in an out-of-bounds access (there’s no logical short-circuiting in Fortran so we will evaluate every condition in this statement, e.g. if il=nlambda we’ll still evaluateforbid(ig,il+2)
)I am not suggesting that this becomes a permenant fix! I can change il+iwfbfudge to ng2 + 1 + iwfbfudge.
-
reporter We also ensure a single valued distribution function at the upper ntgrid bounce point, but we don’t seem to enforce this at any internal bounce point.
Good spot. I would have written this with “box” mode with nperiod = 1 in mind. This should be fixed by solving for wfb between every N pi, N+2 pi
-
Thanks for the responses. The only reason I point out the out-of-bounds access is that this can, in some instances, lead to all sorts of strange behaviour in unrelated parts of the code, it’s only a read access here so probably fine.
-
reporter It looks like using “box” mode with nperiod = 1 and wfbbc_option = “trapped” removes the instability. Testing further.
-
That’s exciting, have you checked “box” with other
wfbbc_option
values? -
reporter - attached itge.vnewk.1e-05.switchwfb2.in
- attached itge.vnewk.1e-05.switchwfb2.ikx.0.iky.1.fields.pdf
- attached itge.vnewk.1e-05.switchwfb2.scrn.out
Very reduced input files that run quickly on a single core, showing that the numerical instability is removed by using "box" mode with wfb treated as a trapped particle. If one switches from the wfbbc_option = 'trapped' to the wfbbc_option = 'passing' option then the numerical instability returns.
This supports the need to treat wfb as a trapped particle. However this is problematic outside of "box" mode with nperiod = 1. The reason for this is that theta = N pi points are duplicated in "box" mode, but they are not duplicated in "range" mode. Presumably in range mode we should treat wfb as a passing particle, and modify the collision operator appropriately?
David, what are your thoughts on this?
-
I agree that if we treat the wfb as passing then the collision operator should be modified.
Yes in ballooning space we have some extra complication in being able to handle fact that at the wfb bounce point we might expect to find values for the wfb distribution function coming from either side of this bounce point. It would be nice to think of an elegant way to handle this. One (inelegant) approach may be to drop the duplicate sigma point at bounce points as we require the distribution to be identical for both sigma we shouldn’t need to store both. We could then interpret the two sigma as referring to the two sides of the bounce point here. This is hacky, likely to cause significant confusion and would require very different handling to the current passing approach so I’m not particularly in favour of this approach, but it might be one way to quickly hack in an implementation of this for us to test.
If one could argue that g_wfb(bounce) had to be single valued then our life would be simplified significantly I think, but I’ve yet to think of a good reason this would have to be the case other than arguing that collisions will tend to smooth the distribution function across the trapped-passing boundary so if the barely passing passing distribution is single valued and continuous here then the wfb at either side of bmag=bmax at least can’t be very different from one another.
-
For the trapped wfb treatment I think we would always want to use
radau_gauss_grid = F
– is this right? Presumably we would expect the potentials to be single valued at the duplicate theta point -- for this to be the case we would expect the velocity integral of the distribution function to have the same value at both of the duplicate points. With Radau-Gauss grids we have non-zero weight for the wfb at its bounce point so we could find different potentials there if the wfb has different distribution function values at the duplicate bounce points.
-
reporter That might be correct, that argument sounds reasonable. One of the reasons that I introduced the radau_gauss_grid = T option was because the collision operator coefficients are calculated using the particle weights, and if we have zero weight at one point then this prescription will run into trouble. Perhaps this point is fixable.
What I do not see how to fix is how to resolve the fact that we would need to perform pitch angle scattering at two duplicate theta points (when we only have one on grid). Since we only have the rest of the v|| points once, how do we decide which pitch angle scattering calculation was correct? The one from theta=pi- or theta =pi+?
-
reporter I was able to find enough time to experiment with my proposed solution of removing the wfb grid point entirely. According to @William Dorland , gs2 was originally written without a wfb grid point. A working version of my branch without the wfb is here https://bitbucket.org/gyrokinetics/gs2/branch/remove-wfb-from-grid
This branch does not have the numerical instability featured in this thread. The collision operator causes no numerical instability in the limit nu->0, even with bakdiff = 0. I have done some initial sanity checks with some linear simulations close to my original input file here, but no nonlinear runs
I chose to implement these changes in my branch because 1) I want this feature, along with other features in my branch, and 2) it was relatively quick in a version of the code I was familiar with. I'm sure it will be possible to put this to the more recent version of gs2 in master or next -- however, there may be places in the code that I did not make fully consistent for lack of time, and so I thought this initial branch could act as a talking point for users to play with.
I would be keen to have feedback from the other developers to see if there is a downside to removing the wfb, and what code I have missed, if any. @Michael Barnes @Joseph Parker
-
I think there isn’t a correct point to choose here but we would have to do and store both. This is certainly not something which has an obvious simple elegant solution (for me at least). I’m warming a little to my earlier suggested horrible hacky approach which doesn’t store the duplicate v||=0 for both signs, freeing up space to store both values of g at the duplicate theta point. This might not be as terrible as I first worried if applied consistently, but I may also be warming to it just because I’ve not been able to think of an alternative.
I see you’ve added a commit which demonstrates how to remove the wfb entirely. This is a useful option to have and I’m interested to see if this reproduces existing results well whilst avoiding the observed numerical instabilities (both the original collisional issue and also similar in appearance issues such as issue #108).
-
reporter I am up for a discussion about your approach if that would be useful.
I should warn that in the commit that I just made the wfb is removed by force, with no option to add it back in with a change of a logical flag. I found it too tiring to accommodate a new option alongside the various wfb special exceptions that already existed -- if need be I can think more carefully about how to get all of these options into one code.
I would be very enthusiastic about testing the wfb-free grid and code in a variety of circumstances, if you have a list of issues to test. (Hopefully the issues exist also in my old branch!)
-
I strongly agree with your approach of removing the wfb in that branch rather than trying to accommodate every possible option! At this point it’s better to get clearer but less flexible code as it allows us to test the consequences of the decision. I’m happy to look at porting this to work with the current release (as I’d like to use this).
In terms of issues to test against I think issue #108 is the main one recorded on the Bitbucket tracker which also provides an input file to test against. I’ve seen a similar instability in a number of different cases and I suspect they are all somewhat related. I do wonder if it could possibly help with the issue reported in issue #95.
I’m currently trying to work on wrapping up our logic about how we decide how to treat various pitch angles into logical functions (see https://bitbucket.org/gyrokinetics/gs2/commits/fcc5ce05261ea9b4ef32841fb1c7be0e6252b3b4?at=feature/add_functions_wrapping_up_logic_about_pitch_angles) for an example – this might make it easier to support a number of different choices as it limits the appearance of such choices to these small logical functions. If nothing else, I think this leads to more readable code.
-
I tested the input file from issue #108 with your branch and unfortunately it seems it still does blow up. Are there any changes to inputs that are needed to make sure this is consistent?
-
I think I can see potential for a more general issue which eliminating the wfb won’t necessarily fix. If we have a magnetic field with internal local maxima (i.e. maxima with bmag < bmax) then we will have a pitch angle which bounces on both sides of this point, as with the wfb. This is likely to be subject to the same instability as is facing the wfb.
-
reporter I would be wary of new_trap_int and opt_source, so I have just tested the minimal.in of #108 with new_trap_int = .false. and opt_source=.false.. I also find numerical instability. On reflection, I am not surprised. The bug reported in this thread appears to be related to the inconsistency between the treatment of collisions and the collisionless parallel solve, whereas #108 appears to be about collisionless physics only. Can you find the bug in #108 in toroidal geometry? It looks very similar to an instability that I have previously come across in cyclone base case.
-
Yes it was a bit of a long shot, thanks for confirming.
I believe this does exist in toroidal geometry as well.
-
The only other numerical instability associated with collisions that I’m aware of is one that @Joseph Parker saw with his split collisions code but I think that’s also probably not related to the wfb but rather the change in treatment of collisions generally.
-
reporter I had not considered internal bounce points. Do you have a recipe for producing such a magnetic field? I have never considered how the parallel solve deals with internally bouncing particles. My current understanding is the trapped particles bounce at ig when forbid(ig+1,il) = T (if v|| > 0) or forbid(ig-1,il) = T (if v|| < 0). For a magnetic field with a local maximum B_loc, how does gs2 treat lambda = 1/B_loc? For this lambda, I would expect that forbid = F at B=B_loc, and so I would expect that the particle passes over B=B_loc. This would mean (as I think you imply) that the particle would have v|| = 0 at B_loc but v|| =0+ and v||= 0- could have different amplitudes.
-
A recipe is as follows
&theta_grid_knobs equilibrium_option = "eik" / &theta_grid_eik_knobs beta_prime_input = -1.00000000000E+00 bishop = 4 equal_arc = F iflux = 0 irho = 2 local_eq = T s_hat_input = 4.00000000000E+00 writelots = T / &theta_grid_parameters akappa = 2.0 akappri = -0.3 nperiod = 1 ntheta = 128 qinp = 0.5 r_geo = 0.30000000000E+01 rhoc = 0.50000000000E+00 rmaj = 0.30000000000E+01 shift = 0.00000000000E+00 tri = 0.50000000000E+00 tripri = 1.5 /
This may not be particularly representative of a real tokamak, but I’ve found that these local maxima do appear when heading towards the pedestal of STs at least.
Yes GS2 uses
forbid(ig +/- 1, il)
to determine if we’re at an upper or lower bounce point and to then act as required. For a pitch angle bouncing at an internal maximum we would therefore not identify the bounce point and the pitch would be treated as passing at these internal bounce points, only bouncing at the outer bounce points so could have different values based on the sign of velocity at the internal bounce point. -
reporter Thanks for the code! I have tried inserting this in my input files, and I see numerical instability regardless of which branch I use. Could I trouble you to supply a whole input file to make sure that I'm not making a mistake?
-
This was not taken from a real case, but just constructed to demonstrate a possible bmag structure.
Adding the following gives me a vaguely complete input file which doesn’t seem to blow up with current next
&dist_fn_species_knobs_1 bakdif = 0.00000000000E+00 fexpr = 0.50000000000E+00 / &kt_grids_knobs grid_option = "single" / &kt_grids_single_parameters aky = 0.40000000000E+00 / &species_knobs nspec = 1 / &species_parameters_1 dens = 0.10000000000E+01 fprim = 0.22000000000E+01 mass = 0.10000000000E+01 temp = 0.10000000000E+01 tprim = 0.69000000000E+01 type = "default" vnewk = 0.00000000000E+00 z = 0.10000000000E+01 / &knobs nstep = 1000 /
Turning on collisions (vnewk =0.1) doesn’t seem to lead to instability either – so it’s not immediately obvious that this demonstrates a problem. Increasing nperiod from 1 to 2 also doesn’t show an issue.
-
Ah I spoke too soon. Setting
nperiod=2
andvnewk = 1.0e-7
does eventually go numerically unstable by aroundt=90
-
reporter My gs2 output looks like the following…
r_bar = 0.500000000000000 d beta/d rho = 0.000000000000000E+000 , -1.00000000000000 s_hat 4.77550232732293 , 4.00000000000000 sum phinew**2= 0.67E+09 sum aparnew**2= 0.00E+00 sum bparnew**2= 0.00E+00 sum phinew**2= 0.67E+09 sum aparnew**2= 0.00E+00 sum bparnew**2= 0.00E+00 ky= 5.00E-01 kx= 0.00E+00 om= 0.00E+00 0.00E+00 omav= 0.00E+00 0.00E+00 phtot= 1.07E+08 theta0= 1.00E-01 sum phinew**2= NaN sum aparnew**2= 0.00E+00 sum bparnew**2= 0.00E+00 ky= 5.00E-01 kx= 0.00E+00 om= NaN NaN omav= NaN NaN phtot= NaN theta0= 1.00E-01
Trying this with current trunk works if I change ntheta = 130. I suspect a bug in the geometry module in my branches, but not in next.
-
I think there must be some difference in the geometry calculations as I find the “expected” shear is reported as
4.7816880047933630
whilst you find4.77550232732293
whilst these numbers aren’t very different (and not useful), they’re different enough to demonstrate some variation in the code.It turns out I was testing with the wrong setup – I had qinp = 3.5 rather that 0.5. When I use the correct value I also see the fields becoming Nan after the first output. Dropping
tripri
to 0.5 sees this go away (the mode grows but at a reasonable rate). Slowly increasingtripri
I find the appearance of Nan seems to coincide with the appearance of the local maxima. Usingtripri=0.777
the simulations behaves reasonably, usingtripri=0.778
leads to Nans appearing immediately. Checkingdiff(bmag[theta>=0])
I can see that this seems to be associated with the appearance of a negative diff value, giving evidence that the issue is indeed to do with local maxima appearing.I should also note that for both
tripri
values I find diff values which are exactly 0 suggesting that we have a flat bmag structure in places. -
I should note, the Nans appear in the collisionless case with nperiod=1 so the issue here cannot be solely with the trapped treatment interacting with the Lorentz collision operator.
-
Furthermore, it seems that we end up with different nlambda in these two cases – 72 with tripri=0.778 and 74 with tripri=0.777 (with ntheta=130, ngauss=5 I would expect nlambda = 10 + 66 =76 I think). I wonder if gridgen is perhaps struggling with this case but we’ve not triggered one of the test cases which emit a warning.
-
It seems that the nan is appearing because we find that
g1(ig, 1)==1.0
for a particular lambda and theta point. When trying to ensure consistency at trapped bounce points we add in a piece of the homogeneous solution proportional to1/(1-g1(bounce,1))
so here we end up dividing by zero.
-
For this particular point I find
forbid(ig+1, il) = T, forbid(ig, il) = F, forbid(ig-1, il) = F, forbid(ig-2, il) = T
whereig
is the upper bounce point at which we calculate1/(1-g1(ig,1))
. This looks like a stretched out totally trapped particle – there are no points at whichv||/=0
in this allowed region. -
In this case it can be avoided by simply guarding the calculation of beta1 on
if (gnew(ig, 1, iglo) /= gnew(ig, 2, iglo))
- in other words only set beta1 /= 0 if we don’t already satisfy the bounce condition.
-
With this small change to the code I’ve been able to reproduce the wfb-like collisional error.
For reference my input file is
&dist_fn_species_knobs_1 bakdif = 0.00000000000E+00 fexpr = 0.50000000000E+00 / &kt_grids_knobs grid_option = "single" / &kt_grids_single_parameters aky = 0.40000000000E+00 theta0 = 40.0 / &species_knobs nspec = 1 / &species_parameters_1 dens = 0.10000000000E+01 fprim = 0.22000000000E+01 mass = 0.10000000000E+01 temp = 0.10000000000E+01 tprim = 0.69000000000E+01 type = "default" vnewk = 0.0 !10000000000E-06 z = 0.10000000000E+01 / &knobs nstep =1000 / &theta_grid_knobs equilibrium_option = "eik" / &theta_grid_eik_knobs beta_prime_input = -1.00000000000E+00 bishop = 4 equal_arc = F iflux = 0 irho = 2 local_eq = T s_hat_input = 4.00000000000E+00 writelots = T / &theta_grid_parameters akappa = 2.0 akappri = -0.3 nperiod = 1 ntheta = 128 qinp = 0.5 r_geo = 0.30000000000E+01 rhoc = 0.50000000000E+00 rmaj = 0.30000000000E+01 shift = 0.00000000000E+00 tri = 0.50000000000E+00 tripri = 1.5 /
Simply switching between vnewk = 0 and vnewk = 1.0e-7 is enough to see the instability. Keeping vnewk = 1.0e-7 but dropping tripri to 0.7 removes the internal local maxima and appears to avoid the instability as shown below. It would probably be useful to identify a narrower range of tripri as in the earlier cases to make the argument that the issue is the introduction of local maxima a bit more convincing.
-
Here are the bmag structures for the two tripri values
-
reporter This is very convincing, but unfortunately I'm not able to get past the original NaNs mentioned above. In dist_fn.fpp, I have tried to take into account that g1 may be 1.0 in the following way (changes on line 8 & 9). Have I misunderstood? Have you tested the above with next, or with my remove-wfb-from-grid branch? What I want to confirm is that the code with the wfb removed definitely has the problem that you feature above. I expect it will!
if (il >= ng2+1 .and. il <= lmax) then beta1 = 0.0 do ig = ntgr-1, ntgl, -1 if (ittp(ig) <= il) cycle if (forbid(ig,il)) then beta1 = 0.0 cycle !CMR: to avoid pointless arithmetic later in loop else if (forbid(ig+1,il) .and. (gnew(ig, 1, iglo) /= gnew(ig, 2, iglo))) then !else if (forbid(ig+1,il)) then beta1 = (gnew(ig,1,iglo) - gnew(ig,2,iglo))/(1.0 - g1(ig,1)) end if gnew(ig,1,iglo) = gnew(ig,1,iglo) + beta1*g1(ig,1) gnew(ig,2,iglo) = gnew(ig,2,iglo) + beta1*g1(ig,2) end do end if
-
This was a test on next (or something branched off next).
That fix looks appropriate. I wrote it slightly differently, but it should be equivalent I think
else if (forbid(ig+1,il)) then if (gnew(ig,1,iglo) /= gnew(ig,2,iglo)) then beta1 = (gnew(ig,1,iglo) - gnew(ig,2,iglo))/(1.0 - g1(ig,1)) else beta1 = 0.0 end if end if
-
Just to note, I also find the NaNs remain in your branch with my version of the “fix” as well, also even for
tripri=0.7
.I've identified that this is still due to the same calculation of
beta1
. The relevantil
index has a forbidden region just near the in-board side unlike theil
which was causing issues for me.This remains true if I revert 32b5a708 so is not due to the removal of the wfb (and even if I checkout 6d9a844 and then apply the beta1 fix).
This also seems to remain true if I relax the equilibrium to a circle so I worry that I must be doing something wrong :
&theta_grid_knobs equilibrium_option = "eik" / &theta_grid_eik_knobs beta_prime_input = 0.00000000000E+00 bishop = 4 equal_arc = F iflux = 0 irho = 2 local_eq = T s_hat_input = 4.00000000000E+00 writelots = T / &theta_grid_parameters akappa = 1.0 akappri = 0.0 nperiod = 1 ntheta = 128 qinp = 0.5 r_geo = 0.30000000000E+01 rhoc = 0.50000000000E+00 rmaj = 0.30000000000E+01 shift = 0.00000000000E+00 tri = 0.00000000000E+00 tripri = 0.0 /
-
reporter - attached itge.geo.in
This input file works in my branch with the change to limit beta1 being set, above. Perhaps you can use to to work out what’s wrong with your input file?
-
Thanks, it turns out
delt
is not given a default value in your branch (I think this is something we fixed in the main release a while ago). Setting this to 0.1 prevents the NaNs.I’ve reproduced the previous results from the earlier version of your branch – just vnewk>0 tripri=1.5 blows up here
The problem also persists with your latest version of the branch (i.e. with the wfb removed)
-
reporter Ok - I think this means we have converged on the need to have particles bouncing from left and right at a single point?
The other solution would be to make gridgen remove every WFB-like lambda.
-
Yes I think that’s right. It’s possible that there could be similar issues with trapped particles where bounce regions are separated by a single point as well (less clear of an issue but we sometimes set coefficients in the forbidden region as a part of the algorithm and I worry there could be chance for there to be some unintended interaction here).
-
I’ve created issue #148 to try to distil the general issue into a single post.
-
reporter Thanks! #148 gives a very nice summary.
- Log in to comment
Just to confirm this goes away with
ei_coll_only = T
as would be expected. Increasingky
causes the numerical instability to blow up faster, reducing it prevents the instability from arising (within the duration of the simulation at least). To make the test case cheaper it is possible to set theta0=40 and nperiod=3 – for ky=0.5, shat=0.8, theta0=0.1 we see the mode blow up significantly at around theta=100. (Setting theta0=40 should give us similar kperp2 at theta=0 as at theta=100 with theta0=0.1).