Segfault in XGC-1 on codecamp2017 involving neutrals and limiter

Issue #10 closed
Steve Abbott created an issue

I'm seeing a segfault with the XGC-1 ES build on codecamp2017_pcuf. The relevant routines should be identical to those in codecamp2017. This is blocking integration of the GPU pusher to the main branch, ie pull request #101

The relevant trace on summitdev is:

4 calbdpoints_() git/epsi/XGC1_3/limiter.F90:227

5 neutral2_setup_() git/epsi/XGC1_3/neutral2.F90:455

6 neutral_init_() git/epsi/XGC1_3/setup.F90:3887

7 setup_() git/epsi/XGC1_3/setup.F90:117

Possibly relevant input file flags:

  • sml_neutral=.t.
  • sml_neutral_use_ion_loss=.f.
  • &neu_param ! neutral collision
  • neu_col_mode=2
  • neu_grid_max_psi=1.2D0
  • neu_cx_period=5
  • neu_mode2_period=1000
  • neu_sepfile='sep_c1140613017.dat'
  • neu_limfile='lim_c1140613017.dat'

I have core files too, if we need to debug further. As far as I can tell, the segfault is happening on the ylim access.

Either way, I don't understand why this just started breaking now, and these two calls to calbdpoints look mutually exclusive to me.

call calbdpoints(neu_sep_mtheta,neu_sep_r_file,neu_sep_z_file,neu_sep_mtheta_file-1,neu_sep_r,neu_sep_z)

call calbdpoints(neu_sep_mtheta,lim_org_r,lim_org_z,lim_mdata-1,neu_lim_r,neu_lim_z)

@seunghoeku and @rhager : Do either of you have any idea why this is breaking now? Was one of the removed input options (ignore_drift bits or the neutral_start_step option) hiding this problem for me? Is my input file using deprecated routines?

Comments (7)

  1. Seung-Hoe Ku

    I do not have any idea at this moments. Does neu_limfile has reasonable amount of data? For example, the first line is positive and indicates # of data points?

  2. Steve Abbott reporter

    @seunghoeku The file is sane; the first line accurately states that there are 55 data points. But I don't think that file is actually getting read. It looks like the only place that reads that is XGC1_3/limiter.F90: limiter_read. But limiter_read only gets called from limiter_setup, which only gets called here:

    File: XGC1_3/setup.F90
    3641:      !limiter setup
    3642:      if (neu_full_grid) then
    3643:         call limiter_setup
    3644:         call check_point('after limiter setup')
    3645:      endif
    

    But:

    File: XGC1_3/setup.F90
    2073:      neu_full_grid = .false. !use neutral_totalf. Supersedes neu_col_mode==2.
    

    ^^ hardcoded, with no entry in a namelist.

    So the limiter file never gets read.

    (edited to correct line numbers: I had some print debugging elsewhere in the file)

  3. Seung-Hoe Ku

    @rmc256 @abbotts1 The neutral code is updated on code camp by Michael. I have added Michael for his advice on this issue.

  4. Michael Churchill

    @abbotts1 @seunghoeku Doing a Blame, looks like Robert changed it from my neu_full_grid==0 to just neu_full_grid. Line 3642 in setup.F90 should be changed back to neu_full_grid==0 or more accurately

    if (.NOT. neu_full_grid) then
    
  5. Log in to comment