The fields divB and gradPhi are currently allocated in Grid_PrepareGrid. . This gets called on all grids on all tasks, which leads to memory errors. This PR moves the allocation to Grid_AllocateGrids, which only gets called on the grid's task.
Since this changes the allocation sequence, I issued PR 404 which ensures that divB and gradPhi are allocated for all possible initializers.
This PR fails the test suite vs. the tip of enzo-dev. The tests that run work just fine. However, the following tests die with segmentation faults:
Note that MHDZeldovichPancake (the only 1D test that fails) seg faults after 252 time steps, whereas the 2D tests that fail die after either the first or second time steps.
Actually, I have a quick followup on this, since I'm also looking at PR #364. Did you successfully run the test suite, @dcollins4096 ? It's not enabled on your BitBucket account by default, it seems. Does this PR require something from a previous PR in order to run successfully?
That's right, sorry I wasn't clear in the first note:
It seemed like it would be much easier to review if they were two separate PRs, since PR 404 is literally the same change 47 times.
On my desktop, both PR405 and PR404 pass the push suite, provided PR404 is pulled first.
Oh, hilarious. OK, in that case I am not at all concerned about this one. :-) Thanks for clearing this up! (I'll also verify it, but i imagine it's just fine.)
Just following up on this - even with both PR #404 and PR #405 pulled in, I'm still getting the same seg faults. Does it need something else that I'm missing? Maybe also PR #364 ?
Oh, fascinating. I just looked for "Fail" in test_results, but these only threw errors.
I do see that seg fault. I'll dig into it.
Pushed a fix. The tests @bwoshea mentioned now actually pass.
@dcollins4096 is there a specific reason you added the allocations to src/enzo/hydro_rk/Grid_MHDRK2_1stStep.C?
I see how they fix the segfaults, but my first impression is that the allocations there are out of place, i.e., at that place in the code the variables should have already been allocated, which would suggest that a fix at an earlier time (in the code flow) would be more appropriate. If you agree I'd be happy and have a look where that 'earlier time' might be.
That's an excellent question.
I looked into exactly that, allocating it earlier, which is what prompted PR 404. I spent a bunch of time yesterday looking into all the places that I missed for the allocation (which caused the seg faults), and it became clear that was more effort than it was worth. It turns out that divB and gradPhi aren't filled with anything meaningful and aren't used until they're computed in the RK solver. They don't get interpolated or moved or communicated or copied from one grid to another. (This implies that they're not persistent between sub-grid timesteps, for what that's worth, which I didn't realize until now) . So it's not really necessary to align the allocation of divB with all the points where BaryonField comes into existence. (There are several of them: I identified two, but there might be more)
The upshot is allocating them on demand is equivalent to their usage.
We should discuss if this is the behavior that divB and gradPhi should have, but that's going to require more careful testing.
I'd argue for digging a bit deeper. Rereading Brian's comment that "Note that MHDZeldovichPancake (the only 1D test that fails) seg faults after 252 time steps" indicates that something gets falsely allocated (or pointed to) early on. So it could be worth looking at this now as it might prevent other problems down the road.
The MHDZeldovichPancake doesn't refine until 252 steps. The test failed then because the new subgrids didn't have divB allocated. Brian's comment happened before I put the allocate in Grid_MHDRK2_1stStep.C: after I put in the allocate, the tests pass. So you're correct that something was getting falsely allocated, but I fixed it, it was not getting allocated.
I did dig deeper, and there is no purpose in having divB exist outside of those routines, if we're going to keep the functionality of that field the same. The fields only get used in Grid_MHD3D, where they're filled, Grid_MHDSourceTerms where they're added to the fluxes, and the destructor, where they're deleted. So there isn't any danger of anything else happening.
We could, though, promote it to a proper BaryonField. . This would be I think straight forward, though it would possibly change the behavior. This would allow for future users who for whatever reason need it to be defined before the hydro step.
I'll need to have a closer look at the Dedner AMR structure again. If those fields aren't used beyond the machinery you mention, then it might be a good idea to remove those lines of code from Grid_AllocateGrids and keep them only at hydro_rk/Grid_MHDRK2_1stStep.C (or similar) . This way there's only one (clean) place where they are allocated and not having them as a BaryonField would also highlight that they are not supposed to be touched by other routines.
I think that is the best solution.
I went ahead and made that change. It passes the AMR tests mentioned above, I'm running the push suite now. But the AMR Dedner tests are the place where it'll break.
Thanks for your comments, I appreciate it!
@dcollins4096 just checking - is this ready to be merged now? Also, would you mind merging with the tip of enzo-dev to make sure that it still passes all of the tests?
Just following up on my own comment: I merged this PR with the tip of enzo-dev and ran it against the tip. The push suite passes with flying colors. I'm going to hit approve; @pgrete , could you sign off as well? And maybe @gbryan could take a quick look as well?
I did run the push suite as well, and it passed push suite. I also changed the two recent comments from @ngoldbaum and @gbryan . Thanks!
edit to add, I meant I ran the push suite on a locally-merged version, vs. the current tip.
I went through the Dedner machinery in more detail and am now suggesting we talk about this in more detail next week at the workshop.
Depending on whether people actually use UseDivergenceCleaning (not the Dedner method, but the active deprojection method) we could remove those variables completely to clean up the code as they are not required at all (if I haven't missed anything fundamental) for the mixed (hyperbolic/parabolic) Dedner-style divergence cleaning.
They are only required if the EGLM-MHD formulation is used, which (a) never happens (see Grid_MHDSourceTerms.C) given that there's no #define DEDNER_SOURCE and (b) the Dedner paper itself recommends using the GLM-MHD formulation as the additional source terms in the EGLM-MHD formulation cause a "significant loss of conservation".
I'm happy to talk more next week, thanks for looking into it!
I think it makes sense to hold off on doing anything else with this PR until we discuss next week!
Upon further discussion, the divB and gradPhi terms are only used in the MHD source terms, which are fully commented out in the code. So those two fields are allocated and filled, but never used. This is consistent with the statements in the original Dedner et al 2002 paper, which states that these terms are problematic. So we’ve removed the terms completely.
This does not affect the main hyperbolic treatment, which only requires the Phi field, which is treated as a full BaryonField. It also does not affect the poisson solvers cleaning (`UseDivergenceCleaning`), which generates it’s own divergence field.
This seems to look correct.
Should we accept and merge?
I also think that it's ready to be merged once Dave confirms that it successfully passes the test suite.
I can confirm that it still compiles with CUDA (even though not with the current week of code for other unrelated reasons - working on those next).
This passes the entire test suite. I’m going to merge!