#390 Merged at 6ded86c
Repository
enzo-dev-64bit
Branch
week-of-code
Repository
enzo-dev
Branch
week-of-code
Author
  1. John Regan
Reviewers
Description

Hi All,

This is the next addition in the port across from my Enzo3.0 fork. I added the photo-detactment of H- to the RT module.

I'm hoping this doesn't interfere with the previous PR. This update is on a different bookmark but maybe that's not sufficient - perhaps a different branch is required?

  • Commit status

Comments (40)

  1. John Wise

    Hi @john_regan ,

    This looks great. I was looking for the new 64-bit initialization routines mkPix2xy and mkPix2xy called in RadiativeTransferInitialize.C, but I couldn't find them in the new RadiativeTransferHealpixRoutines64.{h,C} files. Where are they located now?

    Also, you accidently include the object file RadiativeTransferHealpixRoutines64.o.

    Thanks,

    John

    1. John Regan author

      Hi @jwise77 ,

      Yes that's a good question on the mkPix2xy and mk_xy2pix. It's quite a while since I did this but from looking at the source I don't think those arrays are ever actually used. I grepped the source and they get updated alright during the initialisation and they are global so that's ok but where are they used. I suspect that was my logic in deleting them. Happy to put it back though.

      1. John Wise

        They're used in the 32-bit pix2coord_nest and pix2vec_nest routines. But these are replaced by the routines in the next 64-bit source. I think with your current changes, it will fail to build when precision-32 is used. The best solution would be to remove the initialization of these arrays in RadiativeTransferInitialize.C.

        1. John Regan author

          Yes that makes sense - I had overlooked the fact that these arrays could be used in the Healpix source. It could well fair when built in 32-bit mode. I can double check that. Either way I'll go ahead and remove the arrays from RadiativeTransferInitialize.C.

          1. John Regan author

            Hi @jwise77,

            I've made those changes now. In order to build in 32-bit mode one of the arguments to the pix2vec_nest64 routine needs to be a double so I've explicitly set that now. Does that break the 32-bit mode logic though?

  2. Britton Smith

    @john_regan, can you enable pipelines testing for your fork so the test suite can run?

    1. John Regan author

      Hi @brittonsmith

      OK I think I did that (I clicked enable - a bunch of stuff kicked off and I forgot all about it). In general I consider that a pretty successful outcome. Unfortunately, it looks like something went wrong. Did I need to set some environment up first maybe?

      1. Britton Smith

        Hi @john_regan, you probably did it right. The tests won't run for the first time until you push an update to the PR. I don't think there is a way of running them manually.

        1. Britton Smith

          I take that back. It did run, there were just a few errors. This may be because the gold standard is in flux at the moment with the latest PRs going through. That should be settled today.

          1. John Regan author

            Ah I see there were some errors in the RT tests. That might be expected due to the change in the resolution of HEALPIX now being employed. I'll take a closer look at the fails and see if I can understand it.

            1. John Regan author

              Hi @brittonsmith

              OK I had a look at this and I ran the YT scripts in one of the tests that was failing. The scripts produce results that are identical. I'll update the scripts to the YT3.x standard while I'm at this too. I suspect what we are seeing is simply the impact of a higher resolution HealPIX solver. There is no change, that I can see to the physics though. Does that make sense to you @jwise77 ?

              1. John Wise

                Hi, if the results are identical, then I think this is fine. I think the failures are probably being caused by the higher precision of the ray normals, rather than higher resolution, because very high HEALPix levels won't be reached in these test cases.

              2. Britton Smith

                The test suite gold standards are now caught up, so you may want to merge with the main repo one more time as well.

  3. John Regan author

    @brittonsmith @jwise77 I just want to move this on a little. What I found was that when I ran the same YT script on both the enzo-dev tip and this fork I got identical results. For example in the I-front Vs. Time plot below. I find that the I-front lags behind the analytical result but the old figure shows its ahead. I think it should lag right? Some of the other differences come from different binning between the YT3 and YT2 scripts and I can clean some of them up now. @jwise77 how worried should we be that the new figures don't match the old ones?

    1. John Wise

      HI @john_regan the numerical solution should only lag behind at early times. But it is the correct behavior for the simulated I-front radius to be slightly larger than the Strömgren radius at later times. The updated yt-3 scripts should reproduce similar results as the yt-2 script. If you could clean that up, it'd be great.

      1. John Regan author

        Hi @jwise77

        Hmmm in that case there may be a problem somewhere as it looks to me like the I-front is lagging behind for the enzo-dev tip as well.

        1. John Wise

          Confirmed that this already existed in enzo-dev and will have to be fixed in another PR. Approving.

            1. John Wise

              This was a surprise to me. I'm going to track down the problem next week with hg bisect. I think it's critical to fix since it was working in the past.

  4. Brian O'Shea

    Also, just following up on this: I ran the test suite with a few different options, and the results are as follows:

    with --strict=high (only passes with bitwise identicality), there are 41 test failures in the PhotonShadowing and PhotonTestAMR tests.

    with --strict=low (passes with fractional tolerance of 1e-3, bitwise identicality not required), there are 32 errors, all in PhotonTestAMR.

    Looking closer at PhotonTestAMR, the density and velocity values are off, by up to 10% in a small number of cells. By inspection, there are some subtle differences that are basically impossible to see in projection, but which creep up in the slices. Here's an example of one of the more obvious ones

    Gold standard (x-axis slice, y-velocity, centered at (0.5, 0.5, 0.5) ):

    gold_Slice_x_y-velocity.png

    This PR (same quantity, axis, centering):

    PR_Slice_x_y-velocity.png

    I don't feel very strongly about this, but it doesn't necessarily make sense that there are ANY differences at all - shouldn't the integer arithmetic be the same for this PR and for the gold standard, given how few cells there are? @john_regan or @jwise77, can you comment on why this is to be expected?

    1. John Regan author

      Hi @bwoshea

      Thanks for looking into this. I agree that a change from 32-bit HEALPIX to 64 bit HEALPIX shouldn't make very much, if any, difference in these tests. Is it possible that the bug that is breaking the I-front tests is also creeping in here? What happens if you compare against the current tip of enzo-dev? P.S. I can't see any difference between those figures (but that's probably just me - and I have to scroll up and down to compare!)

      1. John Regan author

        Hi @bwoshea & @jwise77

        I reran the tests with the --radiation=ray flag for speed and it fails for me in PhotonTestAMR as well. Probably in a similar way to what you see Brian. So it seems the 64-bit transition does cause some failures - it appears subtle differences build up. These are probably the same failures I saw before when I initially ran the tests and then led me to look at the results of running the scripts that test the ray tracing. There is no reason to believe the 32-bit results are more or less correct that the 64-bit results of course. It's interesting that the differences can be as large as 10%.

        I think it's worth sticking with the 64 bit implementation though, the reason I moved to 64 bit is that in our paper back in 2014 JW and I clearly saw cells at high resolution being completely missed by the 32-bit implementation. When moving to 64 bit the situation improved although the same problems return at even higher resolution as expected. Basically, there is a limit to how good the HEALPIX solver can "resolve" the high density AMR mesh.

        1. Brian O'Shea

          hi @john_regan , take a look at the files in here:

          https://www.dropbox.com/s/gh7zxwtvbx88ybs/diffs.tar.gz?dl=0

          It's a bunch of projections and slices of both the gold standard (the current tip of enzo-dev) and from your PR, centered at (0.5, 0.5, 0.5). It's useful to toggle back and forth - you'll see some tiny differences, particularly around the edges of the irradiated region. Note that the color bars are slightly different between some of the different. Given that we're using the tip of enzo-dev for the gold standard, your PR ought to both be subject to the same bug regarding the I-front propagation, right?

          Anyway, I do agree it makes sense to stick with the 64-bit implementation. I am not really all that concerned about the differences, since they're clearly minor, but I was surprised to see any differences at ALL!

          As a side note, a lot of people run Enzo in 32-bit precision to save memory; it might be worth making a tweak to the documentation about the MORAY solver to make it clear to people that there IS an advantage to using 64-bit for the transport solver in some circumstances!

          It would be helpful for a couple of other people to chime in on the differences ( @jwise77 , @gbryan !), but I think that the changes are very minor and it's fine to merge this in.

          1. John Regan author

            Hi @bwoshea,

            Yeah I can see the differences when I toggle some of the slices. They are very tiny but definitely there. There should of course be some differences. You are also correct that the tip of enzo-dev and my fork both contain the bug. I wasn't sure where you were taking the gold standard from that's all.

            For the 32-bit Vs. 64 bit RT I guess that there is now only the option of 64-bit precision. There isn't any 32 bit HealPIx option now - is that what you meant?

            1. John Regan author

              Hi @bwoshea ,

              Since there haven't been any further comments do you want to go ahead and merge this in?

              JR

              1. Brian O'Shea

                @john_regan sorry for not responding; last Thurs-Sat was my department's grad recruiting weekend, which I'm in charge of. It's definitely time to merge this in!

                  1. Brian O'Shea

                    Well, that's not good. I missed that one. What's the best way to back this out? (And, just to verify, it does look like your bookmark and all of your new commits got pulled in with the PR.)

                    1. John Regan author

                      Yeah that's a good question. I don't have a good answer. I could try backing out my commit somehow and then merge with the commit that doesn't include these changes. Perhaps though @brittonsmith could comment on the best next step. I find that experimenting can make things worse very quickly.

                      1. Brian O'Shea

                        Or @ngoldbaum , he is usually my savior when we have a Mercurial SNAFU. Either way, it doesn't affect the test suite, so we can try to Not Panic for a little while. :-/

  5. Nathan Goldbaum

    @bwoshea probably the best way to get rid of those commits is to back them out using "hg backout". You could also strip them but given that they were merged into the main enzo repo it's probably not a good idea to rewrite history like that given that people might have pulled since then.

    There are instructions about how to back out a merge here: https://book.mercurial-scm.org/read/undo.html#backing-out-a-merge

  6. Nathan Goldbaum

    @john_regan bitbucket doesn't know about bookmarks except how to display them in the UI. If you don't want work to be pushed to an existing pull request, do it on a separate head in the repo. You merged with your existing head, which is what caused it to be included with this pull request. If you do need to merge with existing work, do not push to the repo on bitbucket that is the source of open pull requests, push to another fork.

  7. John Regan author

    @ngoldbaum that makes sense - thanks! So I guess the moral of the story is to use a separate repo for every PR. I was trying to be clever/experiment on how to avoid that for bookkeeping purposes and to avoid a bunch of different repos - bad idea :)

    1. Nathan Goldbaum

      You don't need to do that. Just make sure in the repo you use for PRs you have one head per PR.

  8. John Regan author

    @bwoshea so I guess you need to hg backout the merge that pulled in the extra commits. Once you do that I think I can simply open a new PR with the same commits. Is that the way it seems to you too?

    1. Brian O'Shea

      I'm going to actually fork the repository, do the backout in the fork, and issue a PR. That seems like the safest way to do this!