Yt - correctly read gadget binary big endian files

#2534 Declined
Repository
yt-weiguang
Branch
yt
Repository
yt
Branch
yt
Author
  1. Weiguang Cui
Reviewers
Description

modifications to yt/frontend/gadget/io.py, yt/frontend/gadget/data_structures.py to correctly read gadget binary big endian files.

Comments (22)

  1. yt-fido
  2. Nathan Goldbaum

    Any chance that we can get some tests for this? If you have some sample data you can share for upload to yt-project.org/data with different endianness that would be helpful.

    In addition, in the future it would be best to avoid making lots of whitespace changes for code sections that haven't been edited meaningfully. It makes it much more difficult to review the pull request and figure out where you have made meaningful changes to the code.

    1. Weiguang Cui author

      You can use the Gadget3-snap-format2 in yt-project.org/data to make the test. I simply tested with ds=yt.load("Gadget3-snap-format2",index_ptype="Gas") and sp=yt.SlicePlot(ds,'x', ("gas", "density"), width = (800.0, 'kpc')) The plot show difference when you do not include index_ptype. But this is because with index_ptype, yt only calculate the density for gas particles, is that correct?

      1. Nathan Goldbaum

        I'm talking about an automated test, i.e. One our test runner can run on every commit. I was picturing a test that loaded big-endian and little-median data and then verifies that the data are read in correctly.

          1. Nathan Goldbaum

            What's the endianness of this file? Is there any chance you can upload a smaller file with this endianness? We prefer to keep our sample data small to avoid needing to download a bulky file when debugging something. If you do upload a different file you can use the yt curldrop (see https://docs.hub.yt/services.html#curldrop or girder) to share it.

            Can you add an entry for this file (or another smaller file) in the data listing for yt-project.org/data? The repository for the website is at bitbucket.org/yt_analysis/website. You'll need to edit a file named data/datafiles.json in that repo. The data should be packaged as a gzipped tarball containing a folder containing your data. It helps if the folder has a useful descriptive name like e.g. LittleEndianGadgetBinary.

            Add an entry in the section on Gadget data for this file, including a short description. I will take care of uploading the data to the machine that hosts the website.

            Once that's done, you should add an automated test to this pull request making use of the data you've uploaded and any existing data on yt-project.org/data that tests this functionality. The test should live in yt/frontends/gadget/tests/test_outputs.py. You should look at the examples in that file that make use of the requires_file decorator for templates. The test function you write should load the test data and somehow verify that the data are being read in with the correct endianness.

            If you can think of a way to test the functionality you've added in this pull request without using any test datasets that would also work and would avoid the complexity of uploading data. Basically, we just want to make sure that the functionality you've added in this pull request continues to work in the future, and having an automated test we can run on every commit ensures that.

            1. Weiguang Cui author

              I have uploaded a snapshot usign curl, which is smaller but still 700Mb. Sorry for that I don't find too much time to reduce the file size.

              curl -T BigEndianGadgetBinary http://use.yt/upload/ 
              Stream body handler: received 757933052 bytes
              http://use.yt/upload/125c9981
              

              I also added this in the yt_analysis/website :

                      {
                          "code": "Gadget 3",
                          "description": "Gadget snapshot with binary format 2, BigEndian encoding",
                          "filename": "BigEndianGadgetBinary",
                          "size": "723 MB",
                          "url": "http://yt-project.org/data/BigEndianGadgetBinary"
                      },
              

              I will try to find sometime to write the test file tonight.

  3. Nathan Goldbaum

    In addition, my comment about mentioning index_ptype in the documentation still stands, please update the page I mentioned in my previous comment earlier this month. My offer to help out if you have questions about updating the documentation still stands.

  4. Weiguang Cui author

    I will do the doc as soon as possible. The newly changed files are for a new commit, which relates to the big endian files. I do not know how it is mixed with this commit. How can I roll back to issue a new commit? Thank you.

    1. Nathan Goldbaum

      The easiest way would be to use hg revert:

      $ cd yt-hg # or wherever your clone of the yt repo is
      $ hg pull https://bitbucket.org/yt_analysis/yt
      $ hg revert yt/analysis_modules/sunyaev_zeldovich/projection.py -r 07a9a1e
      

      That will revert the SZPack wrapper to the state it exists in 07a9a1e. You'll need to commit the change after you do the revert.

    1. Nathan Goldbaum

      Argh, I forgot about the merge conflict. In lieu of waiting for Weiguang to update here to finish the transition, I'm going to reissue this pull request on github and take responsibility for getting it merged in. @Weiguang Cui put in a lot of effort here going back and forth with me and I want to make sure his efforts lead to real code contributions instead of getting blocked on something silly.