Accept arbitrary box objects and position objects

Issue #134 closed
Carl Simon Adorf created an issue

Description

I propose an update to the python API to allow for ducktyping when passing position arrays and box objects.

The proposed API would, for example, allow to express the following code to compute the RDF for multiple frames:

for frame in traj:
    box = freud.box.Box.from_box(frame.box)
    pos = np.ascontiguousarray(frame.positions, np.float32)
    rdf.accumulate(box, pos, pos)

in a more concise and robust way:

for frame in traj:
    rdf.accumulate(frame.box, frame.positions, frame.positions)

Advantages

  • more concise
  • more pythonic in the sense that types are strong but the interface allows for ducktyping of arbitrary box/position objects
  • more secure: Passing arrays with the incorrect dtype currently results in UB, segfaults and incorrect results
  • more robust: future to changes to the internal box object do not break the front-facing API
  • more robust: future changes to the internal position array types do not break the front-facing API

Disadvantages

  • This change requires a programming guideline that makes sure that all python objects are correctly copy/move constructed at the C++/python interface

Timeline

I propose to implement this update for the 0.7 release.

Comments (16)

  1. Eric Harper

    So, @csadorf if I understand correctly, this involves adding the following code to the beginning of each accumulate/compute:

    def accumulate(box, ref_pos, pos):
        if not isinstance(box, freud.box.Box):
            try:
                box = freud.box.Box.from_box(box)
            except:
                raise TypeError("supplied box is not of type freud.box.Box or cannot be converted")
        ref_pos = np.ascontiguousarray(ref_pos)
        pos = np.ascontiguousarray(pos)
    
  2. Bradley Dice

    @csadorf Are you interested in working on this issue? It is a good suggestion. ...and you can still get it done before the 0.7 release. :)

  3. Vyas Ramasubramani

    We've proposed a new release cycle, but realistically I don't think that cycle will actually kick in for a while. Once we start releasing again we would like to release monthly, but we have a lot of work to do on the current version before we get there so you have time. Just ping us when you're ready to work on this, and we'll do the same if we progress to a point where a release candidate is on the horizon.

  4. Carl Simon Adorf reporter

    I'm on the latest version (0.10.0) and I'm still getting errors when passing a gsd-hoomd-box:

    ttributeError                            Traceback (most recent call last)
    freud/box.pyx in freud.box.Box.from_box()
    
    AttributeError: 'numpy.ndarray' object has no attribute 'Lx'
    
    During handling of the above exception, another exception occurred:
    
    IndexError                                Traceback (most recent call last)
    <ipython-input-8-68067f797d43> in <module>()
          9     psi = list()
         10     for frame in trajectory[-10:]:  # iterate through the last 10 frames
    ---> 11         op_hex.compute(frame.configuration.box, frame.particles.position)
         12         print(np.abs(np.mean(op_hex.psi)))
         13         psi.append(op_hex.compute(frame.configuration.box, frame.particles.position))
    
    freud/order.pyx in freud.order.HexOrderParameter.compute()
    
    ~/miniconda3/lib/python3.6/site-packages/freud/common.py in convert_box(box)
         68     if not isinstance(box, freud.box.Box):
         69         try:
    ---> 70             box = freud.box.Box.from_box(box)
         71         except ValueError:
         72             raise
    
    freud/box.pyx in freud.box.Box.from_box()
    
    IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
    

    Are we not testing gsd-box objects?

  5. Vyas Ramasubramani

    We made a few minor fixes to this after the original 0.10.0 release. Your use case should work, and it works for me with a test GSD file. Are you building freud from source? If so, you may need to specify --ENABLE-CYTHON for setup.py, the CPP files are only up-to-date as of the last release.

  6. Carl Simon Adorf reporter

    @bdice This is the error message when I try to compile freud master (31b9df5d26eb924f2208c4b51be481cdf30ccb7f):

    In file included from freud/environment.cpp:676:
    In file included from /Users/csadorf/miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/arrayobject.h:4:
    In file included from /Users/csadorf/miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:18:
    In file included from /Users/csadorf/miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1816:
    /Users/csadorf/miniconda3/lib/python3.6/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: "Using deprecated NumPy API, disable it by "          "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
    #warning "Using deprecated NumPy API, disable it by " \
     ^
    freud/environment.cpp:10081:38: error: no member named 'getRMax' in 'freud::environment::LocalDescriptors'; did you mean 'getLMax'?
      __pyx_v_r = __pyx_v_self->thisptr->getRMax();
                                         ^~~~~~~
                                         getLMax
    cpp/environment/LocalDescriptors.h:55:18: note: 'getLMax' declared here
        unsigned int getLMax() const
                     ^
    1 warning and 1 error generated.
    error: command 'gcc' failed with exit status 1
    
  7. Bradley Dice

    @csadorf For now, we've made the decision that master builds will require the use of --ENABLE-CYTHON. We only update the included Cython-generated C++ files on releases. @vramasub Should we change this behavior? It may make sense to switch to a model with a develop branch so that our master branch is always buildable. In either case, we definitely need to document the --ENABLE-CYTHON option.

  8. Vyas Ramasubramani

    Yeah I've been chatting with Simon about this offline. I'm also wondering if we need to go back to having the C++ files be available. I considered the possibility of adding an additional branch in between, but I think that's extra overhead. Let's chat and decide. The command line options for setup.py are now available through argparse at least, but we can and should document them more thoroughly online.

  9. Log in to comment