Improving readability with consistent imports

Issue #182 resolved
Vyas Ramasubramani created an issue

Issue

The current version of freud on master is difficult to read for a number of reasons:

  1. Implicit "inclusions" occur because all the pxi files are incorporated directly into _freud.pyx that are not obvious when modifying a particular pxi file. This issue makes it difficult to pinpoint the definition of a class or function being used.
  2. Many modules are aliased upon import, especially the CPP interface modules, e.g. " cimport freud._module as module". Such imports are especially confusing for new developers or contributors who are unfamiliar with freud's structure and have trouble determining whether the cppclass or the cdef class are being used, for example.
  3. Lots of module members are imported using the "from freud.module import foo", which is not necessarily a problem in general but becomes much more confusing because of the obfuscating factor of all the pxi files getting included into _freud.pyx.
  4. Certain modules (kspace, box, parallel) have implementations that are split between pxi and py files, further confusing the matter of where things are defined.

Current solution

Moving all code from pxi files and py files into a single module.pyx file (see the pure_cython branch) largely fixes issues 1, 3, and 4. I have almost entirely removed all module aliasing, so underscore modules are always imported with their names. I'm also making use of explicit relative imports where possible for consistency and to clearly differentiate internal freud imports from other imports ("from . import common").

Outstanding issues

Ideally, to further improve modularity I would like to stop importing the underscore modules except within the corresponding pyx file. In particular, this means not importing _box and _locality in every other module. From a design perspective, I think it would be cleanest if the classes defined in pyx files provide complete interfaces for everything else, including other pyx files. However, this raises one issue with respect to boxes.

Consider the following code from bond.pyx:

    def compute(self, box, ref_points, ref_orientations, points, orientations,
                nlist=None):
        box = common.convert_box(box)
        ...
        defaulted_nlist = make_default_nlist(
            box, ref_points, points, self.rmax, nlist, None)
        cdef NeighborList nlist_ = defaulted_nlist[0]
        cdef _locality.NeighborList * nlist_ptr = nlist_.get_ptr()
        ...
        cdef _box.Box l_box = _box.Box(
            box.getLx(), box.getLy(), box.getLz(), box.getTiltFactorXY(),
            box.getTiltFactorXZ(), box.getTiltFactorYZ(), box.is2D())
        with nogil:
            self.thisptr.compute(
                l_box, nlist_ptr,
                <vec3[float]*> l_ref_points.data,
                <float*> l_ref_orientations.data, n_ref,
                <vec3[float]*> l_points.data,
                <float*> l_orientations.data, n_p)
        return self

The _locality part of this can be easily removed. Since nlist_ is a cdef class, we can directly use nlist_.get_ptr() inside the with nogil block.

Similarly, we can use something like dereference(box.thisptr) to pass the C++ box into the compute. However, in order to be able to do this, the box object needs to be typed: cdef box.Box box = common.convert_box(box). Unfortunately, this causes a couple of problems. The cdef will fail because the box function parameter will shadow the module name. Additionally, the internal variable name box will appear as a redefinition of the parameter, which Cython won't allow.

The second problem is easily fixed, we can just rename all of the internal box variable to b or bx, the scopes in Cython are small enough that I don't think that will be remotely confusing. The first problem is a little more annoying though. Since we need to maintain the current API, the best I could come up with is some ridiculously explicit import like from . import box as box_module so that it won't conflict.

@joaander @mspells I would be interested in hearing some feedback from the two of you on what you think is best, both with respect to the current changes that have been made and this outstanding problem. Of course, this issue is purely about readability and code quality; the existing version works. In the interest of making it easier for less experienced users to maintain and contribute to freud, however, I think it's worth trying to make things like this as simple as possible.

Comments (4)

  1. Joshua Anderson

    I'm not all that familiar with the current state of the freud code. All I have to offer are general comments based on your description here:

    Simplification is a good thing. The easier it is to develop/maintain/extend, the better. I don't have enough experience with cython to comment on whether moving everything from .py into .pyx files is a good option. This is a big switch, changing over from using cython as a C++ wrapper to using it as a full fledged language for the entire implementation. That adds a certain level of complexity (compilation) for what is otherwise pure python code.

    Explicit is good (see the zen of python), so I see importing _ modules by name and only where needed as a good thing. If this isn't possible in all cases (such as your box example), that's not a huge problem.

  2. Bradley Dice

    @joaander In most cases, the code was already in Cython. The small bits of kspace, box, parallel (and voronoi?) that were in plain Python were exceptions to the rule, and this should help simplify it by making everything Cython. In our case, I think the bigger switch is not from Python to Cython, but from one massive _freud module to each module (_bond, _box, ...) being separately Cythonized and compiled.

  3. Matthew Spellings

    I think verbosity for the purpose of clarity is just fine here. from . import box as box_cy and from . cimport box as box_cpp (or however you'd like to make it explicit) seem quite reasonable to me! Moving things into cython from python also seems like a good move for clarity.

  4. Log in to comment