Permit access to Eigen CSR data

Issue #519 resolved
Prof Garth Wells created an issue

If there is a guarantee that the DOLFIN Eigen sparse matrix is a CSR matrix, then we can safely allow access to the data.

Comments (37)

  1. Chris Richardson

    As I recall, at the moment, we initialise the Eigen sparse matrix by calling matrix.insert() to set the sparsity pattern. After doing this, we could call matrix.makeCompressed() which converts it to CSR format. Most Eigen solvers require CSR format anyway. There is also the Eigen method .isCompressed() to check.

  2. Christian Clason

    Since I currently rely on the uBLAS backend to access CSR in order to build mixed finite element/superposition operator block matrices in my code, I would be really happy if this could be implemented before uBLAS is removed (as currently proposed).

  3. Chris Richardson

    @clason - if you use the C++ interface, you can access the Eigen Matrix directly, using the .mat() method of dolfin::EigenMatrix.

  4. Christian Clason

    @chris_richardson - sorry, I should have specified that I'm using the Python interface. To be precise, what I'm doing is

    # construct scipy matrix from bilinear form a, boundary condition bc
    parameters['linear_algebra_backend'] = 'uBLAS'
    def assemble_csr(a,bc):
        A = assemble(a)
        bc.apply(A)
        row,col,val = A.data()
        return sp.csr_matrix((val,col,row))
    
  5. Chris Richardson

    This turns out to be a pain to implement, because uBLAS uses std::size_t and Eigen uses int for indexing, and despite being templated, seems to fail when the index type is set to long int. It could be a bug in Eigen.

  6. Jan Blechta

    We have already touched this kind of index problems (I think it was in ILU and maybe sparse core), reported it to Eigen guys but kept it to int in DOLFIN for simplicity. As far as I remember it was basically impossible to solve it consistently in DOLFIN with typedefed la_index until it is fixed in Eigen.

    Let me search for it later, I'm currently occupied by something else.

  7. Chris Richardson

    @garth-wells - yes! I noticed, while Jan and I were chatting on the sidelines, you were at work...

  8. Prof Garth Wells reporter

    It doesn't make sense to have GenericMatrix::data because GenericMatrix is abstract and doesn't say anything about the storage. GenericMatrix::data only ever worked for the uBLAS backend, so there was no point to having it in `GenericMatrix'.

    I'm moving the data member function into the FooMatrix classes (only Eigen for now).

    It turns out that this whole thing was a lot easier if I removed the Umfpack and Cholmod wrappers. These aren't required in the absence of uBLAS - Eigen provides its own interface to Umfpack and friends.

  9. Prof Garth Wells reporter

    Add EigenMatrix::data function. Fixes Issue #519.

    To make this change simpler, UmfpackLUSolver and CholmodSolver have been removed. They won't be any use once uBLAS is removed.

    → <<cset 2e1dc31de3f5>>

  10. Christian Clason

    Just to confirm that this works for me in current master, although it took me a while to figure out that I now have to explicitly cast the matrix to the backend type first (would have been nice to mention that). For the record, this works now:

    # construct scipy matrix from bilinear form a, boundary condition bc
    parameters['linear_algebra_backend'] = 'Eigen'
    def assemble_csr(a,bc):
        A = assemble(a)
        bc.apply(A)
        row,col,val = as_backend_type(A).data()
        return sp.csr_matrix((val,col,row))
    
  11. Chris Richardson

    @clason "although it took me a while to figure out that I now have to explicitly cast the matrix to the backend type first (would have been nice to mention that)"

    I'm sure it will be mentioned in the release notes - master branch is very much work in progress.

  12. Christian Clason

    @chris_richardson - yeah, that came across as more annoyed than I had intended. (And I now realize that it was mentioned in a previous commit that .data() was removed from GenericMatrix.) I apologize for that remark. The point was for me to explicitly mention this in case somebody else following master stumbles over this.

    "Maybe we could reverse the order of the data to make it compatible with scipy.sparse?"

    Or even wrap it into a .to_scipy() function?

  13. Prof Garth Wells reporter

    @chris_richardson Yes, we could change the order.

    Even better, we already have EigenMatrix.sparray() to get a SciPy matrix directly (deep copy), but we could add an argument to get a shallow copy.

  14. Christian Clason

    @garth-wells you mean I could do something like

    sp.bmat([[M.sparray(),A.sparray()],[A.sparray(),B]],format='csr')
    

    where A,M are assembled as usual (with the Eigen backend) and B is a self-constructed SciPy matrix? That would be perfect for my purposes!

  15. Prof Garth Wells reporter

    @clason We already have the function

    def sparray(self):
        "Return a scipy.sparse representation of Matrix"
        from scipy.sparse import csr_matrix
        data = self.data(deepcopy=True)
        C = csr_matrix((data[2], data[1], data[0]))
        return C
    

    The above code copies the matrix. I'm suggesting we just an option to share the matrix data, rather than copy.

    We could improve the name of the function it we think it needs improving.

  16. Christian Clason

    @garth-wells Yes, I realize that (now...), and I don't think the deep copy would be a major bottleneck in my code (brief experiments suggest about 10% slower), but if I could write it as A.as_sparray() (without performance penalties), that would be even more elegant. (Although using a shallow copy in .sparray() and move the deep copy to .to_sparray() might be more consistent with .array() for vectors?)

  17. Prof Garth Wells reporter

    @clason Is as_foo() typical SciPy syntax? I know that tofoo() is, e.g. todense().

    I like the idea of as_sparray and to_sparray for shallow and deep copies, respectively. Is there a better name than sparray? It should really reflect that it's a CSR matrix.

  18. Christian Clason

    @garth-wells It's more NumPy, I'd say (e.g., as in asarray). I like .tocsr() as well, although that looses the SciPy connotation (which is useful when looking in the documentation or source for a function to get a matrix into SciPy).

  19. Prof Garth Wells reporter

    Let's make it:

    as_sparray()  # Return Scipy view of a CSR matrix
    to_sparray()   # Return a Scipy CSR matrix (copy)
    

    I think this is 'pythonic' and has some parallels with NumPy/SciPy notation. I'm not keen on an optional copy flag.

  20. Prof Garth Wells reporter

    Since it returns a SciPy object my preference is for SciPy-like syntax.

    Hopefully we will get rid of some of the DOLFIN deepcopy=foo syntax when we introduce more views, see Issue #531.

  21. Johan Hake

    We could use the as_array and to_array concept for view and copy of underlying data for other structures too. Now MeshFunction.array is a view of the data, but GenericVector.array is a copy. This is a stumble stone for some new users. So following the logic @garth-wells suggested we should have:

    MeshFunction.as_array() # Return NumPy view of the data
    GenericVector.to_array() # Return a NumPy array of the data (copy)
    
  22. Log in to comment