Reimplement "project" by looping over subspaces
I'm getting a huge speedup by projecting the different subspaces of a TensorFunctionSpace instead of the whole TensorFunctionSpace. See the attached proof of concept. I wonder whether the approach can be made robust, and applied to MixedFuctionSpaces. Would it it possible to avoid redefining the linear and bilinear forms inside the loop whenever the shape of the source and destination subspaces is equal to the previous ones? How should I compare their shape?
Comments (21)
-
-
reporter I just realized that using
fi.sub(i)
in the definition of the linear form is not going to work with ufl expressions. Consider for example
v0 = Expression(("0", "sin(2*pi*x[0])*sin(2*pi*x[1])","1")) F1 = VectorFunctionSpace(mesh,"CG", 1) F2 = TensorFunctionSpace(mesh, "CG", 1) f1 = Function(F1) f1.interpolate(v0) X=grad(f1) + outer(f1, f1)
Here
project(X,F2)
works, while
mixedproject(X, F2)
fails
-
Caching with forms as keys is unsafe from a memory leak perspective if a program uses many meshes and discards the old ones over time, or from a correctness perspective if a program modifies the mesh coordinates which would invalidate the mass matrix.
That said I think mixedproject is a good idea, and we should look into how to add the "subspace awareness" necessary at the right places to make this kind of thing easier. (However I'm very busy these days).
-
Attaching a new script that works for the case @MarcoMorandini mentioned. Just need to avoid subbing the linear rhs form, here achieved by assembling the entire rhs in the mixed space. No penalty in terms of efficiency there.
@martinal, using forms as keys would be safe if the cache is deleted on the exit of mixedproject, right? Nothing inside the projection that can make it wrong? I modified the code to delete the cache on exit.
New implementation: mixedproject.py
-
reporter @mikael_mortensen, your last version works very well for us, and dramatically reduces our memory and cpu requirements for projections. Thank you!
Even if it were safe to keep the cache I think that it should nevertheless be dropped on the exit. Otherwise, you would incur into a huge memory penalty for real-word computations.
-
It's completely safe when dropped on exit yes.
-
I think this project() should be changed to use this when possible. That will speed up a lot of applications with no user edits.
-
- changed milestone to 1.6
-
- marked as major
-
reporter - attached mixedprojection3.py
@mikael_mortensen, you signature sometimes doesn't work. For example, it fails when projecting onto V1*V2, with V1 = FunctionSpace(mesh, "CG", 1) and V2 FunctionSpace(mesh, "N1curl", 1) becuase the corresponding signature is "2".
I tried to harden the code (see the attached file mixedprojection3.py), and I would greatly appreciate if you, @martinal or someone else could check its correctness - and hopefully reduce its spagetti-like structure.
The bad news is that for the above example we have a performance penalty with respect to the original project. It's the only case, though, and I hope it is acceptable.
-
Vector elements are always causing trouble for my logic. We need to catch these elements as well. I tried to follow your logic @MarcoMorandini, but as you say, it is a bit messy even though it seems to work. Vector elements have zero sub spaces, but value rank of 1. I think it is better to use this. I have not considered all angles, but this would work for V1*V2
if ((sum(V.sub(i).num_sub_spaces() for i in range(n_sub)) == 0) and sum(V.sub(i).element().value_rank() for i in range(n_sub)) == 0): return n_sub
There's probably a nicer way though. Will give it some more thought.
-
An easier way to iterate over the subspace hierarchy is needed also for improvement to split, for example to handle split(split(nested_mixed_function)[0]). I'll see if I can cook up something that will simplify this issue as well.
-
- changed milestone to 1.7
-
MatNest might be good solution to this.
-
- removed milestone
Removing milestone: 1.7 (automated comment)
-
I recently posted this question on the forum, and I think that it is relevant to this issue. Has this issue been resolved, because I could not be sure of it?
-
reporter - attached functions.py
After digging into the hard disk: the attached file, functions.py, is based on @mikael_mortensen suggestions and should be what you are looking for: use
mixedproject
insted of project. Perhaps to really get maximum performances you should find a way to achieve the effect of the commented linesol.parameters["preconditioner"]["structure"] = "same"
that does not work any more. Could you try it? I've never made a pull request because of the comment by @martinal about the need to an easier way to iterate about the subspace hierarchy. Should I do it? -
I recently implemented
split
in ufl for functions on nested mixed elements. You can look at the implementation in ufl, but I'm not sure it can be directly translated to function spaces in dolfin. But don't wait for for me to do anything more here. -
I tried
mixedproject
in functions.py, and it is indeed much faster, even faster than the workaround devised in the question I posted. For 162000 elements, it computes the projection in 1.16 seconds. The time complexity appears to have been decreased by a great magnitude. I have also compared with the results obtained byproject()
(I calculated the differences of the fields using the calculator filter in ParaView), and the difference is very small.I looked around a bit for the preconditioner parameter, but the only parameters that
KrylovSolver
now provides are['absolute_tolerance', 'divergence_limit', 'error_on_nonconvergence', 'maximum_iterations', 'monitor_convergence', 'nonzero_initial_guess', 'relative_tolerance', 'report']
. I have just started to use FEniCS, so I cannot comment any further on that.I don't know FEniCS well enough to propose an easier way of iterating the subspace hierarchy, but I would definitely suggest incorporating this function into FEniCS if it is stable enough.
-
The 'right' solution to this issue will be to preserve the block structure in the projection matrix, e.g.
A = [P 0 0] [0 P 0] [0 0 P]
where the diagonal blocks all point to the same matrix.
-
reporter Not sure. Doubts: with MatNest: do Petsc's solvers really deal with this as efficiently as it would do with a linear system with many RHS? What about the Trilinos backend? Would the effort of dealing with block matrices solve only this use case or would it have possible applications elsewhere? Could a dolfin DiagonalBlockMatrix (not limited to 2x2) be an alternative, hiding there the complexity? This last solution should also handle more complex cases, like
A= [P 0 0 0 0] [0 P 0 0 0] [0 0 P 0 0] [0 0 0 R 0] [0 0 0 0 R]
that would be straightforward with MatNest (at the expense of having in memory, at the same time, both P and R, instead of P and later on R as with
mixedproject
) - Log in to comment
This is why I try to avoid Vector and TensorFunctionSpaces. They are not taking advantage of underlying symmetries in assembling and solving. Your procedure is nice and it looks robust to me. Without giving it much more thought I have extended it to Mixed spaces in the attachment just to see that it works. It can certainly be done much nicer, but I think it should work for any regular mixture of spaces. Efficiency is achieved by caching the mass matrices with forms as keys. I also suggest using boundary conditions for the matrices in the keys, but I have not really elaborated on it. Like I said, just a first effort. Any suggestions for further improvements or better solutions alltogether much appreciated. The speedup is really good.
mixedprojection.py