Performance regression in UFLACS
Issue #134
new
The following form compiled with UFLACS shows 10 times slow down in dev compared to 2016.1.0. (Note that the form is not very useful as a bilinear form would result in dense UFC element matrix with dimensions (2**8)**2
.)
from six.moves import xrange
element = VectorElement("R", triangle, 0, dim=2**8)
v = TestFunction(element)
summands = [v[i]*dx for i in xrange(len(v))]
L = sum(summands)
#L = Form([i for s in summands for i in s.integrals()])
(The last line is a workaround for circumventing a deep recursion in summation (...(s0 + s1) + s2) + s3) + s4) ...)
which helps in other cases but is irrelevant here.)
Profiling the code shows 18 million calls to finiteelementbase.py:96(__eq__)
:
$ python2 -mffc -r uflacs -v -p bug.ufl
$ python2 -mpstats ffc_bug.profile
Welcome to the profile statistics browser.
ffc_bug.profile% sort cumtime
ffc_bug.profile% stats 30
Sun Nov 20 16:21:44 2016 ffc_bug.profile
35394056 function calls (35334210 primitive calls) in 17.419 seconds
Ordered by: cumulative time
List reduced from 1613 to 30 due to restriction <30>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 17.238 17.238 /home/jan/dev/fenics-master/src/ffc/ffc/__main__.py:84(compile_ufl_data)
1 0.001 0.001 17.238 17.238 /home/jan/dev/fenics-master/src/ffc/ffc/compiler.py:137(compile_form)
1 0.000 0.000 17.237 17.237 /home/jan/dev/fenics-master/src/ffc/ffc/compiler.py:158(compile_ufl_objects)
1 0.000 0.000 16.732 16.732 /home/jan/dev/fenics-master/src/ffc/ffc/representation.py:129(compute_ir)
1 0.000 0.000 16.637 16.637 /home/jan/dev/fenics-master/src/ffc/ffc/representation.py:425(_compute_integral_ir)
1 0.000 0.000 16.637 16.637 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/uflacsrepresentation.py:31(compute_integral_ir)
1 0.006 0.006 16.627 16.627 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/representation/build_uflacs_ir.py:40(build_uflacs_ir)
1 0.000 0.000 16.245 16.245 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/elementtables/terminaltables.py:359(build_optimized_tables)
1 0.021 0.021 15.700 15.700 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/elementtables/terminaltables.py:114(build_element_tables)
262 0.120 0.000 11.111 0.042 /home/jan/dev/fenics-master/src/ufl/ufl/algorithms/analysis.py:168(sort_elements)
262 3.836 0.015 10.560 0.040 /home/jan/dev/fenics-master/src/ufl/ufl/utils/sorting.py:26(topological_sorting)
18222126 4.049 0.000 5.061 0.000 /home/jan/dev/fenics-master/src/ufl/ufl/finiteelement/finiteelementbase.py:96(__eq__)
66064 0.047 0.000 4.496 0.000 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/elementtables/terminaltables.py:148(add_table)
263 0.005 0.000 4.207 0.016 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/elementtables/table_utils.py:109(get_ffc_table_values)
260 0.336 0.001 4.184 0.016 /home/jan/dev/fenics-master/src/ffc/ffc/mixedelement.py:66(tabulate)
65794 0.041 0.000 3.683 0.000 /home/jan/dev/fenics-master/src/ffc/ffc/enrichedelement.py:109(tabulate)
65809 0.316 0.000 3.644 0.000 /home/jan/dev/fenics-master/src/fiat/FIAT/finite_element.py:175(tabulate)
65809 0.250 0.000 3.238 0.000 /home/jan/dev/fenics-master/src/fiat/FIAT/polynomial_set.py:84(tabulate)
65814 0.155 0.000 2.322 0.000 /home/jan/dev/fenics-master/src/fiat/FIAT/expansions.py:202(tabulate)
65815 0.927 0.000 2.017 0.000 /home/jan/dev/fenics-master/src/fiat/FIAT/expansions.py:208(_tabulate)
132702 0.470 0.000 1.268 0.000 {sum}
2598597 0.423 0.000 1.204 0.000 /home/jan/dev/fenics-master/src/ufl/ufl/finiteelement/finiteelementbase.py:86(_ufl_hash_data_)
2913599/2913593 0.604 0.000 0.871 0.000 {repr}
131385 0.771 0.000 0.771 0.000 {method 'remove' of 'list' objects}
136854 0.758 0.000 0.758 0.000 {method 'pop' of 'list' objects}
394890 0.587 0.000 0.588 0.000 /home/jan/dev/fenics-master/src/fiat/FIAT/expansions.py:212(<genexpr>)
1 0.001 0.001 0.544 0.544 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/elementtables/terminaltables.py:196(optimize_element_tables)
260 0.086 0.000 0.527 0.002 /home/jan/dev/fenics-master/src/ffc/ffc/uflacs/elementtables/table_utils.py:57(strip_table_zeros)
65814 0.252 0.000 0.441 0.000 /usr/lib/python2.7/dist-packages/numpy/linalg/linalg.py:1924(norm)
65833 0.121 0.000 0.415 0.000 /home/jan/dev/fenics-master/src/fiat/FIAT/polynomial_set.py:201(form_matrix_product)
Comments (7)
-
-
-
assigned issue to
-
assigned issue to
-
- marked as critical
-
reporter - changed milestone to 2017.2
- edited description
-
- changed title to Performance regression in UFLACS
-
- removed responsible
-
reporter - changed milestone to 2018.1
- Log in to comment
Thanks for profiling, I guess I'm doing something O(n^2) for n subelements in the new element table computation.