possible performance regression in JIT with pybind11
Dear all,
I am still finishing up the transition of some of my codes from swig to pybind11, so I am currently on master before PR 434 was merged (more precisely, at commit be01a791)
I noticed that my testsuite runs considerably slower when using the pybind11 backend. While in part that is definitely my fault in my code-specific wrappers, I managed to boil down a very trivial test that on my laptop shows a sensible performance regression.
from dolfin import *
import time
mesh = UnitSquareMesh(4, 4)
k = Expression("1 + x[0]", degree=1)
V = FunctionSpace(mesh, "CG", 1)
u = TrialFunction(V)
v = TestFunction(V)
a = inner(k*grad(u), grad(v))*dx
names = ("Expression", "assemble")
funs = (lambda: Expression("1 + x[0]", degree=1), lambda: assemble(a))
times = dict()
for (name, fun) in zip(names, funs):
t = time.time()
for _ in range(1000):
fun()
t = time.time() - t
times[name] = t
if has_pybind11():
print("pybind11", times)
else:
print("swig", times)
The results of a few executions are (I discarded the first run when expressions and form were compiled):
- swig backend:
swig {'Expression': 8.230140686035156, 'assemble': 1.3750722408294678}
swig {'Expression': 8.20879077911377, 'assemble': 1.3839783668518066}
swig {'Expression': 8.177268743515015, 'assemble': 1.3746638298034668}
swig {'Expression': 8.172224044799805, 'assemble': 1.3894646167755127}
swig {'Expression': 8.2942795753479, 'assemble': 1.3838119506835938}
swig {'Expression': 8.36578631401062, 'assemble': 1.4100961685180664}
swig {'Expression': 8.209084033966064, 'assemble': 1.4079020023345947}
- pybind11 backend:
pybind11 {'Expression': 12.34678053855896, 'assemble': 7.130722522735596}
pybind11 {'Expression': 12.30522108078003, 'assemble': 7.242411375045776}
pybind11 {'Expression': 12.457031011581421, 'assemble': 7.19352388381958}
pybind11 {'Expression': 12.320039510726929, 'assemble': 7.148066520690918}
pybind11 {'Expression': 12.562726020812988, 'assemble': 7.249881982803345}
pybind11 {'Expression': 12.55371618270874, 'assemble': 7.276110887527466}
I double checked the C++ optimization options in both cases and they are the same (optimization is enabled in both cases). Can anybody reproduce? If not, what could be wrong in my current setup?
Thanks,
best regards,
Francesco
Comments (4)
-
reporter -
I agree this would be a good idea.
-
reporter OK I'll prepare a PR
-
reporter - changed status to resolved
Solved by PR 437
- Log in to comment
I think this can be solved by moving pkgconfig imports and local variables to the global scope in
With this change, I get:
Is there any specific reason for the pkgconfig variable to be local?