do not forbid SSE2 enabling for non Intel x86/x86_64 processors
Issue #52
resolved
Hi! configure.ac contains around line 50 the following:
case $host_cpu in i[[3456]]86*|x86_64*)
AX_CPU_VENDOR()
if test "x$ax_cv_cpu_vendor" = "xIntel"; then
AX_EXT() # SSE2 is slower on the Opteron
fi
esac
which allows SSE2 instructions only for Intel cpus. I have been told (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702287) that the fact that AMD cpus are slower with SSE2 is not true anymore. Could you please consider removing the if/fi lines and just call AX_EXT()?
Thanks!
Cédric
Comments (3)
-
repo owner -
repo owner I ran tests on a AMD Opteron(tm) Processor 6174 which is a K10 and I stand corrected: one indeed gains a little by enabling SSE2 on those machines.
Without SSE2
$ ./bench_multiplication 10000 m: 10000, n: 10000, l: 10000, cutoff: 0, cpu cycles: 4402531744, cc/n^2.807: 0.02604, wall time: 2.00114 s $ ./bench_elimination 10000 m: 10000, n: 10000, last r: 10000, cpu cycles: 2068096385, cc/(mnr^0.807): 0.01223, wall time: 0.94004 s
With SSE2
$ ./bench_multiplication 10000 m: 10000, n: 10000, l: 10000, cutoff: 0, cpu cycles: 4259547616, cc/n^2.807: 0.02520, wall time: 1.93615 s $ ./bench_elimination 10000 m: 10000, n: 10000, last r: 10000, cpu cycles: 2007073582, cc/(mnr^0.807): 0.01187, wall time: 0.91230 s
-
repo owner - changed status to resolved
enable SSE2 on AMD chips as well (K10s gain a little), closes
#52→ <<cset 8f08bc8469ed>>
- Log in to comment
I don't find the comment in the Debian bug system convincing. We'd need some evidence that bit operations on AMDs indeed benefit from enabling SSE2.