1. Martin Albrecht
  2. m4ri
Issue #52 resolved

do not forbid SSE2 enabling for non Intel x86/x86_64 processors

Cédric Boutillier
created an issue

Hi! configure.ac contains around line 50 the following:

    case $host_cpu in i[[3456]]86*|x86_64*)
         AX_CPU_VENDOR()
         if test "x$ax_cv_cpu_vendor" = "xIntel"; then
            AX_EXT() # SSE2 is slower on the Opteron
         fi
    esac

which allows SSE2 instructions only for Intel cpus. I have been told (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702287) that the fact that AMD cpus are slower with SSE2 is not true anymore. Could you please consider removing the if/fi lines and just call AX_EXT()?

Thanks!

Cédric

Comments (3)

  1. Martin Albrecht repo owner

    I ran tests on a AMD Opteron(tm) Processor 6174 which is a K10 and I stand corrected: one indeed gains a little by enabling SSE2 on those machines.

    Without SSE2

    $ ./bench_multiplication 10000
    m: 10000, n: 10000, l: 10000, cutoff:     0, cpu cycles:   4402531744, cc/n^2.807: 0.02604, wall time:    2.00114 s
    
    $ ./bench_elimination 10000
    m: 10000, n: 10000, last r: 10000, cpu cycles:   2068096385, cc/(mnr^0.807): 0.01223, wall time:    0.94004 s
    

    With SSE2

    $ ./bench_multiplication 10000
    m: 10000, n: 10000, l: 10000, cutoff:     0, cpu cycles:   4259547616, cc/n^2.807: 0.02520, wall time:    1.93615 s
    
    $ ./bench_elimination 10000
    m: 10000, n: 10000, last r: 10000, cpu cycles:   2007073582, cc/(mnr^0.807): 0.01187, wall time:    0.91230 s
    
  2. Log in to comment