Snippets

Rasmus Larsen Benchmark of rsqrt(inf) = 0 change.

Created by Rasmus Larsen
SSE:
name                                    old time/op             new time/op             delta
BM_eigen_rsqrt_float/1                   1.89ns ± 0%             1.90ns ± 0%   +0.67%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/8                   4.34ns ± 0%             4.71ns ± 0%   +8.49%          (p=0.016 n=4+5)
BM_eigen_rsqrt_float/64                  29.1ns ± 1%             32.8ns ± 0%  +12.90%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/512                  233ns ± 1%              260ns ± 0%  +11.72%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/4k                 1.83µs ± 1%             2.05µs ± 0%  +11.96%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/32k                14.6µs ± 0%             16.3µs ± 0%  +12.19%          (p=0.016 n=4+5)
BM_eigen_rsqrt_float/256k                120µs ± 1%              131µs ± 1%   +9.35%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/1M                  477µs ± 1%              526µs ± 1%  +10.19%          (p=0.008 n=5+5)

AVX:
BM_eigen_rsqrt_float/1                   1.86ns ± 0%             2.17ns ± 0%  +17.02%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/8                   2.69ns ± 0%             2.98ns ± 4%  +10.80%          (p=0.016 n=4+5)
BM_eigen_rsqrt_float/64                  16.1ns ± 4%             18.5ns ± 4%  +14.95%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/512                  123ns ± 4%              139ns ± 3%  +12.69%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/4k                 1.00µs ± 4%             1.12µs ± 3%  +11.89%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/32k                8.11µs ± 4%             8.97µs ± 4%  +10.64%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/256k               89.0µs ± 2%             92.1µs ± 4%   +3.48%          (p=0.032 n=5+5)
BM_eigen_rsqrt_float/1M                  357µs ± 1%              371µs ± 4%   +4.03%          (p=0.032 n=5+5)

AVX512:
name                                    old time/op             new time/op             delta
BM_eigen_rsqrt_double/1                  3.17ns ± 0%             3.17ns ± 0%     ~             (p=0.317 n=5+5)
BM_eigen_rsqrt_double/8                  3.16ns ± 1%             3.43ns ± 1%   +8.47%          (p=0.008 n=5+5)
BM_eigen_rsqrt_double/64                 21.7ns ± 1%             24.2ns ± 1%  +11.72%          (p=0.008 n=5+5)
BM_eigen_rsqrt_double/512                 171ns ± 1%              192ns ± 1%  +12.28%          (p=0.008 n=5+5)
BM_eigen_rsqrt_double/4k                1.64µs ± 2%             1.80µs ± 2%   +9.93%          (p=0.008 n=5+5)
BM_eigen_rsqrt_double/32k               13.1µs ± 2%             14.4µs ± 2%  +10.43%          (p=0.008 n=5+5)
BM_eigen_rsqrt_double/256k               170µs ± 2%              176µs ± 2%   +3.46%          (p=0.016 n=5+5)
BM_eigen_rsqrt_double/1M                 721µs ± 2%              742µs ± 2%     ~             (p=0.056 n=5+5)
BM_eigen_rsqrt_float/1                   2.42ns ± 0%             2.24ns ± 0%   -7.24%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/8                   14.5ns ± 0%             14.5ns ± 0%     ~             (p=0.571 n=5+5)
BM_eigen_rsqrt_float/64                  8.43ns ± 1%             9.28ns ± 1%  +10.04%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/512                 66.1ns ± 1%             73.4ns ± 1%  +11.01%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/4k                   537ns ± 0%              594ns ± 0%  +10.56%          (p=0.008 n=5+5)
BM_eigen_rsqrt_float/32k                4.79µs ± 1%             5.37µs ± 0%  +12.05%          (p=0.016 n=5+4)
BM_eigen_rsqrt_float/256k               80.1µs ± 2%             81.6µs ± 2%     ~             (p=0.056 n=5+5)
BM_eigen_rsqrt_float/1M                  322µs ± 0%              329µs ± 2%   +2.06%          (p=0.016 n=4+5)

Comments (0)

HTTPS SSH

You can clone a snippet to your computer for local editing. Learn more.