Snippets

Rasmus Larsen MexxLo: Untitled snippet

Created by Rasmus Larsen
name                                                                                 old time/op             new time/op             delta
BM_Contraction_512_512_512_1_false_false     [usual fully-connected    ]              1.61ms ± 2%             1.67ms ± 5%     ~             (p=0.222 n=5+5)
BM_Contraction_512_512_512_2_false_false     [usual fully-connected    ]              1.00ms ± 3%             0.97ms ±10%     ~             (p=0.548 n=5+5)
BM_Contraction_512_512_512_8_false_false     [usual fully-connected    ]              372µs ± 6%              366µs ±16%     ~             (p=1.000 n=5+5)
BM_Contraction_512_512_512_18_false_false    [usual fully-connected    ]              241µs ± 6%              224µs ± 6%   -6.99%          (p=0.032 n=5+5)
BM_Contraction_512_512_512_28_false_false    [usual fully-connected    ]              245µs ± 6%              259µs ± 0%   +5.47%          (p=0.016 n=5+4)
BM_Contraction_512_512_512_36_false_false    [usual fully-connected    ]              260µs ± 1%              264µs ± 3%     ~             (p=0.222 n=5+5)
BM_Contraction_2048_2048_2048_1_false_false  [usual fully-connected    ]               265ms ± 3%              266ms ± 7%     ~             (p=0.690 n=5+5)
BM_Contraction_2048_2048_2048_2_false_false  [usual fully-connected    ]              62.2ms ± 0%             60.0ms ± 4%   -3.53%          (p=0.016 n=4+5)
BM_Contraction_2048_2048_2048_8_false_false  [usual fully-connected    ]              16.1ms ± 1%             15.9ms ± 1%     ~             (p=0.095 n=5+5)
BM_Contraction_2048_2048_2048_18_false_false [usual fully-connected    ]              9.38ms ± 7%             8.92ms ± 4%     ~             (p=0.056 n=5+5)
BM_Contraction_2048_2048_2048_28_false_false [usual fully-connected    ]              8.11ms ± 2%             7.94ms ± 4%     ~             (p=0.222 n=5+5)
BM_Contraction_2048_2048_2048_36_false_false [usual fully-connected    ]              7.72ms ± 5%             7.82ms ± 6%     ~             (p=1.000 n=5+5)
BM_Contraction_128_1024_1024_1_false_false   [lstm                     ]              1.65ms ± 1%             1.62ms ± 3%     ~             (p=0.095 n=5+5)
BM_Contraction_128_1024_1024_2_false_false   [lstm                     ]              1.20ms ±13%             1.15ms ±15%     ~             (p=0.421 n=5+5)
BM_Contraction_128_1024_1024_8_false_false   [lstm                     ]              424µs ±10%              425µs ±12%     ~             (p=0.548 n=5+5)
BM_Contraction_128_1024_1024_18_false_false  [lstm                     ]              305µs ±10%              294µs ± 5%     ~             (p=0.421 n=5+5)
BM_Contraction_128_1024_1024_28_false_false  [lstm                     ]              312µs ± 6%              297µs ±10%     ~             (p=0.548 n=5+5)
BM_Contraction_128_1024_1024_36_false_false  [lstm                     ]              366µs ± 6%              342µs ±17%     ~             (p=0.151 n=5+5)
BM_Contraction_128_2048_4096_1_false_false   [translate lstm           ]              14.1ms ± 7%             13.3ms ± 0%   -5.45%          (p=0.016 n=5+4)
BM_Contraction_128_2048_4096_2_false_false   [translate lstm           ]              8.67ms ± 4%             8.49ms ± 7%     ~             (p=0.222 n=5+5)
BM_Contraction_128_2048_4096_8_false_false   [translate lstm           ]              3.00ms ± 5%             2.92ms ± 4%     ~             (p=0.222 n=5+5)
BM_Contraction_128_2048_4096_18_false_false  [translate lstm           ]              1.79ms ± 2%             1.74ms ± 1%   -2.43%          (p=0.032 n=5+5)
BM_Contraction_128_2048_4096_28_false_false  [translate lstm           ]              1.63ms ± 2%             1.60ms ± 1%     ~             (p=0.310 n=5+5)
BM_Contraction_128_2048_4096_36_false_false  [translate lstm           ]              1.29ms ± 1%             1.30ms ± 1%     ~             (p=0.310 n=5+5)
BM_Contraction_128_3072_4096_1_false_false   [translate lstm           ]              20.9ms ± 1%             20.5ms ± 2%     ~             (p=0.063 n=4+5)
BM_Contraction_128_3072_4096_2_false_false   [translate lstm           ]              13.8ms ± 6%             12.9ms ± 6%     ~             (p=0.056 n=5+5)
BM_Contraction_128_3072_4096_8_false_false   [translate lstm           ]              4.30ms ±14%             4.24ms ± 9%     ~             (p=1.000 n=5+5)
BM_Contraction_128_3072_4096_18_false_false  [translate lstm           ]              2.75ms ± 2%             2.73ms ± 1%     ~             (p=0.310 n=5+5)
BM_Contraction_128_3072_4096_28_false_false  [translate lstm           ]              2.45ms ± 3%             2.44ms ± 1%     ~             (p=1.000 n=5+5)
BM_Contraction_128_3072_4096_36_false_false  [translate lstm           ]              1.99ms ± 4%             2.00ms ± 5%     ~             (p=0.841 n=5+5)
BM_Contraction_128_4096_2048_1_false_true    [translate lstm backwards ]              46.6ms ± 4%             45.2ms ± 4%     ~             (p=0.151 n=5+5)
BM_Contraction_128_4096_2048_2_false_true    [translate lstm backwards ]              25.8ms ± 3%             25.0ms ± 3%   -3.13%          (p=0.032 n=5+5)
BM_Contraction_128_4096_2048_8_false_true    [translate lstm backwards ]              8.92ms ± 5%             8.62ms ± 4%     ~             (p=0.222 n=5+5)
BM_Contraction_128_4096_2048_18_false_true   [translate lstm backwards ]              3.97ms ±11%             3.64ms ±15%     ~             (p=0.310 n=5+5)
BM_Contraction_128_4096_2048_28_false_true   [translate lstm backwards ]              4.24ms ± 7%             4.33ms ± 3%     ~             (p=0.548 n=5+5)
BM_Contraction_128_4096_2048_36_false_true   [translate lstm backwards ]              2.82ms ± 6%             2.99ms ± 9%     ~             (p=0.421 n=5+5)
BM_Contraction_2048_128_4096_1_true_false    [translate lstm backwards ]              24.5ms ± 5%             21.8ms ±27%     ~             (p=0.222 n=5+5)
BM_Contraction_2048_128_4096_2_true_false    [translate lstm backwards ]              8.79ms ± 3%             8.69ms ± 3%     ~             (p=0.690 n=5+5)
BM_Contraction_2048_128_4096_8_true_false    [translate lstm backwards ]              2.68ms ± 7%             2.62ms ± 1%     ~             (p=0.310 n=5+5)
BM_Contraction_2048_128_4096_18_true_false   [translate lstm backwards ]              1.78ms ± 6%             1.68ms ±10%     ~             (p=0.310 n=5+5)
BM_Contraction_2048_128_4096_28_true_false   [translate lstm backwards ]              1.23ms ± 2%             1.22ms ± 3%     ~             (p=0.841 n=5+5)
BM_Contraction_2048_128_4096_36_true_false   [translate lstm backwards ]              1.44ms ± 3%             1.42ms ± 3%     ~             (p=0.222 n=5+5)
BM_Contraction_3072_4096_128_1_false_true    [translate lstm backwards ]              52.7ms ± 2%             51.0ms ± 3%     ~             (p=0.056 n=5+5)
BM_Contraction_3072_4096_128_2_false_true    [translate lstm backwards ]              17.2ms ±25%             20.1ms ±19%     ~             (p=0.151 n=5+5)
BM_Contraction_3072_4096_128_8_false_true    [translate lstm backwards ]              5.83ms ±20%             6.00ms ±15%     ~             (p=0.690 n=5+5)
BM_Contraction_3072_4096_128_18_false_true   [translate lstm backwards ]              3.38ms ±10%             3.28ms ± 2%     ~             (p=1.000 n=5+5)
BM_Contraction_3072_4096_128_28_false_true   [translate lstm backwards ]              3.35ms ± 7%             3.46ms ± 5%     ~             (p=0.095 n=5+5)
BM_Contraction_3072_4096_128_36_false_true   [translate lstm backwards ]              4.41ms ±24%             3.55ms ±10%     ~             (p=0.222 n=5+5)
BM_Contraction_3072_128_4096_1_true_false    [translate lstm backwards ]              30.2ms ±15%             29.8ms ± 8%     ~             (p=1.000 n=5+5)
BM_Contraction_3072_128_4096_2_true_false    [translate lstm backwards ]              14.1ms ± 5%             14.0ms ± 5%     ~             (p=1.000 n=5+5)
BM_Contraction_3072_128_4096_8_true_false    [translate lstm backwards ]              4.04ms ± 3%             4.01ms ± 2%     ~             (p=0.548 n=5+5)
BM_Contraction_3072_128_4096_18_true_false   [translate lstm backwards ]              3.11ms ±12%             2.60ms ± 4%  -16.67%          (p=0.008 n=5+5)
BM_Contraction_3072_128_4096_28_true_false   [translate lstm backwards ]              2.09ms ± 2%             2.03ms ± 5%     ~             (p=0.095 n=5+5)
BM_Contraction_3072_128_4096_36_true_false   [translate lstm backwards ]              2.34ms ± 1%             2.35ms ± 3%     ~             (p=0.841 n=5+5)
BM_Contraction_512_512_1024_1_false_false    [adbrain lstm             ]              3.37ms ± 7%             3.25ms ± 6%     ~             (p=0.222 n=5+5)
BM_Contraction_512_512_1024_2_false_false    [adbrain lstm             ]              1.95ms ± 5%             1.91ms ±14%     ~             (p=1.000 n=5+5)
BM_Contraction_512_512_1024_8_false_false    [adbrain lstm             ]              626µs ± 5%              615µs ± 9%     ~             (p=0.841 n=5+5)
BM_Contraction_512_512_1024_18_false_false   [adbrain lstm             ]              364µs ± 4%              359µs ± 8%     ~             (p=0.421 n=5+5)
BM_Contraction_512_512_1024_28_false_false   [adbrain lstm             ]              357µs ± 2%              351µs ± 4%     ~             (p=0.690 n=5+5)
BM_Contraction_512_512_1024_36_false_false   [adbrain lstm             ]              363µs ± 2%              360µs ± 1%     ~             (p=0.421 n=5+5)
BM_Contraction_80_3072_4096_1_false_false    [lstm                     ]              14.3ms ± 1%             13.8ms ± 2%   -3.14%          (p=0.008 n=5+5)
BM_Contraction_80_3072_4096_2_false_false    [lstm                     ]              8.00ms ± 5%             7.81ms ± 3%     ~             (p=0.095 n=5+5)
BM_Contraction_80_3072_4096_8_false_false    [lstm                     ]              3.30ms ± 6%             3.25ms ± 8%     ~             (p=1.000 n=5+5)
BM_Contraction_80_3072_4096_18_false_false   [lstm                     ]              2.13ms ± 2%             2.12ms ± 2%     ~             (p=0.690 n=5+5)
BM_Contraction_80_3072_4096_28_false_false   [lstm                     ]              1.99ms ± 3%             1.99ms ± 3%     ~             (p=0.421 n=5+5)
BM_Contraction_80_3072_4096_36_false_false   [lstm                     ]              1.67ms ± 1%             1.66ms ± 1%     ~             (p=0.548 n=5+5)
BM_Contraction_2049_2049_2049_1_false_false  [~2k better L1 alignment  ]               114ms ± 6%              111ms ± 3%     ~             (p=0.095 n=5+5)
BM_Contraction_2049_2049_2049_2_false_false  [~2k better L1 alignment  ]              60.7ms ± 8%             60.6ms ± 8%     ~             (p=0.690 n=5+5)
BM_Contraction_2049_2049_2049_8_false_false  [~2k better L1 alignment  ]              16.1ms ± 1%             16.0ms ± 5%     ~             (p=0.690 n=5+5)
BM_Contraction_2049_2049_2049_18_false_false [~2k better L1 alignment  ]              9.25ms ± 3%             9.16ms ± 5%     ~             (p=0.841 n=5+5)
BM_Contraction_2049_2049_2049_28_false_false [~2k better L1 alignment  ]              7.96ms ± 3%             7.94ms ± 7%     ~             (p=1.000 n=5+5)
BM_Contraction_2049_2049_2049_36_false_false [~2k better L1 alignment  ]              7.91ms ± 2%             7.98ms ± 2%     ~             (p=0.222 n=5+5)
BM_Contraction_2064_2064_2064_1_false_false  [~2k better L1 alignment  ]               114ms ± 2%              110ms ± 3%   -2.84%          (p=0.008 n=5+5)
BM_Contraction_2064_2064_2064_2_false_false  [~2k better L1 alignment  ]              63.2ms ± 3%             58.2ms ±10%     ~             (p=0.095 n=5+5)
BM_Contraction_2064_2064_2064_8_false_false  [~2k better L1 alignment  ]              16.5ms ± 3%             16.3ms ± 2%     ~             (p=0.421 n=5+5)
BM_Contraction_2064_2064_2064_18_false_false [~2k better L1 alignment  ]              9.45ms ± 9%             9.16ms ± 3%     ~             (p=0.548 n=5+5)
BM_Contraction_2064_2064_2064_28_false_false [~2k better L1 alignment  ]              8.11ms ± 4%             8.01ms ± 3%     ~             (p=0.690 n=5+5)
BM_Contraction_2064_2064_2064_36_false_false [~2k better L1 alignment  ]              7.98ms ± 3%             7.92ms ± 3%     ~             (p=0.690 n=5+5)
BM_Contraction_4096_4096_4096_1_false_false  [big for L2 opt           ]               2.04s ± 3%              2.07s ± 3%     ~             (p=0.841 n=5+5)
BM_Contraction_4096_4096_4096_2_false_false  [big for L2 opt           ]               941ms ± 9%              933ms ± 2%     ~             (p=1.000 n=5+5)
BM_Contraction_4096_4096_4096_8_false_false  [big for L2 opt           ]               131ms ± 4%              132ms ± 3%     ~             (p=0.548 n=5+5)
BM_Contraction_4096_4096_4096_18_false_false [big for L2 opt           ]              68.9ms ± 6%             68.0ms ± 3%     ~             (p=1.000 n=5+5)
BM_Contraction_4096_4096_4096_28_false_false [big for L2 opt           ]              61.5ms ± 2%             61.3ms ± 3%     ~             (p=0.841 n=5+5)
BM_Contraction_4096_4096_4096_36_false_false [big for L2 opt           ]              56.2ms ± 4%             56.9ms ± 4%     ~             (p=0.841 n=5+5)
BM_Contraction_512_512_512_1_true_false      [fully-connected transpose]              1.99ms ± 8%             2.08ms ± 5%     ~             (p=0.222 n=5+5)
BM_Contraction_512_512_512_2_true_false      [fully-connected transpose]              1.11ms ± 9%             1.20ms ±13%     ~             (p=0.151 n=5+5)
BM_Contraction_512_512_512_8_true_false      [fully-connected transpose]              404µs ± 8%              400µs ± 2%     ~             (p=0.421 n=5+5)
BM_Contraction_512_512_512_18_true_false     [fully-connected transpose]              261µs ± 6%              250µs ± 2%     ~             (p=0.095 n=5+5)
BM_Contraction_512_512_512_28_true_false     [fully-connected transpose]              268µs ± 9%              270µs ± 5%     ~             (p=1.000 n=5+5)
BM_Contraction_512_512_512_36_true_false     [fully-connected transpose]              283µs ± 3%              282µs ± 2%     ~             (p=0.690 n=5+5)
BM_Contraction_512_512_512_1_false_true      [fully-connected transpose]              1.89ms ± 4%             1.90ms ± 6%     ~             (p=0.841 n=5+5)
BM_Contraction_512_512_512_2_false_true      [fully-connected transpose]              1.10ms ± 3%             1.05ms ± 4%     ~             (p=0.056 n=5+5)
BM_Contraction_512_512_512_8_false_true      [fully-connected transpose]              407µs ± 2%              393µs ± 7%     ~             (p=0.151 n=5+5)
BM_Contraction_512_512_512_18_false_true     [fully-connected transpose]              251µs ± 4%              244µs ± 4%     ~             (p=0.151 n=5+5)
BM_Contraction_512_512_512_28_false_true     [fully-connected transpose]              256µs ± 4%              239µs ± 7%   -6.79%          (p=0.016 n=5+5)
BM_Contraction_512_512_512_36_false_true     [fully-connected transpose]              268µs ± 3%              268µs ± 3%     ~             (p=0.841 n=5+5)
BM_Contraction_512_512_512_1_true_true       [fully-connected transpose]              2.26ms ± 6%             2.25ms ± 2%     ~             (p=0.690 n=5+5)
BM_Contraction_512_512_512_2_true_true       [fully-connected transpose]              1.22ms ±16%             1.25ms ±11%     ~             (p=1.000 n=5+5)
BM_Contraction_512_512_512_8_true_true       [fully-connected transpose]              428µs ± 3%              410µs ± 3%   -4.28%          (p=0.016 n=5+5)
BM_Contraction_512_512_512_18_true_true      [fully-connected transpose]              275µs ± 6%              261µs ± 2%   -5.05%          (p=0.032 n=5+5)
BM_Contraction_512_512_512_28_true_true      [fully-connected transpose]              274µs ± 2%              272µs ± 4%     ~             (p=0.841 n=5+5)
BM_Contraction_512_512_512_36_true_true      [fully-connected transpose]              283µs ± 0%              282µs ± 3%     ~             (p=0.730 n=4+5)
BM_Contraction_2048_2048_2048_1_true_false   [fully-connected transpose]               278ms ± 9%              278ms ±16%     ~             (p=0.841 n=5+5)
BM_Contraction_2048_2048_2048_2_true_false   [fully-connected transpose]              88.2ms ± 8%             82.5ms ± 1%     ~             (p=0.190 n=5+4)
BM_Contraction_2048_2048_2048_8_true_false   [fully-connected transpose]              19.0ms ± 3%             18.8ms ± 2%     ~             (p=0.548 n=5+5)
BM_Contraction_2048_2048_2048_18_true_false  [fully-connected transpose]              12.2ms ±10%             11.6ms ± 5%     ~             (p=0.222 n=5+5)
BM_Contraction_2048_2048_2048_28_true_false  [fully-connected transpose]              8.90ms ± 9%             8.50ms ± 4%     ~             (p=0.310 n=5+5)
BM_Contraction_2048_2048_2048_36_true_false  [fully-connected transpose]              8.21ms ± 4%             8.27ms ± 2%     ~             (p=0.548 n=5+5)
BM_Contraction_2048_2048_2048_1_false_true   [fully-connected transpose]               275ms ± 4%              268ms ± 2%     ~             (p=0.095 n=5+5)
BM_Contraction_2048_2048_2048_2_false_true   [fully-connected transpose]              72.5ms ± 5%             71.7ms ± 3%     ~             (p=0.548 n=5+5)
BM_Contraction_2048_2048_2048_8_false_true   [fully-connected transpose]              18.6ms ± 0%             18.5ms ± 1%     ~             (p=0.190 n=4+5)
BM_Contraction_2048_2048_2048_18_false_true  [fully-connected transpose]              10.2ms ± 3%             10.0ms ± 7%     ~             (p=0.310 n=5+5)
BM_Contraction_2048_2048_2048_28_false_true  [fully-connected transpose]              8.82ms ± 2%             8.84ms ± 3%     ~             (p=1.000 n=5+5)
BM_Contraction_2048_2048_2048_36_false_true  [fully-connected transpose]              8.53ms ± 2%             8.47ms ± 3%     ~             (p=0.841 n=5+5)
BM_Contraction_2048_2048_2048_1_true_true    [fully-connected transpose]               279ms ±10%              269ms ± 4%     ~             (p=0.310 n=5+5)
BM_Contraction_2048_2048_2048_2_true_true    [fully-connected transpose]              83.2ms ± 5%             80.1ms ± 5%     ~             (p=0.421 n=5+5)
BM_Contraction_2048_2048_2048_8_true_true    [fully-connected transpose]              21.3ms ± 2%             21.2ms ± 1%     ~             (p=0.690 n=5+5)
BM_Contraction_2048_2048_2048_18_true_true   [fully-connected transpose]              12.7ms ± 4%             12.5ms ± 4%     ~             (p=0.690 n=5+5)
BM_Contraction_2048_2048_2048_28_true_true   [fully-connected transpose]              9.61ms ± 7%             9.56ms ± 2%     ~             (p=1.000 n=5+5)
BM_Contraction_2048_2048_2048_36_true_true   [fully-connected transpose]              9.01ms ± 1%             8.67ms ± 9%     ~             (p=0.151 n=5+5)
BM_Contraction_64_288_128000_1_false_false   [DeepVariant #1           ]              41.5ms ± 1%             41.0ms ± 1%   -1.02%          (p=0.032 n=5+5)
BM_Contraction_64_288_128000_2_false_false   [DeepVariant #1           ]              20.3ms ± 9%             20.0ms ± 5%     ~             (p=1.000 n=5+5)
BM_Contraction_64_288_128000_8_false_false   [DeepVariant #1           ]              5.45ms ± 2%             5.53ms ± 3%     ~             (p=0.095 n=5+5)
BM_Contraction_64_288_128000_18_false_false  [DeepVariant #1           ]              2.98ms ± 3%             2.89ms ± 2%   -3.05%          (p=0.032 n=4+5)
BM_Contraction_64_288_128000_28_false_false  [DeepVariant #1           ]              2.80ms ± 1%             2.74ms ± 2%   -2.25%          (p=0.032 n=5+5)
BM_Contraction_64_288_128000_36_false_false  [DeepVariant #1           ]              2.69ms ± 2%             2.63ms ± 7%     ~             (p=0.421 n=5+5)
BM_Contraction_96_576_128000_1_false_false   [DeepVariant #1           ]               104ms ± 2%              104ms ± 3%     ~             (p=1.000 n=5+5)
BM_Contraction_96_576_128000_2_false_false   [DeepVariant #1           ]              57.3ms ± 7%             62.8ms ± 3%   +9.58%          (p=0.008 n=5+5)
BM_Contraction_96_576_128000_8_false_false   [DeepVariant #1           ]              16.0ms ± 4%             16.2ms ± 6%     ~             (p=0.310 n=5+5)
BM_Contraction_96_576_128000_18_false_false  [DeepVariant #1           ]              8.77ms ± 2%             8.65ms ± 2%     ~             (p=0.222 n=5+5)
BM_Contraction_96_576_128000_28_false_false  [DeepVariant #1           ]              8.72ms ± 1%             8.49ms ± 4%     ~             (p=0.151 n=5+5)
BM_Contraction_96_576_128000_36_false_false  [DeepVariant #1           ]              8.85ms ± 1%             8.59ms ± 9%     ~             (p=0.690 n=5+5)
BM_Contraction_96_864_128000_1_false_false   [DeepVariant #1           ]               146ms ± 1%              147ms ± 3%     ~             (p=0.548 n=5+5)
BM_Contraction_96_864_128000_2_false_false   [DeepVariant #1           ]              79.6ms ± 5%             78.7ms ± 5%     ~             (p=0.690 n=5+5)
BM_Contraction_96_864_128000_8_false_false   [DeepVariant #1           ]              25.0ms ± 2%             24.9ms ± 4%     ~             (p=1.000 n=5+5)
BM_Contraction_96_864_128000_18_false_false  [DeepVariant #1           ]              13.5ms ± 2%             13.6ms ±12%     ~             (p=0.548 n=5+5)
BM_Contraction_96_864_128000_28_false_false  [DeepVariant #1           ]              13.3ms ± 2%             12.8ms ± 6%     ~             (p=0.310 n=5+5)
BM_Contraction_96_864_128000_36_false_false  [DeepVariant #1           ]              13.7ms ± 3%             13.1ms ± 7%     ~             (p=0.310 n=5+5)
BM_Contraction_384_2592_24576_1_false_false  [DeepVariant #1           ]               287ms ± 3%              284ms ± 6%     ~             (p=0.841 n=5+5)
BM_Contraction_384_2592_24576_2_false_false  [DeepVariant #1           ]               172ms ±12%              168ms ± 6%     ~             (p=0.548 n=5+5)
BM_Contraction_384_2592_24576_8_false_false  [DeepVariant #1           ]              45.1ms ± 1%             45.2ms ± 1%     ~             (p=1.000 n=5+5)
BM_Contraction_384_2592_24576_18_false_false [DeepVariant #1           ]              22.8ms ± 2%             22.4ms ± 1%     ~             (p=0.151 n=5+5)
BM_Contraction_384_2592_24576_28_false_false [DeepVariant #1           ]              20.7ms ± 4%             20.1ms ± 4%     ~             (p=0.222 n=5+5)
BM_Contraction_384_2592_24576_36_false_false [DeepVariant #1           ]              19.3ms ± 1%             19.0ms ± 6%     ~             (p=0.690 n=5+5)
BM_Contraction_192_1120_24576_1_false_false  [DeepVariant #2           ]              62.8ms ± 3%             61.5ms ± 5%     ~             (p=0.151 n=5+5)
BM_Contraction_192_1120_24576_2_false_false  [DeepVariant #2           ]              39.9ms ± 5%             40.1ms ± 2%     ~             (p=0.690 n=5+5)
BM_Contraction_192_1120_24576_8_false_false  [DeepVariant #2           ]              10.3ms ± 1%             10.2ms ± 2%     ~             (p=0.222 n=5+5)
BM_Contraction_192_1120_24576_18_false_false [DeepVariant #2           ]              5.70ms ± 1%             5.72ms ± 3%     ~             (p=1.000 n=5+5)
BM_Contraction_192_1120_24576_28_false_false [DeepVariant #2           ]              5.23ms ± 4%             5.24ms ± 4%     ~             (p=1.000 n=5+5)
BM_Contraction_192_1120_24576_36_false_false [DeepVariant #2           ]              4.93ms ± 3%             4.87ms ± 5%     ~             (p=0.310 n=5+5)
BM_Contraction_192_768_24576_1_false_false   [DeepVariant #3           ]              44.6ms ± 5%             43.4ms ± 1%     ~             (p=0.056 n=5+5)
BM_Contraction_192_768_24576_2_false_false   [DeepVariant #3           ]              27.6ms ± 5%             28.1ms ± 4%     ~             (p=0.310 n=5+5)
BM_Contraction_192_768_24576_8_false_false   [DeepVariant #3           ]              7.29ms ± 5%             7.19ms ± 1%     ~             (p=0.548 n=5+5)
BM_Contraction_192_768_24576_18_false_false  [DeepVariant #3           ]              4.05ms ± 3%             3.99ms ± 4%     ~             (p=0.690 n=5+5)
BM_Contraction_192_768_24576_28_false_false  [DeepVariant #3           ]              3.62ms ± 3%             3.59ms ± 4%     ~             (p=0.690 n=5+5)
BM_Contraction_192_768_24576_36_false_false  [DeepVariant #3           ]              3.31ms ± 8%             3.40ms ± 3%     ~             (p=0.222 n=5+5)
BM_Contraction_160_1120_24576_1_false_false  [DeepVariant #4           ]              57.3ms ± 1%             57.5ms ± 5%     ~             (p=0.841 n=5+5)
BM_Contraction_160_1120_24576_2_false_false  [DeepVariant #4           ]              36.5ms ± 5%             36.4ms ± 3%     ~             (p=1.000 n=5+5)
BM_Contraction_160_1120_24576_8_false_false  [DeepVariant #4           ]              9.34ms ± 1%             9.29ms ± 1%     ~             (p=0.548 n=5+5)
BM_Contraction_160_1120_24576_18_false_false [DeepVariant #4           ]              5.26ms ± 1%             5.09ms ± 4%     ~             (p=0.056 n=5+5)
BM_Contraction_160_1120_24576_28_false_false [DeepVariant #4           ]              4.63ms ± 2%             4.64ms ± 0%     ~             (p=1.000 n=5+4)
BM_Contraction_160_1120_24576_36_false_false [DeepVariant #4           ]              4.34ms ± 3%             4.36ms ± 6%     ~             (p=0.548 n=5+5)
BM_Contraction_128_896_24576_1_false_false   [DeepVariant #5           ]              39.4ms ± 7%             36.2ms ± 2%   -8.18%          (p=0.032 n=5+5)
BM_Contraction_128_896_24576_2_false_false   [DeepVariant #5           ]              24.7ms ± 2%             23.8ms ± 7%     ~             (p=0.095 n=5+5)
BM_Contraction_128_896_24576_8_false_false   [DeepVariant #5           ]              6.13ms ± 0%             6.12ms ± 1%     ~             (p=0.556 n=4+5)
BM_Contraction_128_896_24576_18_false_false  [DeepVariant #5           ]              3.55ms ± 5%             3.57ms ± 2%     ~             (p=0.841 n=5+5)
BM_Contraction_128_896_24576_28_false_false  [DeepVariant #5           ]              3.16ms ± 4%             3.24ms ± 4%     ~             (p=0.548 n=5+5)
BM_Contraction_128_896_24576_36_false_false  [DeepVariant #5           ]              2.94ms ± 4%             3.06ms ± 1%   +4.05%          (p=0.008 n=5+5)
BM_Contraction_512_800_4_1_false_false       [speech lstm a            ]             66.5µs ± 2%             63.4µs ± 5%     ~             (p=0.056 n=5+5)
BM_Contraction_512_800_4_2_false_false       [speech lstm a            ]             34.2µs ± 4%             34.4µs ± 7%     ~             (p=0.841 n=5+5)
BM_Contraction_512_800_4_8_false_false       [speech lstm a            ]             34.7µs ±16%             33.2µs ±19%     ~             (p=0.421 n=5+5)
BM_Contraction_512_800_4_18_false_false      [speech lstm a            ]             38.5µs ± 9%             32.0µs ± 4%  -16.95%          (p=0.008 n=5+5)
BM_Contraction_512_800_4_28_false_false      [speech lstm a            ]             36.9µs ± 4%             32.9µs ±14%     ~             (p=0.056 n=5+5)
BM_Contraction_512_800_4_36_false_false      [speech lstm a            ]             36.4µs ± 2%             33.1µs ± 6%   -9.02%          (p=0.008 n=5+5)
BM_Contraction_512_80_800_1_false_false      [speech lstm b            ]              472µs ± 2%              461µs ± 1%   -2.30%          (p=0.016 n=5+5)
BM_Contraction_512_80_800_2_false_false      [speech lstm b            ]              243µs ±13%              238µs ± 5%     ~             (p=0.421 n=5+5)
BM_Contraction_512_80_800_8_false_false      [speech lstm b            ]              112µs ±15%              115µs ±15%     ~             (p=0.310 n=5+5)
BM_Contraction_512_80_800_18_false_false     [speech lstm b            ]             93.6µs ± 4%             88.1µs ±16%     ~             (p=0.151 n=5+5)
BM_Contraction_512_80_800_28_false_false     [speech lstm b            ]             93.2µs ± 3%             91.1µs ± 2%     ~             (p=0.095 n=5+5)
BM_Contraction_512_80_800_36_false_false     [speech lstm b            ]              122µs ± 6%              117µs ± 4%     ~             (p=0.095 n=5+5)
BM_Contraction_512_13522_80_1_false_false    [speech lstm c            ]              7.34ms ± 5%             7.04ms ± 5%     ~             (p=0.151 n=5+5)
BM_Contraction_512_13522_80_2_false_false    [speech lstm c            ]              7.68ms ±18%             7.19ms ±23%     ~             (p=0.690 n=5+5)
BM_Contraction_512_13522_80_8_false_false    [speech lstm c            ]              3.16ms ±16%             2.89ms ±13%     ~             (p=0.151 n=5+5)
BM_Contraction_512_13522_80_18_false_false   [speech lstm c            ]              1.33ms ±12%             1.28ms ±23%     ~             (p=0.421 n=5+5)
BM_Contraction_512_13522_80_28_false_false   [speech lstm c            ]              1.15ms ±11%             1.13ms ± 6%     ~             (p=0.548 n=5+5)
BM_Contraction_512_13522_80_36_false_false   [speech lstm c            ]              1.01ms ± 4%             1.03ms ± 3%     ~             (p=0.421 n=5+5)
BM_Contraction_1_13522_80_1_false_false      [speech lstm d            ]              319µs ± 1%              311µs ± 1%   -2.55%          (p=0.008 n=5+5)
BM_Contraction_1_13522_80_2_false_false      [speech lstm d            ]              171µs ± 3%              170µs ± 5%     ~             (p=0.310 n=5+5)
BM_Contraction_1_13522_80_8_false_false      [speech lstm d            ]             71.6µs ± 2%             71.3µs ± 8%     ~             (p=0.690 n=5+5)
BM_Contraction_1_13522_80_18_false_false     [speech lstm d            ]             78.4µs ±11%             77.8µs ± 4%     ~             (p=1.000 n=5+5)
BM_Contraction_1_13522_80_28_false_false     [speech lstm d            ]             92.4µs ± 9%             74.8µs ± 2%  -19.04%          (p=0.008 n=5+5)
BM_Contraction_1_13522_80_36_false_false     [speech lstm d            ]             91.1µs ± 7%             77.2µs ± 7%  -15.31%          (p=0.008 n=5+5)
BM_Contraction_3200_512_4_1_false_false      [speech lstm e            ]              273µs ± 2%              271µs ± 3%     ~             (p=0.548 n=5+5)
BM_Contraction_3200_512_4_2_false_false      [speech lstm e            ]              160µs ± 3%              160µs ± 5%     ~             (p=1.000 n=5+5)
BM_Contraction_3200_512_4_8_false_false      [speech lstm e            ]              100µs ±11%               99µs ± 7%     ~             (p=0.690 n=5+5)
BM_Contraction_3200_512_4_18_false_false     [speech lstm e            ]             83.1µs ± 3%             82.9µs ± 4%     ~             (p=1.000 n=5+5)
BM_Contraction_3200_512_4_28_false_false     [speech lstm e            ]             87.4µs ± 2%             90.0µs ± 9%     ~             (p=0.690 n=5+5)
BM_Contraction_3200_512_4_36_false_false     [speech lstm e            ]             87.3µs ± 7%             87.7µs ± 3%     ~             (p=0.421 n=5+5)
BM_Contraction_3200_512_80_1_false_false     [speech lstm f            ]              1.51ms ± 1%             1.57ms ± 4%   +4.13%          (p=0.016 n=5+5)
BM_Contraction_3200_512_80_2_false_false     [speech lstm f            ]              1.35ms ± 9%             1.35ms ±11%     ~             (p=0.841 n=5+5)
BM_Contraction_3200_512_80_8_false_false     [speech lstm f            ]              487µs ±11%              485µs ± 8%     ~             (p=0.841 n=5+5)
BM_Contraction_3200_512_80_18_false_false    [speech lstm f            ]              346µs ± 5%              355µs ±15%     ~             (p=1.000 n=5+5)
BM_Contraction_3200_512_80_28_false_false    [speech lstm f            ]              338µs ±16%              319µs ± 7%     ~             (p=0.421 n=5+5)
BM_Contraction_3200_512_80_36_false_false    [speech lstm f            ]              385µs ± 7%              359µs ±10%     ~             (p=0.095 n=5+5)
BM_Contraction_3200_80_512_1_false_false     [speech lstm g            ]              1.86ms ± 4%             1.85ms ± 5%     ~             (p=1.000 n=5+5)
BM_Contraction_3200_80_512_2_false_false     [speech lstm g            ]              1.01ms ±14%             0.89ms ±16%     ~             (p=0.151 n=5+5)
BM_Contraction_3200_80_512_8_false_false     [speech lstm g            ]              376µs ± 3%              358µs ± 7%     ~             (p=0.151 n=5+5)
BM_Contraction_3200_80_512_18_false_false    [speech lstm g            ]              263µs ± 6%              244µs ± 2%   -7.32%          (p=0.016 n=5+5)
BM_Contraction_3200_80_512_28_false_false    [speech lstm g            ]              258µs ± 3%              256µs ± 1%     ~             (p=0.151 n=5+5)
BM_Contraction_3200_80_512_36_false_false    [speech lstm g            ]              296µs ± 2%              305µs ± 2%   +2.92%          (p=0.032 n=5+5)
BM_Contraction_2048_1024_1_1_false_false     [gemv col_major           ]              320µs ± 1%              317µs ± 1%   -0.97%          (p=0.032 n=5+5)
BM_Contraction_2048_1024_1_2_false_false     [gemv col_major           ]              321µs ± 1%              317µs ± 1%   -1.11%          (p=0.032 n=5+5)
BM_Contraction_2048_1024_1_8_false_false     [gemv col_major           ]              320µs ± 1%              317µs ± 1%   -1.03%          (p=0.032 n=5+5)
BM_Contraction_2048_1024_1_18_false_false    [gemv col_major           ]              321µs ± 1%              318µs ± 0%   -0.88%          (p=0.008 n=5+5)
BM_Contraction_2048_1024_1_28_false_false    [gemv col_major           ]              321µs ± 1%              318µs ± 1%     ~             (p=0.056 n=5+5)
BM_Contraction_2048_1024_1_36_false_false    [gemv col_major           ]              320µs ± 2%              319µs ± 1%     ~             (p=0.421 n=5+5)
BM_Contraction_1_128_128_1_false_false       [saft lstm 128 h 1 b      ]             5.38µs ± 1%             5.26µs ± 1%   -2.23%          (p=0.008 n=5+5)
BM_Contraction_1_128_128_2_false_false       [saft lstm 128 h 1 b      ]             5.41µs ± 3%             5.28µs ± 2%   -2.42%          (p=0.016 n=5+5)
BM_Contraction_1_128_128_8_false_false       [saft lstm 128 h 1 b      ]             5.42µs ± 3%             5.30µs ± 2%     ~             (p=0.095 n=5+5)
BM_Contraction_1_128_128_18_false_false      [saft lstm 128 h 1 b      ]             5.63µs ±14%             5.32µs ± 2%     ~             (p=0.095 n=5+5)
BM_Contraction_1_128_128_28_false_false      [saft lstm 128 h 1 b      ]             5.55µs ± 9%             5.31µs ± 3%     ~             (p=0.056 n=5+5)
BM_Contraction_1_128_128_36_false_false      [saft lstm 128 h 1 b      ]             5.58µs ± 7%             5.33µs ± 2%   -4.40%          (p=0.032 n=5+5)
BM_Contraction_1_192_192_1_false_false       [saft lstm 192 h 1 b      ]             11.1µs ±15%             10.4µs ± 2%     ~             (p=0.095 n=5+5)
BM_Contraction_1_192_192_2_false_false       [saft lstm 192 h 1 b      ]             11.0µs ± 4%             10.4µs ± 1%   -5.63%          (p=0.008 n=5+5)
BM_Contraction_1_192_192_8_false_false       [saft lstm 192 h 1 b      ]             11.4µs ±20%             10.4µs ± 2%   -8.52%          (p=0.008 n=5+5)
BM_Contraction_1_192_192_18_false_false      [saft lstm 192 h 1 b      ]             11.2µs ±16%             10.5µs ± 1%   -6.55%          (p=0.016 n=5+5)
BM_Contraction_1_192_192_28_false_false      [saft lstm 192 h 1 b      ]             10.8µs ± 1%             10.4µs ± 2%   -3.46%          (p=0.016 n=4+5)
BM_Contraction_1_192_192_36_false_false      [saft lstm 192 h 1 b      ]             11.3µs ±15%             10.4µs ± 1%   -7.98%          (p=0.008 n=5+5)
BM_Contraction_1500_500_1_1_false_false      [MatVec                   ]              117µs ± 1%              116µs ± 1%     ~             (p=0.056 n=5+5)
BM_Contraction_1500_500_1_2_false_false      [MatVec                   ]              117µs ± 1%              115µs ± 1%   -1.23%          (p=0.032 n=5+5)
BM_Contraction_1500_500_1_8_false_false      [MatVec                   ]              117µs ± 1%              116µs ± 1%   -1.34%          (p=0.016 n=5+5)
BM_Contraction_1500_500_1_18_false_false     [MatVec                   ]              117µs ± 1%              115µs ± 1%   -1.54%          (p=0.032 n=4+5)
BM_Contraction_1500_500_1_28_false_false     [MatVec                   ]              118µs ± 2%              116µs ± 0%   -2.06%          (p=0.008 n=5+5)
BM_Contraction_1500_500_1_36_false_false     [MatVec                   ]              117µs ± 1%              116µs ± 1%   -1.37%          (p=0.032 n=5+5)
BM_Contraction_1_500_1500_1_false_false      [VecMat                   ]              212µs ± 0%              227µs ±11%     ~             (p=0.730 n=4+5)
BM_Contraction_1_500_1500_2_false_false      [VecMat                   ]              117µs ± 1%              131µs ±12%     ~             (p=0.111 n=4+5)
BM_Contraction_1_500_1500_8_false_false      [VecMat                   ]             79.5µs ± 2%             76.0µs ± 2%   -4.43%          (p=0.016 n=4+5)
BM_Contraction_1_500_1500_18_false_false     [VecMat                   ]              109µs ± 3%               79µs ± 3%  -28.28%          (p=0.008 n=5+5)
BM_Contraction_1_500_1500_28_false_false     [VecMat                   ]              110µs ± 7%               76µs ± 2%  -31.10%          (p=0.008 n=5+5)
BM_Contraction_1_500_1500_36_false_false     [VecMat                   ]              108µs ± 3%               77µs ± 5%  -28.95%          (p=0.008 n=5+5)
BM_Contraction_250_512_3_1_false_false       [adbrain2                 ]             9.66µs ± 5%             9.14µs ± 2%   -5.41%          (p=0.008 n=5+5)
BM_Contraction_250_512_3_2_false_false       [adbrain2                 ]             16.5µs ± 4%              9.2µs ± 3%  -44.10%          (p=0.008 n=5+5)
BM_Contraction_250_512_3_8_false_false       [adbrain2                 ]             21.4µs ±15%             17.8µs ±24%     ~             (p=0.095 n=5+5)
BM_Contraction_250_512_3_18_false_false      [adbrain2                 ]             20.4µs ± 4%             18.2µs ±20%     ~             (p=0.151 n=5+5)
BM_Contraction_250_512_3_28_false_false      [adbrain2                 ]             21.6µs ± 5%             17.3µs ± 5%  -19.87%          (p=0.008 n=5+5)
BM_Contraction_250_512_3_36_false_false      [adbrain2                 ]             22.4µs ± 9%             17.6µs ± 7%  -21.50%          (p=0.008 n=5+5)
BM_Contraction_1500_512_3_1_false_false      [adbrain2                 ]              128µs ± 2%              125µs ± 1%   -2.94%          (p=0.008 n=5+5)
BM_Contraction_1500_512_3_2_false_false      [adbrain2                 ]             74.8µs ± 7%             72.9µs ± 2%     ~             (p=0.310 n=5+5)
BM_Contraction_1500_512_3_8_false_false      [adbrain2                 ]             50.0µs ± 5%             49.7µs ± 4%     ~             (p=0.690 n=5+5)
BM_Contraction_1500_512_3_18_false_false     [adbrain2                 ]             60.1µs ± 4%             51.0µs ±11%  -15.20%          (p=0.008 n=5+5)
BM_Contraction_1500_512_3_28_false_false     [adbrain2                 ]             62.0µs ± 7%             49.1µs ± 5%  -20.89%          (p=0.008 n=5+5)
BM_Contraction_1500_512_3_36_false_false     [adbrain2                 ]             62.4µs ± 7%             48.5µs ± 4%  -22.21%          (p=0.008 n=5+5)
BM_Contraction_3_512_250_1_false_false                                               38.3µs ± 4%             36.9µs ± 2%   -3.47%          (p=0.032 n=5+5)
BM_Contraction_3_512_250_2_false_false                                               30.7µs ± 4%             31.1µs ± 7%     ~             (p=0.222 n=5+5)
BM_Contraction_3_512_250_8_false_false                                               30.4µs ±11%             28.9µs ± 9%     ~             (p=0.095 n=5+5)
BM_Contraction_3_512_250_18_false_false                                              30.4µs ± 7%             27.9µs ± 4%   -8.37%          (p=0.016 n=5+5)
BM_Contraction_3_512_250_28_false_false                                              29.4µs ± 3%             28.6µs ± 6%     ~             (p=0.151 n=5+5)
BM_Contraction_3_512_250_36_false_false                                              30.3µs ± 3%             28.7µs ± 7%     ~             (p=0.056 n=5+5)
BM_Contraction_3_512_1500_1_false_false                                               239µs ± 6%              225µs ± 1%   -6.03%          (p=0.008 n=5+5)
BM_Contraction_3_512_1500_2_false_false                                               126µs ± 3%              125µs ± 2%     ~             (p=0.841 n=5+5)
BM_Contraction_3_512_1500_8_false_false                                              82.3µs ± 3%             78.8µs ± 3%   -4.26%          (p=0.016 n=5+5)
BM_Contraction_3_512_1500_18_false_false                                             75.8µs ± 3%             83.0µs ± 2%   +9.55%          (p=0.008 n=5+5)
BM_Contraction_3_512_1500_28_false_false                                             77.1µs ± 3%             84.7µs ± 5%   +9.81%          (p=0.008 n=5+5)
BM_Contraction_3_512_1500_36_false_false                                             76.1µs ± 4%             83.6µs ± 6%   +9.86%          (p=0.008 n=5+5)
BM_Contraction_1500_512_4_1_false_false                                               129µs ± 2%              125µs ± 1%   -3.25%          (p=0.008 n=5+5)
BM_Contraction_1500_512_4_2_false_false                                              79.3µs ±11%             74.2µs ± 6%     ~             (p=0.151 n=5+5)
BM_Contraction_1500_512_4_8_false_false                                              51.2µs ± 7%             52.5µs ±10%     ~             (p=0.690 n=5+5)
BM_Contraction_1500_512_4_18_false_false                                             63.6µs ± 5%             50.1µs ± 4%  -21.29%          (p=0.008 n=5+5)
BM_Contraction_1500_512_4_28_false_false                                             63.7µs ± 5%             52.8µs ± 5%  -17.15%          (p=0.008 n=5+5)
BM_Contraction_1500_512_4_36_false_false                                             62.6µs ± 2%             54.4µs ± 6%  -13.05%          (p=0.008 n=5+5)
BM_Contraction_4_512_1500_1_false_false                                               229µs ± 1%              224µs ± 2%   -2.43%          (p=0.032 n=5+5)
BM_Contraction_4_512_1500_2_false_false                                               127µs ± 4%              127µs ± 1%     ~             (p=0.548 n=5+5)
BM_Contraction_4_512_1500_8_false_false                                              85.6µs ± 6%             83.5µs ± 6%     ~             (p=0.548 n=5+5)
BM_Contraction_4_512_1500_18_false_false                                             98.8µs ±10%             83.9µs ± 5%  -15.01%          (p=0.008 n=5+5)
BM_Contraction_4_512_1500_28_false_false                                             96.7µs ±10%             86.4µs ± 4%  -10.69%          (p=0.008 n=5+5)
BM_Contraction_4_512_1500_36_false_false                                             95.3µs ± 7%             88.0µs ±10%     ~             (p=0.056 n=5+5)

Comments (0)

HTTPS SSH

You can clone a snippet to your computer for local editing. Learn more.