Replace insert/extract in scalar emulation with load/store

Issue #14 new
edanor repo owner created an issue

Because we want to explicitly specialize only operations that can be expressed in specific instruction set, all unsupported operations should be left with resolution to scalar emulation.

Most of emulated functions are using insert/extract on a per-element basis. Performing insert/extract operations on vectors is slow for non-emulated vectors. Doing load/store on emulated data types shouldn't create any slow-downs due to compiler optimizations, and even if not, it will still reduce slow-down on target vector code.

This proposal is to re-write scalar emulation functions with LOAD/STORE instead of INSERT/EXTRACT.

Comments (0)

  1. Log in to comment