fix sqrt() for AVX512, updates to Vectors thorn

This pull request mostly add better support for the AVX512 instructions found in KNL and modern Intel Xeon / Core iN CPUs.

There is one bugfix commit “Correct error in AVX512 sqrt implemention” which is important for those architectures (eg to compute sqrt(detg)).

I am marking this as critical until I can verify that without this fix the result of a BBH simulation on eg Stampede2 is not incorrect because of using an incorrect sqrt() function.