This update will allow me to overload smpAssign() to dispatch the assignment towards cudaAssign(). That change is required to make sure operator=() eventually calls cudaAssign() for all CUDA assignable structures (CUDADynamicMatrix, CUDADynamicVector, and also views on these structures)
Thanks a lot for the pull request. The help is highly appreciated. However, it appears as if there is a simpler solution based on the already existing CRTP inheritance hierarchy.
The following shows the inheritance/abstraction hierarchy for vectors:
As soon as both functions are visible, the compiler will always pick up the second one. This, however, might not be desired. Some vectors (e.g. DynamicVector) should still bind to the library function in order to use a CPU backend. For this reason, the second ’smpAssign()’ function can be constraint:
template<typenameVT1,boolTF1,typenameVT2,boolTF2>autosmpAssign(DenseVector<VT1,TF1>&,constDenseVector<VT2,TF2>&)// smpAssign() in Blaze_CUDA->EnableIf_t<IsCUDAAssignable_v<VT1>&&IsCUDAAssignable_v<VT2>>;NowtheCUDA-specificbackendwillonlybecalledforCUDA-types,thelibraryfunctioniscalledforallothervectortypes.ThislogiccanbeappliedtoviewsaswellbyspecialisingtheIsCUDAAssignabletypetraitaccordingly.
This discussion ignores namespaces. By the introduction of an additional namespace it is even simpler to explicitly steer which function is called, since functions in the same namespace as the according types are preferred.
In summary: From this perspective the library function doesn’t have to be constraint since it is too general. Every more specific function can replace the library function. By constraining the more specific function (i.e. the function in the extension!) it is possible to steer which function is called.