Wiki

Clone wiki

blaze / Vector and Matrix Customization


Custom Data Members

So far the Blaze library does not provide a lot of flexibility to customize the data members of existing vector types and matrix types. However, to some extend it is possible to customize vectors and matrices by inheritance. The following example gives an impression on how to create a simple variation of CompressedMatrix, which automatically takes care of acquiring and releasing custom memory.

template< typename Type                    // Data type of the matrix
        , bool SO = defaultStorageOrder >  // Storage order
class MyCustomMatrix
   : public CustomMatrix< Type, unaligned, unpadded, SO >
{
 public:
   explicit inline MyCustomMatrix( size_t m, size_t n )
      : CustomMatrix<Type,unaligned,unpadded,SO>()
      , array_( new Type[m*n] )
   {
      this->reset( array_.get(), m, n );
   }

 private:
   std::unique_ptr<Type[]> array_;
};

Please note that this is a simplified example with the intent to show the general approach. The number of constructors, the memory acquisition, and the kind of memory management can of course be adapted to specific requirements. Also, please note that since none of the Blaze vectors and matrices have virtual destructors polymorphic destruction cannot be used.


Custom Operations

There are two approaches to extend Blaze with custom operations. First, the map() functions provide the possibility to execute componentwise custom operations on vectors and matrices. Second, it is possible to add customized free functions.

The map() Functions

Via the unary and binary map() functions it is possible to execute componentwise custom operations on vectors and matrices. The unary map() function can be used to apply a custom operation on each single element of a dense vector or matrix or each non-zero element of a sparse vector or matrix. For instance, the following example demonstrates a custom square root computation on a dense matrix:

blaze::DynamicMatrix<double> A, B;

B = map( A, []( double d ) { return std::sqrt( d ); } );

The binary map() function can be used to apply an operation pairwise to the elements of two dense vectors or two dense matrices. The following example demonstrates the merging of two matrices of double precision values into a matrix of double precision complex numbers:

blaze::DynamicMatrix<double> real{ { 2.1, -4.2 }, { 1.0,  0.6 } };
blaze::DynamicMatrix<double> imag{ { 0.3,  1.4 }, { 2.9, -3.4 } };

blaze::DynamicMatrix< complex<double> > cplx;

// Creating the matrix
//    ( ( 2.1,  0.3) (-4.2,  1.4) )
//    ( ( 1.0,  2.9) ( 0.6, -3.4) )
cplx = map( real, imag, []( double r, double i ){ return complex<double>( r, i ); } );

These examples demonstrate the most convenient way of defining a unary custom operation by passing a lambda to the map() function. Alternatively, it is possible to pass a custom functor:

struct Sqrt
{
   double operator()( double a ) const
   {
      return std::sqrt( a );
   }
};

B = map( A, Sqrt() );

In order for the functor to work in a call to map() it must define a function call operator, which accepts arguments of the type of the according vector or matrix elements.

Although the operation is automatically parallelized depending on the size of the vector or matrix, no automatic vectorization is possible. In order to enable vectorization, a load() function can be added to the functor, which handles the vectorized computation. Depending on the data type this function is passed one of the following Blaze SIMD data types:

  • SIMD data types for fundamental data types
    • blaze::SIMDint8: Packed SIMD type for 8-bit signed integral data types
    • blaze::SIMDuint8: Packed SIMD type for 8-bit unsigned integral data types
    • blaze::SIMDint16: Packed SIMD type for 16-bit signed integral data types
    • blaze::SIMDuint16: Packed SIMD type for 16-bit unsigned integral data types
    • blaze::SIMDint32: Packed SIMD type for 32-bit signed integral data types
    • blaze::SIMDuint32: Packed SIMD type for 32-bit unsigned integral data types
    • blaze::SIMDint64: Packed SIMD type for 64-bit signed integral data types
    • blaze::SIMDuint64: Packed SIMD type for 64-bit unsigned integral data types
    • blaze::SIMDfloat: Packed SIMD type for single precision floating point data
    • blaze::SIMDdouble: Packed SIMD type for double precision floating point data
  • SIMD data types for complex data types
    • blaze::SIMDcint8: Packed SIMD type for complex 8-bit signed integral data types
    • blaze::SIMDcuint8: Packed SIMD type for complex 8-bit unsigned integral data types
    • blaze::SIMDcint16: Packed SIMD type for complex 16-bit signed integral data types
    • blaze::SIMDcuint16: Packed SIMD type for complex 16-bit unsigned integral data types
    • blaze::SIMDcint32: Packed SIMD type for complex 32-bit signed integral data types
    • blaze::SIMDcuint32: Packed SIMD type for complex 32-bit unsigned integral data types
    • blaze::SIMDcint64: Packed SIMD type for complex 64-bit signed integral data types
    • blaze::SIMDcuint64: Packed SIMD type for complex 64-bit unsigned integral data types
    • blaze::SIMDcfloat: Packed SIMD type for complex single precision floating point data
    • blaze::SIMDcdouble: Packed SIMD type for complex double precision floating point data

All SIMD types provide the value data member for a direct access to the underlying intrinsic data element. In the following example, this intrinsic element is passed to the AVX function _mm256_sqrt_pd():

struct Sqrt
{
   double operator()( double a ) const
   {
      return std::sqrt( a );
   }

   SIMDdouble load( simd_double_t a ) const
   {
      return _mm256_sqrt_pd( a.value );
   }
};

In this example, whenever vectorization is generally applicable, the load() function is called instead of the function call operator for as long as the number of remaining elements is larger-or-equal to the width of the packed SIMD type. In all other cases (which also includes peel-off and remainder loops) the scalar operation is used.

Please note that this example has two drawbacks: First, it will only compile in case the intrinsic _mm256_sqrt_pd() function is available (i.e. when AVX is active). Second, the availability of AVX is not taken into account. The first drawback can be alleviated by making the load() function a function template. The second drawback can be dealt with by adding a simdEnabled() function template to the functor:

struct Sqrt
{
   double operator()( double a ) const
   {
      return std::sqrt( a );
   }

   template< typename T >
   T load( const T& a ) const
   {
      return _mm256_sqrt_pd( a.value );
   }

   template< typename T >
   static constexpr bool simdEnabled() {
#if defined(__AVX__)
      return true;
#else
      return false;
#endif
   }
};

The simdEnabled() function must be a static, constexpr function and must return whether or not vectorization is available for the given data type T. In case the function returns true, the load() function is used for a vectorized evaluation, in case the function returns false, load() is neither called nor instantiated.

By default the map() function uses peel-off and remainder loops if the number of elements is not a multiple of the width of the packed SIMD type. However, all dense vector and matrix types in Blaze provide padding as an optimization. In case the custom operation preserves the value zero of the padding elements, it is possible to omit the peel-off and remainder loops, include the padding elements in the computation and by that increase performance. For that purpose the paddingEnabled() function can be added to the functor:

struct Sqrt
{
   // ...

   static constexpr bool paddingEnabled() { return true; }
};

Also the paddingEnabled() function must be a static, constexpr function and must return whether padding elements can be used in the custom operation. In case the function returns true, the padding elements are used during a vectorized operation, in case the function returns false, the padding elements are not used.

Note that this is a simplified example that is only working when used for dense vectors and matrices with double precision floating point elements. The following code shows the complete implementation of the according functor that is used within the Blaze library. The Blaze Sqrt functor is working for all data types that are providing a square root operation:

namespace blaze {

struct Sqrt
{
   template< typename T >
   BLAZE_ALWAYS_INLINE auto operator()( const T& a ) const
   {
      return sqrt( a );
   }

   template< typename T >
   static constexpr bool simdEnabled() { return HasSIMDSqrt<T>::value; }

   static constexpr bool paddingEnabled() { return true; }

   template< typename T >
   BLAZE_ALWAYS_INLINE auto load( const T& a ) const
   {
      BLAZE_CONSTRAINT_MUST_BE_SIMD_PACK( T );
      return sqrt( a );
   }
};

} // namespace blaze

The same approach can be taken for binary custom operations. The following code demonstrates the Min functor of the Blaze library, which is working for all data types that provide a min() operation:

struct Min
{
   explicit inline Min()
   {}

   template< typename T1, typename T2 >
   BLAZE_ALWAYS_INLINE decltype(auto) operator()( const T1& a, const T2& b ) const
   {
      return min( a, b );
   }

   template< typename T1, typename T2 >
   static constexpr bool simdEnabled() { return HasSIMDMin<T1,T2>::value; }

   static constexpr bool paddingEnabled() { return true; }

   template< typename T1, typename T2 >
   BLAZE_ALWAYS_INLINE decltype(auto) load( const T1& a, const T2& b ) const
   {
      BLAZE_CONSTRAINT_MUST_BE_SIMD_PACK( T1 );
      BLAZE_CONSTRAINT_MUST_BE_SIMD_PACK( T2 );
      return min( a, b );
   }
};

For more information on the available Blaze SIMD data types and functions, please see the SIMD module in the complete Blaze documentation.

Free Functions

In order to extend Blaze with new functionality it is possible to add free functions. Free functions can be used either as wrappers around calls to the map() function or to implement general, non-componentwise operations. The following two examples will demonstrate both ideas.

The first example shows the setToZero() function, which resets a sparse matrix to zero without affecting the sparsity pattern. It is implemented as a convenience wrapper around the map() function:

template< typename MT  // Type of the sparse matrix
        , bool SO >    // Storage order
void setToZero( blaze::SparseMatrix<MT,SO>& mat )
{
   (~mat) = blaze::map( ~mat, []( const auto& value ){ return decltype(value){}; } );
}

The blaze::SparseMatrix class template is the base class for all kinds of sparse matrices and provides an abstraction from the actual type MT of the sparse matrix. However, due to the Curiously Recurring Template Pattern (CRTP) it also enables a conversion back to the actual type. This downcast is performed via the tilde operator (i.e. operator~()). The template parameter SO represents the storage order (blaze::rowMajor or blaze::columnMajor) of the matrix.

The second example shows the countZeros() function, which counts the number of values, which are exactly zero, in a dense, row-major matrix:

template< typename MT >
size_t countZeros( blaze::DenseMatrix<MT,rowMajor>& mat )
{
   const size_t M( (~mat).rows() );
   const size_t N( (~mat).columns() );
   size_t count( 0UL );

   for( size_t i=0UL; i<M; ++i ) {
      for( size_t j=0UL; j<N; ++j ) {
         if( blaze::isDefault<strict>( (~mat)(i,j) ) )
            ++count;
      }
   }

   return count;
}

The blaze::DenseMatrix class template is the base class for all kinds of dense matrices. Again, it is possible to perform the conversion to the actual type via the tilde operator.

The following two listings show the declarations of all vector and matrix base classes, which can be used for custom free functions:

template< typename VT  // Concrete type of the dense or sparse vector
        , bool TF >    // Transpose flag (blaze::columnVector or blaze::rowVector)
class Vector;

template< typename VT  // Concrete type of the dense vector
        , bool TF >    // Transpose flag (blaze::columnVector or blaze::rowVector)
class DenseVector;

template< typename VT  // Concrete type of the sparse vector
        , bool TF >    // Transpose flag (blaze::columnVector or blaze::rowVector)
class SparseVector;
template< typename MT  // Concrete type of the dense or sparse matrix
        , bool SO >    // Storage order (blaze::rowMajor or blaze::columnMajor)
class Matrix;

template< typename MT  // Concrete type of the dense matrix
        , bool SO >    // Storage order (blaze::rowMajor or blaze::columnMajor)
class DenseMatrix;

template< typename MT  // Concrete type of the sparse matrix
        , bool SO >    // Storage order (blaze::rowMajor or blaze::columnMajor)
class SparseMatrix;

Custom Data Types

Introduction

The Blaze library is not restricted to integral, floating point and complex data types (called numeric types in Blaze), but it supports custom data types. For instance, the following example demonstrates that it is possible to use std::string as data type:

blaze::DynamicVector<std::string> a{ "Hello, ", "Blaze " , "Expression" };
blaze::DynamicVector<std::string> b{ "World"  , "Library", " Templates" };

const auto c( evaluate( a + b ) );
std::cout <<  "c =\n" << c << "\n\n";

const std::string maxString( max( c ) );
std::cout << "maxString = " << std::quoted(maxString) << "\n";

Output:

c =
( Hello, World )
( Blaze Library )
( Expression Templates )

maxString = "Hello, World"

Blaze tries hard to make the use of custom data types as convenient, easy and intuitive as possible. In order to work flawlessly with Blaze, custom data types are required to provide a certain interface (depending on the operations that the type is used for). The following sections give an overview of the necessary steps to enable the use of the hypothetical custom data type custom::double_t for vector and matrix operations.

namespace custom {

struct double_t
{
   constexpr double_t() = default;
   constexpr double_t( double i ) : value( i ) {}
   double value{};
};

} // namespace custom

Arithmetic Operations

The Blaze library assumes that a custom data type provides operator<<() for streaming, operator+=() and operator+() for additions (which for instance includes additions inside matrix/vector multiplications, matrix/matrix multiplications, reduction or norm operations), operator-=() and operator-() for subtractions, operator*=() and operator*() for multiplications and operator/=() and operator/() for divisions:

namespace custom {

constexpr double_t& operator+=( double_t& lhs, double_t rhs ) noexcept { lhs.value += rhs.value; return lhs; }
constexpr double_t& operator-=( double_t& lhs, double_t rhs ) noexcept { lhs.value -= rhs.value; return lhs; }
constexpr double_t& operator*=( double_t& lhs, double_t rhs ) noexcept { lhs.value *= rhs.value; return lhs; }
constexpr double_t& operator/=( double_t& lhs, double_t rhs ) noexcept { lhs.value /= rhs.value; return lhs; }

constexpr double_t operator+( double_t lhs, double_t rhs ) noexcept { return double_t{ lhs.value + rhs.value }; }
constexpr double_t operator-( double_t lhs, double_t rhs ) noexcept { return double_t{ lhs.value - rhs.value }; }
constexpr double_t operator*( double_t lhs, double_t rhs ) noexcept { return double_t{ lhs.value * rhs.value }; }
constexpr double_t operator/( double_t lhs, double_t rhs ) noexcept { return double_t{ lhs.value / rhs.value }; }

inline std::ostream& operator<<( std::ostream& os, double_t d )
{
   return os << d.value;
}

} // namespace custom

Example:

int main()
{
   blaze::DynamicVector<custom::double_t> a{ 1.0, 2.0, 3.0, 4.0 };
   blaze::DynamicVector<custom::double_t> b{ 0.1, 0.2, 0.3, 0.4 };

   std::cout << "a + b =\n" << ( a + b ) << "\n";
   std::cout << "a * b =\n" << ( a * b ) << "\n";

   std::cout << "sum(a) = " << sum(a) << "\n"
             << "prod(a) = " << prod(a) << "\n";
}

Output:

a + b =
(         1.1 )
(         2.2 )
(         3.3 )
(         4.4 )

a * b =
(         0.1 )
(         0.4 )
(         0.9 )
(         1.6 )

sum(a) = 10
prod(a) = 24

Note that similar steps are necessary if several custom data types are combined (as for instance custom::double_t and custom::float_t). Note that in this case both permutations need to be taken into account:

custom::double_t operator+( const custom::double_t& a, const custom::float_t& b );
custom::double_t operator+( const custom::float_t& a, const custom::double_t& b );
// ...

Please note that only built-in data types apply for vectorization and thus custom data types cannot achieve maximum performance!

Relational Operations

In order to compare the element type, Blaze expects the equality operator (i.e. operator==()) and the inequality operator (i.e. operator!=()). Alternatively it is possible to provide an equal() function, which distinguishes between strict and relaxed comparison:

namespace custom {

constexpr bool operator==( double_t lhs, double_t rhs ) noexcept { return lhs.value == rhs.value; }
constexpr bool operator!=( double_t lhs, double_t rhs ) noexcept { return !( lhs == rhs ); }

template< blaze::RelaxationFlag RF >
constexpr bool equal( double_t lhs, double_t rhs ) noexcept { return blaze::equal<RF>( lhs.value, rhs.value ); }

} // namespace custom

Example:

int main()
{
   blaze::DynamicVector<custom::double_t> a{ 1.0, 2.0, 3.0, 4.0 };
   blaze::DynamicVector<custom::double_t> b{ 0.1, 0.2, 0.3, 0.4 };

   std::cout << "a == b: " << ( a == b ) << "\n"
             << "a != b: " << ( a != b ) << "\n";
}

Output:

a == b: 0
a != b: 1

Elementwise Operations

For the different kinds of elementwise operations on vectors and matrices (abs(), sin(), cos(), sqrt(), log(), exp(), min(), max(), ...), the custom type is required to provide the according function overload. Note that the sqrt() operation may also be required for several norm computations. Also, for any inversion operation, the type is required to suport the inv() function:

namespace custom {

inline    double_t abs ( double_t d ) noexcept { return double_t{ std::abs ( d.value ) }; }
inline    double_t sin ( double_t d ) noexcept { return double_t{ std::sin ( d.value ) }; }
inline    double_t cos ( double_t d ) noexcept { return double_t{ std::cos ( d.value ) }; }
inline    double_t sqrt( double_t d ) noexcept { return double_t{ std::sqrt( d.value ) }; }
inline    double_t log ( double_t d ) noexcept { return double_t{ std::log ( d.value ) }; }
inline    double_t exp ( double_t d ) noexcept { return double_t{ std::exp ( d.value ) }; }
constexpr double_t inv ( double_t d ) noexcept { return double_t{ 1.0/d.value }; }

constexpr double_t min( double_t lhs, double_t rhs ) noexcept { return double_t{ blaze::min( lhs.value, rhs.value ) }; }
constexpr double_t max( double_t lhs, double_t rhs ) noexcept { return double_t{ blaze::max( lhs.value, rhs.value ) }; }

} // namespace custom

Example:

int main()
{
   blaze::DynamicVector<custom::double_t> a{ 1.0, 2.0, 3.0, 4.0 };
   blaze::DynamicVector<custom::double_t> b{ 0.1, 0.2, 0.3, 0.4 };

   std::cout << "abs(a) =\n" << abs(a) << "\n";
   std::cout << "sin(a) =\n" << sin(a) << "\n";
   std::cout << "cos(a) =\n" << cos(a) << "\n";
   std::cout << "sqrt(a) =\n" << sqrt(a) << "\n";
   std::cout << "log(a) =\n" << log(a) << "\n";
   std::cout << "exp(a) =\n" << exp(a) << "\n\n";
   std::cout << "min(a) =\n" << min(a) <<  "\n";
   std::cout << "max(a) =\n" << max(a) << "\n\n";
   std::cout << "min(a,b) =\n" << min(a,b) << "\n";
   std::cout << "max(a,b) =\n" << max(a,b) << "\n";
   std::cout << "norm(a) = " << norm(a) << "\n";
}

Output:

abs(a) =
(           1 )
(           2 )
(           3 )
(           4 )

sin(a) =
(    0.841471 )
(    0.909297 )
(     0.14112 )
(   -0.756802 )

cos(a) =
(    0.540302 )
(   -0.416147 )
(   -0.989992 )
(   -0.653644 )

sqrt(a) =
(           1 )
(     1.41421 )
(     1.73205 )
(           2 )

log(a) =
(           0 )
(    0.693147 )
(     1.09861 )
(     1.38629 )

exp(a) =
(     2.71828 )
(     7.38906 )
(     20.0855 )
(     54.5982 )

min(a) = 1
max(a) = 4

min(a,b) =
(         0.1 )
(         0.2 )
(         0.3 )
(         0.4 )

max(a,b) =
(           1 )
(           2 )
(           3 )
(           4 )

norm(a) = 5.47723

Adaptors

If the custom data type is used in the context of the HermitianMatrix, UniLowerMatrix, or UniUpperMatrix adaptors, it will be necessary to provide overloads of the isZero(), isOne(), and isReal() functions:

namespace custom {

template< blaze::RelaxationFlag RF >
constexpr bool isZero( double_t d ) { return blaze::isZero<RF>( d.value ); }

template< blaze::RelaxationFlag RF >
constexpr bool isOne ( double_t d ) { return blaze::isOne<RF> ( d.value ); }

template< blaze::RelaxationFlag RF >
constexpr bool isReal( double_t d ) { MAYBE_UNUSED( d ); return true; }

} // namespace custom

Example:

int main()
{
   blaze::UniLowerMatrix< blaze::DynamicMatrix<custom::double_t> > L
      { { 1.0, 0.0, 0.0 },
        { 2.0, 1.0, 0.0 },
        { 3.0, 4.0, 1.0 } };

   blaze::UniUpperMatrix< blaze::DynamicMatrix<custom::double_t> > U
      { { 1.0, 2.0, 3.0 },
        { 0.0, 1.0, 4.0 },
        { 0.0, 0.0, 1.0 } };

   const auto A( evaluate( L * U ) );
   std::cout << "A =\n" << A << "\n";
}

Output:

A =
(            1            2            3 )
(            2            5           10 )
(            3           10           26 )

Previous: Configuration Files ---- Next: Grouping / Tagging

Updated