Introduce custom matrices for existing chunks of memory into Blaze

Issue #24 resolved
Klaus Iglberger created an issue

Description

In many applications users have to manually allocate and manage matrix-like data structures. It would be very convenient to be able to directly interface them with Blaze. However, currently it is not possible to use these chunks of memory directly within Blaze, but it is necessary to copy them to Blaze data structures (such as DynamicMatrix). The Blaze library should provide an API for constructing matrices, which represent these chunks of memory and thus enable to perform linear algebra computations on them as if they were native Blaze data structures.

Conceptual Example

using blaze::aligned;    // Alignment flag for aligned matrices
using blaze::unaligned;  // Alignment flag for unaligned matrices
using blaze::padded;     // Flag for padded matrices
using blaze::unpadded;   // Flag for unpadded matrices
using blaze::rowMajor;   // Flag for row-major matrices

int* unalignedArray = new int[rows*columns];
int* alignedArray = blaze::allocate<int>( rows*columns + padding );

// Instantiate a matrix to represent the first array of integer elements. Via template
// parameter it is specified that the array does neither provide specific alignment
// nor is padded. Note that the matrix will NOT take ownership of the memory resource!
blaze::CustomMatrix<int,unaligned,unpadded,rowMajor> A( unalignedArray, rows, columns );

// Instantiate a matrix to represent the second array of integer elements. Via template
// parameter it is specified that the array is both aligned and padded. Also, the matrix
// takes ownership and uses the given deleter to dispose of the memory resource.
blaze::CustomMatrix<int,aligned,padded,rowMajor> B( alignedArray, rows, columns, rows*columns + padding, blaze::Deallocate() );

// Instantiate a native Blaze data structure
blaze::DynamicMatrix<int,rowMajor> C;

// Performing a matrix addition
C = A + B;

Tasks

  • design an API to allow to use chunks of memory as native Blaze matrices
  • design the API such that user can pass ownership to a chunk of memory
  • implement the necessary functionality
  • provide a full documentation of the feature
  • ensure compatibility with all existing vector and matrix classes
  • ensure compatibility with all existing vector and matrix expressions
  • guarantee maximum performance for all operations
  • add the necessary number of test cases for the entire functionality

Comments (3)

  1. Klaus Iglberger reporter

    Summary

    The feature has been implemented, tested, optimized (including vectorization and parallelization) and documented as required. It is immediately available via cloning the Blaze repository and will be officially released in Blaze 2.6.

    The CustomMatrix Class Template

    The blaze::CustomMatrix class template provides the functionality to represent an external array of elements of arbitrary type and a fixed size as a native Blaze dense matrix data structure. Thus in contrast to all other dense matrix types a custom matrix does not perform any kind of memory allocation by itself, but it is provided with an existing array of element during construction. A custom matrix can therefore be considered an alias to the existing array. It can be included via the header file

    #include <blaze/math/CustomMatrix.h>
    

    The type of the elements, the properties of the given array of elements and the storage order of the matrix can be specified via the following four template parameters:

    template< typename Type, bool AF, bool PF, bool SO >
    class CustomMatrix;
    
    • Type : specifies the type of the matrix elements. blaze::CustomMatrix can be used with any non-cv-qualified, non-reference, non-pointer element type.
    • AF : specifies whether the represented, external arrays are properly aligned with respect to the available instruction set (SSE, AVX, ...) or not.
    • PF : specified whether the represented, external arrays are properly padded with respect to the available instruction set (SSE, AVX, ...) or not.
    • SO : specifies the storage order (blaze::rowMajor, blaze::columnMajor) of the matrix. The default value is blaze::rowMajor.

    The blaze::CustomMatrix is the right choice if any external array needs to be represented as a Blaze dense matrix data structure or if a custom memory allocation strategy needs to be realized:

    using blaze::CustomMatrix;
    using blaze::aligned;
    using blaze::unaligned;
    using blaze::padded;
    using blaze::unpadded;
    
    // Definition of an unmanaged 3x4 custom matrix for unaligned, unpadded integer arrays
    typedef CustomMatrix<int,unaligned,unpadded,rowMajor>  UnalignedUnpadded;
    std::vector<int> vec( 12UL )
    UnalignedUnpadded A( &vec[0], 3UL, 4UL );
    
    // Definition of a managed 5x6 custom matrix for unaligned but padded 'float' arrays
    typedef CustomMatrix<float,unaligned,padded,columnMajor>  UnalignedPadded;
    UnalignedPadded B( new float[40], 5UL, 6UL, 8UL, blaze::ArrayDelete() );
    
    // Definition of a managed 12x13 custom matrix for aligned, unpadded 'double' arrays
    typedef CustomMatrix<double,aligned,unpadded,rowMajor>  AlignedUnpadded;
    AlignedUnpadded C( blaze::allocate<double>( 192UL ), 12UL, 13UL, 16UL, blaze::Deallocate );
    
    // Definition of a 7x14 custom matrix for aligned, padded 'complex<double>' arrays
    typedef CustomMatrix<complex<double>,aligned,padded,columnMajor>  AlignedPadded;
    AlignedPadded D( blaze::allocate<double>( 112UL ), 7UL, 14UL, 16UL, blaze::Deallocate() );
    

    Special Properties of Custom Matrices

    In comparison with the remaining Blaze dense matrix types blaze::CustomMatrix has several special characteristics. All of these result from the fact that a custom matrix is not performing any kind of memory allocation, but instead is given an existing array of elements. The following sections discuss all of these characteristics:

    • Memory Management
    • Copy Operations
    • Alignment
    • Padding

    Memory Management

    The blaze::CustomMatrix class template acts as an adaptor for an existing array of elements. As such it provides everything that is required to use the array just like a native Blaze dense matrix data structure. However, this flexibility comes with the price that the user of a custom matrix is responsible for the resource management.

    When constructing a custom matrix there are two choices: Either a user manually manages the array of elements outside the custom matrix, or alternatively passes the responsibility for the memory management to an instance of CustomMatrix. In the second case the CustomMatrix class employs shared ownership between all copies of the custom matrix, which reference the same array.

    The following examples give an impression of several possible types of custom matrices:

    using blaze::CustomMatrix;
    using blaze::ArrayDelete;
    using blaze::Deallocate;
    using blaze::allocate;
    using blaze::aligned;
    using blaze::unaligned;
    using blaze::padded;
    using blaze::unpadded;
    using blaze::rowMajor;
    using blaze::columnMajor;
    
    // Definition of a 3x4 custom row-major matrix with unaligned, unpadded and externally
    // managed integer array. Note that the std::vector must be guaranteed to outlive the
    // custom matrix!
    std::vector<int> vec( 12UL );
    CustomMatrix<int,unaligned,unpadded> A( &vec[0], 3UL, 4UL );
    
    // Definition of a 3x4 custom row-major matrix for unaligned, unpadded integer arrays.
    // The responsibility for the memory management is passed to the custom matrix by
    // providing a deleter of type 'blaze::ArrayDelete' that is used during the destruction
    // of the custom matrix.
    CustomMatrix<int,unaligned,unpadded,rowMajor> B( new int[12], 3UL, 4UL, ArrayDelete() );
    
    // Definition of a custom 8x12 matrix and capacity 128 with aligned and padded
    // integer array. The memory management is passed to the custom matrix by providing a
    // deleter of type 'blaze::Deallocate'.
    CustomMatrix<int,aligned,padded> C( allocate<int>( 128UL ), 8UL, 12UL, 16UL, Deallocate() );
    

    It is possible to pass any type of deleter to the constructor. The deleter is only required to provide a function call operator that can be passed the pointer to the managed array. As an example the following code snipped shows the implementation of two native Blaze deleters blaze::ArrayDelete and blaze::Deallocate:

    namespace blaze {
    
    struct ArrayDelete
    {
       template< typename Type >
       inline void operator()( Type ptr ) const { boost::checked_array_delete( ptr ); }
    };
    
    struct Deallocate
    {
       template< typename Type >
       inline void operator()( Type ptr ) const { deallocate( ptr ); }
    };
    
    } // namespace blaze
    

    Copy Operations

    As with all dense matrices it is possible to copy construct a custom matrix:

    using blaze::CustomMatrix;
    using blaze::unaligned;
    using blaze::unpadded;
    
    typedef CustomMatrix<int,unaligned,unpadded>  CustomType;
    
    std::vector<int> vec( 6UL, 10 );    // Vector of 6 integers of the value 10
    CustomType A( &vec[0], 2UL, 3UL );  // Represent the std::vector as Blaze dense matrix
    a[1] = 20;                          // Also modifies the std::vector
    
    CustomType B( a );  // Creating a copy of vector a
    b[2] = 20;          // Also affect matrix A and the std::vector
    

    It is important to note that a custom matrix acts as a reference to the specified array. Thus the result of the copy constructor is a new custom matrix that is referencing and representing the same array as the original custom matrix. In case a deleter has been provided to the first custom matrix, both matrices share the responsibility to destroy the array when the last matrix goes out of scope.

    In contrast to copy construction, just as with references, copy assignment does not change which array is referenced by the custom matrices, but modifies the values of the array:

    std::vector<int> vec2( 6UL, 4 );     // Vector of 6 integers of the value 4
    CustomType C( &vec2[0], 2UL, 3UL );  // Represent the std::vector as Blaze dense matrix
    
    A = C;  // Copy assignment: Set all values of matrix A and B to 4.
    

    Alignment

    In case the custom matrix is specified as aligned the passed array must adhere to some alignment restrictions based on the alignment requirements of the used data type and the used instruction set (SSE, AVX, ...). The restriction applies to the first element of each row/column: In case of a row-major matrix the first element of each row must be properly aligned, in case of a column-major matrix the first element of each column must be properly aligned. For instance, if a row-major matrix is used and AVX is active the first element of each row must be 32-bit aligned:

    using blaze::CustomMatrix;
    using blaze::Deallocate;
    using blaze::aligned;
    using blaze::padded;
    using blaze::rowMajor;
    
    int* array = blaze::allocate<int>( 40UL );  // Is guaranteed to be 32-bit aligned
    CustomMatrix<int,aligned,padded,rowMajor> A( array, 5UL, 6UL, 8UL, Deallocate() );
    

    In the example, the row-major matrix has six columns. However, since with AVX eight integer values are loaded together the matrix is padded with two additional elements. This guarantees that the first element of each row is 32-bit aligned. In case the alignment requirements are violated, a std::invalid_argument exception is thrown.

    Padding

    Adding padding elements to the end of each row/column can have a significant impact on the performance. For instance, assuming that AVX is available, then two aligned, padded, 3x3 double precision matrices can be added via three intrinsic addition instruction:

    using blaze::CustomMatrix;
    using blaze::Deallocate;
    using blaze::allocate;
    using blaze::aligned;
    using blaze::padded;
    
    typedef CustomMatrix<double,aligned,padded>  CustomType;
    
    // Creating padded custom 3x3 matrix with an additional padding element in each row
    CustomType A( allocate<double>( 12UL ), 3UL, 3UL, 4UL, Deallocate() );
    CustomType B( allocate<double>( 12UL ), 3UL, 3UL, 4UL, Deallocate() );
    CustomType C( allocate<double>( 12UL ), 3UL, 3UL, 4UL, Deallocate() );
    
    // ... Initialization
    
    C = A + B;  // AVX-based matrix addition
    

    In this example, maximum performance is possible. However, in case no padding elements are inserted a scalar addition has to be used:

    using blaze::CustomMatrix;
    using blaze::Deallocate;
    using blaze::allocate;
    using blaze::aligned;
    using blaze::unpadded;
    
    typedef CustomMatrix<double,aligned,unpadded>  CustomType;
    
    // Creating unpadded custom 3x3 matrix
    CustomType A( allocate<double>( 12UL ), 3UL, 3UL, 4UL, Deallocate() );
    CustomType B( allocate<double>( 12UL ), 3UL, 3UL, 4UL, Deallocate() );
    CustomType C( allocate<double>( 12UL ), 3UL, 3UL, 4UL, Deallocate() );
    
    // ... Initialization
    
    C = A + B;  // Scalar matrix addition
    

    Note that the construction of padded and unpadded aligned matrices looks identical. However, in case of padded matrices, Blaze will zero initialize the padding element and use them in all computations in order to achieve maximum performance. In case of an unpadded matrix Blaze will ignore the elements with the downside that it is not possible to load a complete row to an AVX register, which makes it necessary to fall back to a scalar addition.

    The number of padding elements is required to be sufficient with respect to the available instruction set: In case of an aligned padded custom matrix the added padding elements must guarantee that the total number of elements in each row/column is a multiple of the intrinsic vector width. In case of an unaligned padded matrix the number of padding elements can be greater or equal the number of padding elements of an aligned padded custom matrix. In case the padding is insufficient with respect to the available instruction set, a std::invalid_argument exception is thrown.

  2. Log in to comment