Random number distributions

Issue #158 resolved
Hartmut Kaiser created an issue

What would be the best way to integrate (other than uniform) random number distributions (i.e. normal, binomial, etc.) with blaze's existing random number generation scheme? I looked around in the code and it looks like that there is currently no way of changing the default (uniform) distribution.

Any help would be greatly appreciated.

Comments (4)

  1. Klaus Iglberger

    Hi Hartmut!

    Thanks for creating this issue. You are correct, currently there is no way to change the kind of distribution for the random number generation. However, admittedly, we did not anticipate that it needed to be changed. We will look into this issue and try to integrate a way to customize the distribution, but only after the release of Blaze 3.3.

    Best regards,

    Klaus!

  2. Daniel Baker

    This could be reasonably simple. In my experiments, the fastest I've been able to generate these numbers has been using an unrolled AES-CTR RNG which is here (https://github.com/dnbaker/frp/blob/master/include/frp/aesctr.h). By using an UNROLL_COUNT of 8, I've gotten about a 5% speedup from the default 4.

    My generic solution for a related problem was here:

    template<typename FloatType, bool SO, template<typename> typename Distribution, typename RNG=aes::AesCtr<uint64_t>, typename... DistArgs>
    void sample_fill(blaze::DynamicMatrix<FloatType, SO> &con, uint64_t seed, DistArgs &&... args) {
        RNG gen(seed);
        Distribution<FloatType> dist(forward<DistArgs>(args)...);
        for(size_t i(0); i < con.rows(); ++i)
            for(size_t j(0); j < con.columns(); ++j)
                con(i, j) = dist(gen);
    }
    
    
    template<typename RNG=aes::AesCtr<uint64_t>>
    void random_fill(uint64_t *data, uint64_t len, uint64_t seed=0) {
        for(RNG gen(seed); len; data[--len] = gen());
    }
    
    #define DEFINE_DIST_FILL(type, name) \
        template<typename Container, typename RNG=aes::AesCtr<uint64_t>, typename...Args> \
        void name##_fill(Container &con, uint64_t seed, Args &&... args) { \
            sample_fill<Container, type, RNG, Args...>(con, seed, forward<Args>(args)...); \
        }\
        template<typename FloatType, bool SO, typename RNG=aes::AesCtr<uint64_t>, typename...Args> \
        void name##_fill(blaze::DynamicMatrix<FloatType, SO> &con, uint64_t seed, Args &&... args) { \
            sample_fill<FloatType, SO, type, RNG, Args...>(con, seed, forward<Args>(args)...); \
        }\
        struct name##_fill_struct {\
            template<typename Container, typename RNG=aes::AesCtr<uint64_t>, typename...Args>\
            void operator()(Container &con, uint64_t seed, Args &&... args) const {\
                name##_fill<Container, RNG, Args...>(con, seed, forward<Args>(args)...);\
            }\
        };
    
    template<typename FloatType>
    class unit_normal: public boost::random::detail::unit_normal_distribution<FloatType> {
    public:
        void reset() {}
    };
    
    DEFINE_DIST_FILL(boost::normal_distribution, gaussian)
    DEFINE_DIST_FILL(unit_normal, unit_gaussian)
    DEFINE_DIST_FILL(boost::cauchy_distribution, cauchy)
    DEFINE_DIST_FILL(boost::random::chi_squared_distribution, chisq)
    DEFINE_DIST_FILL(boost::lognormal_distribution, lognormal)
    DEFINE_DIST_FILL(boost::random::extreme_value_distribution, extreme_value)
    DEFINE_DIST_FILL(boost::random::weibull_distribution, weibull)
    DEFINE_DIST_FILL(boost::random::uniform_real_distribution, uniform)
    

    One could call DISTRIBUTION_fill(matrix, seed, [... constructor arguments for the distribution...]). I currently use header-only components of boost because they're faster on my machine than the libstdc++ implementations, but I could fall back to namespace std distributions.

    I'd need to add a case overload for sparse and static matrices, but it isn't terribly far from being a solution, if perhaps not optimally performant.

    And specifically for the DenseMatrix case, one could instead copy the values by SIMD, and I think an SIMDEnabled-based functor might be the better way to do it. I'd be willing to put in some work for it.

  3. Klaus Iglberger

    Hi dnbh!

    Thank a lot for your code sample and the offer to put some work into it. However, you might have noticed that Blaze already has the functionality to randomize vectors and matrices (see the complete documentation). Hartmut's request is to extend the existing interface by providing the opportunity to pass different kinds of distribution (similar to the std::shuffle() algorithm). Although your solution would work in general it would be a considerable effort to change everything. Extending the existing solution is therefore preferable. Thanks again,

    Best regards,

    Klaus!

  4. Klaus Iglberger

    Summary

    This feature request is resolved by the introduction of vector and matrix generators (see also issue #313). Generators have been implemented, tested, optimized, and documented as required. They are immediately available via cloning the Blaze repository and will be officially released in Blaze 3.7.

    Vector Generators

    generate()

    The generate() function returns a dense vector filled elementwise via the given custom operation. By default, the returned vector is a column vector, but this setting can be changed via the BLAZE_DEFAULT_TRANSPOSE_FLAG switch (see Default Vector Storage). Alternatively it is possible to specify the transpose flag explicitly.

    The following example demonstrates the use of the generate() function:

    using blaze::generate;
    using blaze::columnVector;
    using blaze::rowVector;
    
    // Generates the homogeneous integer vector ( 2, 2, 2, 2, 2 )
    blaze::DynamicVector<int,columnVector> a;
    a = generate( 5UL, []( size_t index ){ return 2; } );
    
    // Generates the linearly spaced float vector ( 2.1, 3.2, 4.3, 5.4 )
    blaze::DynamicVector<float,columnVector> b;
    b = generate( 4UL, []( size_t index ){ return 2.1F + 1.1F*index; } );
    
    // Generates the logarithmically spaced double vector ( 1.0, 10.0, 100.0, 1000.0 )
    blaze::DynamicVector<double,columnVector> c;
    c = generate<columnVector>( 4UL, []( size_t index ){ return blaze::exp10( 1.0 + 1.0*index ); } );
    
    // Generates the vector of integer vectors ( ( 1, 2 ), ( 2, 3 ), ( 3, 4 ), ( 4, 5 ) )
    using VT = blaze::StaticVector<int,2UL>;
    blaze::StaticVector<VT,4UL,rowVector> d;
    d = generate<rowVector>( []( size_t index ) { return evaluate( VT{ 1, 2 } + index ); } );
    

    Custom Random Generators

    Via the generate() function it is possible to create custom generators. The following example demonstrates this by means of the random() function:

    template< typename Rand    // Random number generator
            , typename Dist >  // Random distribution
    decltype(auto) random( size_t size, Rand& rand, Dist& dist ) {
      return blaze::generate( size, [&rand,&dist]( size_t ){
         return dist( rand );
      } );
    }
    
  5. Log in to comment