Eigen is not allocating enough memory?

Issue #444 wontfix
Esmail Abdul Fattah created an issue

Hi,

I am allocating #SBATCH --mem=250G, but this is still giving error (below) when reaching eigen function.

int main(int argc, char *argv[])
{
   blaze::SymmetricMatrix<blaze::DynamicMatrix<double, blaze::rowMajor>> AAA(40000);
   for(size_t i=0; i< 40000; i++) AAA(i,i) = 1;

   DynamicVector<double,columnVector> w;
   DynamicMatrix<double,rowMajor> V;

   w.resize(40000);
   V.resize(40000,40000);

   std::cout << "---------before---------------" << std::endl;

   eigen(AAA, w, V );

   std::cout << "---------after---------------" << std::endl;


   return 1;
}

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Comments (5)

  1. Klaus Iglberger

    Hi Esmail!

    Thanks for taking the time to report a potential issue in Blaze. In this case there is nothing we can do: the eigen() function merely calls the dsyevd() function from the LAPACK libraries (see the Netlib reference). Blaze allocates the required amount of memory for this function call. If this causes an out-of-memory exception on your system, then your system apparently doesn't provide enough memory (anymore).

    To debug the problem on your end, I suggest to incrementally increase the matrix sizes until the problem occurs. The matrix size might give you an idea which amount of memory your system can handle. Alternatively you might try to call the Blaze syevd() backend function(s) directly (see the syevd high-level wrapper functions and the syevd low-level wrapper functions). These functions are also described in the Blaze wiki.

    Thanks again for reporting a potential issue,

    Best regards,

    Klaus!

  2. Esmail Abdul Fattah reporter

    Thank you always for quick replies!

    Would it be possible to know how much memory is needed for this matrix size rather than the amount of memory my system can handle?

    Best Regards,
    Esmail

  3. Esmail Abdul Fattah reporter

    Here the results:

    • While the code is running: the free mem is 167GB.
      Then I got the error:

      terminate called after throwing an instance of 'std::bad_array_new_length' what(): std::bad_array_new_length Aborted (core dumped)

    • I checked the memory again: the free mem is 197GB, which means the code used almost 30 GB and then stopped, although at least 180GB is still available.

    Results:

    abdulfe@kw60890:~/Z$ free -mh
                  total        used        free      shared  buff/cache   available
    Mem:          754Gi       567Gi       167Gi       3.0Mi        18Gi       181Gi
    Swap:         2.0Gi       280Mi       1.7Gi
    abdulfe@kw60890:~/Z$ free -mh
                  total        used        free      shared  buff/cache   available
    Mem:          754Gi       537Gi       197Gi       3.0Mi        18Gi       211Gi
    Swap:         2.0Gi       280Mi       1.7Gi
    abdulfe@kw60890:~/Z$ 
    
  4. Klaus Iglberger

    Hi Esmail!

    The type of exception (i.e. std::bad_array_new_length) has given me the answer to this problem: what is happening here is an integer overflow:

    int main()
    {
       const int size = 40000;  // Your matrix size
       const int result = 2*size*size + 6*size + 3;  // Storage required by the dsyevd() function; results in -1094727293
       // ...
    }
    

    There is only one solution to this problem: what you need is a 64-bit LAPACK library for this kind of problem. On Blaze-side there is nothing we can do since Blaze calls Fortran routines that would only read 32 bits from any given value. I am for instance aware of a 64-bit version of MKL, but don’t have a complete overview of available LAPACK libraries.

    I hope this helps,

    Best regards,

    Klaus!

  5. Log in to comment