MatCreateMPIBAIJWithArrays segfault

Issue #195 resolved
Marshall Galbraith
created an issue

I'm trying to generate a block matrix with an existing CSR structure. I've tried using MatCreateMPIBAIJWithArrays, but I get a seffault because a->A is a null pointer in MatSetOption_MPISBAIJ on line 1591 in mpisbaij.c. That said, I'm not sure why MatCreateMPIBAIJWithArrays appears to be creating a symmetric block matrix, which seems inconsistent with the name of the function.

Comments (19)

  1. Marshall Galbraith reporter

    I currently have this implemented as part of our C++ project, so it would take some effort to put together a small example. The wrong matrix type is pretty easy to spot in the mean time:

    PetscErrorCode  MatCreateMPIBAIJWithArrays(MPI_Comm comm,PetscInt bs,PetscInt m,PetscInt n,PetscInt M,PetscInt N,const PetscInt i[],const PetscInt j[],const PetscScalar a[],Mat *mat)
    {
      PetscErrorCode ierr;
    
      PetscFunctionBegin;
      if (i[0]) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"i (row indices) must start with 0");
      if (m < 0) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"local number of rows (m) cannot be PETSC_DECIDE, or negative");
      ierr = MatCreate(comm,mat);CHKERRQ(ierr);
      ierr = MatSetSizes(*mat,m,n,M,N);CHKERRQ(ierr);
      ierr = MatSetType(*mat,MATMPISBAIJ);CHKERRQ(ierr);
      ierr = MatSetOption(*mat,MAT_ROW_ORIENTED,PETSC_FALSE);CHKERRQ(ierr);
      ierr = MatMPIBAIJSetPreallocationCSR(*mat,bs,i,j,a);CHKERRQ(ierr);
      ierr = MatSetOption(*mat,MAT_ROW_ORIENTED,PETSC_TRUE);CHKERRQ(ierr);
      PetscFunctionReturn(0);
    }
    

    I'm also having problems with the ILU preconditioner with BAIJ matricies, i.e.

    Mat Object: () 2 MPI processes
      type: mpibaij
    row 0: (0, 2.)  (1, 0.)  (2, -1.)  (3, -0.)
    row 1: (0, 0.)  (1, 2.)  (2, -0.)  (3, -1.)
    row 2: (0, -1.)  (1, -0.)  (2, 2.)  (3, 0.)  (4, -1.)  (5, -0.)
    row 3: (0, -0.)  (1, -1.)  (2, 0.)  (3, 2.)  (4, -0.)  (5, -1.)
    row 4: (2, -1.)  (3, -0.)  (4, 2.)  (5, 0.)  (6, -1.)  (7, -0.)
    row 5: (2, -0.)  (3, -1.)  (4, 0.)  (5, 2.)  (6, -0.)  (7, -1.)
    row 6: (4, -1.)  (5, -0.)  (6, 2.)  (7, 0.)  (8, -1.)  (9, -0.)
    row 7: (4, -0.)  (5, -1.)  (6, 0.)  (7, 2.)  (8, -0.)  (9, -1.)
    row 8: (6, -1.)  (7, -0.)  (8, 2.)  (9, 0.)  (10, -1.)  (11, -0.)
    row 9: (6, -0.)  (7, -1.)  (8, 0.)  (9, 2.)  (10, -0.)  (11, -1.)
    row 10: (8, -1.)  (9, -0.)  (10, 2.)  (11, 0.)  (12, -1.)  (13, -0.)
    row 11: (8, -0.)  (9, -1.)  (10, 0.)  (11, 2.)  (12, -0.)  (13, -1.)
    row 12: (10, -1.)  (11, -0.)  (12, 2.)  (13, 0.)  (14, -1.)  (15, -0.)
    row 13: (10, -0.)  (11, -1.)  (12, 0.)  (13, 2.)  (14, -0.)  (15, -1.)
    row 14: (12, -1.)  (13, -0.)  (14, 2.)  (15, 0.)  (16, -1.)  (17, -0.)
    row 15: (12, -0.)  (13, -1.)  (14, 0.)  (15, 2.)  (16, -0.)  (17, -1.)
    row 16: (14, -1.)  (15, -0.)  (16, 2.)  (17, 0.)
    row 17: (14, -0.)  (15, -1.)  (16, 0.)  (17, 2.)
    [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
    [0]PETSC ERROR: Invalid argument
    [0]PETSC ERROR: Must be square matrix, rows 10 columns 12
    [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
    [0]PETSC ERROR: Petsc Release Version 3.8.2, Nov, 09, 2017
    [0]PETSC ERROR: Unknown Name on a Linux-x86_64 named gripen by galbramc Mon Nov 13 17:55:20 2017
    [0]PETSC ERROR: Configure options --with-cc=/usr/bin/gcc-4.8 --with-fc=0 --with-cxx=/usr/bin/g++-4.8 --with-mpi=1 --with-mpi-include=/usr/lib/openmpi/include --with-mpi-lib="[/usr/lib/libmpi_cxx.so,/usr/lib/
    [0]PETSC ERROR: #1 MatGetOrdering() line 243 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/mat/order/sorder.c
    [0]PETSC ERROR: #2 PCSetUp_ILU() line 134 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/pc/impls/factor/ilu/ilu.c
    [0]PETSC ERROR: #3 PCSetUp() line 924 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/pc/interface/precon.c
    [0]PETSC ERROR: #4 KSPSetUp() line 381 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c
    [0]PETSC ERROR: #5 PCSetUpOnBlocks_BJacobi_Singleblock() line 618 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/pc/impls/bjacobi/bjacobi.c
    [0]PETSC ERROR: #6 PCSetUpOnBlocks() line 955 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/pc/interface/precon.c
    [0]PETSC ERROR: #7 KSPSetUpOnBlocks() line 213 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c
    [0]PETSC ERROR: #8 KSPSolve() line 613 in /home/galbramc/workspace/SANSproto/build/debug_gnu48/libPETSc-prefix/petsc-3.8.2/src/ksp/ksp/interface/itfunc.c
    Configure options --with-cc=/usr/bin/gcc-4.8 --with-fc=0 --with-cxx=/usr/bin/g++-4.8 --with-mpi=1 --with-mpi-include=/usr/lib/openmpi/include --with-mpi-lib="[/usr/lib/libmpi_cxx.so,/usr/lib/libmpi.so,/usr/l
    

    It works on a single processor, but I get the above error with two processors.

    I'll work on creating some tests for you today.

  2. Marshall Galbraith reporter

    Did not work:

    [0]PETSC ERROR: Object is in wrong state
    [0]PETSC ERROR: Must call MatXXXSetPreallocation() or MatSetUp() on argument 1 "A" before MatSetOption_MPIBAIJ()
    

    Here is the test:

    static char help[] = "Tests MatCreateMPIBAIJWithArrays()\n\n";
    
    /*T
       Concepts: partitioning
       Processors: 4
    T*/
    
    /*
      Include "petscmat.h" so that we can use matrices.  Note that this file
      automatically includes:
         petscsys.h       - base PETSc routines   petscvec.h - vectors
         petscmat.h - matrices
         petscis.h     - index sets
         petscviewer.h - viewers
    */
    #include <petscmat.h>
    
    int main(int argc,char **args)
    {
      Mat            A;
      PetscInt       *ia,*ja, bs = 2;
      PetscErrorCode ierr;
      PetscMPIInt    rank,size;
    
      ierr = PetscInitialize(&argc,&args,(char*)0,help);if (ierr) return ierr;
      ierr = MPI_Comm_size(PETSC_COMM_WORLD,&size);CHKERRQ(ierr);
      if (size != 4) SETERRQ(PETSC_COMM_WORLD,1,"Must run with 4 processors");
      ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr);
    
      ierr = PetscMalloc1(5,&ia);CHKERRQ(ierr);
      ierr = PetscMalloc1(16,&ja);CHKERRQ(ierr);
      if (!rank) {
        ja[0] = 1; ja[1] = 4; ja[2] = 0; ja[3] = 2; ja[4] = 5; ja[5] = 1; ja[6] = 3; ja[7] = 6;
        ja[8] = 2; ja[9] = 7;
        ia[0] = 0; ia[1] = 2; ia[2] = 5; ia[3] = 8; ia[4] = 10;
      } else if (rank == 1) {
        ja[0] = 0; ja[1] = 5; ja[2] = 8; ja[3] = 1; ja[4] = 4; ja[5] = 6; ja[6] = 9; ja[7] = 2;
        ja[8] = 5; ja[9] = 7; ja[10] = 10; ja[11] = 3; ja[12] = 6; ja[13] = 11;
        ia[0] = 0; ia[1] = 3; ia[2] = 7; ia[3] = 11; ia[4] = 14;
      } else if (rank == 2) {
        ja[0] = 4; ja[1] = 9; ja[2] = 12; ja[3] = 5; ja[4] = 8; ja[5] = 10; ja[6] = 13; ja[7] = 6;
        ja[8] = 9; ja[9] = 11; ja[10] = 14; ja[11] = 7; ja[12] = 10; ja[13] = 15;
        ia[0] = 0; ia[1] = 3; ia[2] = 7; ia[3] = 11; ia[4] = 14;
      } else {
        ja[0] = 8; ja[1] = 13; ja[2] = 9; ja[3] = 12; ja[4] = 14; ja[5] = 10; ja[6] = 13; ja[7] = 15;
        ja[8] = 11; ja[9] = 14;
        ia[0] = 0; ia[1] = 2; ia[2] = 5; ia[3] = 8; ia[4] = 10;
      }
    
      ierr = MatCreateMPIBAIJWithArrays(PETSC_COMM_WORLD,bs,bs*4,bs*4,bs*16,bs*16,ia,ja,NULL,&A);CHKERRQ(ierr);
      ierr = PetscFree(ia);CHKERRQ(ierr);
      ierr = PetscFree(ja);CHKERRQ(ierr);
      ierr = MatView(A,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
      ierr = MatDestroy(&A);CHKERRQ(ierr);
      ierr = PetscFinalize();
      return ierr;
    }
    
  3. Sean Farley staff

    @Jed Brown,

    It is beyond idiotic that references to the issue aren't linked from the issue.

    Every instance I can find both on this page and the line you link to have the issue linkified.

    Also, this issue shouldn't be closed until the commit that "fixes" it is merged to the appropriate integration branch.

    I've brought this up before, actually. If only we were allowed to improve our issue tracker and devote time to it. Alas.

  4. Jed Brown

    @Sean Farley I made the comment two days ago on that 2010 commit that introduced the bug, but it wasn't referenced in any way from this issue so Marshall didn't get any notification nor did Matt have any indication that I had looked at the problem. As the author of the commit, Barry would have been notified, but he's trying to finish a paper.

  5. Log in to comment