Clone wiki

petsc-3.3-omp / Home

OpenMP enabled PETSc

OpenMP thread-level parallelisation has bee added to PETSc Vec and Mat classes, which define the kernel of the majority of the computation for CSR and Block-CSR formats.

Task-based sparse Matrix-Vector Multiplication (spMVM)

The current version uses task-based spMVM to overlap MPI communication with local computation. In addition, a non-zero-based load balancing scheme is available that balances the workload between threads. This scheme is activated with the option "-matmult_nz_balance"; otherwise a row-based thread partitioning is used.

Release notes:

  • When running a single MPI process with multiple threads thread 0 will not act as a worker thread during MatMult. For scaling with a single MPI process this means that only t-1 threads are actively sharing work when OMP_NUM_THREADS=t.
  • A purely vector-based version of petsc-3.3-omp is available here.