QESPRESSO (scorep)

Issue #52 new
jg piccinali repo owner created an issue

Scorep/142

Daint

Setup

  • cd /apps/santis/sandbox/jgp/qe/src/5.2/INTEL1501/espresso-5.2.0/
  • module swap PrgEnv-cray PrgEnv-intel
  • module load craype-accel-nvidia35
  • module unload atp
  • module load fftw

Compile

  • export CC="cc"
  • export FC="ftn"
  • export F77="ftn"
  • export MPIF90="ftn"
  • ./configure ARCH=crayxt --enable-openmp --enable-parallel \
  • --with-scalapack --prefix=...
  • make -j12 pw; make install

Run

  • benchmark
  • cd /apps/santis/sandbox/jgp/qe/5.2/JG/small/
  • aprun -n16 -N8 -d1 ./pw.x_int1501 -in test_1.in
     PWSCF        :     5.92s CPU         6.66s WALL
    JOB DONE.
real 9.37

Compile (scorep)

  • module load scorep/1.4.2
  • vim make.sys
PREP = scorep --mpp=mpi --thread=omp
MPIF90         = $(PREP) ftn
CC             = $(PREP) cc
F77            = $(PREP) ftn
FC             = $(PREP) ftn
LD             = $(PREP) ftn
  • make clean; make pw
    • compilation stops because undef. ref BUT pw.x is ready...
    • Link with:
export R=/apps/daint/5.2.UP02/sandbox/jgp/qe/src/5.2/INTEL1501/espresso-5.2.0
cd PW/src
scorep --mpp=mpi --thread=omp ftn -openmp -o pw.x pwscf.o  libpw.a  \
-L$R/flib \
-L$R/clib \
-L$R/iotk/src \
-L$R/Modules \
-lptools -lqemod -lflib -lclib -liotk

Run (scorep/profiling)

  • export SCOREP_ENABLE_PROFILING=true
  • export SCOREP_ENABLE_TRACING=false
  • square scorep-n16N8d1/ cube.png

Run (scorep/filtering)

SCOREP_REGION_NAMES_BEGIN
 EXCLUDE
...
SCOREP_REGION_NAMES_END
  • scorep-score -f scorep-n16N8d1/filter.jg scorep-n16N8d1/profile.cubex
Estimated aggregate size of event trace:                   434MB
Estimated requirements for largest trace buffer (max_buf): 32MB
Estimated memory requirements (SCOREP_TOTAL_MEMORY):       34MB
flt     type    max_buf[B]      visits time[s] time[%] time/visit[us]  region
 -       ALL 1,012,309,257 574,452,834  293.68   100.0           0.51  ALL
 -       USR 1,006,841,550 572,942,546  169.87    57.8           0.30  USR
 -       MPI     3,907,698     308,500   60.84    20.7         197.23  MPI
 -       OMP     1,658,094     814,404   15.28     5.2          18.76  OMP
 -       COM       650,572     387,384   47.69    16.2         123.11  COM

 *       ALL    32,618,675  16,989,693  147.96    50.4           8.71  ALL-FLT
 +       FLT   979,690,582 557,463,141  145.73    49.6           0.26  FLT
 *       USR    27,150,968  15,479,405   24.14     8.2           1.56  USR-FLT
 -       MPI     3,907,698     308,500   60.84    20.7         197.23  MPI-FLT
 -       OMP     1,658,094     814,404   15.28     5.2          18.76  OMP-FLT
 *       COM       650,572     387,384   47.69    16.2         123.11  COM-FLT

Run (scorep/tracing)

  • Recompile with -g
  • export SCOREP_ENABLE_PROFILING=false
  • export SCOREP_ENABLE_TRACING=true
  • export SCOREP_TOTAL_MEMORY=100M
  • export SCOREP_FILTERING_FILE=scorep-n16N8d1/filter.jg

vampir.jpg

Comments (2)

  1. Log in to comment