- edited description
QESPRESSO (scorep)
Issue #52
new
Scorep/142
Daint
Setup
- cd /apps/santis/sandbox/jgp/qe/src/5.2/INTEL1501/espresso-5.2.0/
- module swap PrgEnv-cray PrgEnv-intel
- module load craype-accel-nvidia35
- module unload atp
- module load fftw
Compile
- export CC="cc"
- export FC="ftn"
- export F77="ftn"
- export MPIF90="ftn"
- ./configure ARCH=crayxt --enable-openmp --enable-parallel \
- --with-scalapack --prefix=...
- make -j12 pw; make install
Run
- benchmark
- cd /apps/santis/sandbox/jgp/qe/5.2/JG/small/
- aprun -n16 -N8 -d1 ./pw.x_int1501 -in test_1.in
PWSCF : 5.92s CPU 6.66s WALL
JOB DONE.
real 9.37
Compile (scorep)
- module load scorep/1.4.2
- vim make.sys
PREP = scorep --mpp=mpi --thread=omp
MPIF90 = $(PREP) ftn
CC = $(PREP) cc
F77 = $(PREP) ftn
FC = $(PREP) ftn
LD = $(PREP) ftn
- make clean; make pw
- compilation stops because undef. ref BUT pw.x is ready...
- Link with:
export R=/apps/daint/5.2.UP02/sandbox/jgp/qe/src/5.2/INTEL1501/espresso-5.2.0
cd PW/src
scorep --mpp=mpi --thread=omp ftn -openmp -o pw.x pwscf.o libpw.a \
-L$R/flib \
-L$R/clib \
-L$R/iotk/src \
-L$R/Modules \
-lptools -lqemod -lflib -lclib -liotk
Run (scorep/profiling)
- export SCOREP_ENABLE_PROFILING=true
- export SCOREP_ENABLE_TRACING=false
- square scorep-n16N8d1/
Run (scorep/filtering)
SCOREP_REGION_NAMES_BEGIN
EXCLUDE
...
SCOREP_REGION_NAMES_END
- scorep-score -f scorep-n16N8d1/filter.jg scorep-n16N8d1/profile.cubex
Estimated aggregate size of event trace: 434MB
Estimated requirements for largest trace buffer (max_buf): 32MB
Estimated memory requirements (SCOREP_TOTAL_MEMORY): 34MB
flt type max_buf[B] visits time[s] time[%] time/visit[us] region
- ALL 1,012,309,257 574,452,834 293.68 100.0 0.51 ALL
- USR 1,006,841,550 572,942,546 169.87 57.8 0.30 USR
- MPI 3,907,698 308,500 60.84 20.7 197.23 MPI
- OMP 1,658,094 814,404 15.28 5.2 18.76 OMP
- COM 650,572 387,384 47.69 16.2 123.11 COM
* ALL 32,618,675 16,989,693 147.96 50.4 8.71 ALL-FLT
+ FLT 979,690,582 557,463,141 145.73 49.6 0.26 FLT
* USR 27,150,968 15,479,405 24.14 8.2 1.56 USR-FLT
- MPI 3,907,698 308,500 60.84 20.7 197.23 MPI-FLT
- OMP 1,658,094 814,404 15.28 5.2 18.76 OMP-FLT
* COM 650,572 387,384 47.69 16.2 123.11 COM-FLT
Run (scorep/tracing)
- Recompile with
-g
- export SCOREP_ENABLE_PROFILING=false
- export SCOREP_ENABLE_TRACING=true
- export SCOREP_TOTAL_MEMORY=100M
- export SCOREP_FILTERING_FILE=scorep-n16N8d1/filter.jg
Comments (2)
-
reporter -
reporter - edited description
- Log in to comment