- edited description
summer school - openacc/cce (perftools)
Issue #36
new
DAINT
Get the src
- ssh -Y daint01
- git clone https://github.com/bcumming/summer-school.git
- cd summer-school.git/openacc/fortran/
CCE
Compile
- module load craype-accel-nvidia35
- module load perftools
Currently Loaded Modulefiles:
1) modules/3.2.10.3
2) nodestat/2.2-1.0502.53712.3.109.ari
3) sdb/1.0-1.0502.55976.5.27.ari
4) alps/5.2.1-2.0502.9041.11.6.ari
5) lustre-cray_ari_s/
2.5_3.0.101_0.31.1_1.0502.8394.10.1-1.0502.17198.8.51
6) udreg/2.3.2-1.0502.9275.1.12.ari
7) ugni/5.0-1.0502.9685.4.24.ari
8) gni-headers/3.0-1.0502.9684.5.2.ari
9) dmapp/7.0.1-1.0502.9501.5.219.ari
10) xpmem/0.1-2.0502.55507.3.2.ari
11) hss-llm/7.2.0
12) Base-opts/1.0.2-1.0502.53325.1.2.ari
13) craype-network-aries
14) craype/2.4.0
15) cce/8.3.12
16) totalview-support/1.1.4
17) totalview/8.11.0
18) cray-libsci/13.0.4
19) pmi/5.0.7-1.0000.10678.155.25.ari
20) rca/1.0.0-2.0502.53711.3.127.ari
21) atp/1.8.2
22) PrgEnv-cray/5.2.40
23) craype-sandybridge
24) slurm
25) cray-mpich/7.2.2
26) ddt/5.0
27) cray-libsci_acc/3.1.1
28) cudatoolkit/6.5.14-1.0502.9613.6.1
29) craype-accel-nvidia35
30) perftools/6.2.4
- make clean
- make
ftn -rmd -hacc -O3 -e Z -c stats.f90 -o stats.o
ftn -rmd -hacc -O3 -e Z -c data.f90 -o data.o
ftn -rmd -hacc -O3 -e Z -c operators.f90 -o operators.o
ftn -rmd -hacc -O3 -e Z -c linalg.f90 -o linalg.o
ftn -rmd -hacc -O3 -e Z -c io.f90 -o io.o
ftn -rmd -hacc -O3 -e Z \
stats.o data.o operators.o linalg.o io.o \
main.f90 -o main
ftn -rmd -hacc -O3 -e Z -c operators_mpi.f90 -DUSE_G2G \
-o operators_mpi.o
ftn -rmd -hacc -O3 -e Z \
stats.o data.o operators_mpi.o linalg.o io.o \
main.f90 -o main_mpi
- ls $CRAYPAT_ROOT/share/traces/
- pat_build -g oacc main
INFO: A maximum of 43 functions from group 'oacc' will be traced.
Profile
- sbatch.sh santis 1 main+pat 1 1 1 "512 512 50 0.0025"
aprun -n 1 main+pat 512 512 50 0.0025
CrayPat/X: Version 6.2.4
==============================
Welcome to mini-stencil!
mesh :: 512 * 512 dx = 1.95694714784622192E-3
time :: 50 time steps from 0 .. 2.50000000000000005E-3
=============================
-------------------------------------------------
simulation took 2.543 seconds
4676 conjugate gradient iterations 1838.49 per second
246 nonlinear newton iterations
-----------------------------------------------
Experiment data file written: main+pat+23524-12t.xf
Analyze
- pat_report *xf >xf
Processing step 5 of 5
Comments (5)
-
reporter -
reporter PROFILE (1024x1024)
- sbatch.sh santis 1 main+pat 1 1 1 "1024 1024 100 0.0025"
-
reporter - changed title to summer school - openacc/cce (perftools)
-
reporter pat_build -u -g oacc main
- sbatch.sh santis 5 main+pat 1 1 1 "512 512 500 0.0025"
-
reporter perftools-lite
- module load perftools-lite/6.2.4
- export CRAYPAT_LITE=gpu
- make clean; make main
INFO: creating the CrayPat-instrumented executable 'main' (gpu) ...OK INFO: A maximum of 335 functions from group 'cuda' will be traced.
- sbatch.sh santis 5 main+ptl624 1 1 1 "512 512 500 0.0025"
- Log in to comment