summer school - openacc/pgi (perftools)

Issue #37 new
jg piccinali repo owner created an issue

DAINT

Get the src

PGI

Compile

  • module swap PrgEnv-cray PrgEnv-pgi
  • module load craype-accel-nvidia35
  • module load perftools
Currently Loaded Modulefiles:
  1) modules/3.2.10.3
  2) nodestat/2.2-1.0502.53712.3.109.ari
  3) sdb/1.0-1.0502.55976.5.27.ari
  4) alps/5.2.1-2.0502.9041.11.6.ari
  5) lustre-cray_ari_s/2.5_3.0.101_0.31.1_1.0502.8394.10.1-1.0502.17198.8.51
  6) udreg/2.3.2-1.0502.9275.1.12.ari
  7) ugni/5.0-1.0502.9685.4.24.ari
  8) gni-headers/3.0-1.0502.9684.5.2.ari
  9) dmapp/7.0.1-1.0502.9501.5.219.ari
 10) xpmem/0.1-2.0502.55507.3.2.ari
 11) hss-llm/7.2.0
 12) Base-opts/1.0.2-1.0502.53325.1.2.ari
 13) craype-network-aries
 14) craype-sandybridge
 15) craype/2.4.0
 16) slurm
 17) cray-mpich/7.2.2
 18) ddt/5.0
 19) pgi/15.3.0
 20) totalview-support/1.1.4
 21) totalview/8.11.0
 22) pmi/5.0.7-1.0000.10678.155.25.ari
 23) atp/1.8.2
 24) PrgEnv-pgi/5.2.40
 25) cudatoolkit/6.5.14-1.0502.9613.6.1
 26) craype-accel-nvidia35
 27) rca/1.0.0-2.0502.53711.3.127.ari
 28) perftools/6.2.4
  • make clean
  • make main
ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g -c stats.f90
ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g -c data.f90
ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g -c operators.f90
ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g -c linalg.f90
ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g -c io.f90
ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g  \
stats.o   data.o   operators.o     linalg.o     io.o main.f90  -o main

## ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g   \
-c operators_mpi.f90 -DUSE_G2G
## ftn -acc=verystrict -ta=nvidia,nofma,cc35,cuda6.5  -Mpreprocess  -g  \
stats.o   data.o   operators_mpi.o linalg.o     io.o main.f90  \
-o main_mpi
  • ls $CRAYPAT_ROOT/share/traces/
  • pat_build -g oacc main
INFO: A maximum of 70 functions from group 'oacc' will be traced.

Profile

  • sbatch.sh santis 1 main+pat 1 1 1 "1024 1024 100 0.0025"
Experiment data file written: main+pat+26467-12t.xf

❗ overhead

Analyze (KO)

  • pat_report *.xf >xf

pt624_pgi.png

Comments (6)

  1. jg piccinali reporter
    • pat_build -f -u -g oacc main
    WARNING: Tracing small, frequently called functions can add excessive overhead.
    WARNING: To set a minimum size, say 800 bytes, for traced functions, use:
        -D trace-text-size=800.
    INFO: A total of 15 selected non-group functions were traced.
    INFO: A maximum of 70 functions from group 'oacc' will be traced.
    /tmp/buildvAH7rq/wrapStaticFunctions.c:17: error: redefinition of parameter 'u'
    /tmp/buildvAH7rq/wrapStaticFunctions.c:17: error: previous definition of 'u' was here
    /tmp/buildvAH7rq/wrapStaticFunctions.c:17: error: redefinition of parameter 's'
    /tmp/buildvAH7rq/wrapStaticFunctions.c:17: error: previous definition of 's' was here
    
  2. Log in to comment