Perftools API

Issue #45 new
jg piccinali repo owner created an issue

Description

C API

#include <pat_api.h>
// /opt/cray/perftools/default/include/
        int istat=PAT_API_FAIL;
        istat=PAT_record(PAT_STATE_OFF);
...
        int istat;
        istat=PAT_record(PAT_STATE_ON); //printf("%d: pat_rec=%d\n", __LINE__, istat);
        istat=PAT_region_begin( 1, "loop1" );
... loop1 ...
      istat=PAT_region_end( 1 );
      istat=PAT_record(PAT_STATE_OFF);

Fortran API

include 'pat_apif.h'
! /opt/cray/perftools/default/include/
      integer :: istat
      call PAT_record(PAT_STATE_OFF, istat)
...
      call PAT_record(PAT_STATE_ON, istat); !print *,"pat_rec=",istat
      call PAT_record(PAT_STATE_ON, istat); !print *,"pat_rec=",istat
      call PAT_region_begin( 1, "loop1", istat )
... loop1 ...
      call PAT_region_end ( 1, istat )
      call PAT_record(PAT_STATE_OFF, istat)

Get the src:

  • git clone EuroHack15.git
  • cd examples/qwiklab/perftools_api

Setup:

  • module load perftools/6.2.4
  • module list
Currently Loaded Modulefiles:
modules/3.2.10.3
nodestat/2.2-1.0502.53712.3.109.ari
sdb/1.0-1.0502.55976.5.27.ari
alps/5.2.1-2.0502.9041.11.6.ari
lustre-cray_ari_s/2.5_3.0.101_0.31.1_1.0502.8394.10.1-1.0502.17198.8.51
udreg/2.3.2-1.0502.9275.1.12.ari
ugni/5.0-1.0502.9685.4.24.ari
gni-headers/3.0-1.0502.9684.5.2.ari
dmapp/7.0.1-1.0502.9501.5.219.ari
xpmem/0.1-2.0502.55507.3.2.ari
hss-llm/7.2.0
Base-opts/1.0.2-1.0502.53325.1.2.ari
craype-network-aries
craype/2.4.0
cce/8.3.12
totalview-support/1.1.4
totalview/8.11.0
cray-libsci/13.0.4
pmi/5.0.7-1.0000.10678.155.25.ari
rca/1.0.0-2.0502.53711.3.127.ari
atp/1.8.2
PrgEnv-cray/5.2.40
craype-sandybridge
slurm
cray-mpich/7.2.2
ddt/5.0
cray-libsci_acc/3.1.1
cudatoolkit/6.5.14-1.0502.9613.6.1
craype-accel-nvidia35
perftools/6.2.4

Compile:

  • module load perftools/6.2.4
  • C code:
    • pat_help API regions C .
    • cc -hnoomp -D_CSCS_ITMAX=1000 \
    • -D_CRAYPAT_CSCS jacobi_openmp.c -o l1l2
    • pat_build -f -u l1l2
  • Fortran code:
    • pat_help API regions Fortran .
    • ftn -D_CRAYPAT_CSCS task1_patrecord-loop.F90 -o l1l2
    • pat_build -f -u l1l2

Run

  • rm -f *.xf *.ap2
  • aprun -n1 l1l2+pat

Report

  • pat_report -T -s traced_functions=show l1l2+pat*.xf > xf

no user instrumentation

Table 1:  Profile by Function Group and Function
  Time% |     Time | Imb. |  Imb. | Calls |Group
        |          | Time | Time% |       | Function
 100.0% | 1.917253 |   -- |    -- |   7.0 |Total
|-----------------------------------------------------
| 100.0% | 1.917253 |   -- |    -- |   7.0 |USER
||----------------------------------------------------
||  99.9% | 1.914418 |   -- |    -- |   1.0 |jacobi
||   0.1% | 0.002794 |   -- |    -- |   1.0 |init_host
||   0.0% | 0.000024 |   -- |    -- |   1.0 |main
||   0.0% | 0.000008 |   -- |    -- |   1.0 |stop_timer
||   0.0% | 0.000007 |   -- |    -- |   2.0 |mytimer_
||   0.0% | 0.000002 |   -- |    -- |   1.0 |start_timer
|==========================================

user instrumenting only 1 loop

Table 1:  Profile by Function Group and Function
  Time% |     Time | Imb. |  Imb. | Calls |Group
        |          | Time | Time% |       | Function
 100.0% | 0.001708 |   -- |    -- |   2.0 |Total
|---------------------------------------------------
| 100.0% | 0.001708 |   -- |    -- |   2.0 |USER
||--------------------------------------------------
||  98.2% | 0.001677 |   -- |    -- |   1.0 |#1.loop1
||   1.8% | 0.000031 |   -- |    -- |   1.0 |main
|===================================================

user instrumenting 2 loops

Table 1:  Profile by Function Group and Function
  Time% |     Time | Imb. |  Imb. | Calls |Group
        |          | Time | Time% |       | Function
 100.0% | 0.002019 |   -- |    -- |   3.0 |Total
|---------------------------------------------------
| 100.0% | 0.002019 |   -- |    -- |   3.0 |USER
||--------------------------------------------------
||  82.0% | 0.001656 |   -- |    -- |   1.0 |#1.loop1
||  14.8% | 0.000299 |   -- |    -- |   1.0 |#2.loop2
||   3.2% | 0.000064 |   -- |    -- |   1.0 |main
|===================================================

Comments (5)

  1. jg piccinali reporter

    PGI

    Setup

    • export PATH=/users/hck28/pgi/linux86-64/15.5/bin:$PATH
    • ftn -V pgf90 15.5-0
    • cc -V pgcc 15.5-0

    Fortran

    Table 1:  Profile by Function Group and Function
      Time% |     Time | Imb. |  Imb. |  Calls |Group
            |          | Time | Time% |        | Function
     100.0% | 1.123486 |   -- |    -- | 2002.0 |Total
    |----------------------------------------------------
    | 100.0% | 1.123486 |   -- |    -- | 2002.0 |USER
    ||---------------------------------------------------
    ||  69.0% | 0.775205 |   -- |    -- | 1000.0 |#1.loop1
    ||  30.9% | 0.347072 |   -- |    -- | 1000.0 |#2.loop2
    ||   0.1% | 0.001208 |   -- |    -- |    1.0 |MAIN_
    ||   0.0% | 0.000001 |   -- |    -- |    1.0 |main
    

    C: ko

    • same issue with pgi/15.3 and pgi/15.5
    Experiment data file written:
    /scratch/santis/piccinal/EuroHack15.git/examples/qwiklab/perftools_api/PGI/l1l2+pat+21879-12t.xf
    Application 164785 exit codes: 32
    Command exited with non-zero status 32
    
  2. jg piccinali reporter

    C: ok with perftools/6.2.3

    Table 1:  Profile by Function Group and Function
    
      Time% |     Time | Imb. |  Imb. | Calls |Group
            |          | Time | Time% |       | Function
    
     100.0% | 0.001965 |   -- |    -- |   4.0 |Total
    |---------------------------------------------------
    | 100.0% | 0.001965 |   -- |    -- |   4.0 |USER
    ||--------------------------------------------------
    ||  83.2% | 0.001635 |   -- |    -- |   1.0 |#1.loop1
    ||  14.7% | 0.000289 |   -- |    -- |   1.0 |#2.loop2
    ||   1.4% | 0.000028 |   -- |    -- |   1.0 |main
    ||   0.6% | 0.000012 |   -- |    -- |   1.0 |exit
    |===================================================
    
  3. Log in to comment