- changed title to Test non blocking MPI collectives (CLE>=5.2)
- edited description
Test non blocking MPI collectives (CLE>=5.2)
Issue #16
new
- cd /apps/santis/sandbox/jgp/mpich/src/mpich-3.1.3/test/mpi/f90/coll
- ftn -c mtestf90.f90
- ftn nonblockingf90.f90 mtestf90.o -o $PE_ENV
- aprun -n8 ./CRAY
Comments (5)
-
reporter -
reporter - cd /apps/santis/sandbox/jgp/mpich/src/mpich-3.1.4/test/mpi/coll
- scorep --mpp=mpi cc 2jg.c -dynamic
# nonblocking2.c
- aprun -n 4 -N 4 -d 1 -j 1 ./a.out
-
reporter Nonblocking collectives like MPI_Ibcast are features of MPI 3.0 (and above). Score-P provides support for MPI up to v2.2. However we are working on the support of the new MPI 3.x features.
-
reporter Speedup.ch/2015
Compile
- make MPICXX="scorep --mpp=mpi CC"
Run
- aprun -n 1 ./stencil 1024 1 50
last heat: 2433.555556 time: 0.740542
- aprun -n 8 -N 8 -d 1 -j 1 ./stencil_mpi_ddt+sc142 1024 1 50 4 2
[0] last heat: 99.666667 time: 3.936471
- aprun -n 8 -N 8 -d 1 -j 1 ./stencil_mpi_carttopo_neighcolls+sc142 1024 1 50 4 2
[0] last heat: 99.666667
* Non blocking collective info is missing in tracefile:
MPI_Ineighbor_alltoallv(sbuf, counts, displs, MPI_DOUBLE, rbuf, counts, displs, MPI_DOUBLE, topocomm, &req);
-
reporter Speedup.ch/2015
void operator() (buffer_t* buffer) { int size= 2 * buffer->tile_size() * comm->max_n * comm->max_n; if(comm->nbc == comm_t::FFT_NBC) { NBC_Ialltoall(buffer->a2as, size, MPI_DOUBLE, buffer->a2ar, size, MPI_DOUBLE, MPI_COMM_WORLD, &buffer->handle); } else { MPI_Alltoall(buffer->a2as, size, MPI_DOUBLE, buffer->a2ar, size, MPI_DOUBLE, MPI_COMM_WORLD); }
Cray XC
Compile
MPICXX=CC CC=cc CXX=CC F77=ftn \ ./configure \ --prefix=/apps/escha/sandbox/jgp/hoefler/libNBC/1.1.1/xc/gnu_482 \
- module swap PrgEnv-cray PrgEnv-gnu
- module load fftw/3.3.4.4
CC -w \ -I../libNBC/1.1.1/xc/gnu_482/include \ -L../libNBC/1.1.1/xc/gnu_482/lib \ 3d-fft.cpp -lnbc
Run
- aprun -n2 -N1 a.out
1 repetitions of N=320, testsize: 0, testint 0, tests: 0, max_n: 160 approx. size: 1000.000000 MB normal (MPI): 5.617800 (NBC_A2A: 0.080495/0.000000) (Test: 0.000000) (2x1d-fft: 3.083646) - 1x131072000 byte normal (NBC): 5.597339 (NBC_A2A: 0.048640/0.020811) (Test: 0.000000) (2x1d-fft: 3.094804) - 1x131072000 byte pipe (NBC): 5.403412 (NBC_A2A: 0.036380/0.025574) (Test: 0.000000) (2x1d-fft: 3.074958) - 1x131072000 byte tile (NBC): 5.357859 (NBC_A2A: 0.040936/0.018718) (Test: 0.000000) (2x1d-fft: 3.037793) - 1x131072000 byte win (NBC): 5.298810 (NBC_A2A: 0.044356/0.015685) (Pack: 0.000000) (2x1d-fft: 2.955262) - 1x131072000 byte wintile (NBC): 5.219840 (NBC_A2A: 0.082914/0.060287) (Pack: 0.000000) (2x1d-fft: 2.887048) - 1x131072000 byte # real 42.13
- Log in to comment