Keep track of masked-out volume in CarpetMask

Create issue
Issue #434 closed
Erik Schnetter created an issue

Keep track of the volume that is masked out by CarpetMask, and take this volume into account when checking in CarpetReduce that the integral over the simulation domain equals the domain volume.

Keyword:

Comments (31)

  1. Roland Haas
    • removed comment

    as a stopgap measure (and maybe also afterwards): warn only once when the volumes begin and end to differ.

    Compiles but not further tested :-)

  2. Erik Schnetter reporter
    • removed comment

    Introduce a grid scalar CarpetReduce::excised_cells that tallies the volume that is excised from reduction operations, as is done e.g. in CarpetMask. This allows testing the correctness of the weight grid function for reduction operations in the presence of excision.

  3. Roland Haas
    • removed comment

    I tried with the attached parfile (2 processors). The mask patch does not seem to fix the issue. My (completely ecidence less) suspicion would be that the issue are the outer boundaries (which are included by the reduction operation in CarpetReduce) and the ghost zones (which end up having weight 1 but are ignored by the reduction since it skips cctk_nghostzones points [at least did so in the git version where I looked at it last]). I get:

    INFO (CarpetReduce): Finalise the weight on level 0 INFO (CarpetReduce): Testing weight INFO (CarpetReduce): Reduction weight sum: 2009.5 INFO (CarpetReduce): Volume of map #0: 2156 INFO (CarpetReduce): Simulation domain volume: 1963 INFO (CarpetReduce): Additional excised volume: 193 and a warning:

    WARNING level 1 in thorn CarpetReduce processor 0 host horizon.tapir.caltech.edu (line 127 of /home/rhaas/ET/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ The sum of "Simulation domain volume" and "Additional excised volume" is "Volume of map #0".

  4. Erik Schnetter reporter
    • removed comment

    The problem was that the excision region counted how much was excised before the boundary was marked, so that the boundary was effectively removed twice from the domain.

    Updated patch attached.

  5. Roland Haas
    • removed comment

    Doesn't seem to fix it yet though it's a step in the right direction.

    INFO (CarpetReduce): Finalise the weight on level 0 INFO (CarpetReduce): Testing weight INFO (CarpetReduce): Reduction weight sum: 2009.5 INFO (CarpetReduce): Volume of map #0: 2156 INFO (CarpetReduce): Simulation domain volume: 2068 INFO (CarpetReduce): Additional excised volume: 88 2 2.000 WARNING level 1 in thorn CarpetReduce processor 0 host horizon.tapir.caltech.edu (line 127 of /home/rhaas/ET/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ WARNING level 1 in thorn CarpetReduce processor 0 host horizon.tapir.caltech.edu (line 127 of /home/rhaas/ET/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ LoopControl timing statistics: Loop #0 "CarpetSurfaceSetup_all": total count: 3 total setup: 9.53674e-07 total calc: 0.00165701 avg calc: 0.000552336 avg first calc: 0.000618935 avg improvement: 11% saved: 0.000199795 seconds Loop #1 "MaskBase_SetMask_all": total count: 4 total setup: 0 total calc: 0.000102043 avg calc: 2.55108e-05 avg first calc: 3.00407e-05 avg improvement: 15% saved: 1.81198e-05 seconds Total calculation time: 0.00182819 seconds; total saved time: 0.000217915 seconds -------------------------------------------------------------------------------- Done.

    If I make the spherical surface large enough so that everything is excluded (radius=500) I get:

    INFO (CarpetReduce): Finalise the weight on level 0 INFO (CarpetReduce): Testing weight INFO (CarpetReduce): Reduction weight sum: 0 INFO (CarpetReduce): Volume of map #0: 2156 INFO (CarpetReduce): Simulation domain volume: 1001 INFO (CarpetReduce): Additional excised volume: 1155 2 2.000 LoopControl timing statistics: Loop #0 "CarpetSurfaceSetup_all": total count: 3 total setup: 0 total calc: 0.00162721 avg calc: 0.000542402 avg first calc: 0.000565052 avg improvement: 4% saved: 6.79493e-05 seconds Loop #1 "MaskBase_SetMask_all": total count: 4 total setup: 0 total calc: 9.67979e-05 avg calc: 2.41995e-05 avg first calc: 2.5034e-05 avg improvement: 3% saved: 3.33786e-06 seconds Total calculation time: 0.00177097 seconds; total saved time: 7.12872e-05 seconds -------------------------------------------------------------------------------- Done. WARNING level 1 in thorn CarpetReduce processor 0 host horizon.tapir.caltech.edu (line 127 of /home/rhaas/ET/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ WARNING level 1 in thorn CarpetReduce processor 0 host horizon.tapir.caltech.edu (line 127 of /home/rhaas/ET/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ WARNING level 1 in thorn CarpetReduce processor 0 host horizon.tapir.caltech.edu (line 127 of /home/rhaas/ET/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ This works on single MPI process (which seems to point towards the ghost zones).

  6. Roland Haas
    • removed comment

    actually it seems as if only a CCTK_ReduceLocalScalar was missing. mask.3.patch adds it. This passes my test file with 1 and 2 MPI processes and with small and large excised region.

  7. Roland Haas
    • changed status to resolved
    • removed comment

    This works for me, at least with trunk. I tried simfactorie's qc-mclachlan.par file on Kraken today. And --- after fixing the number of metric_timelevels to avoid some low level Carpet assert (the one in ticket #163) --- it properly keeps track of the total volume for at least 2560 itrations or t=5M.

    Upon inspecting the timestamps of comment 3 (fixed, 2011-11-09) which is also the commit timestamp CarpetMask/ofb3ca7d192ff4 and the Maxwell release timestamp (2011-10-24) I am no longer surprised. This was not fixed in time for the release.

  8. Roland Haas
    • changed status to open
    • removed comment

    This seems to happen again with the version of qc0-mclachlan.par proposed in #732 (which only adds more analysis routines). I observe the warning when running with 36 cores and 6 threads each but not when running on 1 core one thread or 2 cores 1 thread each.

  9. Erik Schnetter reporter
    • removed comment

    Roland, do you want to try this patch?

    $ git diff
    diff --git a/Carpet/CarpetMask/src/mask_surface.cc b/Carpet/CarpetMask/src/mask_
    index f2a592b..2b7ee6f 100644
    --- a/Carpet/CarpetMask/src/mask_surface.cc
    +++ b/Carpet/CarpetMask/src/mask_surface.cc
    @@ -133,6 +133,15 @@ namespace CarpetMask {
               CCTK_LOOP3_ALL(CarpetSurfaceSetup, cctkGH, i,j,k) {
                 int const ind = CCTK_GFINDEX3D (cctkGH, i, j, k);
    
    +            bool const is_ghost =
    +              (not cctk_bbox[0] and i < cctk_nghostzones[0]) or
    +              (not cctk_bbox[2] and j < cctk_nghostzones[1]) or
    +              (not cctk_bbox[4] and k < cctk_nghostzones[2]) or
    +              (not cctk_bbox[1] and i >= cctk_lsh[0] - cctk_nghostzones[0]) or
    +              (not cctk_bbox[3] and j >= cctk_lsh[1] - cctk_nghostzones[1]) or
    +              (not cctk_bbox[5] and k >= cctk_lsh[2] - cctk_nghostzones[2]);
    +            CCTK_REAL const ghost_factor = CCTK_REAL (not is_ghost);
    +            
                 CCTK_REAL const dx = x[ind] - x0;
                 CCTK_REAL const dy = y[ind] - y0;
                 CCTK_REAL const dz = z[ind] - z0;
    @@ -142,7 +151,8 @@ namespace CarpetMask {
                 if (rho < 1.0e-12) {
                   // Always excise the surface origin
                   // Tally up the weight we are removing
    -              * excised_cells += cell_volume * factor * BCNT(iweight[ind]);
    +              * excised_cells +=
    +                cell_volume * factor * ghost_factor * BCNT(iweight[ind]);
                   iweight[ind] = 0;
                 } else {
                   CCTK_REAL theta =
    @@ -188,7 +198,8 @@ namespace CarpetMask {
                     sf_radius[a + maxntheta * (b + maxnphi * sn)];
                   if (rho <= dr * shrink_factor) {
                     // Tally up the weight we are removing
    -                * excised_cells += cell_volume * factor * BCNT(iweight[ind]);
    +                * excised_cells +=
    +                  cell_volume * factor * ghost_factor * BCNT(iweight[ind]);
                     iweight[ind] = 0;
                   }
                 }
    
  10. Roland Haas
    • removed comment

    Yes. Unfortunately right now I get a SEGFAULT right after the error message. Currently I am doing a recompile from scratch.

    Even with the segfault I should be able to tell if this improves the situation.

    From the look of the patch and the fact that the error is MPI process number dependent the fix seems to be in the right direction.

  11. Roland Haas
    • removed comment

    Does not seem to fix the issue though the number change slightly. I currently have

    ||= Code version =||= MPI processes =||= threads per process =||= Simulation domain volume =||= Additional excised volume =||= Reduction weight sum =|| || trunk || 6 || 6 || 431999.99795532227 || 0.002044677734375 || 431999.99794769287 || || trunk || 2 || 6 || 431999.99795532227 || 0.002044677734375 || 431999.99794769287 || || trunk || 8 || 6 || 431999.99795150757 || 0.002048492431640625 || 431999.99794769287 || || patched || 6 || 6 || 431999.99796295166 || 0.00203704833984375 || 431999.99794769287 || || patched || 1 || 1 || no difference || no difference || no difference || || patched || 36 || 1 || no difference || no difference || no difference || || patched || 6 || 1 || no difference || no difference || no difference ||

    The expected total volume before exclusion (which is "Simulation domain valume"+"Additional excised volume") is 432000.

    So it seems as if the error is due to some OpenMP issue since all runs using 1 thread per process find no difference whereas all runs that use 6 threads per process do find a difference.

    Please note that all runs (with or without the patch, with any number of processes and threads) abort shortly afterwards with a SEGFAULT exception:

    [zwicky163:02171] [ 0] /lib64/libpthread.so.0 [0x2b1c3e286be0]
    [zwicky163:02171] [ 1] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(opal_memory_ptmalloc2_int_malloc+0x1fa) [0x2c4bf3a]
    [zwicky163:02171] [ 2] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim [0x2c4ab58]
    [zwicky163:02171] [ 3] /usr/lib64/libstdc++.so.6(_Znwm+0x1d) [0x2b1c3ff081dd]
    [zwicky163:02171] [ 4] /usr/lib64/libstdc++.so.6(_ZNSs4_Rep9_S_createEmmRKSaIcE+0x21) [0x2b1c3fee6861]
    [zwicky163:02171] [ 5] /usr/lib64/libstdc++.so.6(_ZNSs4_Rep8_M_cloneERKSaIcEm+0x2b) [0x2b1c3fee723b]
    [zwicky163:02171] [ 6] /usr/lib64/libstdc++.so.6(_ZNSs7reserveEm+0x45) [0x2b1c3fee7b45]
    [zwicky163:02171] [ 7] /usr/lib64/libstdc++.so.6(_ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE8overflowEi+0xd5) [0x2b1c3fee10c5]
    [zwicky163:02171] [ 8] /usr/lib64/libstdc++.so.6(_ZNSt15basic_streambufIcSt11char_traitsIcEE6xsputnEPKcl+0x7d) [0x2b1c3fee5d1d]
    [zwicky163:02171] [ 9] /usr/lib64/libstdc++.so.6(_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc+0x111) [0x2b1c3fedb311]
    [zwicky163:02171] [10] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(_ZN6Carpet15reflevel_setterC1EPK4_cGHi+0x1f8) [0x183f608]
    [zwicky163:02171] [11] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(_ZN12CarpetReduce9ReduceGVsEPK4_cGHiiiPviPKiPKNS_9reductionEi+0x45e) [0x17bf3fe]
    [zwicky163:02171] [12] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(_ZN12CarpetReduce11average_GVsEPK4_cGHiiiPviPKi+0x2d) [0x17c201d]
    [zwicky163:02171] [13] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(CCTK_Reduce+0x148) [0x7d9fc8]
    [zwicky163:02171] [14] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim [0x77f430]
    [zwicky163:02171] [15] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim [0x77e48a]
    [zwicky163:02171] [16] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(CCTKi_TriggerAction+0x66) [0x5acd56]
    [zwicky163:02171] [17] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim [0x5a947b]
    [zwicky163:02171] [18] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(CCTKi_DoScheduleTraverse+0x208) [0x5ac0e8]
    [zwicky163:02171] [19] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(CCTK_ScheduleTraverse+0x199) [0x5a5119]
    [zwicky163:02171] [20] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim [0x17db30d]
    [zwicky163:02171] [21] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(_ZN6Carpet10InitialiseEP12tFleshConfig+0x2e7) [0x17de647]
    [zwicky163:02171] [22] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim(main+0x99) [0x59a3b9]
    [zwicky163:02171] [23] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2b1c4057b994]
    [zwicky163:02171] [24] /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlanOersted_fixed/SIMFACTORY/exe/cactus_sim [0x59a0b9]
    [zwicky163:02171] *** End of error message ***
    

    which comes from Carpet/Carpet/src/modes.cc:1359 which is a call to enter_level_mode.

    If either one of these is reproducible than I think we must bump this issue to critical and fix it before the release.

    These happened with a clean (according to a diff against a fresh checked out copy) -realcleaned configuration. I am trying this on lonestar now.

  12. Erik Schnetter reporter
    • marked as
    • changed milestone to ET_2012_11
    • removed comment

    I confirm that my previous patch has no effect, since ghost zones were already correctly excluded.

    I see the OpenMP error; a global variable is modified by all patches simultaneously. A reduction clause is missing.

    The segfault seems to occur in an I/O statement; a certain "reserve" function fails. This looks like memory management. I don't know where string buffers are used -- this may be a timer issue. Raising level since this may be a blocker.

    New patch for OpenMP issue:

    $ git diff .
    diff --git a/Carpet/CarpetMask/src/mask_excluded.cc b/Carpet/CarpetMask/src/mask
    index aec5ffa..8787ef1 100644
    --- a/Carpet/CarpetMask/src/mask_excluded.cc
    +++ b/Carpet/CarpetMask/src/mask_excluded.cc
    @@ -62,7 +62,8 @@ namespace CarpetMask {
    
             bool const exterior = exclude_exterior[n];
    
    -#pragma omp parallel
    +        CCTK_REAL local_excised = 0.0;
    +#pragma omp parallel reduction (+: local_excised)
             CCTK_LOOP3_ALL(CarpetExcludedSetup, cctkGH, i,j,k) {
               int const ind = CCTK_GFINDEX3D (cctkGH, i, j, k);
    
    @@ -75,11 +76,12 @@ namespace CarpetMask {
                   dx2 + dy2 + dz2 <= r2)
               {
                 // Tally up the weight we are removing
    -            * excised_cells += cell_volume * factor * BCNT(iweight[ind]);
    +            local_excised += cell_volume * factor * BCNT(iweight[ind]);
                 iweight[ind] = 0;
               }
    
             } CCTK_ENDLOOP3_ALL(CarpetExcludedSetup);
    +        * excised_cells += local_excised;
    
           } // if r>=0
         }   // for n
    diff --git a/Carpet/CarpetMask/src/mask_surface.cc b/Carpet/CarpetMask/src/mask_
    index f2a592b..044d1de 100644
    --- a/Carpet/CarpetMask/src/mask_surface.cc
    +++ b/Carpet/CarpetMask/src/mask_surface.cc
    @@ -129,7 +129,8 @@ namespace CarpetMask {
                 }
               }
    
    -#pragma omp parallel
    +        CCTK_REAL local_excised = 0.0;
    +#pragma omp parallel reduction (+: local_excised)
               CCTK_LOOP3_ALL(CarpetSurfaceSetup, cctkGH, i,j,k) {
                 int const ind = CCTK_GFINDEX3D (cctkGH, i, j, k);
    
    @@ -142,7 +143,7 @@ namespace CarpetMask {
                 if (rho < 1.0e-12) {
                   // Always excise the surface origin
                   // Tally up the weight we are removing
    -              * excised_cells += cell_volume * factor * BCNT(iweight[ind]);
    +              local_excised += cell_volume * factor * BCNT(iweight[ind]);
                   iweight[ind] = 0;
                 } else {
                   CCTK_REAL theta =
    @@ -188,11 +189,12 @@ namespace CarpetMask {
                     sf_radius[a + maxntheta * (b + maxnphi * sn)];
                   if (rho <= dr * shrink_factor) {
                     // Tally up the weight we are removing
    -                * excised_cells += cell_volume * factor * BCNT(iweight[ind]);
    +                local_excised += cell_volume * factor * BCNT(iweight[ind]);
                     iweight[ind] = 0;
                   }
                 }
               } CCTK_ENDLOOP3_ALL(CarpetSurfaceSetup);
    +          * excised_cells += local_excised;
    
             } else {
    
  13. Roland Haas
    • removed comment

    I tracked down the commit that causes segfaults: hash 21272f1 "Carpet: Correct #ifdefs in CarpetSimpleMPIDatatypeLength" of Carpet. Going to bed now. I really have no idea how that one causes memory corruption though. Disabling QuasiLocalMeasures in the parfile removes the SEGFAULT. Disabling output for QuasiLocalMeasures' grid arrays and grid scalars does as well. Turning the CarpetIOScalar output for QLM on again, restores the SEGFAULT so it seems related to reduction of complex numbers??? The Info output for qlm_spin is not sufficient to restore the SEGFAULT.

  14. Erik Schnetter reporter
    • removed comment

    The commit mentioned above corrected the routine CarpetSimpleMPIDatatypeLength, but not the sibling routine CarpetSimpleMPIDatatype. Maybe this inconsistency leads to problems.

    If so, this patch may help:

    $ git diff .
    diff --git a/Carpet/Carpet/src/helpers.cc b/Carpet/Carpet/src/helpers.cc
    index 3b5ebb2..7fb36f1 100644
    --- a/Carpet/Carpet/src/helpers.cc
    +++ b/Carpet/Carpet/src/helpers.cc
    @@ -347,33 +347,23 @@ namespace Carpet {
       MPI_Datatype CarpetSimpleMPIDatatype (const int vartype)
       {
         switch (vartype) {
    -#ifdef CARPET_COMPLEX
         case CCTK_VARIABLE_COMPLEX:
           return CarpetMPIDatatype (CCTK_VARIABLE_REAL);
    -#endif
    -#ifdef CARPET_COMPLEX8
    -#  ifdef HAVE_CCTK_COMPLEX8
    +#ifdef HAVE_CCTK_COMPLEX8
         case CCTK_VARIABLE_COMPLEX8:
           return CarpetMPIDatatype (CCTK_VARIABLE_REAL4);
    -#  endif
     #endif
    -#ifdef CARPET_COMPLEX16
    -#  ifdef HAVE_CCTK_COMPLEX16
    +#ifdef HAVE_CCTK_COMPLEX16
         case CCTK_VARIABLE_COMPLEX16:
           return CarpetMPIDatatype (CCTK_VARIABLE_REAL8);
    -#  endif
     #endif
    -#ifdef CARPET_COMPLEX32
    -#  ifdef HAVE_CCTK_COMPLEX32
    +#ifdef HAVE_CCTK_COMPLEX32
         case CCTK_VARIABLE_COMPLEX32:
           return CarpetMPIDatatype (CCTK_VARIABLE_REAL16);
    -#  endif
     #endif
    -    default:
    -      return CarpetMPIDatatype (vartype);
         }
    -    // notreached
    -    return MPI_CHAR;
    +    // default
    +    return CarpetMPIDatatype (vartype);
       }
    
       int CarpetSimpleMPIDatatypeLength (const int vartype)
    @@ -390,11 +380,9 @@ namespace Carpet {
         case CCTK_VARIABLE_COMPLEX32:
     #endif
           return 2;
    -    default:
    -      return 1;
         }
    -    // notreached
    -    return 0;
    +    // default
    +    return 1;
       }
    
  15. Erik Schnetter reporter
    • removed comment

    Roland, can you attach the parameter file that triggers this problem? The parameter file in the other bug report is only a patch, and I don't know against what to apply this patch. The patch has no path name, and its description does not mention a path name or thorn name either.

  16. Roland Haas
    • removed comment

    Sure. It's the qc0-mclachlan.par file from CactusExamples/trunk (ie. our demo parfile), which I have just attached. The one proposed in #732 is now in trunk. Sorry for the confusion. I had applied all code patches against current master in Carpet.

  17. Roland Haas
    • removed comment

    Applying only comment:22 does not help. There still is a segfault. The backtrace is:

    (gdb) bt
    #0  0x0000000002c4cc19 in opal_memory_ptmalloc2_int_free ()
    #1  0x0000000002c4d25b in opal_memory_ptmalloc2_free_hook ()
    #2  0x00002b68c06ac261 in free () from /lib64/libc.so.6
    #3  0x0000000002b441ac in ompi_coll_tuned_reduce_intra_basic_linear ()
    #4  0x0000000002b35db5 in ompi_coll_tuned_reduce_intra_dec_fixed ()
    #5  0x0000000002aed529 in mca_coll_sync_reduce ()
    #6  0x0000000002ae7473 in PMPI_Reduce ()
    #7  0x00000000017be25c in CarpetReduce::Finalise (cgh=0x1ee04bc0, proc=-2, num_outvals=80, outvals=0x1ee04bd0, outtype=389, myoutvals=0x0,
        mycounts=0x6f, red=0x17bfa73, $4=<value optimized out>, $5=<value optimized out>, $6=<value optimized out>, $7=<value optimized out>,
        $8=<value optimized out>, $9=<value optimized out>, $0=<value optimized out>, $1=<value optimized out>)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/CarpetReduce/src/reduce.cc:962
    #8  0x00000000017bfa73 in CarpetReduce::ReduceGVs (cgh=0x1ee04bc0, proc=-2, num_outvals=80, outtype=518015952, outvals=0x185, num_invars=0,
        invars=0x1e536490, red=0x17c1aad, igrid=517840208, $6=<value optimized out>, $7=<value optimized out>, $8=<value optimized out>,
        $9=<value optimized out>, $0=<value optimized out>, $1=<value optimized out>, $2=<value optimized out>, $3=<value optimized out>,
        $4=<value optimized out>) at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/CarpetReduce/src/reduce.cc:1577
    #9  0x00000000017c1aad in CarpetReduce::average_GVs (cgh=0x1ee04bc0, proc=-2, num_outvals=80, outtype=518015952, outvals=0x185, num_invars=0,
        invars=0x1edd9d50, $8=<value optimized out>, $9=<value optimized out>, $=<value optimized out>, $=<value optimized out>,
        $=<value optimized out>, $=<value optimized out>, $=<value optimized out>)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/CarpetReduce/src/reduce.cc:1621
    #10 0x00000000007d97b8 in CCTK_Reduce (GH=0x1ee04bc0, proc=-2, operation_handle=80, num_out_vals=518015952, type_out_vals=389, out_vals=0x0,
        num_in_fields=1) at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/src/comm/Reduction.c:429
    #11 0x000000000077ec20 in CarpetIOScalar::OutputVarAs (cctkGH=0x1ee04bc0, varname=0xfffffffe <Address 0xfffffffe out of bounds>,
        alias=0x50 <Address 0x50 out of bounds>, out_reductions=0x1ee04bd0 "", $P1=<value optimized out>, $P2=<value optimized out>,
        $P3=<value optimized out>, $P4=<value optimized out>)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/CarpetIOScalar/src/ioscalar.cc:465
    #12 0x000000000077dc7a in CarpetIOScalar::TriggerOutput (cctkGH=0x1ee04bc0, vindex=-2, $O9=<value optimized out>, $P0=<value optimized out>)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/CarpetIOScalar/src/ioscalar.cc:717
    #13 0x00000000005ac4a6 in CCTKi_TriggerAction (GH=0x1ee04bc0, variable=-2)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/src/IO/IOMethods.c:931
    #14 0x00000000005a8bcb in CCTKi_ScheduleCallExit (attribute=0x1ee04bc0, data=0xfffffffe)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/src/main/ScheduleInterface.c:2780
    #15 0x00000000005ab838 in CCTKi_DoScheduleTraverse (group_name=0x1ee04bc0 "", item_entry=0xfffffffe, item_exit=0x50, while_check=0x1ee04bd0,
        if_check=0x185, function_process=0, data=0x7fff55a60358) at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/src/schedule/ScheduleTraverse.c:158
    #16 0x00000000005a4869 in CCTK_ScheduleTraverse (where=0x1ee04bc0 "", GH=0xfffffffe, CallFunction=0x50)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/src/main/ScheduleInterface.c:891
    #17 0x00000000017dad9d in Carpet::CallAnalysis (cctkGH=0x1ee04bc0, did_recover=254, $:5=<value optimized out>, $:6=<value optimized out>)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/Carpet/src/Initialise.cc:615
    #18 0x00000000017de0d7 in Carpet::Initialise (fc=0x1ee04bc0, $;8=<value optimized out>)
        at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/arrangements/Carpet/Carpet/src/Initialise.cc:132
    #19 0x0000000000599b09 in main (argc=6, argv=0x7fff55a60978) at /panfs/ds06/sxs/rhaas/zwicky/cactus/ET_trunk/src/main/flesh.cc:80
    

    Erik: the data and all is on zwicky where I think you can at least read all my files in case you want a more in-depth look at it. The whole set of files is in /panfs/ds06/sxs/rhaas/simulations/qc0-mclachlantrunk .

  18. Roland Haas
    • removed comment

    comment:19 seems to cure the initial issue (warnings about excised regions) by the way. We should likely close this ticket then and move the discussion about the reduction to a new ticket.

  19. Roland Haas
    • removed comment

    Current master of Carpet (incl. 1bdb022a5dec73500986edb41ea1c8e67c6c5fc5 and 17845edc3640298238c742a51509be2423cd6373) shows neither of the issues anymore. These two commits I reviewed and they look ok. The others seem fine when just looking through them. All ok to be ported to the release branch I think.

    Ok to port into the release branch and close the ticket? I will run qc0-maclachlan.par to completion on zwicky to check that everything still works.

  20. Log in to comment