CUDA component compatibility issue with CUDA/11.5

Issue #97 resolved
Giuseppe Congiu created an issue

Example of reported error:

components/cuda/linux-cuda.c:555:48: error: unknown type name 'NVPA_RawMetricsConfigOptions'; did you mean 'NVPA_RawMetricsConfig'?
  555 | NVPA_Status (*NVPA_RawMetricsConfig_CreatePtr)(NVPA_RawMetricsConfigOptions*, NVPA_RawMetricsConfig**);
      |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                NVPA_RawMetricsConfig

Comments (8)

  1. Michael Knobloch

    This issue popped up on the JUWELS Booster system, using GCC 11.2 and CUDA 11.5. Here’s the full list of errors

    components/cuda/linux-cuda.c:555:48: error: unknown type name 'NVPA_RawMetricsConfigOptions'; did you mean 'NVPA_RawMetricsConfig'?
    555 | NVPA_Status (*NVPA_RawMetricsConfig_CreatePtr)(NVPA_RawMetricsConfigOptions*, NVPA_RawMetricsConfig**);
    |                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |                                                NVPA_RawMetricsConfig
    components/cuda/linux-cuda.c: In function '_cuda_linkCudaLibraries':
    components/cuda/linux-cuda.c:1397:5: error: 'NVPA_RawMetricsConfig_CreatePtr' undeclared (first use in this function); did you mean 'NVPW_CUDA_RawMetricsConfig_CreatePtr'?
    1397 |     NVPA_RawMetricsConfig_CreatePtr = DLSYM_AND_CHECK_nvperf(dl4, "NVPA_RawMetricsConfig_Create");
    |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |     NVPW_CUDA_RawMetricsConfig_CreatePtr
    components/cuda/linux-cuda.c:1397:5: note: each undeclared identifier is reported only once for each function it appears in
    components/cuda/linux-cuda.c: In function 'cuda11_getMetricDetails':
    components/cuda/linux-cuda.c:4150:5: error: unknown type name 'NVPA_RawMetricsConfigOptions'; did you mean 'NVPA_RawMetricsConfig'?
    4150 |     NVPA_RawMetricsConfigOptions nvpa_metricsConfigOptions;
    |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |     NVPA_RawMetricsConfig
    components/cuda/linux-cuda.c:4152:45: error: 'NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE' undeclared (first use in this function); did you mean 'NVPA_RAW_METRIC_REQUEST_STRUCT_SIZE'?
    4152 |     memset(&nvpa_metricsConfigOptions, 0,   NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE);
    |                                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |                                             NVPA_RAW_METRIC_REQUEST_STRUCT_SIZE
    components/cuda/linux-cuda.c:4153:30: error: request for member 'structSize' in something not a structure or union
    4153 |     nvpa_metricsConfigOptions.structSize =  NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE;
    |                              ^
    components/cuda/linux-cuda.c:4154:30: error: request for member 'activityKind' in something not a structure or union
    4154 |     nvpa_metricsConfigOptions.activityKind = NVPA_ACTIVITY_KIND_PROFILER;
    |                              ^
    components/cuda/linux-cuda.c:4155:30: error: request for member 'pChipName' in something not a structure or union
    4155 |     nvpa_metricsConfigOptions.pChipName = pChipName;
    |                              ^
    components/cuda/linux-cuda.c:4158:17: error: 'NVPA_RawMetricsConfig_CreatePtr' undeclared (first use in this function); did you mean 'NVPW_CUDA_RawMetricsConfig_CreatePtr'?
    4158 |     NVPW_CALL((*NVPA_RawMetricsConfig_CreatePtr)(&nvpa_metricsConfigOptions, &pRawMetricsConfig),
    |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    components/cuda/linux-cuda.c:397:32: note: in definition of macro 'NVPW_CALL'
    397 |         NVPA_Status _status = (call);                                                               \
    |                                ^~~~
    components/cuda/linux-cuda.c: In function '_cuda11_build_profiling_structures':
    components/cuda/linux-cuda.c:4704:9: error: unknown type name 'NVPA_RawMetricsConfigOptions'; did you mean 'NVPA_RawMetricsConfig'?
    4704 |         NVPA_RawMetricsConfigOptions metricsConfigOptions;
    |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |         NVPA_RawMetricsConfig
    components/cuda/linux-cuda.c:4705:44: error: 'NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE' undeclared (first use in this function); did you mean 'NVPA_RAW_METRIC_REQUEST_STRUCT_SIZE'?
    4705 |         memset(&metricsConfigOptions, 0,   NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE);
    |                                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |                                            NVPA_RAW_METRIC_REQUEST_STRUCT_SIZE
    components/cuda/linux-cuda.c:4706:29: error: request for member 'structSize' in something not a structure or union
    4706 |         metricsConfigOptions.structSize =  NVPA_RAW_METRICS_CONFIG_OPTIONS_STRUCT_SIZE;
    |                             ^
    components/cuda/linux-cuda.c:4707:29: error: request for member 'activityKind' in something not a structure or union
    4707 |         metricsConfigOptions.activityKind = NVPA_ACTIVITY_KIND_PROFILER;
    |                             ^
    components/cuda/linux-cuda.c:4708:29: error: request for member 'pChipName' in something not a structure or union
    4708 |         metricsConfigOptions.pChipName = mydevice->cuda11_chipName;
    |                             ^
    components/cuda/linux-cuda.c:4713:21: error: 'NVPA_RawMetricsConfig_CreatePtr' undeclared (first use in this function); did you mean 'NVPW_CUDA_RawMetricsConfig_CreatePtr'?
    4713 |         NVPW_CALL((*NVPA_RawMetricsConfig_CreatePtr)
    |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    components/cuda/linux-cuda.c:397:32: note: in definition of macro 'NVPW_CALL'
    397 |         NVPA_Status _status = (call);                                                               \
    |                                ^~~~
    components/cuda/linux-cuda.c: In function '_cuda11_read':
    components/cuda/linux-cuda.c:381:31: warning: implicit conversion from 'NVPA_Status' to 'CUptiResult' [-Wenum-conversion]
    381 |         CUptiResult _status = (call);                                                               \
    |                               ^
    components/cuda/linux-cuda.c:5588:9: note: in expansion of macro 'CUPTI_CALL'
    5588 |         CUPTI_CALL((*NVPW_MetricsContext_SetCounterDataPtr) (&setCounterDataParams),
    |         ^~~~~~~~~~
    components/cuda/linux-cuda.c:381:31: warning: implicit conversion from 'NVPA_Status' to 'CUptiResult' [-Wenum-conversion]
    381 |         CUptiResult _status = (call);                                                               \
    |                               ^
    components/cuda/linux-cuda.c:5605:9: note: in expansion of macro 'CUPTI_CALL'
    5605 |         CUPTI_CALL((*NVPW_MetricsContext_EvaluateToGpuValuesPtr) (&evalToGpuParams),
    |         ^~~~~~~~~~

  2. john.rodgers

    This issue also reproducible with CUDA 11.4. Sample papi_component_avail output:

    Name:   cuda                    CUDA events and metrics via NVIDIA CuPTI interfaces
       \-> Disabled: A required function 'NVPA_RawMetricsConfig_Create' was not found in '/cm/shared/apps/cuda11.4/toolkit/11.4.2/targets/x86_64-lin
    

  3. john.rodgers

    Wanting to check on the status of PR 259. Do y’all think it’s likely it will be merge into develop on its own, or will it be bundled with another pending PR?

  4. Giuseppe Congiu reporter

    We are testing the PR on different versions of the CUDA Toolkit to make sure it works across all of them. Unfortunately, we don’t have access to some of the CUDA Toolkit versions on the tested hardware and thus the delay.

  5. Log in to comment