NVML component fails to detect all avalilable events

Issue #62 resolved
Konstantin Stefanov created an issue

PAPI nvml component uses ROM version (infoROM, eccROM) to detect the type of the GPU and find which events are supported. On some newer cards, e.g. Tesla Kepler and Tesla Pascal, this gives wrong results. Those cards support GPU and memory utilization, for example, but it was not detected as Kepler card may not have powerROM, and PAPI nvml considers it as an old card.

On the other hand, current component marks fan speed as available for all cards, which is not the case for non-discrete cards.

I created pull request (https://bitbucket.org/icl/papi/pull-requests/5/change-method-for-detecting-available-nvml/diff) where I propose a patch to change a method used to detect the available events. I propose to try to get data for every event and if that does not fail, mark that event as available.

I've tested it on Tesla K40s and Tesla P100-SXM2 cards, its results are consistent with what nvdia-smi gives.

Comments (1)

  1. Log in to comment