Could not load library libcudnn_cnn_infer.so.8

Issue #48 on hold
gennuo wang created an issue

Dear iphop team,
Thank you for developing so great tool. I am testing the tool, now I have the issues below:

Welcome to iPHoP

Looks like everything is now set up, we will first clean up the input file, and then we will start the host prediction steps themselves
[1/1/Skip] Skipping computation of blastn against microbial genomes...
[1/3/Skip] Skipping blast parsing...
[2/1/Skip] Skipping computation of blastn against CRISPR...
[2/2/Skip] Skipping crispr parsing...
[3/1/Skip] Skipping computation of WIsH scores...
[3/2/Skip] Skipping WIsH parsing...
[4/1/Skip] Skipping computation of VHM s2 similarities...
[4/2/Skip] Skipping VHM parsing...
[5/1/Skip] Skipping computation of PHP scores...
[5/2/Skip] Skipping PHP parsing...
[6/1/Skip] Skipping RaFAH...
[6/2/Skip] Skipping RaFAH parsing...
[6.5/1/Skip] Skipping diamond search against RaFAH refs...
[6.5/2/Skip] Skipping calculation of AAI to RaFAH refs...
[7/Skip] We already found all the expected files, we skip...
[7.5/Skip] We already found all the expected files, we skip...
[8] Running the convolution networks...
[8/1] Loading data as tensors..
[8/1.1] Getting blast-based scores..
2023-08-03 09:28:59.269503: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-03 09:29:02.427915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 7403 MB memory: -> device: 0, name: Quadro P4000, pci bus id: 0000:3b:00.0, compute capability: 6.1
2023-08-03 09:29:02.442935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 7403 MB memory: -> device: 1, name: Quadro P4000, pci bus id: 0000:d8:00.0, compute capability: 6.1
[8/1.2] Run blast classifier Model_blast_Conv-87 (by batch)..
Predicting confidence score for all batches of input data [ ] 0%2023-08-03 09:29:06.499470: I tensorflow/stream_executor/cuda/cuda_dnn.cc:366] Loaded cuDNN version 8800
Could not load library libcudnn_cnn_infer.so.8. Error: libcudnn_cnn_infer.so.8: failed to map segment from shared object: Cannot allocate memory
Could you give me any ideas to fix it? Thank you very much.

Best,

Gennuo

Comments (2)

  1. Simon Roux repo owner

    Hi,

    “Cannot allocate memory” suggests that you ran out of memory on the node / computer iPHoP was running on ? I would suggest maybe processing only a small number of sequences (~ 20) to first see if iPHoP works as expected. Then if this error only occurs when processing a (much) larger dataset, that means that either iPHoP must be run on a larger node or in smaller batches.

  2. Log in to comment