OP_REQUIRES? not sure what it is

Issue #44 closed
Lin-Xing Chen created an issue

Hi Simon,

Unfortunately, I met another issue when running a big fasta file using iPhop. Please help.

2023-07-15 19:03:49.679429: W tensorflow/core/common_runtime/bfc_allocator.cc:474] ***************************************************************************************************_
2023-07-15 19:03:49.679903: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at sparse_to_dense_op.cc:227 : RESOURCE_EXHAUSTED: OOM when allocating tensor with shape[500,61,21,31] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "/home/linking/miniconda3/envs/iphop_env/bin/iphop", line 10, in <module>
    sys.exit(cli())
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/iphop.py", line 128, in cli
    args["func"](args)
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/master_predict.py", line 106, in main
    runmodels.run_individual_models(args)
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/runmodels.py", line 111, in run_individual_models
    full_predicted = run_single_classifier(classifier,tensors,args)
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/iphop/modules/runmodels.py", line 238, in run_single_classifier
    predict = best_model.predict([tf.sparse.to_dense(tensors[i])])
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/tensorflow/python/ops/sparse_ops.py", line 1715, in sparse_tensor_to_dense
    return gen_sparse_ops.sparse_to_dense(
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/tensorflow/python/ops/gen_sparse_ops.py", line 3162, in sparse_to_dense
    _ops.raise_from_not_ok_status(e, name)
  File "/home/linking/miniconda3/envs/iphop_env/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 7107, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[500,61,21,31] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:SparseToDense]

Any idea what is going on? Thank you.

Best,

LINXING

Comments (5)

  1. Simon Roux repo owner

    HI Lin-Xing,

    This one is easier to debug: “OOM” is here for out of memory, i.e. there was not enough resources for iPHoP to process this big fasta file on this node / job / computer. My recommendation is typically to give iPHoP more CPU/memory, or split your input in smaller batches (all sequences are considered independently, so processing sequences as a big fasta file or processing them in many small fasta files should give you the exact same results).

    Best,

    Simon

  2. Nicolas Tromas

    Hi Simon,

    I got similar issue even with a “small” file (2000 contigs, 30mb file). With the older Iphop version, I never got that issue even with bigger files (~300mb).

    Nico

  3. Simon Roux repo owner

    Hi Nico,

    I’m not sure why this step would behave differently with older versions of iPHoP (it was not changed since version 0.9), although I guess maybe some dependencies were updated and explain this difference. The size of the input is also not really in question, it’s more about the number of input sequences and the number of hits for each input sequence (which may explain what you observe). In any case, running smaller batches is the recommended “fix” for this.

    Best,

    Simon

  4. Log in to comment