nvidia cuda 10.1/2

Issue #33 closed
patrick Shirkey created an issue

Hi Matt,

Using the latest git pull.

Seems to be a problem with nvidia cuda 10.1/2 on linux. Looks like the code is unable to find the opencl device.

optirun -v emberrender --openclinfo
[ 8751.002942] [INFO]Response: Yes. X is active.

[ 8751.002965] [INFO]Running application using primus.

OpenCL Info:

  • I have tested opencl is working with this tool.

https://github.com/NVIDIA/cuda-samples

cuda-samples/Samples/deviceQuery$ ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 850M"
CUDA Driver Version / Runtime Version 10.2 / 10.1
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 4046 MBytes (4242604032 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 902 MHz (0.90 GHz)
Memory Clock rate: 1001 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.1, NumDevs = 1
Result = PASS

Comments (7)

  1. Matt Feemster repo owner

    Hi Patrick, blast from the past. I didn’t know you still had an interest in Fractorium. Have you been using it this whole time?

    Your problem seems very strange. What is cuda 10.1/2? Is it new or old?

    I’ve tested on linux with my laptop which has a GTX 1050 in it and it works fine.

    If this is a new way of doing cuda, then perhaps nvidia changed around how some things are done?

    Since you are able to build and repro this problem on linux, are you able to debug emberrender to see where the problem might be?

    Thanks.

  2. patrick Shirkey reporter

    Turns out I had to explicitly install these packages

    apt install nvidia-opencl-icd nvidia-opencl-dev

    Seems that they were not brought in by any of the other nvidia other packages.

    optirun -v emberrender --openclinfo
    [81840.705000] [INFO]Response: Yes. X is active.

    [81840.705034] [INFO]Running application using primus.

    OpenCL Info:
    Platform 0: NVIDIA Corporation NVIDIA CUDA OpenCL 1.2 CUDA 10.2.159
    Device 0: NVIDIA Corporation GeForce GTX 850M
    CL_DEVICE_OPENCL_C_VERSION: OpenCL C 1.2
    CL_DEVICE_LOCAL_MEM_SIZE: 49,152
    CL_DEVICE_LOCAL_MEM_TYPE: 1
    CL_DEVICE_MAX_COMPUTE_UNITS: 5
    CL_DEVICE_MAX_READ_IMAGE_ARGS: 256
    CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 16
    CL_DEVICE_MAX_MEM_ALLOC_SIZE: 1,060,651,008
    CL_DEVICE_ADDRESS_BITS: 64
    CL_DEVICE_GLOBAL_MEM_CACHE_TYPE: 2
    CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 128
    CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 122,880
    CL_DEVICE_GLOBAL_MEM_SIZE: 4,242,604,032
    CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 65,536
    CL_DEVICE_MAX_CONSTANT_ARGS: 9
    CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
    CL_DEVICE_MAX_WORK_GROUP_SIZE: 1,024
    CL_DEVICE_MAX_WORK_ITEM_SIZES: 1,024, 1,024, 64

  3. Matt Feemster repo owner

    OK, glad you figured that out. Tbh, I’m not even really sure what’s required on linux, all I know is that my setup was configured properly somewhere along the way, enough for me to make the linux installers.

    On the linux build page of this project:

    https://bitbucket.org/mfeemster/fractorium/src/5146fc5dd2405ee3f78d29a57e45f26f28eda974/Data/BuildGuideLinux.md

    I say that nvidia developers should have these on their system:

    ocl-icd-libopencl1 ocl-icd-opencl-dev opencl-headers nvidia-modprobe nvidia-prime nvidia-384 nvidia-384-dev

    I really don’t even know if all or any of those are still required. Which of them do you think is needed?

    Thanks.

  4. patrick Shirkey reporter

    Yeah, Most of those are correct for my system. The problem is that updating nvidia drivers is a major headache. I find I often have to purge nvidia and reinstall after an update.

    ex.

    apt-get purge nvidia.

    apt-get install nvidia-driver

    apt-get install nvidia-smi
    modprobe nvidia
    apt install nvidia-cuda-toolkit

    • The main take away is that I had to run install these packages directly. I had already installed them a couple of times previously but purging removes them and they are not brought in by other nvidia packages or cuda toolkit.

    apt install nvidia-opencl-icd nvidia-opencl-dev

    If they are not installed explicitly then “optirun clinfo” does not work.

  5. Log in to comment