Add CUDA 8.6 Compute capability
Please add CUDA 8.6 Compute capability to list of valid architectures (RTX 30* GPUs)
Comments (8)
-
-
I guess what is meant is that the Makefile doesn’t allow sm_86, it has only sm_80 for Ampere. PR #5 allows any sm for CMake, but it hasn’t been merged in yet.
-
Account Deleted reporter Yes, I am speaking about ability to pass sm_86 key. I think it will be nice to have it in Makefile as well for consistency.
-
In SLATE, a not-yet-released change in the Makefile does it this way, which avoids having a list of known sm architectures:
# Generate flags for which CUDA architectures to build. # cuda_arch_ is a local copy to modify. cuda_arch_ = $(cuda_arch) ifneq ($(findstring kepler, $(cuda_arch_)),) cuda_arch_ += sm_30 endif [... and so on for maxwell, pascal, volta, turing, ampere ...] # Warn about unrecognized architectures. cuda_arch_unknown = $(filter-out sm_% kepler maxwell pascal volta turing ampere, $(cuda_arch)) ifneq ($(cuda_arch_unknown),) $(error ERROR: unknown `$(cuda_arch_unknown)` in cuda_arch) endif # Extract architectures XX from sm_XX in cuda_arch and sort numerically. sms := $(patsubst sm_%,%,$(filter sm_%, $(cuda_arch_))) sms_sort := $(shell printf "%s\n" $(sms) | sort -n) # code=sm_XX is binary, code=compute_XX is PTX gencode_sm = -gencode arch=compute_$(sm),code=sm_$(sm) gencode_compute = -gencode arch=compute_$(sm),code=compute_$(sm) # Get gencode options for all sm_XX in cuda_arch_. nv_sm := $(foreach sm,$(sms_sort),$(gencode_sm)) nv_compute := $(foreach sm,$(sms_sort),$(gencode_compute)) ifeq ($(nv_sm),) $(error ERROR: unknown `cuda_arch=$(cuda_arch)`. Set cuda_arch to one or more of kepler, maxwell, pascal, volta, turing, ampere, or valid sm_XX from nvcc -h) else # Get last option (last 2 words) of nv_compute. nwords := $(words $(nv_compute)) nwords_1 := $(shell expr $(nwords) - 1) nv_compute_last := $(wordlist $(nwords_1), $(nwords), $(nv_compute)) endif # Use all sm_XX (binary), and the last compute_XX (PTX) for forward compatibility. NVCCFLAGS += $(nv_sm) $(nv_compute_last)
-
I’ve implemented Mark’s suggestion, and tested on my own CUDA machine (with an up-to-date NVCC). It should be available as https://bitbucket.org/icl/magma/commits/6e3a460f9badffb1c391b2d34bb1477ad8eb7367 (which is part of the ‘master’ branch)
You can do it exactly as you say:
GPU_TARGET = Pascal Volta Turing Ampere sm_86
In your
make.inc
-
Account Deleted reporter Thanks for quick help!
Should I close this one?
-
Yes, go ahead and mark as resolved/close the issue
-
Account Deleted reporter - changed status to resolved
- Log in to comment
I believe this is already supported, you can check https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#virtual-architecture-feature-list for the list of architectures
We implement this as
Ampere
, so in yourmake.inc
, addAmpere
to theGPU_TARGET
variable. You can also specify the architecture version viasm_XY
. For example, you could addsm_80
toGPU_TARGET
to achieve the same effect