cuda - The behavior of __CUDA_ARCH__ macro -

cuda - The behavior of __CUDA_ARCH__ macro -

in host code, seems __cuda_arch__ macro wont generate different code path, instead, generate code exact code path current device.

however, if __cuda_arch__ within device code, generate different code path different devices specified in compiliation options (/arch).

can confirm correct?

__cuda_arch__ when used in device code carry number defined reflects code architecture being compiled.

it not intended used in host code. nvcc manual:

this macro can used in implementation of gpu functions determining virtual architecture being compiled. host code (the non-gpu code) must not depend on it.

usage of __cuda_arch__ in host code therefore undefined (at least cuda).

Comments