cuda - The behavior of __CUDA_ARCH__ macro -


in host code, seems __cuda_arch__ macro wont generate different code path, instead, generate code exact code path current device.

however, if __cuda_arch__ within device code, generate different code path different devices specified in compiliation options (/arch).

can confirm correct?

__cuda_arch__ when used in device code carry number defined reflects code architecture being compiled.

it not intended used in host code. nvcc manual:

this macro can used in implementation of gpu functions determining virtual architecture being compiled. host code (the non-gpu code) must not depend on it.

usage of __cuda_arch__ in host code therefore undefined (at least cuda).


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -