cuda - atomicAdd() for double on GPU -
i doing project on gpu, , have use atomicadd() double, because cuda not support double, use code below, nvidia provide.
__device__ double atomicadd(double* address, double val) { unsigned long long int* address_as_ull = (unsigned long long int*)address; unsigned long long int old = *address_as_ull, assumed; { assumed = old; old = atomiccas(address_as_ull, assumed, __double_as_longlong(val + __longlong_as_double(assumed))); } while (assumed != old); return __longlong_as_double(old); }
now want know why implement require loop, while (assumed!=old)
basically because implementation requires load, can't performed atomically. compare-and-swap operation atomic version of
(*address == assumed) ? (assumed + val) : *address
there no guarantee the value @ address
won't change between cycle value loaded address
, cycle atomiccas
call used store updated value. if happens, value @ address
won't updated. therefore loop ensures 2 operations repeated until there no change of value @ address
between read , compare-and-swap operation, implies update took place.
Comments
Post a Comment