cuda - atomicAdd() for double on GPU -


i doing project on gpu, , have use atomicadd() double, because cuda not support double, use code below, nvidia provide.

__device__ double atomicadd(double* address, double val) {     unsigned long long int* address_as_ull =                                           (unsigned long long int*)address;     unsigned long long int old = *address_as_ull, assumed;     {         assumed = old;         old = atomiccas(address_as_ull, assumed,                          __double_as_longlong(val +                          __longlong_as_double(assumed)));     } while (assumed != old);     return __longlong_as_double(old); } 

now want know why implement require loop, while (assumed!=old)

basically because implementation requires load, can't performed atomically. compare-and-swap operation atomic version of

(*address == assumed) ? (assumed + val) : *address 

there no guarantee the value @ address won't change between cycle value loaded address , cycle atomiccas call used store updated value. if happens, value @ address won't updated. therefore loop ensures 2 operations repeated until there no change of value @ address between read , compare-and-swap operation, implies update took place.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -