optimization - C/Arm4 Mult and Division performance -
i trying optimize code going used on arm architecture.
the target architecture arm9 architecture doesn't support fpcalculations, , testing configuration arm4 emulation in vs2008. have tried run comparisons see if it's worth attempting change divisions multiplications , i've ran interesting results don't understand.
here code ran:
for(int = 0; < testrate; i++){ num = 356.0f * 0.06666666666666666666666666666667f; } queryperformancecounter( &end ); double result1 = end.quadpart - start.quadpart; queryperformancecounter( &start ); for(int = 0; < testrate; i++) num2 = 356.0f / 15.0f ; queryperformancecounter( &end ); double result2 = end.quadpart - start.quadpart;
and result:
calculation time 1: 1485016.000000, result : 23.733335 calculation time 2: 1068092.000000, result : 23.733334
calculation 1 when done multiplications , calculation 2 divisions. result shows divisions faster , on better cases difference between 2 calculations become negligible. here testrate = 1000000.
i know multiplications have fixed cycle divisions can take 24 cycles @ worst case, same arm architecture? in code trying port have various divisions have dynamic denominators considering using fast reciprocal function reciprocal , multiply instead of dividing it, , based on result test above i'm thinking might not render result expecting. can clarify me?
Comments
Post a Comment