2016-02-10 08:21 AM
I am using Embitz as IDE and a NucleoL152 as development platform.
I tried this thing. An array of integers of 1024 elements (from 0 to 1023) was copied to two different arrays, one using CPU memory transfer and the other using DMA memory to memory transfer. When the two transfer operation started two digital output pin were set and when the corresponding operation ended the corresponding pin was reset. Then I used an oscilloscope to measure the transfer time in the two case, changing the compiler optimization to -O0 (no optimization for speed), -O1 (optimize for speed), -O2 (optimize even more for speed), -O3 (full optimization for speed).In the case of DMA memory to memory transfer the time was always in the range 220 to 238 us for all optimization options. However, in the case of CPU memory transfer I obtained 1290us for no optimization (-O0), 323us for optimization (-O1), 346us for even more optimization (-O2) and 607us for full optimization (-O3). While I can understand the decrease of time from -O0 to -O1, I really do not understand why it increases for -O2 e -O3. My idea was that the full optimization -O3 should have the minimum time. #compiler-optimization2016-02-11 01:44 PM
It is not like a gas pedal, otherwise, why would anyone use anything other than -0 for debug and -0MAX for release. To understand what the compiler is doing see:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.htmlDid you use memcpy() or did you roll your own copy (copy by bytes or by 32bit words)