2016-03-02 01:03 PM
Hi all,
For an application I'm working on, I needed to speed up startup. I looked into the .init and .bss routines and saw that they weren't very efficient. I re-wrote them and got about a 2.2x improvement in startup time for my application. The revised code does the same thing, just in fewer cycles.Feel free to use freely if this is useful for anyone. @ST, feel free to copy into the library you distribute.https://gist.github.com/ppannuto/672328eb8184abdb9559-Pat #startup-speed-efficiency-fast2016-03-03 07:17 AM
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2016-03-03 10:45 AM
Sure, if you compare the running of the two loops, the original code
executed several more instructions per loop, every loop iteration itwould read the same memory address into the same register, which itdidn't need to do. You can also eliminate the adds that incrementsthe pointer by using the stmia [store and increment after] instruction;for a single store operation, stm and stmia both take 2 cycles. (On morepowerful (cortex-m3 and up) cores, you usually use postfix addressing,i.e. str r0, [r1], #4 to do this, but postfix isn't supported on the m0,stmia is, however).In the old code, the loop part was ldr r3, =_sidata ldr r3, [r3, r1] str r3, [r0, r1] adds r1, r1, #4 ldr r0, =_sdata ldr r3, =_edata adds r2, r0, r1 cmp r2, r3 bcc CopyDataInit 5 memory operations x 2 cycles each = 10 cycles+ 3 alu operations x 1 cycle each = 3 cycles+ 1 branch opreation x 1 cycle (usu) = 1 cycleFor 14 cycles / loop. In the new code, the loop part is ldmia r2!, {r3} stmia r0!, {r3} cmp r0, r1 bcc CopyDataInitializersLoop 2 memory/alu operations x 2 cycles each = 4 cycles 1 alu operation x 1 cycle each = 1 cycle 1 branch operation x 1 cycle (usu) = 1 cycleFor 6 cycles / loop.Is this clear?-PatComplete Old Copy Data: movs r1, #0 b LoopCopyDataInitCopyDataInit: ldr r3, =_sidata ldr r3, [r3, r1] str r3, [r0, r1] adds r1, r1, #4LoopCopyDataInit: ldr r0, =_sdata ldr r3, =_edata adds r2, r0, r1 cmp r2, r3 bcc CopyDataInitComplete New Copy Data:CopyDataInitializersStart: ldr r0, =_sdata /* write to this addr */ ldr r1, =_edata /* until you get to this addr */ ldr r2, =_sidata /* reading from this addr */ b CopyDataInitializersEnterLoopCopyDataInitializersLoop: ldmia r2!, {r3} stmia r0!, {r3}CopyDataInitializersEnterLoop: cmp r0, r1 bcc CopyDataInitializersLoop