2009-06-11 05:48 AM
LOAD/STORE speed in STM32
2011-05-17 04:14 AM
Welcome,
I've a question. How fastest execute the LDRB and STRB instruction? I've the following program: LDRB R2, [R1], #+1 STRB R2, [R0] LDRB R2, [R1], #+1 STRB R2, [R0] ....etc. Above code need 8 cycles, but other program: LDRB R2, [R0] STRB R2, [R1], #+1 LDRB R2, [R0] STRB R2, [R1], #+1 .... etc. need to execute 14 cycles!!! Why? This is almost as first program.2011-05-17 04:14 AM
Hi,
Could you let me know the content of R0 and R1 in both cases ? Cheers, STOne-32.2011-05-17 04:14 AM
So, in first program:
R0 contain GPIOA high address ODR 0x4001080D R1 contain address of memory buffer (in RAM), where from will read data in second program: R0 contain GPIOA low address IDR 0x40010808 R2 contain address of memory, where will write data of input port2011-05-17 04:14 AM
Hi,
Could you repeat again the tests at a speed of 8Mhz running from internal RC and see if these results are the same ? without putting wait-states for Flash, let them as default. Cheers, STOne-32.2011-05-17 04:14 AM
Result is the same for 8MHz. Wait states for Flash is important only while branch (probably...) Maybe GPIO are slow? Write to port is faster than read. I have result like this. Change registers are no effect.
2011-05-17 04:14 AM
Hi, I am wondering how do you measure the number of cycles. This relates to your conclusion. :o
2011-05-17 04:14 AM
Quote:
On 09-06-2009 at 21:18, Anonymous wrote: So, in first program: R0 contain GPIOA high address ODR 0x4001080D R1 contain address of memory buffer (in RAM), where from will read data in second program: R0 contain GPIOA low address IDR 0x40010808 R2 contain address of memory, where will write data of input port I believe that R2 at the end is R1, right ? and was a typo, In that case this is easy not so difficult to explain , I have replaced R1 and R0 by their meaning :Quote:
LDRB R2, [Internal RAM] STRB R2, [GPIO DR] LDRB R2, [Internal RAM] STRB R2, [GPIO DR] ....etc. Above code need 8 cycles, but other program: LDRB R2, [GPIO DR] STRB R2, [Internal RAM] LDRB R2, [GPIO DR] STRB R2, [Internal RAM] .... etc. need to execute 14 cycles!!! Why? GPIO DR are located in the ''High speed APB'' bus after the AHBtoAPB Bridge whereas The internal RAM is located at AHB Bus, In fact inside the AHBtoAPB bridge there is a ''write Buffer'' that buffers the writes to the APB locations. That this why in the first case the timing is minimum, However in the second case, when you load from ''APB'' using the LDRB, which is a blocking instruction, the high speed APB needs 1 additional cycle compared to RAM if the APB/AHB prescaler is equal to 1. Hope this explains to you results. Cheers, STOne-32.2011-05-17 04:14 AM
Wow, this is a bit complex, but I understand now why read is slower in second case. Thank you very much! Can I read data faster with GPIO? For example, if exist other LOAD instruction? I've found nothing in datasheet yet.