2018-02-16 05:52 AM
Hello
With my previous experience with Atmel 8bit MCUs, which have 1MIPS/MHz perfomance, I had exactly 1 executed instruction per systick.
Now I'm using STM32F103. I noted from datasheet that its perfomance is 1.25 DMIPS/MHz. So I wrote small assembler program, in short:
LDR param0, [R6] ; param0 receiver, R6 contains address in periph bit-bang
STR param0, [R7], #4 ; R7 contains address in SRAM bit-bangB Loop ;There's no prescalers neither for AHB not for APB1/2. I downloaded this small code in embedded SRAM, set flash latency to 0, disabled flash prefetch buffer, off all interrupts and DMA.
Then I measured how fast executes this code from SRAM. The result is that one command takes 4 systicks (branch takes 8), and actual perfomance is 0.25 MIPS/MHz.
What I did wrong? Or misunderstood?
2018-02-16 06:04 AM
contains address in periph bit-bang
You mean bit-band?
Bit-banding is internally a read-modify-write.
Accessing peripherals is going through the AHB/APB bus, which may add further few clocks of latency.
These are NOT microcontrollers as you are used to from the 8-bitters, where all components are clocked tightly with the processor core.
These are SoC, a system like you would in past build on a board from a processor, memories and bunch of peripheral chips; with a timing arbitrator, heavily relying on combining waitstates of all related parties.
JW
2018-02-16 06:19 AM
'
1.25 DMIPS/MHz'
DMIPS != MIPS.
2018-02-16 07:21 AM
I am afraid you misunderstood what a DMIPS is (datasheet mentions Dhrystone 2.1). See there:
2018-02-16 07:37 AM
Thats interesting where the border line goes between a MCU and a SoC. Imagine its quite floating since a ST32Fxx is a lot simpler (yet cumbersome) then a Sitara AM3358 (example).
2018-02-16 07:38 AM
Thanks for answer. I was pretty sure, that I can not get real 1 MIPS.
But answer still opened for me: what is 1.5 DMIPS? What commands should I make core to execute to reach this perfomance?
And how about branch command? There no conditions and no periph accessing, but still takes about 8 ticks
2018-02-16 07:52 AM
Branching are defined with costs of some sort on most micros. Why not try benchmarking on a CCM RAM device,
(F303/F407 etc) RAM that sits on CPU bus directly?
2018-02-16 07:57 AM
I do not have any of them. And I do not want to do some benchmarks. I made some device with F103. I thought it can solve a task gived to it, but it seems to slow, and I wandered why. The freq is very high (in comparsion with my last 8bit MCUs).
2018-02-16 07:58 AM
I agree.
But
now I really confused. How can be DMIPS > MIPS ? I thougt that DMIPS is some kind of average speed.
2018-02-16 08:05 AM
Mikas,
Imagine its quite floating
As with most terms around in electronics.
The marketing wants their share, too, so they will loudly disagree with my 'definition', as they want to replace the 8-bitters by 32-bitters at all costs (pun intended).
Roman,
And how about branch command? There no conditions and no periph accessing, but still takes about 8 ticks
How do you know?
JW