2024-03-30 05:59 AM
If I couldn't find documentation specifying the execution time or clock cycles required for each assembly instruction, and if I want to calculate the processor speed in executing instructions, what should I do?
Is there a resource available for calculating the number of clock cycles for each assembly instruction?
thanks.
Solved! Go to Solution.
2024-03-30 11:42 AM
This is an ARM core, so look, what they give as timing...
And its not so simple, to give just "this instruction is one cycle ", because these cpu is designed to work together with the compiler, that makes the code, and depending on compiler settings then.
Maybe an instruction is followed by other instruction, that can be executed same time, or other, that needs long address (+ 1 cycle), it depends on what the compiler arranges (out of order execution...) and is the instruction in cached area or needs a new flash access (+ x waitstates), so better you look on effective speed, at different optimizer settings, than looking at a special instruction. Most are "one cycle" basically, because this is a RISC cpu - but some wait cycles may come, depending on surrounding code.
2024-03-30 07:25 AM
https://www.st.com/resource/en/programming_manual/pm0056-stm32f10xxx20xxx21xxxl1xxxx-cortexm3-programming-manual-stmicroelectronics.pdf
but be ware that some instructions (accessing to peripheral buses etc) can take longer in dependency on bus clock and its usage...
2024-03-30 07:59 AM
2024-03-30 07:59 AM
For Cortex-M3: https://developer.arm.com/documentation/ddi0337/h/programmers-model/instruction-set-summary/cortex-m3-instructions
2024-03-30 10:34 AM
The pipeline generally allows for a throughput of one instruction per cycle.
You should also look at the DWT unit and the CYCCNT register to benchmark code and loops, etc.
2024-03-30 11:03 AM - edited 2024-03-30 11:04 AM
> what should I do?
Please tell what is your real need? Is it purely academic?
2024-03-30 11:42 AM
This is an ARM core, so look, what they give as timing...
And its not so simple, to give just "this instruction is one cycle ", because these cpu is designed to work together with the compiler, that makes the code, and depending on compiler settings then.
Maybe an instruction is followed by other instruction, that can be executed same time, or other, that needs long address (+ 1 cycle), it depends on what the compiler arranges (out of order execution...) and is the instruction in cached area or needs a new flash access (+ x waitstates), so better you look on effective speed, at different optimizer settings, than looking at a special instruction. Most are "one cycle" basically, because this is a RISC cpu - but some wait cycles may come, depending on surrounding code.
2024-03-30 12:12 PM - edited 2024-03-30 12:13 PM
Completely agree with @AScha.3 ! ARM Cortex-M core is a RISC architecture and in terms in MIPS We can assume that average instruction by cycle = 1.
as said also by @Tesla DeLorean it is linked to Pipeline - each stage is 1 HCLK , with complex pipelines such as Cortex-M7 it is a dual issue and 2 instructions can execute at same cycle but can be up to 5 to 7 stages . the nightmare for Pipelines are Branches which is the most costing in terms of cycles and flush the pipeline.
Hope it helps you .
STOne-32
2024-03-30 05:41 PM
You can use a software simulator to step through your assembly code and watch the cycle counter - but it will probably not show all the delays with memory accesses - that's where the fun starts, organising the register loads/unloads so they don't slow down the other operations as much.