2014-01-20 07:44 AM
I'm on an STM32 F2, but a generic idea would be ideal :)
Is there any way to get a high level 'reasonably' accurate idea of how busy the cpu? Given the variety of peripherals and so on I'm sure its quite a challenge if doable at all to get an accurate idea, but for my purposes I'd just like an idea of how effective some optimizations are. (and by 'busy', I mean, in terms of available cpu cycles to do some manual work, not 'busy' as in terms of how many peripherals are active at once, etc) ie: Say your'e doing a full memcpy sort of operation in pure C or ASM code, versus firing up a DMA to do it. If you perforn this optimization, it'd be nice to see a 'cpu is 90% busy' down to 'cpu is 40% busy' sort of test be doable. jeff2014-01-20 08:01 AM
Hi
Tricky. I had this discussion with a colleague a few years back and he managed to convince me of his take on it : the CPU is 100% busy doing what it has been told to do. If it is doing 'nothing' - it is spend 100% of its time doing 'nothing' '' it'd be nice to see a 'cpu is 90% busy' down to 'cpu is 40% busy' sort of test be doable.'' This kind of measurement is only possible once you introduce the concept of 'idling' Simply put - you need an 'idle task'. Then you can simply measure the time spent idling verses the time spent processing over a fixed period.2014-01-20 08:22 AM
Years ago, a networking guy asked me .. 'is it better if the pipes are full or empty?', and naturally I said 'full, not wasting any time', and he said 'empty, networks always should be minimum use'. Developers want to see full out all the time, but flexible enough to adjust when new input comes.. CPU can't be sitting around doing nothing, thats no good :) (except nowadays, you do want the cpu to idle and conserve battery, but times have changed.) But network should be full of 'my' data, not anyone elses ;)
Good idea; I've got an interupt driven application, so the main is just a while(1) { nop } sort of loop ... really should put the cpu to sleep, sometime. The trick is .. you can't just grab systick at the begin and end of loop or the like .. but I suppose could add a counter in there and just capture it every n-millisec say, so can see how many cycles are NOPping per second (say, plus the while-loop overhead) .. then compare after optimizations to see how much additional counts its getting. Hmm, okay, not bad :)2014-01-20 08:38 AM
Well there are trace and profiling tools. Many RTOS solutions provide methods of collecting task level profile data.
One common method is to modulate a GPIO pin, or pins, understanding how long an interrupts takes, or how long you're in an idle/WFI task. Viewed with a scope or logic analyzer. If you can measure current over time, that is quite effective at determining how busy CMOS circuits are. You can also directly benchmark pieces of code, either by themselves, or as part of normal operation. The DWT trace unit present on all STM32 F1/F2/F4/F3/L1 parts has a cycle counter that can measure the time some group of instructions or functions takes. It's 32-bit and has the granularity of the processor clock. The SysTick counter is only 24-bit and a bit cruder, and there are several 32-bit TIM units that could be free run at various speeds.2014-01-20 05:53 PM
CYCCNT is only one of the performance counters available in DWT. There is a bunch of other counters and those can emit reports via TPIU. Try Keil's uVision IDE (32kB limit) - it (kind of) supports SWV and performance counters. Also Atollic has some DWT features.