2015-05-29 10:48 AM
In the datasheet of STM32F407 it is stated (Table 4, 2.2.21), the max speed for TIM1 is given with 168MHz, whereas TIM2-5 only run with max. 84MHz. This is quite clear, as TIM1 connects to APB2, running at max. 84MHz, and TIM2-5 connect to APB1, running at max 48MHz. ...
The strange thing is: In the STM32F411 datasheet (Table 4, 3.20) the max speed of all timers is specified with the same value, 100MHz, although still TIM1 connects to the double speed APB2, and TIM2-5 connect to the half speed APB1. Is this real or maybe some misprint? (Or do the timers TIM2-5 need some special ''trick setting'' to get this higher speed - as the include file and the register settings always is the STM32F4XX.H, STM32F411 seems to use exactly the same settings as STM32F405/7 ... (just a subset of course) ... so this is really somehow very strange?2015-05-29 12:04 PM
2015-05-29 12:17 PM
All the STM32 designs allow for the timers to pull their clock from one-tap earlier on the divider chain, except the DIV1 case where there isn't a faster source. Review the ''Clock Tree'' diagram in the Reference Manual for each series of parts.
The 407 has APB1 at DIV4, ABP2 at DIV2, APB2 here being nominally 84 MHz and the TIMCLKs attached at 168 MHzThe 411 has APB1 at DIV2, ABP1 at DIV1, the TIMCLKs on both these buses can be 100 MHz.The general philosophy at work here is that any counter/timer is going to at least divide it's input by TWO. ie one clock tick up, one clock tick down.2015-06-16 09:45 AM
Thank you for your answer. It sounds very interesting, but I must admit, that I did not understand it really.
Going from Discovery STM32F407 to Discovery STM32F411, if I want to operate the Discovery 411 with 84 MHz for simplicity, I understand that I should do the following changes in the clock configuration file system_stm32f4xx.c: Parameter Discovery407 Discovery411 Remark PLL_P 2 4 to reduce core speed to 84MHz RCC_CFGR RCC_CFGR_PPRE2_DIV2 | RCC_CFGR_PPRE1_DIV4 RCC_CFGR_PPRE2_DIV1 | RCC_CFGR_PPRE1_DIV2 to set ABP1 to 42MHz and ABP2 to 84MHz FLASH_ACR FLASH_ACR_LATENCY_5WS FLASH_ACR_LATENCY_2WS for 84MHz 2 WS required (RM 3.4.1 Table 5) PWR_CR PWR_CR_VOS PWR_CR_VOS*2 see RM3.4.1 Table5 I hope this is correct? (Or do your recommend any further changes?) Could you specify the settings for 100MHz for 411? I have no glue to get this accomplished, as in RM-411-6.3.3 (description of register RCC_CFGR) it is clearly stated for the Parameters PPRE2 and PPRE1, that APB2 and APB1 must not run at higher speed than 84MHz/42MHz. I have no idea how I should accomplish this with 100MHz core frequency (this also collides with the 411 datasheet spec in the timer table (DS 411 3.20 Table4), where the max interface clock is specified with 100MHz - but I assume with interface clock they mean the speed of APB2 and ABP1??????) PS: Looking through this, I recognized that the Prefetch is not enabled in my FLASH_ACR configuration (I still always used the system_stm32f4xx.c from the discovery project - it is a pity that this discovery project is somehow not available for the STM32F411). Can you give me an estimation, how much % a typical C program will run higher in speed, if I enable prefetch by setting the PRFTEN bit? (or is this not really recommended?).2015-06-16 10:19 AM
I think any references to max 84/42 MHz are due to a failure to completely edit the document. Most of the materials, and peripheral speeds quoted (SPI, USART, TIM) are predicated on 100/50 MHz clocks.
''Several prescalers are used to configure the AHB frequency, the high-speed APB (APB2) and the low-speed APB (
APB1
) domains. The maximum frequency of the AHB domain is 100 MHz. The maximum allowed frequency of the high-speed APB2 domain is 100 MHz.The maximum allowed frequency of the low-speed APB1 domain is 50 MHz''http://www.st.com/web/en/resource/technical/document/reference_manual/DM00119316.pdf
Getting to 48 MHz might be problematic. For USB consider 96/48 MHz For 100 MHz (VOS and Flash wait states set appropriately) PLL_M = 8 (8 MHz source to 1 MHz comparison) PLL_N = 200 PLL_P = 2 APB1 DIV2 APB2 DIV1 or PLL_N = 400 PLL_P = 4 Comparison frequency between 1-2 MHz, VCO frequency between 192 and 432 MHz2015-06-16 10:37 AM
Can you give me an estimation, how much % a typical C program will run higher in speed, if I enable prefetch by setting the PRFTEN bit? (or is this not really recommended?).
At 100 MHz, code that would take 100 cycles, would take 60 cycles with prefetching enabled, as I read the documentation. One could presumably benchmark. The ART is quite effective at hiding the inherent slowness of the flash array.2015-06-16 10:48 AM
Thanks, very helpful.
Now I just also resolved another mystery which I did not get before: - This TIMPRE bit, in fact the complete RCC_DCKCFGR register is brand new as well in STM32F411 as in the STM32F42x/43x parts ... . So in STM32F411 all timers can run at 100MHz if this bit is enabled, and in STM32F42x/43x all timers can run at 168MHz if this bit is enabled. ... it would be really nice, if STM could invest in an application note which gives a table of register differences for the different parts of the STM32F4 family ... (only for the basic processor blocks as RCC, ...). This would be really very helpful.2015-06-16 12:44 PM
... just you over-estimated the prefetch influence quite a bit.
In my software it brings a speed increase of only 10% - but more than nothing of course ... . (STM32F407 discovery, 168MHz, 5 Wait States). I did not try with STM32F411 yet.2015-06-16 01:35 PM
Try disabling the cache, and a long linear run of code.
2015-06-16 11:28 PM
Adapting the code to the ART would not really be realistic :). It is just a complex C software which does some very complex, user programmable LED blinking now for testing purpose ... .
If I compare the cycle time to the slowest case of all switched off, I get the following timings: (all with STM32F407, 168MHz, VOS set, 5 WS): all off: 104usec DCEN: 100usec (-4%) PRFTEN: 92usec (-11.5%) DC+PRFTEN: 89usec (-15%) ICEN: 85usec (-19%) ICEN+DCEN: 82usec (-22%) ICEN+PRFTEN: 77usec (-26%) ICEN+DCEN+PRFTEN: 73usec (-30%) So the most important seems to be the Instruction cache, then the Prefetch, then the Data cache ... (that DC is not too important is understandable, my software does not use large Data table handling ...). ... just info ...