2024-11-19 02:46 AM
Hi,
We notice the startup time from power up can be affected by setting one GPIOB pin to input. The startup time here refers to the time from SystemInit() until main(). Basically it contains scatterload and __rt_entry to my understanding.
Specifically, the sample code below, built with ARM Compiler 6.16 Tool, would demonstrate that
#define MEASURE_PIN 5
int main(void)
{
IO_SET_PIN(GPIOA, MEASURE_PIN, 1);
// Enable IWDG
IWDG->KR = 0xCCCC;
// Enable write access
IWDG->KR = 0x5555;
// Set prescaler
IWDG->PR = 0;
// Set reload
IWDG->RLR = 1;
// Wait for the registers reload
while (IWDG->SR)
{
}
// refresh the watchdog.
IWDG->KR = 0xAAAA;
while(1);
return 0;
}
void SystemInit(void)
{
// enable GPIOA clock
RCC->IOPENR |= RCC_IOPENR_GPIOAEN | RCC_IOPENR_GPIOBEN;
// set GPIOA pin5 to output
IO_SET_PIN_DIR(GPIOA, MEASURE_PIN, 1);
// set GPIOA pin5 to low
IO_SET_PIN(GPIOA, MEASURE_PIN, 0);
// Set GPIOB pin7 to input
// GPIOB->MODER &= 0xffff3fff;
}
Basically GPIOA pin 5 is toggled to low and high for the time between SystemInit() and main() to be measured.
main() also has IWDG setup and ends with a while loop so that IWDG reset happens periodically since power up.
On the oscilloscope it looks like the screenshots below.
Respectively from the screenshots, it can be observed that the 1st startup(power up) takes 254 μs while the 2nd startup (IWDG reset) takes 132 μs.
Now if uncommenting this line GPIOB->MODER &= 0xffff3fff; and redo the measurement, we get the results below.
We can see that having the GPIOB pin 7 set to input reduces the startup time from 254 μs to 132 μs and the rest startup remains taking the same time.
Could you please confirm the following:
Just in case, this is how Reset_Handler looks like in our startup file. It is basically without any customization.
; Reset handler routine
Reset_Handler PROC
EXPORT Reset_Handler [WEAK]
IMPORT __main
IMPORT SystemInit
LDR R0, =SystemInit
BLX R0
LDR R0, =__main
BX R0
ENDP
Anta
2024-11-22 07:22 AM
@waclawek.jan wrote:> And initalizing the ram-variables (in __rt_entry) shouldn't take up any time as I don't see any global or static variable in your code. I suspect the internal clock is not fully stabilized yet.
0x400 bytes starting at 0x20000020 are set to zero - there are 256 word writes.
If the loop takes 6 cycles to execute, which IMO it can, that's around 130us at 12MHz clock.
JW
That's quite a lot of zero-initialized variables. Or does it always clear the entire block reserved in the linker file?
2024-11-22 09:39 AM
I'm just describing what I see in the disasm. I am not ARM/Keil so can't explain their decisions... ;)
I'm not sure the described phenomenon can be explained without having deep access into the 'C0's innards. I was asking for the chip revision because of the "first SRAM access may fail" erratum, although the erratum is not clear about what constitutes the circumstances of failure and what exactly are its consequences. But regardless of that erratum, one of the mechanisms I can envisage is, that after poweron reset a hardware process runs, which performs some operation across the whole SRAM array, and thus the first SRAM access is delayed by some time - well say it precalculates the parity, one word per cycle, 6kBytes at 12MHz would take around 130us...
The difference between the "with gpio" and "without gpio" is, that in the former the first access to RAM is the push which happens *before* the observed pin is pulled low, i.e. the lengthy RAM operation happens before the pin is pulled low and thus the pulse has the expected length; whereas in the latter (without gpio) the first access to RAM is the first zero write for the initialization, which is *after* the pin is pulled low, thus prolongs the observed pulse.
This is just a theory and could be proven or disproven by a carefully crafted test (possibly in asm), where one pin edge would be generated before the first access to SRAM, and another edge (either on the same or other pin) after that access. I don't have a 'C0 to experiment.
In any case, the total time between the moment when POR releases the processor and the moment when the observed pin is pulled high at beginning of main() should be roughly the same in both cases. This could be proven or disproven by powering the circuit from a source which can go from 0V to 3V quickly enough (in a few us).
JW
2024-11-28 04:10 AM
Hi @waclawek.jan I was trying to follow your conjecture and wanted to conduct the test afterwards if possible but failed to verify on a few points, such as:
Could you please share more pointers on these 2 items?
Anta
2024-11-28 09:53 AM
> I can find it [PUSH] from the "with gpio" but not the other file.
Exactly: there is no PUSH (which is before the observed pin pulled low) in the "without gpio" firmware; that's why I wrote above:
>> whereas in the latter (without gpio) the first access to RAM is the first zero write for the initialization
---
> "0x400 bytes starting at 0x20000020 are set to zero", where is this number 0x400 coming from?
In __scatterload, 0x80001b4 is loaded into r4, and then LDM r4!,{r0-r2} loads 3 words from table at 0x80001b4 into r0, r1, r2 and calls address which is the 4th word of table at 0x80001b4 (loaded previosly into r3). That 4th word is 080001a4, which is address of __scatterload_zeroinit, and that transfers r2/4 zero words (i.e. r2 zero bytes) from r0 into memory starting at r1. As r2 was loaded as 3rd word from the table 0x80001b4, that's the 0x400.
JW
2024-12-04 01:47 AM
@waclawek.jan wrote:> I can find it [PUSH] from the "with gpio" but not the other file.
Exactly: there is no PUSH (which is before the observed pin pulled low) in the "without gpio" firmware; that's why I wrote above:
>> whereas in the latter (without gpio) the first access to RAM is the first zero write for the initialization
---
Thanks for the explanation. I expect the 2nd reset would have been through the same zero initialization though; however, the measured time on the 2nd reset between "with gpio" and "without gpio" is the same. Am I overlooking something here?
Anta
2024-12-04 02:18 AM
My hypothesis was, that after power-on reset (i.e. not watchdog reset), the first access to RAM lasts longer.
JW
2024-12-09 06:37 AM
@waclawek.jan wrote:This is just a theory and could be proven or disproven by a carefully crafted test (possibly in asm), where one pin edge would be generated before the first access to SRAM, and another edge (either on the same or other pin) after that access. I don't have a 'C0 to experiment.
I have tested and confirmed the behavior described. In my setup, I modified SystemInit() to toggle a GPIO pin and ensured that the first SRAM access occurs between the toggling operations. Then, in main(), I set up the IWDG to repeatedly reset the system, ensuring that several SRAM accesses occur in a row for observation.
The results are as follows:
This confirms that the initial access to SRAM is significantly delayed compared to subsequent accesses.
@waclawek.jan Thank you for pointing out this behavior - it was interesting to verify and observe, although I'm still missing the connection/explanation of why having one extra GPIO direction configuration would lead to earlier SRAM access. Any insights or comments from anyone would be appreciated, especially if you can shed more light on this behavior.
2024-12-09 11:14 PM
Also one thing worth mentioning is that the errata says
As the probability of occurrence is extremely low, the failure is rare and difficult to reproduce
while the issue we are seeing happens every time.
2024-12-10 02:12 AM
> I'm still missing the connection/explanation of why having one extra GPIO direction configuration would lead to earlier SRAM access
Because with the extra GPIO direction configuration the compiler felt a register pressure and decided to push registers in the prologue of SystemInit() function.
JW
2024-12-10 03:22 AM
@waclawek.jan wrote:> I'm still missing the connection/explanation of why having one extra GPIO direction configuration would lead to earlier SRAM access
Because with the extra GPIO direction configuration the compiler felt a register pressure and decided to push registers in the prologue of SystemInit() function.
JW
Push on the stack? Does SystemInit even need stack? Would marking the function as _Noreturn or __attribute__((noreturn)) help?
I think we need to ask the engineers who designed the chip. They should know what conditions trigger this behavior and perhaps how to avoid it.