2020-04-03 06:59 AM
Hi there,
I've been dealing with this problem for a while, and can't seem to find the root of it. I've already dug through the forums, datasheets, application notes, etc. with no success.
At this point, I'd be very greatful for any pointers or suggestions you may have.
Setup:
Goals:
Current status:
Questions:
I'm attaching my current linker config file, MPU initialization code and ethernetif.c.
All that code is very much work in progress; so please, forgive poor naming, dead code, and similar shortcomings.
Should you need anything else, or if you'd like me to simplify any of that code, please let me know.
Thanks for your time,
JC
2020-04-03 01:23 PM
I'm still investigating register values under different scenarios.
I'm running some tests to analyze the MPU registers, so I'll update this post when I have more workable data.
For now, here's some other register values at the fault handler:
If you'd like to know the values of any particular registers, just let me know.
2020-04-03 06:38 PM
> Why doesn't the SDK's ARM_CM4_MPU FreeRTOS port support non-cacheable MPU regions? Or am I missing something?
STM32H743 does not have M4 cores. You're running on M7.
--pa
2020-04-03 07:12 PM
Check the MPU regions at https://community.st.com/s/question/0D50X0000C6eNNSSQ2/bug-fixes-stm32h7-ethernet.
You hadn't said you would, but avoid making the DMA descriptor regions cacheable as it'd cost more cycles than you'd gain.
2020-04-06 07:51 AM
M4 and M7 are compatible at this level. That's why the SDK only provides an M7-specific port to work around a silicon errata in the r0p1 core revision.
From the M7 port's ReadMe.txt:
The first option is to use the ARM Cortex-M4F port, and the second option is to
use the Cortex-M7 r0p1 port - the latter containing a minor errata workaround.
If the revision of the ARM Cortex-M7 core is not r0p1 then either option can be
used, but it is recommended to use the FreeRTOS ARM Cortex-M4F port located in
the /FreeRTOS/Source/portable/IAR/ARM_CM4F directory.
If the revision of the ARM Cortex-M7 core is r0p1 then use the FreeRTOS ARM
Cortex-M7 r0p1 port located in the /FreeRTOS/Source/portable/IAR/ARM_CM7_MPU/r0p1
directory.
As far as I can tell, my chip does not have an r0p1 core, so I'm using the recommended M4 port, just like the SDK's FreeRTOS examples.
2020-04-07 10:03 AM
Thanks for the feedback, alister.
I've been going through your code, and can't identify any fatal differences. Here's the main ones I could find:
Did you ever see the "crash on 1st non-privileged instruction fetch" problem?
I find it odd that the fault doesn't occur during MPU configuration, or after LwIP/ethernet are put to work, but whenever I switch to user mode.
To me, it feels like my MPU_Config() is reconfiguring the MPU regions allocated to FreeRTOS tasks, so that's why I'm studying the MPU registers now. But does that make sense to you? because you seem to have it running, and I can't see you working around any incompatibilities. Did I miss anything?
2020-04-07 05:17 PM
>You have the linker script align the memory, while I set the aligned addresses myself
You can check their start addresses and sizes in your linker map file.
>You use the SRAM3 and AXI regions
My region sizes weren't final. I'd selected non-optimal-domains after weighing the other peripherals/functions I'm supporting.
>You don't have an MPU region for LwIP's heap. I'm not sure how you ensure the Tx buffer's cache coherency
I'd modified the STM32H7 lwIP's ethernetif.c. These macro's I'd added there control cache-coherency. I'd described these and my rationale in a post to that thread on "February 17, 2020 at 5:55 PM".
#define ETH_RX_BUFFERS_ARE_CACHED 0 /* To conserve memory, this app positions its Rx Buffers
* (".RxArraySection" section) in a not-cacheable MPU region. */
#define ETH_TX_BUFFERS_ARE_CACHED 1 /* Tx buffers are in normal, cached memory. */
>Did you ever see the "crash on 1st non-privileged instruction fetch" problem?
I'm not using privileged mode and I'm not expert.
It's important to understand examples are _only_ that. I sometimes read examples but never use them
Google widely.
You shouldn't need to globally disable D-cache.
You'll need to determine which tasks should execute as privileged. A privileged task can access any region. Perhaps the task that initializes lwIP (and the ETH driver via ethernetif.c), the ethernetif_input or equivalent (for ETH driver rx) and lwIP's tcpip_thread (for ETH driver tx) should be privileged.
2020-04-16 10:57 AM
Getting back to register-level debugging... Can anybody make sense of these register values?
Memory Protection Unit Value Access
MPU_CTRL 0x00000000 ReadOnly
PRIVDEFENA 0 ReadOnly
HFNMIENA 0 ReadOnly
ENABLE 0 ReadOnly
MPU_RNR 0x00000007 ReadWrite
MPU_RBAR 0x00000007 ReadWrite
ADDR 0x0000000 ReadWrite
VALID 0 ReadWrite
REGION 0x7 ReadWrite
MPU_RASR 0x00000000 ReadWrite
XN 0 ReadWrite
AP 0x0 ReadWrite
TEX 0x0 ReadWrite
S 0 ReadWrite
C 0 ReadWrite
B 0 ReadWrite
SRD 0x00 ReadWrite
SIZE 0x00 ReadWrite
ENABLE 0 ReadWrite
MPU_RBAR_A1 0x00000007 ReadWrite
MPU_RASR_A1 0x00000000 ReadWrite
MPU_RBAR_A2 0x00000007 ReadWrite
MPU_RASR_A2 0x00000000 ReadWrite
MPU_RBAR_A3 0x00000007 ReadWrite
MPU_RASR_A3 0x00000000 ReadWrite
I pulled them off the Nucleo-H743ZI running the (unmodified) FreeRTOS MPU example from the SDK.
The example is apparently working fine, with a Hard Fault occurring when an un privileged task accesses protected regions; as expected.
None of that makes any sense to me, since not even the MPU_CTRL->ENABLE bit is set.
Got any insight into why I'm seeing them, and how could the example app possibly work like this?
For reference, these are all the RBAR:RASR extracted via printf:
[MPU->RNR: 0x00000000][MPU->RBAR: 0x00000000][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000001][MPU->RBAR: 0x00000001][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000002][MPU->RBAR: 0x00000002][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000003][MPU->RBAR: 0x00000003][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000004][MPU->RBAR: 0x20000004][MPU->RASR: 0x0307001f]
[MPU->RNR: 0x00000005][MPU->RBAR: 0x20000005][MPU->RASR: 0x01070011]
[MPU->RNR: 0x00000006][MPU->RBAR: 0x00000006][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000007][MPU->RBAR: 0x00000007][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000008][MPU->RBAR: 0x00000008][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000009][MPU->RBAR: 0x00000009][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000a][MPU->RBAR: 0x0000000a][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000b][MPU->RBAR: 0x0000000b][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000c][MPU->RBAR: 0x0000000c][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000d][MPU->RBAR: 0x0000000d][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000e][MPU->RBAR: 0x0000000e][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000f][MPU->RBAR: 0x0000000f][MPU->RASR: 0x00000000]
For comparison, these are the register values from the SDK's LwIP example:
Memory Protection Unit Value Access
MPU_CTRL 0x00000005 ReadOnly
PRIVDEFENA 1 ReadOnly
HFNMIENA 0 ReadOnly
ENABLE 1 ReadOnly
MPU_RNR 0x00000001 ReadWrite
REGION 0x01 ReadWrite
MPU_RBAR 0x30044001 ReadWrite
ADDR 0x1802200 ReadWrite
VALID 0 ReadWrite
REGION 0x1 ReadWrite
MPU_RASR 0x030C001B ReadWrite
XN 0 ReadWrite
AP 0x3 ReadWrite
TEX 0x1 ReadWrite
S 1 ReadWrite
C 0 ReadWrite
B 0 ReadWrite
SRD 0x00 ReadWrite
SIZE 0x0D ReadWrite
ENABLE 1 ReadWrite
MPU_RBAR_A1 0x30044001 ReadWrite
MPU_RASR_A1 0x030C001B ReadWrite
MPU_RBAR_A2 0x30044001 ReadWrite
MPU_RASR_A2 0x030C001B ReadWrite
MPU_RBAR_A3 0x30044001 ReadWrite
MPU_RASR_A3 0x030C001B ReadWrite
------------------ All RBAR:RASR pairs: -------------------
[MPU->RNR: 0x00000000][MPU->RBAR: 0x30040000][MPU->RASR: 0x0301000f]
[MPU->RNR: 0x00000001][MPU->RBAR: 0x30044001][MPU->RASR: 0x030c001b]
[MPU->RNR: 0x00000002][MPU->RBAR: 0x00000002][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000003][MPU->RBAR: 0x00000003][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000004][MPU->RBAR: 0x00000004][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000005][MPU->RBAR: 0x00000005][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000006][MPU->RBAR: 0x00000006][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000007][MPU->RBAR: 0x00000007][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000008][MPU->RBAR: 0x00000008][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000009][MPU->RBAR: 0x00000009][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000a][MPU->RBAR: 0x0000000a][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000b][MPU->RBAR: 0x0000000b][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000c][MPU->RBAR: 0x0000000c][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000d][MPU->RBAR: 0x0000000d][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000e][MPU->RBAR: 0x0000000e][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000f][MPU->RBAR: 0x0000000f][MPU->RASR: 0x00000000]
My current codebase has a seemingly correct MPU_CTRL, and my HAL-allocated regions are there, but none of the FreeRTOS-MPU regions are. They seem to have been overwritten by using the HAL (???)
[MPU->RNR: 0x00000000][MPU->RBAR: 0x00000000][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000001][MPU->RBAR: 0x00000001][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000002][MPU->RBAR: 0x00000002][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000003][MPU->RBAR: 0x00000003][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000004][MPU->RBAR: 0x00000004][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000005][MPU->RBAR: 0x00000005][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000006][MPU->RBAR: 0x00000006][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000007][MPU->RBAR: 0x00000007][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000008][MPU->RBAR: 0x00000008][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000009][MPU->RBAR: 0x00000009][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000a][MPU->RBAR: 0x0000000a][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000b][MPU->RBAR: 0x0000000b][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000c][MPU->RBAR: 0x0000000c][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000d][MPU->RBAR: 0x3004710d][MPU->RASR: 0x0301000f]
[MPU->RNR: 0x0000000e][MPU->RBAR: 0x3004000e][MPU->RASR: 0x0300001b]
[MPU->RNR: 0x0000000f][MPU->RBAR: 0x3004400f][MPU->RASR: 0x03000019]
Any ideas as to why the FreeRTOS-MPU regions were seemingly wiped?
2020-04-20 09:59 AM
Update: I've found the first problem, but the app is still not working.
This bug is present in the SDK's FreeRTOS ports. I know it's a problem in ARM_CM4_MPU/port.c and ARM_CM7_MPU/r0p1/port.c, but also probably many others.
Those ports use the contents of the MPU->TYPE register to decide whether an MPU is present at runtime. Specifically:
#define portEXPECTED_MPU_TYPE_VALUE ( 8UL << 8UL ) /* 8 regions, unified. */
Since the M7 core supports up to 16 memory regions, the device returns 0x00001000 instead of 0x00000100, and FreeRTOS silently initializes without properly configuring or initializing the MPU.
I've fixed that by expecting (16UL<<8UL), and now the FreeRTOS-MPU example actually uses the MPU:
Code:
#if defined(CORE_CM4)
#define portEXPECTED_MPU_TYPE_VALUE ( 8UL << 8UL ) /* 8 regions, unified. */
#else
/* The if statement in this fix is in line with stm32h7xx_hal_cortex.h */
#define portEXPECTED_MPU_TYPE_VALUE ( 16UL << 8UL ) /* 16 regions, unified. */
#endif /* defined(CORE_CM4) */
Results:
[MPU->RNR: 0x00000000][MPU->RBAR: 0x08000000][MPU->RASR: 0x06070029]
[MPU->RNR: 0x00000001][MPU->RBAR: 0x08000001][MPU->RASR: 0x0507001b]
[MPU->RNR: 0x00000002][MPU->RBAR: 0x20000002][MPU->RASR: 0x01070011]
[MPU->RNR: 0x00000003][MPU->RBAR: 0x40000003][MPU->RASR: 0x13000039]
[MPU->RNR: 0x00000004][MPU->RBAR: 0x20000004][MPU->RASR: 0x0307001f]
[MPU->RNR: 0x00000005][MPU->RBAR: 0x20000005][MPU->RASR: 0x01070011]
[MPU->RNR: 0x00000006][MPU->RBAR: 0x00000006][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000007][MPU->RBAR: 0x00000007][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000008][MPU->RBAR: 0x00000008][MPU->RASR: 0x00000000]
[MPU->RNR: 0x00000009][MPU->RBAR: 0x00000009][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000a][MPU->RBAR: 0x0000000a][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000b][MPU->RBAR: 0x0000000b][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000c][MPU->RBAR: 0x0000000c][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000d][MPU->RBAR: 0x0000000d][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000e][MPU->RBAR: 0x0000000e][MPU->RASR: 0x00000000]
[MPU->RNR: 0x0000000f][MPU->RBAR: 0x0000000f][MPU->RASR: 0x00000000]
But when I test that fix in my custom codebase, the FreeRTOS RBAR:RASR pairs are still wiped, and only my HAL-allocated regions remain.
As a result, my application still crashes.