Skip to main content
Francois Tremblay
Associate II
February 26, 2018
Question

STM32H7 and heap location

  • February 26, 2018
  • 5 replies
  • 4936 views
Posted on February 26, 2018 at 19:38

I am porting our application from a STM32F779 to a STM32H753 device. Our application looks to run fin if we have the heap located on the DTCM_RAM  (the whole 128k is reserved for heap allocation).

If the heap is moved to end of AXI-SRAM (last 128Kb) or  SRAM1, our application performs wrong computation.

In all cases, data and bss section are on beginning of AXI-SRAM and stack on SRAM2.

Is there something special about DTCM_RAM that can explain this behavior?

-François

    This topic has been closed for replies.

    5 replies

    Tesla DeLorean
    Guru
    February 26, 2018
    Posted on February 26, 2018 at 20:06

    >>Is there something special about DTCM_RAM that can explain this behavior?

    It is NOT cached, as it is single cycle anyway.

    You can DMA into it without causing coherency issues, ie DMA buffers for ETHERNET/SDMMC

    If you use other areas of RAM you need to use MMU to set shareability, cacheability and write through properties.

    Or use functions to invalidate DCache on areas you have compromised.

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Francois Tremblay
    Associate II
    February 27, 2018
    Posted on February 27, 2018 at 19:29

     ,

     ,

    Clive,

    Thanks to take time to answer to my question. ,

    Here couples others details:

    • The malloc-ed memory (the one coming from heap) is not used ,for any DMA buffers. ,
    • Both CPU caches (instruction and data) are currently deactivated.
    • The MPU is left in is default reset state (i.e. disabled). ,
    • Here which scenario work and which ones don't work ,
      • Scenario 1 (Working)
        • data+bss: AXI-SRAM (368k)
        • stack: SRAM1
        • heap DTCM-RAM
      • Scenario 2 (Not working:(
        • data+bss: AXI-SRAM ,

          (368k)

        • stack: SRAM1
        • heap ,SRAM2
      • Scenario 3 (Not working)
        • data+bss: AXI-SRAM

          (368k)

        • stack:

          SRAM1

        • heap: ,

          AXI-SRAM (l

          ast 128k) ,
    • I don't understand why scenario ♯ 3 is not working because 'data+bss' is already on ,AXI-SRAM. So, it should not have 'missing' configuration regarding MPU.
    • For similar reason, the the heap on SRAM2 should work as our stack is already on SRAM1 (which is on the same clock domain).
    Tesla DeLorean
    Guru
    February 27, 2018
    Posted on February 27, 2018 at 19:49

    I don't know, try turning the question on its head, what are you doing with the allocated memory?

    Do you have some kind of resource leak, or accessing beyond bounds issue? Anything returning a NULL that you aren't catching.

    I tend to instrument malloc/free in situations where I use dynamic memory prolifically, it is a good way to see who's creating orphans or releasing the same memory twice. In a pinch I'll build code to walk the allocators linked list. Stack depth might also be something to watch/monitor, stack can dip into statics or locale.

    Dynamic allocation in embedded tends to be highly problematic due to long up-time and the potential to leak or fragment.

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Francois Tremblay
    Associate II
    February 28, 2018
    Posted on February 28, 2018 at 18:21

    The problem about wrong computation occurs in a FPU intensitive loop. The problem looks to disappear when I add printf in the middle of the loop (when heap is in SRAM2).

    Having __DMB() and __DSB() at the same location than printf don't help.

    Tesla DeLorean
    Guru
    February 28, 2018
    Posted on February 28, 2018 at 18:26

    printf/scanf are heavy stack users, if RTOS where is it allocating its thread stacks from, and how large.

    Review system registers/context at the printf()

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Ibrahim Abdalkader
    Associate II
    March 1, 2018
    Posted on March 01, 2018 at 22:14

    Are you enabling the clocks ? Maybe this will help.

    __HAL_RCC_D2SRAM1_CLK_ENABLE();

    __HAL_RCC_D2SRAM2_CLK_ENABLE();

    __HAL_RCC_D2SRAM3_CLK_ENABLE();
    Francois Tremblay
    Associate II
    March 2, 2018
    Posted on March 02, 2018 at 14:21

    Ibrahim, that was already done. By the way STM, such initilization must be done within STM32Cube H7. Or at least commented out. 

    Also, I added 

    *((__IO uint32_t*)0x51003108) = 0x00000001;

    It is the same kind of workaround for SRAM1/2/3 than the workaround for AXI-SRAM regarding bug in silicon (see 2.2.15 from Errata sheet). That will set the switch matrix read capability to 1 for the SRAM1/2/3 (Target 2).

    I did more extensive test about memories porganization and here my results

    stackheaptest result

    SRAM1AXI-SRAMOK

    DTCM-RAMAXI-SRAMFail

    SRAM4

    AXI-SRAM

    OK

    SRAM4

    DTCM-RAM

    OK

    SRAM1

    DTCM-RAM

    OK

    AXI-SRAM

    DTCM-RAM

    OK

    DTCM-RAM

    SRAM2Fail

    SRAM1SRAM2Fail

    AXI-SRAMSRAM2Fail

    SRAM4SRAM2OK

    considering

    • the failure occurs on the 1st trial each test scenario
    • the success succeed for many iterations of our FPU intensive loop (where both heap and stack used). Some scenarios were tested for few hours.
    • errata sheet 2.2.15

    My opinion it is there is another similar silicone bug to 2.2.15 in the STMH7 and/or the workaround is not fixing all the cases.

    I do not have code to share as it is sensitive stuff.

    Ibrahim Abdalkader
    Associate II
    March 13, 2018
    Posted on March 13, 2018 at 14:41

    I was having a similar issue when using SRAM1/2/3 for data, heap and stack my application wouldn't run. The fix was to enable the clocks really early in startup code:

    Reset_Handler:

    /* Enable SRAM clocks */

    ldr r0,=0x580244dc

    ldr r3,[r0]

    orr r3, r3, ♯ 3758096384 /* 0xE0000000 */

    str r3,[r0]

    ldr sp, =_estack /* set stack pointer */

    ......
    Tesla DeLorean
    Guru
    March 2, 2018
    Posted on March 02, 2018 at 15:03

    >>I do not have code to share as it is sensitive stuff.

    I can imagine it is quite complex, but to advance some diagnosis and response.

    You'd need to moving beyond 'not working' and pin down more precisely the point and mode of failure, and then share something that replicates that. This will be especially true if you think there is a bug in the silicon. This can be simplified and sanitized code that illustrates the issue quickly and cleanly.

    So think of ways you can add sanity checking, and recognizing the failure as soon as possible.

    I'd need to re-check the docs on SRAM2, there is one memory that appears in two address ranges.

    You should perhaps engage an engineer from the local sales office to look at this and work with you.

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Torsten Jaekel
    Associate III
    May 16, 2018
    Posted on May 16, 2018 at 20:26

    My experience (impression): D2 SRAM is not enabled on reset and startup in H7 MCU. As mentioned here in this thread, it has to be enabled during run time, via: __HAL_RCC_D2SRAM1_CLK_ENABLE(); etc.

    But it means (results in): this SRAM cannot be used for any initialized data (no way to use as .data or .bss). The startup cannot load any data, even a .bss zero memory fill done in startup will fail. This memory becomes just access-able after enabling the clocks and the SRAM2 content will be random (uninitialized). So, you cannot define a region/section in linker script to load something before main() function is called. You can used for buffers but not for malloc: malloc would need some data structures initialized during startup, after reset released. So, I use SRAM2 only as uninitialized buffer regions during run-time, w/o to expect any pre-initialized data on it.