STM32H745 Inter core data transfer

Madhan · ‎2020-04-28

have periodic tasks (with period set by timer interrupts) running on both cores and i want to exchange data between cores.

Here is a simple example..

M7-- Does some calculations and generates a variable at 20KHz

M4-- Has to use the above variable for some other calculations running at 10KHz

I don't want either core to have a "blocking wait" for the data generated by the other core.

So how do i achieve the above and ensure that M4 doesn't read "half written data" ? It is alright if the code on M4 does not use the most "up-to-date" instance of the variable generated by M7.

Should we use L1 cache or some other special RAM to achieve this ?

berendi · ‎2020-04-28

If there is only a single 32-bit variable, you can be assured that all bits are written at once, there is nothing else to do.

If there is more data, arrange it in a struct, and put the structs in a circular buffer. Something like this

struct m7_to_m4 {
  int a;
  double b;
  char c[100];
};
 
struct m7_to_m4 ring_m7_to_m4[RINGSIZE];
struct m7_to_m4 *last_m7_to_m4;

On the M7 (source) side, update last_m7_to_m4 when there is a consistent set of data in the buffer. On the M4 side, copy last_m7_to_m4 to a private variable once before each calculation.

Manage the L1 cache properly. Either disable caching of the DMA buffer including the pointer used for communication, clean (M7-to-M4) or invalidate (M4-to-M7) the buffer area in the data cache (watch out for buggy implementations in libraries and elsewhere).

Madhan · ‎2020-04-28

Hey Berendi, thank you so much for the response. Could you please give me a clear picture on what you have told.

To make things simple and precise, two variable a,b hold random number generated by code executed on M7 and should be made available for code executed by M4 -> M4 uses a,b performs addition and stores in variable add (add=a+B), multiplies and store in mult (mult=a*b) -> add and mult should be sent back to code executed by M7 for further operations.

(the whole process will be in a loop, also data should be prevented from overwrite or half-written).

It would be great if you can share any work done by you on similar problem!

Tesla DeLorean · ‎2020-04-28

Perhaps your boss could just hire him to perform the work and get paid for it?

The issue here is buffering and cache coherency on the CM7 side, perhaps read a chapter or two in the TRM on the topic, and look at how to program the MMU and use of SCB_InvalidateDCache_by_Addr() and SCB_CleanDCache_by_Addr()

Want some examples? grep the CubeH7 source trees

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Madhan · ‎2020-04-29

The way i tried the implementation is as follows

functions and structures in CM7 main.c

typedef struct{            //master structure present in CM7 main code
    float a;
    float b;
    uint8_t m7dataReady;
}m7data_t;
 
m7data_t *const m7dataStruct = (m7data_t*)0x30000000; // D2 AHB SRAM1 start address;
 
 
typedef struct{           //shadow structure present in CM7 main code
    float add;
    float mult;
    uint8_t m4dataReady;
}m4data_t;
 
m4data_t *const m4dataStruct = (m4data_t*)0x24000000; // D1 AXI SRAM start address
 
 
void m7write()         // function called on request by application
{
    if(m7dataStruct->m7dataReady != 1)
    {
        m7dataStruct->a = (float) (rand() % 50); //store random number between 0 and 50
        m7dataStruct->b = (float) (rand() % 10); //store random number between 0 and 10
        m7dataStruct->m7dataReady = 1;
    }
}
 
void m7read()       // function called periodically (synchronized with timer interrupt)
{
    if(m4dataStruct->m4dataReady == 1)
    {
        printf("CM4 valid data received");
        m4dataStruct->m4dataReady = 0;
    }
}

functions and structures in CM4 main.c

typedef struct{               //Master structure present in CM4 main code
    float add;
    float mult;
    uint8_t m4dataReady;
}m4data_t;
 
m4data_t *const m4dataStruct = (m4data_t*)0x24000000; // D1 AXI SRAM start address
 
 
typedef struct{                 //shadow structure present in CM4 main code
    float a;
    float b;
    uint8_t m7dataReady;
}m7data_t;
 
m7data_t *const m7dataStruct = (m7data_t*)0x30000000; // D2 AHB SRAM1 start address;
 
 
void m4write()                  // function called on request by application
{
    if(m4dataStruct->m4dataReady != 1)
    {
        m4dataStruct->add = m7dataStruct->a + m7dataStruct->b;
        m4dataStruct->mult = m7dataStruct->a * m7dataStruct->b;
        m4dataStruct->m4dataReady = 1;
    }
}
 
void m4read()        // function called periodically (synchronized with timer interrupt)
{
    if(m7dataStruct->m7dataReady == 1)
    {
        printf("CM7 valid data received");
        m7dataStruct->m7dataReady = 0;
    }
}

Instead of using data ready flag, HSEM will be used for signalling cores back and forth!

Please correct me if I have committed any mistake or any alteration to be made in the above method, it would be really great help as I am in the stage of understanding DUAL CORE architecture better.:smiling_face_with_smiling_eyes: