2024-10-16 12:52 AM
Heyho,
I'm using the H733 (custom board) / H735 (eval kit) with Infineon's HyperRAM S70KL1281 / S70KL1282 at 100 MHz for some time now, all working great, except for one thing that is very annoying:
I'm pretty sure that it's not "faulty" timing measurements, using the cycle counter and disabling all interrupt calls around the for loops.
Here's the test function, first writing to HyperRAM, then reading:
/* +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */
/* OCTOSPI HyperRAM test
*/
#define HYPER_TEST_UART 1
uint32_t OspiHypRamTest(uint8_t u8CountDown)
{
uint32_t i = 0;
uint32_t u32Val = 0xFFFFFFFF;
uint32_t u32MaxLen = (uint32_t)((uint32_t)OSPI_HYPERRAM_END_ADDR / 4);
uint32_t u32Errors = 0;
uint32_t u32Data = 0;
uint32_t u32CycStart = 0;
uint32_t u32Cycles = 0;
float flClockMHz = (float)HAL_RCC_GetSysClockFreq() / 1E6;
float flVal = 0.0f;
uint32_t *pu32MemAddr = NULL;
if( OCTOSPI1 == pOspiHypRam ) pu32MemAddr = (uint32_t *)OCTOSPI1_BASE;
else if( OCTOSPI2 == pOspiHypRam ) pu32MemAddr = (uint32_t *)OCTOSPI2_BASE;
#if HYPER_TEST_UART
uart_printf("\n\r+++++++++++++++++++++++++++++++++++++++++++++++++\n\r");
uart_printf("OCTOSPI HyperRAM test, memory mapped, IRQs OFF\n\rcounting ");
if( 0 == u8CountDown ) uart_printf("UP, start with 0\n\r\n\r");
else uart_printf("DOWN, start with %08lX\n\r\n\r", u32Val);
uart_printf("writing bytes: %lu\n\r", (uint32_t)OSPI_HYPERRAM_END_ADDR);
#endif
__DSB();
__disable_irq();
/* write complete HyperRAM */
/* UP - should be faster */
if( 0 == u8CountDown )
{
u32CycStart = DWT->CYCCNT;
for( i = 0; i < u32MaxLen; i++ )
{
pu32MemAddr[i] = i;
}
__DMB();
__DSB();
u32Cycles = DWT->CYCCNT;
}
/* DOWN */
else
{
u32Val = 0xFFFFFFFF;
u32CycStart = DWT->CYCCNT;
for( i = 0; i < u32MaxLen; i++ )
{
pu32MemAddr[i] = u32Val;
u32Val--;
}
__DMB();
__DSB();
u32Cycles = DWT->CYCCNT;
}
__enable_irq();
__DSB();
u32Cycles -= u32CycStart;
flVal = (float)u32Cycles / flClockMHz;
flOspiRamSpeedMBpsMmWr = (float)OSPI_HYPERRAM_END_ADDR / flVal;
flOspiRamSpeedMBpsMmWr *= (float)MEGA_CORRECTION;
#if HYPER_TEST_UART
uart_printf("%lu CPU cycles = %.1f ms\n\r", u32Cycles, (flVal / 1000.0f));
uart_printf("\n\r-> %.2f MB/s (%.0f Mbit/s) WRITE\n\r\n\r", flOspiRamSpeedMBpsMmWr, (8.0f * flOspiRamSpeedMBpsMmWr));
uart_printf("reading & comparing bytes: %lu\n\r", (uint32_t)OSPI_HYPERRAM_END_ADDR);
#endif
__DSB();
if( OCTOSPI1 == pOspiHypRam ) pu32MemAddr = (uint32_t *)OCTOSPI1_BASE;
else if( OCTOSPI2 == pOspiHypRam ) pu32MemAddr = (uint32_t *)OCTOSPI2_BASE;
__disable_irq();
__DSB();
/* read complete HyperRAM and compare */
/* UP - should be faster */
if( 0 == u8CountDown )
{
u32CycStart = DWT->CYCCNT;
for( i = 0; i < u32MaxLen; i++ )
{
u32Data = pu32MemAddr[i];
if( u32Data != i ) u32Errors++;
}
__DMB();
__DSB();
u32Cycles = DWT->CYCCNT;
}
/* DOWN */
else
{
u32Val = 0xFFFFFFFF;
u32CycStart = DWT->CYCCNT;
for( i = 0; i < u32MaxLen; i++ )
{
u32Data = pu32MemAddr[i];
if( u32Data != (u32Val - i) ) u32Errors++;
}
__DMB();
__DSB();
u32Cycles = DWT->CYCCNT;
}
__enable_irq();
u32Cycles -= u32CycStart;
flVal = (float)u32Cycles / flClockMHz;
flOspiRamSpeedMBpsMmRd = (float)OSPI_HYPERRAM_END_ADDR / flVal;
flOspiRamSpeedMBpsMmRd *= (float)MEGA_CORRECTION;
#if HYPER_TEST_UART
uart_printf("%lu CPU cycles = %.1f ms\n\r", u32Cycles, (flVal / 1000.0f));
uart_printf("\n\r-> %.2f MB/s (%.0f Mbit/s) READ\n\r", flOspiRamSpeedMBpsMmRd, (8.0f * flOspiRamSpeedMBpsMmRd));
if( 0 == u32Errors ) uart_printf("\n\rNULL errors\n\r");
else uart_printf("\n\r# ERR: u32Errors = %lu\n\r", u32Errors);
uart_printf("-------------------------------------------------\n\r");
#endif
return u32Errors;
}
Anybody any ideas?
Thanks in advance!
2024-10-16 01:10 AM
2024-10-16 02:02 AM
I'm using
- H735 EVK or H733 custom board
- STM32CubeIDE Version: 1.10.1
- optimization FAST
- CPU clock 400 MHz
- OSPI 100 MHz
- HyperRAM setup via direct register access (doesn't make a difference to HAL setup)
2024-10-16 02:34 AM
I just got the "fast" version again, maybe there's some bus issues in the background, depending on the UART use:
UART 3 is used for debugging, in TX DMA mode.
The ouput function uart_printf() fills the TX DMA buffer, just waits at the beginning for previous transfers to finish by checking TC and other stuff with a function UartDbgDmaTxWait().
When I put UartDbgDmaTxWait() after each uart_printf() around OspiHypRamTest() I get the high speed - for now at least...
The question remains, before I did that, why sometimes fast / slow results, without changing anything concerning the OSCTOSPI peripheral and the test function?
2024-10-16 02:39 AM
I also compared the assembler in the list files, between slow / fast version:
the important loops reading / writing HyperRAM and comparing - while the interrupts are disabled - basically look the same