cancel
Showing results for 
Search instead for 
Did you mean: 

Bizarre hard fault when I put a struct variable in external SDRAM. (Urgent)

Chris Rice
Associate III

So, we are creating firmware for the STM32F7 that takes pictures and stores them to an SDCard. We have implemented kind of a custom database format which stores a kind of file allocation table in RAM, makes the edits as necessary, and then copies the modified portions to the SDCard (so each member is sized to our SDCard block size, 512 bytes).

Our STM32F7 is connected to a large bank of external SDRAM, which we configure to be addressable starting at 0xC0000000. We use this extensively for processing image data, there has been no problem.

However, when we tried to increase our capacity the CPU RAM wasn't large enough, so I am trying to move the database table variables to specific memory locations in SDRAM. I calculated addresses in SDRAM that were not used otherwise, and placed the database table variables as shown in Figure 1.

However... I started getting hard faults in strange places, when I read or write to these variables. I have excerpted examples of reads and writes that are failing in Figure 2. (Note the deep database structure... this is detailed in Figure 3 if you're interested but this all works fine in CPU RAM, was fully tested.)

I ran some test code to just write and readback a few values from various points in the database, and it works fine. But when I run my production code I get hard faults on reads and writes like described in Figure #2.

I'm kind of at my wits end here... and this is fairly urgent; I guess I waited too long to try to increase my database size. I built it to be very scalable and didn't anticipate hard faults when I moved it to external SDRAM.

Sorry for the long winded post, I'm supplying a lot of info because I have no clue what the problem could be. Thanks for any help!!

** Figure 1 ** Definitions of my database structure and indices, optionally stored in external SDRAM (which is configured to be addressable starting at 0xC0000000) or in CPU RAM, depending on STORE_DB_IN_SDRAM.

#define STORE_DB_IN_SDRAM (1)
 
#if STORE_DB_IN_SDRAM
 
#define ADDRESS_DATABASEINFO     0xC1BE2AC0
#define ADDRESS_SEQUENCENUMINDEX 0xC1CB0AC0
#define ADDRESS_ITEMNUMINDEX     0xC1CCDF80
#define ADDRESS_CALCIMAGEINFO    0xC1CEB440
 
static sDatabaseInfo * ptr_DatabaseInfo = (sDatabaseInfo *) ADDRESS_DATABASEINFO;
static uint16_t * ptr_SlotIndexForSeqNum = (uint16_t *) ADDRESS_SEQUENCENUMINDEX;
static uint16_t * ptr_SlotIndexForItemNum = (uint16_t *) ADDRESS_ITEMNUMINDEX;
static sCalculatedImageInfo * ptr_CalculatedImageInfo = (sCalculatedImageInfo *) ADDRESS_CALCIMAGEINFO;
#else
static sDatabaseInfo m_DatabaseInfo;
static uint16_t     m_SlotIndexForSeqNum[MAX_NUMBER_OF_SAVED_IMAGES];
static uint16_t     m_SlotIndexForItemNum[MAX_NUMBER_OF_SAVED_IMAGES];
static sCalculatedImageInfo ptr_CalculatedImageInfo[MAX_NUMBER_OF_SAVED_IMAGES];
#endif

** Figure 2 ** The types of lines that are failing:

uint32_t tmp_a = ptr_DatabaseInfo->MasterBlock[mb_index_pending].slotidx_FirstEmptySlotInList;
    
uint32_t tmp_b = ptr_DatabaseInfo->ImageInfoBlockPairs[0].InfoBlocks[0].TableRow[0].slotindex_Next;   // THIS LINE HARD FAULTS!!
        
ptr_DatabaseInfo->MasterBlock[mb_index_pending].slotidx_LastImageInList = tmp_a;
ptr_DatabaseInfo->MasterBlock[mb_index_pending].slotidx_FirstEmptySlotInList = tmp_b;

uint16_t block_pair_index = 0;
uint16_t table_row_index = 0;
    
for (uint16_t image_index=0; image_index<MAX_NUMBER_OF_SAVED_IMAGES; image_index++)
{
     ptr_DatabaseInfo->ImageInfoBlockPairs[block_pair_index].InfoBlocks[0].TableRow[table_row_index].Flags = 0;
     ptr_DatabaseInfo->ImageInfoBlockPairs[block_pair_index].InfoBlocks[1].TableRow[table_row_index].Flags = 0;
     ptr_DatabaseInfo->ImageInfoBlockPairs[block_pair_index].InfoBlocks[0].TableRow[table_row_index].slotindex_Next = image_index+1;  // THIS LINE HARD FAULTS ON FIRST PASS!!

** Figure 3 ** The applicable database declarations... just for reference sake.

///////////////////////////////////////////////////////////////////////
// declaration
///////////////////////////////////////////////////////////////////////
 
 
#define SDCARD_BLOCKSIZE (512)
#define MAX_NUMBER_OF_SAVED_IMAGES (1000)
#define NUMBER_OF_INFO_ROWS_PER_INFO_BLOCK (73)          // 512 bytes in a block, a table row is 7 bytes
#define NUMBER_OF_INFO_BLOCK_PAIRS (14)
#define NUMBER_OF_UINT32_MASKS (1)  
#define TABLEROW_SIZE_BYTES (7)
 
typedef __packed struct
{
    uint32_t NotFirstPowerUpMarker;
    uint8_t  UnusedBytes[SDCARD_BLOCKSIZE - 4];
} sBlock_NotFirstPowerUpMarker;
 
typedef __packed struct
{
    uint8_t MasterBlockIsA_not_B;
    uint8_t UnusedBytes[SDCARD_BLOCKSIZE - 1];
} sBlock_MasterBlockSwitch
 
typedef __packed struct
{
    uint16_t ImageCount;
    uint16_t slotidx_AbsoluteStartOfList;
    uint16_t slotidx_AbsoluteEndOfList;
    uint16_t slotidx_FirstImageInList;
    uint16_t slotidx_LastImageInList;
    uint16_t slotidx_FirstEmptySlotInList;   
    uint32_t mask_PairSwitches[NUMBER_OF_UINT32_MASKS];    
    uint16_t MyCRC;
    uint8_t  UnusedBytes[SDCARD_BLOCKSIZE - (14 + NUMBER_OF_UINT32_MASKS*4)];	
} sBlock_MasterBlock;
 
typedef __packed struct
{
    sTableRowSavedData TableRow[NUMBER_OF_INFO_ROWS_PER_INFO_BLOCK];  // 73 * 7 bytes = 511
    uint8_t UnusedBytes[SDCARD_BLOCKSIZE - TABLEROW_SIZE_BYTES * NUMBER_OF_INFO_ROWS_PER_INFO_BLOCK];
} sBlock_ImageInfoBlock;
 
typedef __packed struct
{
    uint8_t Flags; 
    uint16_t slotindex_Next;
    uint16_t slotindex_Prev;
    uint16_t slotindex_Link;
} sTableRowSavedData;
 
typedef __packed struct
{
    sBlock_NotFirstPowerUpMarker NotFirstPowerUpMarker;
    sBlock_MasterBlockSwitch MasterBlockSwitch;
    sBlock_MasterBlock MasterBlock[2];
    sBlock_ImageInfoBlockPair ImageInfoBlockPairs[NUMBER_OF_INFO_BLOCK_PAIRS];
} sDatabaseInfo;
 
 

6 REPLIES 6

> __packed

Maybe this?

Was my first thought, too.

Chris, did you check the alignment of those variables in the map file ?

And I suggest to switch to assembly line single stepping when debugging, to reveal the faulting instruction (and addresses).

AVI-crak
Senior

Placement of data in sdram memory must be entrusted to the gcc linker, it is his responsibility.

Chris Rice
Associate III

Wow... yeah that's a perfect fit for my problem, thank you. Assuming alignment is 32bits, I'm definitely not aligned. The smallest item, the tablerow is 7 bytes wide and I have checked it, they are contiguous, 7 bytes apart.

Ozone the link you sent says their "recommended" solution is to enable the MPU for this region, and provides some code to do this, looks pretty straightforward but it doesn't mention any downside. Is there one? I imagine there has to be or the MPU would be enabled for this region by default? I could also of course rework my structure to be aligned...

Thanks guys... you've brought me down from "panicked" to just "nervous". 🙂

Chris Rice
Associate III

I'm realigning my structure to be on 32-bit intervals. In order to kind of enforce this, so that future maintainers don't make a similar mistake if they modify the database (e.g., add data to it by hanging additional members), I'm thinking of adding the code below to startup. Basically, if those tests aren't true, weird things will happen, and I'd like to trap this with a known error instead of just random seeming hard faults.

Does this make sense? Good practice? Overkill? Could it lead to spurious errors?

                if (sizeof(sTableRowSavedData)           != SDCARD_BLOCKSIZE)     { RETURN_MOUNTFAILED(220, 0) }
                if (sizeof(sBlock_MasterBlock)           != SDCARD_BLOCKSIZE)     { RETURN_MOUNTFAILED(220, 0) }
                if (sizeof(sBlock_ImageInfoBlock)        != (SDCARD_BLOCKSIZE*2)) { RETURN_MOUNTFAILED(220, 0) }
                if (sizeof(sBlock_ImageInfoBlockPair)    != SDCARD_BLOCKSIZE)     { RETURN_MOUNTFAILED(220, 0) }
                if (sizeof(sBlock_NotFirstPowerUpMarker) != SDCARD_BLOCKSIZE)     { RETURN_MOUNTFAILED(220, 0) }
                if (sizeof(sBlock_MasterBlockSwitch)     != SDCARD_BLOCKSIZE)     { RETURN_MOUNTFAILED(220, 0) }
                

>> I suggest to switch to assembly line single stepping when debugging, to reveal the faulting instruction (and addresses).

+1 Actual instructions and registers useful to understand the issue

Biggest killer tends to be LDRD/STRD loading a pair of registers as a 64-bit access. This fails on unaligned access across all Cortex-M parts, and is used for floating point doubles, and other optimizations where the compiler thinks it can fold accesses.

Alignment often bites with file structures or packets in byte streams.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..