cancel
Showing results for 
Search instead for 
Did you mean: 

(SDMMC) The Journey of: How exactly does a CRC-16 mismatch get signaled in the SD Bus Protocol?

PhucXDoan
Associate III

Today I started to implement DMA for the SDMMC peripheral, and it worked for the most part, but when I got around to doing some sector writes, I started to get CRC-16 mismatches. This was a relatively simple fix, as I was enabling the Data Path State Machine before the Command Path State Machine, rather doing after I receive the response for the command. For sector reads, this is what you're supposed to do, but for writes (or any other transfer where data goes from device to card), one needs to handle this the other way around. By making this post, I hope it fixes someone's potential bug.

But it made me wonder: why was there a CRC-16 mismatch anyways? It's a good question. One that didn't need to be answered -- and I did almost just moved on -- but it is a good question. What's happening under the hood to cause this issue?

So I busted out my oscilloscope and picked apart what was happening. The following screenshot is it working properly (no CRC-16 mismatch) with 1-bit wide bus at 500KHz where a 512-byte sector of zeros is being written.

image.png

 

Now what does it look like when the State Machines are enabled in the wrong order?

PhucXDoan_5-1721639920913.png

Ah-hah! It appears that the Data Path State Machine got a little bit too excited and immediately began to transmit the data before the SD card even manages to respond to the command...

Very well... but wait... how exactly does the CRC-16 mismatch even gets signaled?

Well, looking through the SD Association's Simplified Specification for the Physical Layer, one can find this diagram on page 9 (abs. pg. 35):

PhucXDoan_6-1721640145430.png

I wasn't doing a multi-block write, but the same principle should apply here.

After the data block (along with the 16-bit CRC) is sent by the SDMMC peripheral, there'd then be the "CRC ok response" and the busy signaling from the SD card shortly thereafter. Here's what it looks like up close:

PhucXDoan_8-1721641262537.png

Here's what the "CRC not ok response" looks like in the CRC-16 mismatch case:

image.png

But what exactly are these responses? Unfortunately, it'd appear that the simplified specification doesn't actually shine any light on this issue at all. It alludes to CRC failures, but avoids explicit details about this so-called "CRC ok response" as if it was taboo. I searched through the specification many, many times, but alas... this is the simplified document, so it's likely ████████.

So I then had a crazy idea: let's ask ChatGPT. Let's see what type of hallucination it could come up with to explain how exactly does a CRC-16 mismatch gets communicated on the SD bus protocol!

I prompted it exactly that, and it gave me a rundown of how error handling is done and stuff; it said that the "SD card will send a negative acknowledgment on the DAT0 line during the data response phase". I guess "negative acknowledgement" is synonymous with "CRC not ok response", so I pressed further on that front...

PhucXDoan_9-1721641838108.png

 

Huh... 010... 101... those look familiar...!

 

PhucXDoan_12-1721642538884.png

 

These are -- in fact -- just control tokens(?)! They were already in the simplified specification, but under the SPI Mode section, which is why I ignored it... but it all makes sense.

The left picture below is the "CRC not ok response" and the right is the "CRC ok response"; both begin with the starting bit 0 and terminate with the ending bit 1. Inside is the status, with 101 being CRC mismatch and 010 as success.

PhucXDoan_10-1721642221551.pngPhucXDoan_11-1721642385835.png

 

 

 

 

 

 

 

I then asked ChatGPT if these "control tokens" could be still used in the SD bus protocol -- outside of SPI mode -- and it then said that it's exclusive to SPI, but I'm willing to wager that it's wrong on this account. I feel like the reality being that the SD bus protocol just simply reuse the concept of these control tokens on the DAT0 line here seems more likely.

But is really reality? I wouldn't know. All I got is a simplified specification.

1 REPLY 1
TDK
Guru

Thank you for sharing. Excellent post.

If you feel a post has answered your question, please click "Accept as Solution".