cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H730 AES-GCM Tag issue

GCern.1
Associate III

STM32H730, an invalid tag is generated for payloads that are not a multiple of 4, as described in Re: STM32U545 AES GCM Tag mismatch - STMicroelectronics Community.

AES-CCM: no issues, everything works.

AES-GCM:

The HAL already handles required padding on its own.

  • Encryption and tag generation - work
  • Decryption - works
  • Tag generation after decryption - invalid

Tried with and without AD. Tried to provide AD and/or Ciphertext as a multiple of 4 and 16 bytes (as suggested by other posts; however, the H7 HAL already handles padding). The resulting tag after decoding is invalid for a ciphertext of a length that is not a multiple of 4.

Everything works when the ciphertext is a multiple of 4 bytes (but not when manually padded to such with 0-oes at the end). The ciphertext of 15 bytes won't generate a correct tag, but it works when it is 12 or 16 bytes.

Standard HAL functions are used: HAL_CRYP_Encrypt, HAL_CRYP_Decrypt, HAL_CRYPEx_AESGCM_GenerateAuthTAG, and HAL_CRYP_SetConfig.

What am I doing wrong?

Edit: Will recheck in the CubeMX-generated project.

Example

DataType = CRYP_DATATYPE_8B
DataWidthUnit  = CRYP_DATAWIDTHUNIT_BYTE

All buffers are aligned (i.e. __attribute__((__aligned__(4)))).

Given

Key (128 bits): [0x071b113b, 0x0ca743fe, 0xcccf3d05, 0x1f737382] (end. adjusted)
IV (12 bytes, providing 16): [0xf0761e8d, 0xcd3d0001, 0x76d457ed, 0x00000002] (end. adjusted, counter set to 2)

Encode, generate a tag, decode, generate a tag of 15 bytes:
Plaintext (15 bytes) - [0x08, 0x00, 0x0F, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x15, 0x16, 0x17]

Expected
Ciphertext (15 bytes) - [0x13, 0xB4, 0xC7, 0x2B, 0x38, 0x9D, 0xC5, 0x01, 0x8E, 0x72, 0xA1, 0x71, 0xD1, 0x89, 0xA9]
Tag (16 bytes) - [0x8A, 0x0B, 0xF6, 0xE7, 0xC8, 0xAC, 0x33, 0x47, 0x61, 0x16, 0xED, 0xA8, 0x05, 0x2E, 0xF7, 0xC6]

Results
Encryption and tag generation work, getting:

Ciphertext (15 bytes) - [0x13, 0xB4, 0xC7, 0x2B, 0x38, 0x9D, 0xC5, 0x01 0x8E, 0x72, 0xA1, 0x71, 0xD1, 0x89, 0xA9]
Tag (16 bytes) - [0x8A, 0x0B, 0xF6, 0xE7, 0xC8, 0xAC, 0x33, 0x47, 0x61, 0x16, 0xED, 0xA8, 0x05, 0x2E, 0xF7, 0xC6]

Decryption works, but tag generation fails, getting:
Plaintext (15 bytes) - [0x08, 0x00, 0x0F, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x15, 0x16, 0x17]
Tag (16 bytes) - [0x74, 0x27, 0x4E, 0x06, 0x64, 0x84, 0xD8, 0x4C, 0x04, 0x3E, 0x25, 0x38, 0x15, 0x3A, 0x47, 0xA4] - invalid

Encode, generate a tag, decode, generate a tag of 16 bytes:
Plaintext (16 bytes) - [0x08, 0x00, 0x0F, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x15, 0x16, 0x17, 0xAB]

Expected
Ciphertext (16 bytes) - [0x13, 0xB4, 0xC7, 0x2B, 0x38, 0x9D, 0xC5, 0x01, 0x8E, 0x72, 0xA1, 0x71, 0xD1, 0x89, 0xA9, 0x64]
Tag (16 bytes) - [0x85, 0xE9, 0x98, 0x47, 0xBE, 0x4D, 0xAA, 0xCB, 0x86, 0xFB, 0xCF, 0x00, 0xBE, 0x58, 0x5B, 0x47]

Results
Encryption and tag generation, as well as decryption and tag generation, work as expected.

9 REPLIES 9
Saket_Om
ST Employee

Hello @GCern.1 

I reported your question internally and will get back to you as soon as possible.

Internal ticket number: 210917 (This is an internal tracking number and is not accessible or usable by customers).

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om
Saket_Om
ST Employee

Hello @GCern.1 

Could you share your project please, then we can assist you more effectively. 

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om

@Saket_Om I observed that tag generation fails when the ciphertext buffer is allocated on the stack rather than globally. Attached an example.

DMA is not used, D/ICaches are disabled.

In the initial project (not the provided example), the tag is generated correctly when the ciphertext is located at 0x24022ed8, but fails when it is located at 0x24021c98 - both in SRAM, but in that project D/ICache is enabled. But maybe I am digging in the incorrect direction...

What is the reason for such behavior?

Hello @GCern.1 

Please refer to the example below: 

STM32CubeH7/Projects/STM32H735G-DK/Examples/CRYP/CRYP_AES_GCM at master · STMicroelectronics/STM32CubeH7 · GitHub

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om
GCern.1
Associate III

Hello, thank you. I'm aware of the examples; they don't trigger the issue.

Even in the example I shared earlier (as a zip attachment), it's possible to get it working. However, there is a specific case where it fails. I suspect the problem may be related to memory access and/or cache coherency or something else, though I haven't pinpointed it yet.

Hello @GCern.1 

Could you please check with the patch attached please?

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
Saket_Om

Hi @Saket_Om 

After several correction of the file to make it compile on H73x the generated tag after gcm decode is incorrect as well. The decoding part of the ciphertext works with modified and with original `stm32h7xx_hal_cryp.c`.


Tag is generated correctly when:
* When input array is 16 bytes but we decode 15 bytes (as there is only 15 bytes of data).
* When we dynamically allocate an aligned input array of size that is multiple of 4,
   but decode with real size (i.e. 15).
* When input array is in the global scope (regardless of its size).

Tag is generated incorrectly:
* When input array is 15 bytes on the stack.

Tried to flush and invalidate the input array memory area, no difference.

Instructions for input array initialization:

-- with [16]
ldr r3, [pc, #140];
add.w r12, sp, #80;
ldmia r3, {r0, r1, r2, r3};
stmia r12, {r0, r1, r2, r3};
 
-- with [15]
ldr r3, [pc, #140];
add.w r12, sp, #80;
ldmia r3, {r0, r1, r2, r3};
stmia.w r12!, {r0, r1, r2};
strh.w r3, [12], #2;
lsrs r3, r3, #16;
strb.w r3, [r12];

Maybe the issue relates to ECC?

@Saket_Om update2 after some further investigation: the culprit seems to be the padding bytes.


During the encryption the CRYP_CR_NPBLB is used to let HW know which bytes to discard;
however, during decryption process that is not used/not available.

Here is debug info, where we can see that input with 16 bytes has explicit `00` padding in the last word (highlighted in bold); while when the input buffer is 15 bytes we get some "noise" (8a) instead of `00`.
-----------------------------------------------------------

Working with 16 length

x/15xb hcryp->pCrypInBuffPtr
0x2402a5d8: 0x13    0xb4    0xc7    0x2b    0x38    0x9d    0xc5    0x01
0x2402a5e0: 0x8e    0x72    0xa1    0x71    0xd1    0x89    0xa9

x/4xw hcryp->pCrypInBuffPtr
0x2402a5d8: 0x2bc7b413  0x01c59d38  0x71a1728e  0x00a989d1

When CrypInCount == 3
p/x *(uint32_t *)(hcryp->pCrypInBuffPtr + hcryp->CrypInCount)
$2 = 0xa989d1

-----------------------------------------------------------
Not working with 15 length

x/15xb hcryp->pCrypInBuffPtr
0x24021c54: 0x13    0xb4    0xc7    0x2b    0x38    0x9d    0xc5    0x01
0x24021c5c: 0x8e    0x72    0xa1    0x71    0xd1    0x89    0xa9

x/4xw hcryp->pCrypInBuffPtr
0x24021c54: 0x2bc7b413  0x01c59d38  0x71a1728e  0x8aa989d1
-----------------------------------------------------------
 
After accounting for the padding bytes in software the authentication tag is correctly generated after decryption.
Modification is done to CRYP_AESGCM_Process in stm32h7xx_hal_cryp.c to set padding bytes to 0 in the lastword:
/* Write the last input block in the IN FIFO */
for (index = 0U; index < lastwordsize; index ++)
{
    if (((hcryp->Instance->CR & CRYP_CR_ALGODIR) == CRYP_OPERATINGMODE_DECRYPT) &&
        ((npblb != 0) && (index + 1 == lastwordsize))) {
        // set pad bytes to 0
        hcryp->Instance->DIN = (*(uint32_t *)(hcryp->pCrypInBuffPtr + hcryp->CrypInCount) << (npblb * 8)) >> (npblb * 8);
    } else {
        hcryp->Instance->DIN  = *(uint32_t *)(hcryp->pCrypInBuffPtr + hcryp->CrypInCount);
    }
    hcryp->CrypInCount++;
}
 
Update3: unfortunately, this won't work for all input cases (i.e. when npblb is > 4).
 
Update4: after correction of the padding in the last block (16 bytes) of the last writable word (4 bytes) it seems to work fine, required changes (same function, same file
/* Write the last input block in the IN FIFO */
unsigned pad_bytes = ((((uint32_t)(hcryp->Size) / 4U) + 1U) * 4U) - (uint32_t)(hcryp->Size);
for (index = 0U; index < lastwordsize; index ++)
{
    if (((hcryp->Instance->CR & CRYP_CR_ALGODIR) == CRYP_OPERATINGMODE_DECRYPT) &&
        (((pad_bytes != 0)) && (index + 1 == lastwordsize))) {
      // set pad bytes to 0
      hcryp->Instance->DIN = (*(uint32_t *)(hcryp->pCrypInBuffPtr + hcryp->CrypInCount) << (pad_bytes * 8)) >> (pad_bytes * 8);
    } else {
      hcryp->Instance->DIN  = *(uint32_t *)(hcryp->pCrypInBuffPtr + hcryp->CrypInCount);
    }
    hcryp->CrypInCount++;
}
GCern.1
Associate III

@Saket_Om opened an issue on the GitHub repo