2024-10-22 02:36 AM - edited 2024-10-22 02:41 AM
Hi folks,
I am trying to use the hardware-AES of the STM32WLE5JC, but I get weird results that do not match the expected values. I am comparing my STM-implementation against RadioLib's Crypto which is based on tiny-AES-c and AES-CMAC. RadioLib's implementation yields results that completely match verification tools such as this one.
I am investigating two things: 1) single-buffer ECB encryption, and 2) CMAC calculation based on ECB encryption. My observations are as follows:
* I only get OK results for buffers <=128 bits.
Example:
uint8_t key[16] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
RadioLibAES128Instance.init(key);
uint8_t data[24] = { 0 };
uint8_t out[24] = { 0 };
// RadioLibAES128Instance.generateCMAC(data, 24, out);
RadioLibAES128Instance.encryptECB(data, 24, out);
RadioLib's software ECB outputs the following (verified using AES calculator):
11:17:16.387 > RLB_PRO: Init key:
11:17:16.389 > RLB_PRO: 00000000: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f ................
11:17:16.396 > RLB_PRO: Data in:
11:17:16.397 > RLB_PRO: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
11:17:16.405 > RLB_PRO: 00000010: 00 00 00 00 00 00 00 00 ........
11:17:16.412 > RLB_PRO: Data out:
11:17:16.414 > RLB_PRO: 00000000: c6 a1 3b 37 87 8f 5b 82 6f 4f 81 62 a1 c8 d8 79 ..;7..[.oO.b...y
11:17:16.421 > RLB_PRO: 00000010: c6 a1 3b 37 87 8f 5b 82 ..;7..[.
Cube's hardware ECB outputs the following:
11:17:11.161 > RLB_PRO: Init key:
11:17:11.163 > RLB_PRO: 00000000: 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f ................
11:17:11.170 > RLB_PRO: Data in:
11:17:11.172 > RLB_PRO: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
11:17:11.180 > RLB_PRO: 00000010: 00 00 00 00 00 00 00 00 ........
11:17:11.562 > RLB_PRO: Data out:
11:17:11.564 > RLB_PRO: 00000000: 52 9d 08 6c 42 02 9a 85 43 44 05 6d 78 ab 8d 9f R..lB...CD.mx...
11:17:11.573 > RLB_PRO: 00000010: 46 63 6a 4c d9 4f 0c f8 FcjL.O..
When I uncomment the call the `generateCMAC()`, the following is the output of the CMAC call and the subsequent call to `encryptECB()` for RadioLib (CMAC verified using calculator, ECB output same as earlier):
11:29:36.528 > RLB_PRO: CMAC Data out:
11:29:36.530 > RLB_PRO: 00000000: 0b 14 5a b3 85 41 ba e9 2b 04 6a 68 81 0e 6f c2 ..Z..A..+.jh..o.
11:29:36.538 > RLB_PRO: Data in:
11:29:36.539 > RLB_PRO: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
11:29:36.546 > RLB_PRO: 00000010: 00 00 00 00 00 00 00 00 ........
11:29:36.554 > RLB_PRO: Data out:
11:29:36.556 > RLB_PRO: 00000000: c6 a1 3b 37 87 8f 5b 82 6f 4f 81 62 a1 c8 d8 79 ..;7..[.oO.b...y
11:29:36.563 > RLB_PRO: 00000010: c6 a1 3b 37 87 8f 5b 82 ..;7..[.
And the same for Cube-hardware (CMAC output correct, ECB output first 16 bytes correct, other 8 incorrect):
11:29:29.863 > RLB_PRO: CMAC Data out:
11:29:29.866 > RLB_PRO: 00000000: 0b 14 5a b3 85 41 ba e9 2b 04 6a 68 81 0e 6f c2 ..Z..A..+.jh..o.
11:29:29.873 > RLB_PRO: Data in:
11:29:29.875 > RLB_PRO: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
11:29:29.884 > RLB_PRO: 00000010: 00 00 00 00 00 00 00 00 ........
11:29:29.891 > RLB_PRO: Data out:
11:29:29.894 > RLB_PRO: 00000000: c6 a1 3b 37 87 8f 5b 82 6f 4f 81 62 a1 c8 d8 79 ..;7..[.oO.b...y
11:29:29.901 > RLB_PRO: 00000010: 13 97 f1 e0 eb cd 02 f9 ........
I am at a loss for how I should fix this. The key- and buffer-endianness seems all good given that I do get correct results, but only under specific circumstances. The crypto initialization is done as follows:
stm32CubeCrypto.Instance = AES;
stm32CubeCrypto.Init.DataType = CRYP_DATATYPE_32B;
stm32CubeCrypto.Init.KeySize = CRYP_KEYSIZE_128B;
stm32CubeCrypto.Init.pKey = key32;
stm32CubeCrypto.Init.Algorithm = CRYP_AES_ECB;
stm32CubeCrypto.Init.DataWidthUnit = CRYP_DATAWIDTHUNIT_WORD;
stm32CubeCrypto.Init.HeaderWidthUnit = CRYP_HEADERWIDTHUNIT_WORD;
stm32CubeCrypto.Init.KeyIVConfigSkip = CRYP_KEYIVCONFIG_ONCE;
status = HAL_CRYP_Init(&stm32CubeCrypto);
Any clues, hints, further questions are very much appreciated!
Solved! Go to Solution.
2024-10-28 03:02 PM
Doesn't AES-128 only work on 16-byte lines? You'd need to PAD appropriately to the next paragraph. As it rearranges all 128-bits in the line. ie give it 32-bytes of padded input, get 32-bytes of output
Or use CTR mode which is the Encrypt to a 16-byte line which you then get to XOR with the data stream
uint8_t key[16] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
You don't show your 32-bit key, which I might expect needs to be
uint32_t key32[4] = { 0x00010203, 0x04050607,0x08090A0B,0x0C0D0E0F };
2024-10-28 02:43 PM
Is there someone who can give some input on this, please? :)
2024-10-28 03:02 PM
Doesn't AES-128 only work on 16-byte lines? You'd need to PAD appropriately to the next paragraph. As it rearranges all 128-bits in the line. ie give it 32-bytes of padded input, get 32-bytes of output
Or use CTR mode which is the Encrypt to a 16-byte line which you then get to XOR with the data stream
uint8_t key[16] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
You don't show your 32-bit key, which I might expect needs to be
uint32_t key32[4] = { 0x00010203, 0x04050607,0x08090A0B,0x0C0D0E0F };
2024-10-28 06:20 PM
You'd need to encrypt from a 32-byte, zero padded buffer
0000 : C6 A1 3B 37 87 8F 5B 82-6F 4F 81 62 A1 C8 D8 79 ..;7..[.oO.b...y
0010 : C6 A1 3B 37 87 8F 5B 82-6F 4F 81 62 A1 C8 D8 79 ..;7..[.oO.b...y
To get your 24-bytes back you'd need to decrypt the 32-byte cipher-text, and take the first 24-bytes
IN
0000 : 00 01 02 03 04 05 06 07-08 09 0A 0B 0C 0D 0E 0F ................
0010 : 10 11 12 13 14 15 16 17-00 00 00 00 00 00 00 00 ................
OUT
0000 : 0A 94 0B B5 41 6E F0 45-F1 C3 94 58 C6 53 EA 5A ....An.E...X.S.Z
0010 : 8D E6 D8 C0 DD 5B CC 98-DF 46 3A CA E0 F5 2A 75 .....[...F:...*u
2024-10-29 05:12 AM - edited 2024-10-29 05:13 AM
Thank you very much - as AES is not defined for block sizes other than multiples of 16, I had worked under the assumption that it would be implicitly handled. That was clearly wrong so I now added the padding. This works well:
13:08:33.389 > RLB_PRO: Data in:
13:08:33.392 > RLB_PRO: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
13:08:33.400 > RLB_PRO: 00000010: 00 00 00 00 00 00 00 00 ........
13:08:33.408 > RLB_PRO: Data out:
13:08:33.411 > RLB_PRO: 00000000: c6 a1 3b 37 87 8f 5b 82 6f 4f 81 62 a1 c8 d8 79 ..;7..[.oO.b...y
13:08:33.419 > RLB_PRO: 00000010: c6 a1 3b 37 87 8f 5b 82 6f 4f 81 62 a1 c8 d8 79 ..;7..[.oO.b...y
One issue however remains, which is: this only works if I first call the CMAC encryption function. If I just call init() and then encryptECB(), I get different output:
13:08:29.073 > RLB_PRO: Data in:
13:08:29.075 > RLB_PRO: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
13:08:29.083 > RLB_PRO: 00000010: 00 00 00 00 00 00 00 00 ........
13:08:29.091 > RLB_PRO: Data out:
13:08:29.094 > RLB_PRO: 00000000: ab 50 88 66 ee ed 42 69 83 ac 9e 53 d1 78 04 5b .P.f..Bi...S.x.[
13:08:29.102 > RLB_PRO: 00000010: ab 50 88 66 ee ed 42 69 83 ac 9e 53 d1 78 04 5b .P.f..Bi...S.x.[
I haven't yet traced if I need the full CMAC calculation chain to take place first or a subset of its computations, but it is really weird. If you would any hints for that, they're very welcome, but am already grateful for your reply as parts of the problems are resolved :D
(Could it have something to do with a B0 buffer initialization? But if I call encrypt() twice, I get the same (invalid) output...)
2024-10-29 08:36 AM
I'm usually using AES in the CBC and CTR forms, I don't have a lot of mileage on the CMAC
ECB perhaps, although not your platform
Most of the CMAC stuff is under the LoRaWAN or mbedTLS libraries
https://github.com/STMicroelectronics/STM32CubeWL/tree/main/Middlewares/Third_Party/mbed-crypto
2024-11-01 12:15 AM
In the end I have to apply a bit of a weird hack: instead of simply encrypting a buffer, I have to encrypt an empty 128-bit block first:
uint8_t a[1];
uint8_t b[16];
this->encryptECB_HW(a, 1, b);
this->encryptECB_HW(in, len, out);
Where this encryption function looks like this:
size_t encryptECB_HW(uint8_t* in, size_t len, uint8_t* out) {
size_t num_blocks = (len + 15) / 16;
uint32_t input[num_blocks * 4] = { 0 };
for (int i = 0; i < len; i++) {
input[i/4] |= (uint32_t)in[i] << ((3 - (i % 4)) * 8);
}
uint32_t output[num_blocks * 4] = { 0 };
HAL_CRYP_Encrypt(&stm32CubeCrypto, input, num_blocks * 4, output, HAL_MAX_DELAY);
for (int i = 0; i < num_blocks * 16; i += 4) {
out[i+3] = output[i/4] & 0xFF;
out[i+2] = (output[i/4] >> 8) & 0xFF;
out[i+1] = (output[i/4] >> 16) & 0xFF;
out[i+0] = (output[i/4] >> 24) & 0xFF;
}
return(num_blocks*16);
}
I don't know why it must be done this way, but it works and encryption an additional block using hardware-AES is definitely faster than encrypting the actual buffer through software. So I'm happy, and thanks for the help @Tesla DeLorean!