cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H56 PKA Arithmetic operations not working

A_A_Phil
Associate II

I am working with the NUCLEO-H563ZI, and I'm trying to write an application that verifies an ECDSA Certificate. I have already successfully done the verification using the  HAL_PKA_ECDSAVerif() function provided with all the parameters. My problem is that the public key that I get is in the compact form (I only have the X part and a bit to indicate which Y part to choose) so I have to calculate the Y part separately before running the verification. To get the Y part I must use the ECC curve function y = modular square root of (x^3 + a.x + b). The parameters are 192 bits long so I can't just directly calculate everything so I tried using the PKA Arithmetic function but I cannot get them to work. The function just gets stuck until timeout. With the HAL_MAX_DELAY it just stays stuck forever. Here is the code below:

 

//Just a mockup of the actual code
uint32_t number1[6] = {0x12345678, 0x9ABCDEF0, 0x13579BDF, 0x2468ACE0, 0x02468ACE, 0x13579BDF};
uint32_t number2[6] = {0x98765432, 0xABCDEF09, 0x2468ACE0, 0x13579BDF, 0x02468ACE, 0x2468ACE0};
uint32_t result [7];

PKA_AddInTypeDef Addition_Struct;

int main(void)
{
	HAL_Init();
	SystemClock_Config();
	MX_RNG_Init();
	MX_PKA_Init();
	
	Addition_Struct.size = 6;
	Addition_Struct.pOp1 = number1;
	Addition_Struct.pOp2 = number2;
	
	HAL_StatusTypeDef status = HAL_PKA_Add (&hpka, &Addition_Struct, 1000);
	//It doesn't work even with HAL_MAX_DELAY

	// Check if addition was successful
	if (HAL_PKA_GetState (&hpka) == HAL_OK){
		printf("Operation Successful\n");
		HAL_PKA_Arithmetic_GetResult(&hpka, result);
	} else{
		// Addition failed
		printf("Error occurred during addition\n");
		uint32_t error = HAL_PKA_GetError (&hpka); //Error has value 12
	}

	while (1)
	{
	}
}

 

What could the problem be? And do you have any idea how can I calculate the y part of the public key in an efficient way ?

1 ACCEPTED SOLUTION

Accepted Solutions
Jocelyn RICARD
ST Employee

Hello @A_A_Phil ,

the NUCLEO-H563ZI does not support HW accelerated crypto.

You should use the STM32H573I-DK

Best regards

Jocelyn

View solution in original post

8 REPLIES 8
Christian N
ST Employee

This post has been escalated to the ST Online Support Team for additional assistance. We'll contact you directly.

Jocelyn RICARD
ST Employee

Hello @A_A_Phil ,

the NUCLEO-H563ZI does not support HW accelerated crypto.

You should use the STM32H573I-DK

Best regards

Jocelyn

moe_fdi
Associate II
 

Hello,

Have you ever successfully calculated the uncompressed public key from a compressed public key, using the PKA Arithmetics function?

BR

Hi,

No I haven't implemented this on an stm32 yet, but I did on another Microcontroller, should be similar though.

The compressed key is basically the x-coordinate of the curve-point and a sign byte or indicator. You use x to calculate y^2 and then you calculate the square root of that using the Tonelli-Shanks algorithm (a great video explaining how to to use it: Finding Mod-p Square Roots with the Tonelli-Shanks Algorithm).

  • So you calculate y^2 = x^3 + ax + b = n (all Modular arithmetic)
  • Then calculate y1 = n^((P+1)/4)  :  Division is just a right shift by 2, the addition is a normal one (not Modular addition) and the exponentiation is Modular.
  • y2 = -y1 (Modular inverse)
  • Determine if y1 or y2 using the indicator from the compressed key (either even or odd)

All of these operations can be executed with the use of the HAL functions for Modular exponentiation, Modular addition, Modular subtraction, Montgomery multiplication and Modular inversion.

I hope this helps

Thank you for your reply.

I’ve implemented most of the calculations, but I’m currently stuck on the division by 4. Since there isn’t an arithmetic function for shifting, which function could I use instead? is it the inverse modular to calculate (P+1)/4 ?

You can use something like this:

void shift_array_right_by_2(uint8_t *array, size_t length) {
    if (length == 0) return;

    // Carry to handle the bits that fall off during the shift
    uint8_t carry = 0;
    for (int i = length - 1; i >= 0; i--) {
        // Save the lowest 2 bits to be carried to the next element
        uint8_t new_carry = (array[i] & 0x03) << 6; // Mask the lowest 2 bits and move them to the top
        
        // Right shift the current element by 2 bits and add the previous carry bits at the leftmost position
        array[i] = (array[i] >> 2) | carry;
        
        // Update the carry with the overflow bits for the next iteration
        carry = new_carry;
    }
}

For a uint32_t just change the new_carry to:

// Save the lower 2 bits to be carried over to the next element in the next loop iteration
uint32_t new_carry = (array[i] & 0x3) << (sizeof(uint32_t) * 8 - 2);

Thank you for the suggested solution. another problem with the last function used to calculate the exponential modular:


I'm encountering a timeout error with HAL_PKA_ModExp, even when setting a high delay with HAL_MAX_DELAY. Additionally, the output data is incorrect. Here is a simplified version of my code:
uint8_t data[96] = {
0x9c, 0x29, 0xec, 0x4b, 0xd7, 0x69, 0xf4, 0xe9, 0x30, 0xc7, 0xb0, 0x14, 0x2e, 0x51, 0x81, 0xff,
0x98, 0x8c, 0xb1, 0xa0, 0x3e, 0x66, 0xf8, 0xcc, 0x62, 0x13, 0xb5, 0xed, 0xf6, 0x89, 0xfc, 0x9d,
0x37, 0x0e, 0xa0, 0x94, 0xfe, 0x9e, 0x8d, 0xd5, 0x64, 0xf9, 0x05, 0x28, 0x22, 0x42, 0xfd, 0x87,
0xec, 0xb2, 0x99, 0x5c, 0xaf, 0x3c, 0x0d, 0x9d, 0xd7, 0xcf, 0x2f, 0x35, 0xda, 0xf0, 0xcb, 0xc7,
0xb8, 0xcd, 0x7c, 0xc1, 0xe0, 0xed, 0xa6, 0x13, 0xe5, 0xbd, 0x07, 0xfb, 0x81, 0x80, 0xb6, 0x4a,
0x1a, 0x27, 0x80, 0xd9, 0xd8, 0xc5, 0x28, 0xa0, 0xae, 0xe2, 0xb2, 0xef, 0x93, 0x37, 0x91, 0xa3
};

uint8_t exponent[32] = {
0x3f, 0xff, 0xff, 0xff, 0xc0, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

uint8_t mod[96] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
};

PKA_ModExpInTypeDef sModExpConfig;
sModExpConfig.expSize = 32;
sModExpConfig.OpSize = 96;
sModExpConfig.pExp = exponent;
sModExpConfig.pOp1 = data;
sModExpConfig.pMod = mod;

uint8_t resultModExp[100];
memset(resultModExp, 0, 100);

if (HAL_PKA_ModExp(&hpka, &sModExpConfig, 10000) != HAL_OK) {
printf("Modular exponentiation timed out.\n");
}

HAL_PKA_ModExp_GetResult(&hpka, resultModExp);

The expected result is (calculated using python):
plaintext = 0x9c29ec4bd769f4e930c7b0142e5181ff988cb1a03e66f8cc6213b5edf689fc9d370ea094fe9e8dd564f905282242fd87ecb2995caf3c0d9dd7cf2f35daf0cbc7b8cd7cc1e0eda613e5bd07fb8180b64a1a2780d9d8c528a0aee2b2ef933791a3
publicExponent = 0x3fffffffc0000000400000000000000000000000400000000000000000000000
modulus = 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffff00000001000000000000000000000000ffffffffffffffffffffffff

result = pow(plaintext, publicExponent, modulus)
print("Mod Exp: ")
print(hex(result))

--> 0x80d6d17eb470e9b87827860307fbd48b4adef1ad9346527a32cd6e34e64e7ec2


But, Strangely, HAL_PKA_ModExp works as expected with a different modulus and exponent setup:

uint8_t modulus[96] = {
0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25,
0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25,
0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25,
0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25,
0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25,
0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25, 0x23, 0x22, 0x22, 0x25};

uint8_t publicExponent[32] = {
0x00, 0x00, 0x05, 0x01, 0x00, 0x00, 0x05, 0x01, 0x00, 0x00, 0x05, 0x01, 0x00, 0x00, 0x05, 0x01,
0x00, 0x00, 0x05, 0x01, 0x00, 0x00, 0x05, 0x01, 0x00, 0x00, 0x05, 0x01, 0x00, 0x00, 0x05, 0x01};
};

uint8_t plaintext[96] = {
0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43,
0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43,
0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43,
0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43,
0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43,
0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x43, 0x44, 0x44, 0x44, 0x44};
};

expected result:
plaintext = 0x444444434444444344444443444444434444444344444443444444434444444344444443444444434444444344444443444444434444444344444443444444434444444344444443444444434444444344444443444444434444444344444444
publicExponent = 0x0000050100000501000005010000050100000501000005010000050100000501
modulus = 0x232222252322222523222225232222252322222523222225232222252322222523222225232222252322222523222225232222252322222523222225232222252322222523222225232222252322222523222225232222252322222523222225
result = pow(plaintext, publicExponent, modulus)
print("Mod Exp: ")
print(hex(result))
--> 0xeac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac92680eac9269

Could you please help me to find out the problem ?Has anyone encountered a similar issue with HAL_PKA_ModExp? Are there specific constraints on modulus or exponent values that could cause this behavior? Any advice on potential fixes or workarounds would be greatly appreciated!

A_A_Phil_0-1731088192658.png

Table from reference manual (page 1402).

What you have used is an even integer:

uint8_t mod[96] = {
LSB -> 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0xff, 0xff, 0xff, 0xff, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
};

modulus = 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000ffffffff00000001000000000000000000000000ffffffffffffffffffffffff <- LSB

I think you made a mistake when copying the Hex strings to the c-array. In the hex string the MSB is on the Left, but in the c-array definition, the MSB is on the right, so you should flip the arrays.

The second case should also not work correctly because the data that you have entered has a larger value than the modulus which is out-of-spec according to the table.