2023-03-27 05:11 AM
Hello, I use Optee on a stm32mp157f-dk2 board, version is the and the corresponding BSP :
All my changes are committed and built using a yocto meta-layer : https://github.com/embetrix/meta-stm32mp15x
the Optee build config is described here : https://github.com/embetrix/meta-stm32mp15x/blob/kirkstone/recipes-security/optee/optee-os-stm32mp_3.16.0.bb#L33
I enabled the PKCS11 TA which is by the way not by default enabled and gave it a try:
EC Prime256 Keypair generation:
# time pkcs11-tool --keypairgen --key-type EC:prime256v1 --label "testkeyEC" --id 1 --login --usage-sign --module /usr/lib/libckteec.so.0
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
label: testkeyEC
ID: 01
Usage: sign
Access: sensitive, always sensitive, never extractable, local
Public Key Object; EC EC_POINT 256 bits
EC_POINT: 044104d7506303c183c36445ef2d5161a5cfe1effaeb12a7b41ef458bc27811d2ddd915518917cd385ec3572032483a6a2efbeb539f585be9d443754862716fabc609d
EC_PARAMS: 06082a8648ce3d030107
label: testkeyEC
ID: 01
Usage: verify
Access: local
real 1m 4.92s
user 0m 0.01s
sys 0m 31.37s
RSA 2048 Keypair generation:
# time pkcs11-tool --keypairgen --key-type RSA:2048 --label "testkeyRSA" --id 2 --login --usage-sign --module /usr/lib/libckteec.so.0
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA
label: testkeyRSA
ID: 02
Usage: sign
Access: sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
label: testkeyRSA
ID: 02
Usage: verify
Access: local
real 0m 43.02s
user 0m 0.00s
sys 0m 20.82s
It take way too long for any real world application :( and strange by the way that ECC prime256 operation take longer than RSA 2048 !
For the sake of comparison I tried with the official mainline Optee build using the https://github.com/OP-TEE/manifest/blob/master/stm32mp1.xml manifest
I got much better times !
EC Prime256 Keypair generation:
# time pkcs11-tool --keypairgen --key-type EC:prime256v1 --label "testkeyEC" --id 1 --login --usage-sign --module /usr/lib/libckteec.so.0
D/TC:? 0 tee_ta_init_session_with_context:624 Re-open TA fd02c9da-306c-48c7-a49c-bbd827ae86ee
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
label: testkeyEC
ID: 01
Usage: sign
Access: sensitive, always sensitive, never extractable, local
Public Key Object; EC EC_POINT 256 bits
EC_POINT: 0441045172428126d0dd3db11d2aaaaf7f7ad5fb4dddc0ad932f12145c6d42306c5a6212d71d9ab5378400c7bced1d31060b881bac7e6ebf66d88e238327920ec2f477
EC_PARAMS: 06082a8648ce3d030107
label: testkeyEC
ID: 01
Usage: verify
Access: local
D/TC:? 0 tee_ta_close_session:529 csess 0x2ffce880 id 1
D/TC:? 0 tee_ta_close_session:548 Destroy session
real 0m 4.14s
user 0m 0.00s
sys 0m 3.96s
RSA 2048 Keypair generation:
# time pkcs11-tool --keypairgen --key-type RSA:2048 --label "testkeyRSA" --id 2 --login --usage-sign --module /usr/lib/libckteec.so.0
D/TC:? 0 tee_ta_init_session_with_context:624 Re-open TA fd02c9da-306c-48c7-a49c-bbd827ae86ee
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA
label: testkeyRSA
ID: 02
Usage: sign
Access: sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
label: testkeyRSA
ID: 02
Usage: verify
Access: local
D/TC:? 0 tee_ta_close_session:529 csess 0x2ffce880 id 1
D/TC:? 0 tee_ta_close_session:548 Destroy session
real 0m 15.59s
user 0m 0.00s
sys 0m 15.43s
I'm stuck with the official latest ST BSP release for u-boot, Kernel at the moment and using new mainline optee 3.20 with that I cannot even bootup the board.
ST latest Optee release is still the 3.16.0-stm32mp, so my question if they are ways to tweak optee and remove bottlenecks to obtain better PKCS11 performance ?
2023-03-27 05:26 AM
I asked on the mainline Optee-OS github same question:
https://github.com/OP-TEE/optee_os/issues/5918
There is a related issue:
https://github.com/OP-TEE/optee_os/issues/5915
But this problem looks like related to STM32MP1 and ST Official BSP
2023-04-03 09:13 AM
Hello @AZaki.2 ,
This PKCS11 use case performance issue is a known limitation specific to the STM32MP15.
The performances are impacted by the OP-TEE pager, which is function of the tasks that OP-TEE has to do in parallel. As an example, this is not the case on STM32MP13 that does not work with the OP-TEE pager.
To enter more in details, we will not have a complete workaround to upgrade performances a lot. However, reducing the workload of OP-TEE can limit the degradation. Removing the debug from OP-TEE will already increase the performances.
If you do not use the M4 coprocessor, you can as well use the M4 dedicated memory to allocate it to OP-TEE, and then have better result concerning key generation.
I hope that these information will help you to go forward.
Kind regards,
Erwan.
2023-04-03 10:00 AM
hello @Erwan SZYMANSKI,
luckily the M4 is not required for my application, how can I allocate the M4 memory to Optee ? is this change required only in u-boot or it's also necessary to change other BSP components ?
Thank you for your support.
Best reagrds
2023-04-05 02:52 AM
Hello @AZaki.2 ,
You can find some information concerning memory mapping on the wiki, like on this article for example: https://wiki.st.com/stm32mpu/wiki/STM32MP15_RAM_mapping
Just as information, I reproduced a test on STM32MP135F-DK board to have a comparison. Find below the results:
root@stm32mp13-disco:~# 12341234 --keypairgen --label testkey --key-type EC:prime256v1
-sh: 12341234: not found
root@stm32mp13-disco:~# time pkcs11-tool --module /usr/lib/libckteec.so.0 --label testtoken --login --pin 12341234 --keypairgen --label testkey --key-type rsa:2048
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA
label: testkey
Usage: decrypt, sign
Access: sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
label: testkey
Usage: encrypt, verify
Access: local
real 0m 5.63s
user 0m 0.00s
sys 0m 4.57s
root@stm32mp13-disco:~# time pkcs11-tool --module /usr/lib/libckteec.so.0 --label testtoken --login --pin 12341234 --keypairgen --label testkey --key-type ec:prime256v1
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
label: testkey
Usage: sign, derive
Access: sensitive, always sensitive, never extractable, local
Public Key Object; EC EC_POINT 256 bits
EC_POINT: 044104179f72afa2693f3fe356fa652cd60379fb235415b19d5ed82ac82a26a91ae5ed1b24be2ceee5acff702fe77d71bbcb37438a278ebcabb69ef30ef660ac877a41
EC_PARAMS: 06082a8648ce3d030107
label: testkey
Usage: verify, derive
Access: local
real 0m 0.97s
user 0m 0.01s
sys 0m 0.20s
Kind regards,
Erwan.
2023-05-09 05:27 AM
How much time you are taking to generate rsa 204 bit keys in U5 with inbuilt PKA?