cancel
Showing results for 
Search instead for 
Did you mean: 

Optee PKCS11 TA Performance really bad!

AZaki.2
Associate II

Hello, I use Optee on a stm32mp157f-dk2 board, version is the and the corresponding BSP :

  • optee-os 3.16.0-stm32mp
  • u-boot v2021.10-stm32mp
  • Linux v5.15-stm32mp

All my changes are committed and built using a yocto meta-layer : https://github.com/embetrix/meta-stm32mp15x

the Optee build config is described here : https://github.com/embetrix/meta-stm32mp15x/blob/kirkstone/recipes-security/optee/optee-os-stm32mp_3.16.0.bb#L33

I enabled the PKCS11 TA which is by the way not by default enabled and gave it a try:

EC Prime256 Keypair generation:

# time pkcs11-tool --keypairgen --key-type EC:prime256v1 --label "testkeyEC" --id 1 --login --usage-sign --module /usr/lib/libckteec.so.0
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
 label:   testkeyEC
 ID:     01
 Usage:   sign
 Access:   sensitive, always sensitive, never extractable, local
Public Key Object; EC EC_POINT 256 bits
 EC_POINT:  044104d7506303c183c36445ef2d5161a5cfe1effaeb12a7b41ef458bc27811d2ddd915518917cd385ec3572032483a6a2efbeb539f585be9d443754862716fabc609d
 EC_PARAMS: 06082a8648ce3d030107
 label:   testkeyEC
 ID:     01
 Usage:   verify
 Access:   local
real  1m 4.92s
user  0m 0.01s
sys   0m 31.37s

RSA 2048 Keypair generation:

# time pkcs11-tool --keypairgen --key-type RSA:2048 --label "testkeyRSA" --id 2  --login --usage-sign  --module /usr/lib/libckteec.so.0 
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA 
  label:      testkeyRSA
  ID:         02
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
  label:      testkeyRSA
  ID:         02
  Usage:      verify
  Access:     local
real    0m 43.02s
user    0m 0.00s
sys     0m 20.82s

It take way too long for any real world application :( and strange by the way that ECC prime256 operation take longer than RSA 2048 !

For the sake of comparison I tried with the official mainline Optee build using the https://github.com/OP-TEE/manifest/blob/master/stm32mp1.xml manifest

I got much better times !

EC Prime256 Keypair generation:

# time pkcs11-tool --keypairgen --key-type EC:prime256v1 --label "testkeyEC" --id 1  --login --usage-sign  --module /usr/lib/libckteec.so.0
D/TC:? 0 tee_ta_init_session_with_context:624 Re-open TA fd02c9da-306c-48c7-a49c-bbd827ae86ee
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; EC
  label:      testkeyEC
  ID:         01
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; EC  EC_POINT 256 bits
  EC_POINT:   0441045172428126d0dd3db11d2aaaaf7f7ad5fb4dddc0ad932f12145c6d42306c5a6212d71d9ab5378400c7bced1d31060b881bac7e6ebf66d88e238327920ec2f477
  EC_PARAMS:  06082a8648ce3d030107
  label:      testkeyEC
  ID:         01
  Usage:      verify
  Access:     local
D/TC:? 0 tee_ta_close_session:529 csess 0x2ffce880 id 1
D/TC:? 0 tee_ta_close_session:548 Destroy session
real    0m 4.14s
user    0m 0.00s
sys     0m 3.96s

RSA 2048 Keypair generation:

# time pkcs11-tool --keypairgen --key-type RSA:2048 --label "testkeyRSA" --id 2 --login --usage-sign  --module /usr/lib/libckteec.so.0
D/TC:? 0 tee_ta_init_session_with_context:624 Re-open TA fd02c9da-306c-48c7-a49c-bbd827ae86ee
Using slot 0 with a present token (0x0)
Key pair generated:
Private Key Object; RSA 
  label:      testkeyRSA
  ID:         02
  Usage:      sign
  Access:     sensitive, always sensitive, never extractable, local
Public Key Object; RSA 2048 bits
  label:      testkeyRSA
  ID:         02
  Usage:      verify
  Access:     local
D/TC:? 0 tee_ta_close_session:529 csess 0x2ffce880 id 1
D/TC:? 0 tee_ta_close_session:548 Destroy session
real    0m 15.59s
user    0m 0.00s
sys     0m 15.43s

I'm stuck with the official latest ST BSP release for u-boot, Kernel at the moment and using new mainline optee 3.20 with that I cannot even bootup the board.

ST latest Optee release is still the 3.16.0-stm32mp, so my question if they are ways to tweak optee and remove bottlenecks to obtain better PKCS11 performance ?

5 REPLIES 5
AZaki.2
Associate II

I asked on the mainline Optee-OS github same question:

https://github.com/OP-TEE/optee_os/issues/5918

There is a related issue:

https://github.com/OP-TEE/optee_os/issues/5915

But this problem looks like related to STM32MP1 and ST Official BSP

Erwan SZYMANSKI
ST Employee

Hello @AZaki.2​ ,

This PKCS11 use case performance issue is a known limitation specific to the STM32MP15.

The performances are impacted by the OP-TEE pager, which is function of the tasks that OP-TEE has to do in parallel. As an example, this is not the case on STM32MP13 that does not work with the OP-TEE pager.

To enter more in details, we will not have a complete workaround to upgrade performances a lot. However, reducing the workload of OP-TEE can limit the degradation. Removing the debug from OP-TEE will already increase the performances.

If you do not use the M4 coprocessor, you can as well use the M4 dedicated memory to allocate it to OP-TEE, and then have better result concerning key generation.

I hope that these information will help you to go forward.

Kind regards,

Erwan.

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.
AZaki.2
Associate II

hello @Erwan SZYMANSKI​,

luckily the M4 is not required for my application, how can I allocate the M4 memory to Optee ? is this change required only in u-boot or it's also necessary to change other BSP components ?

Thank you for your support.

Best reagrds

Hello @AZaki.2​ ,

You can find some information concerning memory mapping on the wiki, like on this article for example: https://wiki.st.com/stm32mpu/wiki/STM32MP15_RAM_mapping

Just as information, I reproduced a test on STM32MP135F-DK board to have a comparison. Find below the results:

root@stm32mp13-disco:~# 12341234 --keypairgen --label testkey --key-type EC:prime256v1

-sh: 12341234: not found

root@stm32mp13-disco:~# time pkcs11-tool --module /usr/lib/libckteec.so.0 --label testtoken --login --pin 12341234 --keypairgen --label testkey --key-type rsa:2048

Using slot 0 with a present token (0x0)

Key pair generated:

Private Key Object; RSA 

 label:   testkey

 Usage:   decrypt, sign

 Access:   sensitive, always sensitive, never extractable, local

Public Key Object; RSA 2048 bits

 label:   testkey

 Usage:   encrypt, verify

 Access:   local

real  0m 5.63s

user  0m 0.00s

sys   0m 4.57s

root@stm32mp13-disco:~# time pkcs11-tool --module /usr/lib/libckteec.so.0 --label testtoken --login --pin 12341234 --keypairgen --label testkey --key-type ec:prime256v1

Using slot 0 with a present token (0x0)

Key pair generated:

Private Key Object; EC

 label:   testkey

 Usage:   sign, derive

 Access:   sensitive, always sensitive, never extractable, local

Public Key Object; EC EC_POINT 256 bits

 EC_POINT:  044104179f72afa2693f3fe356fa652cd60379fb235415b19d5ed82ac82a26a91ae5ed1b24be2ceee5acff702fe77d71bbcb37438a278ebcabb69ef30ef660ac877a41

 EC_PARAMS: 06082a8648ce3d030107

 label:   testkey

 Usage:   verify, derive

 Access:   local

real  0m 0.97s

user  0m 0.01s

sys   0m 0.20s

Kind regards,

Erwan.

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.
Rajan Soma
Associate II

How much time you are taking to generate rsa 204 bit keys in U5 with inbuilt PKA?