Weights quantization to uint8

AAnci.1 · ‎2020-03-26

Hi,

As the title says, I'm trying to use a model with integer weights in cube_ai for speed and size performances.

I converted the original keras model to int operations using TFtiny, but when I try to import it, cube_ai throws an error warning me that only float32 operations are allowed. I saw in the user manual and in the generated code that int operations should be supported, isn't this the case? If so, how should I go about converting my model to only integer operations?

jean-michel.d · ‎2020-03-26

Hi,

I suppose that you use TensorFlow Lite converter utility to convert your original model (keras floating model). What is the used configuration? Post-training quantization with representative dataset should be used. For example the simple option OPTIMIZE_FOR_SIZE allows to quantize only the weights to reduce the size of the generated tflite file but at run-time 32-b float opertors are used. This is not supported by X-CUBE-AI, theconversion of the weights to float is done during the code generation.

Else in the pack, you can find more information : C:\Users\<user_name>\STM32Cube\Repository\Packs\STMicroelectronics\X-CUBE-AI\5.0.0\Documation\index.html.

In particular the "quantize command" from the Command Line article.

br,

Jean-Michel

Typical snippet Python code to convert a Keras model (refer toTensorFlow Lite converter documentation, some parameters can change according the TF version).

converter = tf.lite.TFLiteConverter.from_keras_model_file(<keras_model_path>)

converter.representative_dataset = representative_dataset_gen

converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]

converter.inference_input_type = tf.int8

converter.inference_output_type = tf.int8

quant_model = converter.convert()