2020-03-26 02:31 AM
As the title says, I'm trying to use a model with integer weights in cube_ai for speed and size performances.
I converted the original keras model to int operations using TFtiny, but when I try to import it, cube_ai throws an error warning me that only float32 operations are allowed. I saw in the user manual and in the generated code that int operations should be supported, isn't this the case? If so, how should I go about converting my model to only integer operations?
2020-03-26 02:57 AM
I suppose that you use TensorFlow Lite converter utility to convert your original model (keras floating model). What is the used configuration? Post-training quantization with representative dataset should be used. For example the simple option OPTIMIZE_FOR_SIZE allows to quantize only the weights to reduce the size of the generated tflite file but at run-time 32-b float opertors are used. This is not supported by X-CUBE-AI, theconversion of the weights to float is done during the code generation.
Else in the pack, you can find more information : C:\Users\<user_name>\STM32Cube\Repository\Packs\STMicroelectronics\X-CUBE-AI\5.0.0\Documation\index.html.
In particular the "quantize command" from the Command Line article.
Typical snippet Python code to convert a Keras model (refer toTensorFlow Lite converter documentation, some parameters can change according the TF version).
converter = tf.lite.TFLiteConverter.from_keras_model_file(<keras_model_path>)
converter.representative_dataset = representative_dataset_gen
converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
quant_model = converter.convert()