cancel
Showing results for 
Search instead for 
Did you mean: 

QInt16 quantization support

AMurz.1
Associate II

Hi,

The hardware seems to support 16 bits activations, but this seems to not be usable according to the ST Edge AI documentation.

We have a model where at a particular tensor between a Conv and Sigmoid, the precision loss due to the quantization cause non-negligible accuracy loss of the output of the model. The output of the Conv layer is between -90 and 2 mostly, then it goes to Sigmoid. But the low values cause the quantized output to loss too much accuracy, while being needed for an accurate Sigmoid output (which drives audio related ratio where accuracy is important).

I've tried to use these parameters with ONNX quantization, but ST Edge AI fails:

conf = StaticQuantConfig(
    calibration_data_reader=dr,
    quant_format=QuantFormat.QDQ,
    calibrate_method=CalibrationMethod.MinMax,
    activation_type=QuantType.QInt16,
    weight_type=QuantType.QInt8,
    per_channel=True)

The error message is:

NOT IMPLEMENTED: Unexpected type for constant input of Dequantize layer (SIGNED, 16 bit, C Size: 16 bits Scales: [1.5259021893143654e-05] Zeros: [-32768] Quantizer: UNIFORM)

The only way to fix the accuracy issue is to not quantize the Conv and Sigmoid layer so they are done in software with float32 (at the expense of lower inference speed due to the Conv in software).

 

Tensorflow already have support for 16 bits activations / 8 bits weights using:

converter.target_spec.supported_ops = [tf.lite.OpsSet.EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8]

This can be used for ARM Ethos platform.

I think ONNX need opset 21, but I'm not sure.

 

Is there a way to use 16 bits activation ? Or maybe it is will be implemented later ?

Or maybe float16 using MVE in software somehow ?

 

Thanks for your support.

Alexis Murzeau

 

0 REPLIES 0