Can the STM32 CRC peripheral be made to work with the CRC-15_CAN polynomial?

Can the CRC for the ADBMS1818 (and other Analog Devices BMS parts) be generated using the STM32 CRC peripheral?

The ADBMS1818 datasheet shows a 15 bit polynomial for the CRC as--

x 15 + x 14 + x 10 + x 8 + x 7 + x 4 + x 3 + 1

Other sources call this a CAN-15-CRC polynomial, e.g., the Wikipedia article Cyclic redundancy check.

Wikipedia lists this polynomial as "even", and the Ref Manual for the STM32L431 says that the CRC peripheral does not work for even polynomials. However, it is not clear as to the exact definition of what Wikipedia and ST are using for "even." So, it is not clear that this polynomial can be handled with the 'L431 CRC.

However, if the problem is odd/even, there might be some tricks to make it work, e.g. reversal of the polynomial with a zero added, but I'm not sure if that is possible. I've made some attempts that have not been successful.

Finally, the datasheet has an example of software routine for generating the CRC. It uses a polynomial representation of 0x4599. However, the Wikipedia and its references show it as 0xC599 (16b!).

For software implementation the usual table lookup is probably satisfactory, however if it is possible to make use of the 'L431 hardware it would save the memory space for the table and provide a small improvement in computation time.


Pretty sure the ST hardware doesn't explicitly support 15-bit, but it might work if you get it set up well and mask/shift the answer, really depends on the feed direction, alignment and injection point.

0x4599 and 0xC599 are basically the same, the high order-bit is typically out of scope ie 2**15 (x**15) fits in a 16-bit number space, 2**16 would not, but can be seen as the carry as the register shifts.

The STM32 implementation isn't a rocket, the bus takes at least 4 cycles.

Do you have some example test patterns?

Thanks for the response.

I tried shifting 0x4599 left one bit, making it 0x8B32. With the seed/initial shifted left one bit, the result matches the software routine's output when the input data is all zeroes and various lengths, but fails if there is a 1 bit in the data.

As for an example test pattern, I've been using the ADBMS1818 datasheet example of a two byte {0x00, 0x01} input producing a 0x3D6E output, which uses the polynomical 0x4599 with seed/initial of 0x10, and shifts the 15b result by 1.

One possibility that I haven't investigated is if two crcs could be generated using the 8b and 7b polynomial size selections and combine them. It would depend on being to able to factor the 15b polynomial to make 8b and 7b polynomials that could be multiplied. ...

The speed isn't a big issue in this application, and saving flash for the lookup table is not likely to be critical, but could become a factor in staying within the limits of the flash for a 'B, 128K flash, part. A simple bit-by-bit shifting computation saves the table, if speed is not important, so the issue for this application is somewhat academic (and also interesting).

uint16_t Quick_CRC_Calc15Bits(uint16_t crc, int Size, uint8_t *Buffer)
  static const uint16_t CrcTable[] = { // Nibble Table for 0x4599,
    0xF407,0x319E,0x3AAC,0xFF35,0x2CC8,0xE951,0xE263,0x27FA };
    crc = crc ^ (*Buffer++ << (15-8)); // Align upper bits
    crc = (crc << 4) ^ CrcTable[(crc >> (15-4)) & 0xF]; // Process byte 4-bits at a time
    crc = (crc << 4) ^ CrcTable[(crc >> (15-4)) & 0xF];
  return(crc & 0x7FFF);
  uint8_t data[] = {0x00, 0x01 };
  printf("crc=%04X Quick\n", Quick_CRC_Calc15Bits(0x0010, sizeof(data), data) << 1);

uint16_t Fast_CRC_Calc15Bits(uint16_t crc, int Size, uint8_t *Buffer)
  static const uint16_t CrcTable[] = { // Byte Table for 0x4599,
    0x5368,0x96F1,0x9DC3,0x585A,0x8BA7,0x4E3E,0x450C,0x8095 };
    crc = crc ^ (*Buffer++ << (15-8)); // Align upper bits
    crc = (crc << 8) ^ CrcTable[(crc >> (15-8)) & 0xFF];
  return(crc & 0x7FFF); // Table should keep high-order bit clear

// The ADBMS1818 datasheet shows a 15 bit polynomial for the CRC as
//  x^15 + x^14 + x^10 + x^8 + x^7 + x^4 + x^3 + 1
// Other sources call this a CAN-15-CRC polynomial
// As for an example test pattern, we'll be using the ADBMS1818 datasheet
//  example of a two byte {0x00, 0x01} input producing a 0x3D6E output,
//  which uses the polynomical 0x4599 with seed/initial of 0x10, and shifts
//  the 15b result by 1. The high-order 15-bits will already be suitably
//  aligned.
// Copyright (C) 2022 Clive Turvey (aka Tesla DeLorean,
//  All Rights Reserved
void Crc15Test(void) //
  uint8_t test1[] = {0x00, 0x01 }; // 0x3D6E
  uint8_t test2[] = {0x11, 0x22, 0x33, 0x44, 0x55 }; // 0x7876
  /* CRC handler declaration */
  CRC_HandleTypeDef CrcHandle = {0};
  /* CRC Peripheral clock enable */
  /*##-1- Configure the CRC peripheral #######################################*/
  CrcHandle.Instance = CRC;
  /* The default polynomial is not used. It is required to defined it in CrcHandle.Init.GeneratingPolynomial*/
  CrcHandle.Init.DefaultPolynomialUse    = DEFAULT_POLYNOMIAL_DISABLE;
  /* Set the value of the polynomial */
  CrcHandle.Init.GeneratingPolynomial    = (0x4599 << 1); // 15-bit
  /* The user-defined generating polynomial generates a 16-bit long CRC */
  CrcHandle.Init.CRCLength               = CRC_POLYLENGTH_16B; // Actually 15-bit we use the high order bits
  /* The default init value is not used */
  CrcHandle.Init.DefaultInitValueUse     = DEFAULT_INIT_VALUE_DISABLE;
  /* The used-defined initialization value */
  CrcHandle.Init.InitValue               = (0x0010 << 1); // 15-bit top aligned
  /* The input data are not inverted */
  CrcHandle.Init.InputDataInversionMode  = CRC_INPUTDATA_INVERSION_NONE;
  /* The output data are not inverted */
  CrcHandle.Init.OutputDataInversionMode = CRC_OUTPUTDATA_INVERSION_DISABLE;
  /* The input data are 8-bit long */
  CrcHandle.InputDataFormat              = CRC_INPUTDATA_FORMAT_BYTES;
  if (HAL_CRC_Init(&CrcHandle) != HAL_OK)
    /* Initialization Error */
    Error_Handler(__FILE__, __LINE__);
  printf("CRC-15 %04X TEST1 3D6E?\n", HAL_CRC_Calculate(&CrcHandle, (uint32_t *)test1, sizeof(test1)));
  printf("CRC-15 %04X TEST2 7876?\n", HAL_CRC_Calculate(&CrcHandle, (uint32_t *)test2, sizeof(test2)));
// Output test on STM32L4R5ZI
//  CRC-15 3D6E TEST1 3D6E?
//  CRC-15 7876 TEST2 7876?

And then I thought, perhaps I can do this with other unsupported width CRC's that I often encounter..


I'm playing from a stacked deck here, these do happen to fit really well into the HW shift direction and byte wise usage and MSB take-up of the STM32 implementation. Others could probably be accommodated, but do require a lot of mental gymnastics, reversing the input or output, etc.

Thanks again. The different approaches I now have all agree. Works!

I was close, but I wasn't using HAL_CRC_Calculate and my casting of the pointer was setting up a word rather than a byte pointer.

So, the answer to the posted question is "yes," and it is done by shifting the polynomial and seed/initial left one bit.

I did some machine cycle comparisons of 256 table (byte) lookup, 16 table (nibble) lookup, HAL_CRC_Calculate, and a using the CRC peripheral without HAL. For a six byte ADBMS1818 command the number of machine cycles was 111, 155, 114, 24, respectively.

The problem with HAL_CRC_Calculate (byte format) is that it consumes machine cycles packing the bytes into words, and 1/2 word when possible, and that takes cycles. A "for" or "while" loop sending one byte at a time directly is faster; 86 cycles for six bytes.

HAL also brings in a lot of code. I didn't try to count the number of bytes used in the MX_CRC_Init(), but the number of machine cycles was 283. Estimating roughly two bytes for a machine cycle, would be 566 bytes, but the number of machine cycles would overstate the number 1/2 words somewhat given branches, push and pops (and I didn't see any loops when looking at the code). For convenience the test was being run in a hacked FreeRTOS program and the size jumped 2048 bytes when the CRC as activated in STM32CubeMX. I suspect the additional code merely pushed the code size into another block.

The advantage of using HAL is that it takes care of the low level setup and one gains flexibility when it comes to moving to different STM32 versions, and different applications. However, for a specific application such as this, going bare metal has size and speed advantages. Getting the castings, volatile, etc., for pointers can be a bit tricky. For this application here is my non-HAL implementation for STM32L431--

* File Name          : pec15_reg.c
* Date First Issued  : 06/11/2022
* Description        : ADBMS1818 PEC computation: non-HAL register direct
#include "pec15_reg.h"
/* *************************************************************************
 * uint16_t pec15_reg_init (void);
 *  @brief  : Iniitalize RCC and CRCregisters for ADBMS1818 CRC-15 computation
 * *************************************************************************/
#define CRCBASE ((__IO uint32_t*)0x40023000)
#define SEED 0x10 // ADBMS1818 PEC15 initial 
void pec15_reg_init (void)
 __IO uint32_t* rccbase = (uint32_t*)0x40021000;
  /* Bit 12 CRCEN: CRC clock enable */
  *(rccbase+0x12) |= 0x1000; // Set CEN bit
  /* Set CRC registers. */
  *(uint32_t*)(CRCBASE+4) = SEED*2; // CRC_INT: 
  *(uint32_t*)(CRCBASE+5) = 0x8B32; // CRC_POL: Polynomial * 2
/* *************************************************************************
 * uint16_t pec15_reg (uint8_t *pdata , int len);
 *  @brief  : Reset and compute CRC
 *  @param  : pdata = pointer to input bytes
 *  @param  : len = number of bytes
 *  @return : CRC-15 * 2 (ADBMS1818 16b format)
 * *************************************************************************/
uint16_t pec15_reg (uint8_t *pdata , int len)
  /* Control register configuration includes reset. */
  *(CRCBASE+2) = 0x9; // CRC_CR: 16b + reset
  uint8_t* pend = pdata + len;
     *(__IO uint8_t*)CRCBASE = *pdata++;
  } while (pdata < pend);
  return *CRCBASE;
* File Name          : pec15_reg.h
* Date First Issued  : 06/11/2022
* Description        : ADBMS1818 PEC computation: 16 1/2 word table lookup
#include <stdint.h>
#ifndef __PEC15_REG
#define __PEC15_REG
/* *************************************************************************/
 uint16_t pec15_reg_init (void);
 /*	@brief	: Iniitalize RCC and CRCregisters for CRC-15 computation
 * *************************************************************************/
 uint16_t pec15_reg (uint8_t *pdata , int len);
/*	@brief	: Reset and compute CRC
 *  @param  : pdata = pointer to input bytes
 *  @param  : len = number of bytes
 *  @return : CRC-15 * 2 (ADBMS1818 16b format)
 * *************************************************************************/

If you always calculate CRC on 6 bytes, you can consider an unrolled loop. You can also consider the packing - I don't know why is the ST implementation inefficient but I am not going to investigate, I really don't care about Cube.

 > The advantage of using HAL is that it takes care of the low level setup and one gains flexibility when it comes to moving to different STM32 versions

Yes, this is how ST advertises it and academia for some inexplicable reason echo that. But if you are willing to read the manual, low level setup is in most cases trivially simple - how hard was it to set up the CRC? - and portability ends exactly at the place where hardware starts to be different - try to port your algorithm to 'F4 (hint: the CRC in 'F4 has poly fixed in hardware).