cancel
Showing results for 
Search instead for 
Did you mean: 

What do you do to significantly reduce code size? STM32F0 with 32KB Flash Overflowed

WhereDidFrankGo
Associate II

Hey there,

I have an STM32F042K6 with 32 KB flash which works with a whole bunch of peripherals using STM32 HAL and a lot of code. Now that everything is configured, it is overflowed by nearly 1.5KB (region `FLASH' overflowed by 1464 bytes).

What I'll need to do is a significant reduction of code size. I've already turned on all optimization flags -Os, -fmerge-all-constants, -ffunction-sections and -fdata-sections. What else do you do to reduce the code size?

Active peripherals:

ADC1, I2C1, SPI1, DMA for SPI1, Timer 2, Timer 16, DMA for timer 2, UART2, USB CDC as VCP

I'm not using any (emulated) floating point and no division at all which saved a lot of space already, but 99% of the code feels like is taken away by HAL.

Thank you for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
berendi
Principal

> 99% of the code feels like is taken away by HAL.

Then don't use it.

Replace as many HAL functions as you can with functions that do nothing else but strictly only what is required for your use case. Remove as many runtime checks as you can.

You know e.g. whether the SPI interface uses 8 or 16 bit data, no need to check it in each operation. You know which timer channels are in use, no need to have huge switch instructions for all 4 channels each time. Unless there is a complicated power-saving scheme, you know the frequency your MCU runs on, replace HAL_RCC_GetWhateverFreq functions with inline functions that return a single value. Get rid of the peripheral handle structures.

If you are done with this, enable -flto both in the compiler and linker configuration.

Don't initialize global or static variables to 0 in the variable definitions. It will be done anyway, but with less flash overhead.

If possible, remove all printf and malloc like functions from the code.

View solution in original post

7 REPLIES 7
berendi
Principal

> 99% of the code feels like is taken away by HAL.

Then don't use it.

Replace as many HAL functions as you can with functions that do nothing else but strictly only what is required for your use case. Remove as many runtime checks as you can.

You know e.g. whether the SPI interface uses 8 or 16 bit data, no need to check it in each operation. You know which timer channels are in use, no need to have huge switch instructions for all 4 channels each time. Unless there is a complicated power-saving scheme, you know the frequency your MCU runs on, replace HAL_RCC_GetWhateverFreq functions with inline functions that return a single value. Get rid of the peripheral handle structures.

If you are done with this, enable -flto both in the compiler and linker configuration.

Don't initialize global or static variables to 0 in the variable definitions. It will be done anyway, but with less flash overhead.

If possible, remove all printf and malloc like functions from the code.

hs2
Senior

Assuming (a decent) GCC toolchain use lean C-library ('--specs=nano.specs').

BTW '-fdata-sections' increased image size a bit for my builds. It depends.

Thanks, that's a great idea to replace the HAL and remove the channel checks.

I'm not using any dynamic memory, nor printf-like stuff. Just memcpy and memcmp which is barely larger than using a for loop.

I've tried to -flto it, but it never worked on this platform, apparently due to a bug I have absolutely no clue how to fix. Here's what the compiler tells me:

Invoking: MCU GCC Linker
arm-none-eabi-gcc -mcpu=cortex-m0 -mthumb -mfloat-abi=soft -specs=nosys.specs -specs=nano.specs -Xlinker -flto -T"../STM32F042K6Tx_FLASH.ld" -Wl,-Map=output.map -Wl,--gc-sections -o "FLClassic.elf" @"objects.list"   -lm
/var/folders/fl/23kz1b8551d0jzdld89y1x780000gn/T//ccaNiwaR.ltrans0.ltrans.o: In function `_exit':
<artificial>:(.text+0x360e): multiple definition of `_exit'
/Applications/Ac6/SystemWorkbench.app/Contents/Eclipse/plugins/fr.ac6.mcu.externaltools.arm-none.macos64_1.17.0.201812190825/tools/compiler/bin/../lib/gcc/arm-none-eabi/7.3.1/../../../../arm-none-eabi/lib/thumb/v6-m/libnosys.a(_exit.o):_exit.c:(.text._exit+0x0): first defined here
collect2: error: ld returned 1 exit status

Do you know how to solve this issue?

Try removing -specs=nosys.specs from the compiler options, there should be a checkbox or something in the project options.

I absolutely love you. Thanks, this fixed my problem and shrinked my code down to whooping 22 KB. I'm gonna try to deploy it now and see how it runs on the microcontroller.

Somehow my application is now stuck in Infinite_Loop now and not working, hmm.

Edit: It apparently is this bug: https://bugs.launchpad.net/gcc-arm-embedded/+bug/1747966

But I don't seem to get what exactly should be changed in my project because I'm not familiar with compilers and linkers. Apparently the ${STARTUP} should be included prior to ${OBJ}, but I have no idea how to do that in Settings -> C/C++ Build -> Settings. I'm using SystemWorkbench.

Alright, not the most elegant solution, but by manually commenting used IRQ handlers out that are defined .weak in startup.s solved the issue:

https://stackoverflow.com/questions/51946333/arm-none-eabi-g-does-not-correctly-handle-weak-alias-with-flto