cancel
Showing results for 
Search instead for 
Did you mean: 

Large compiled application size, is this normal?

Scott Dev
Senior
Posted on February 18, 2017 at 08:47

Hi

  I am just a beginner in the STM32, I have been designing for 8bit Freescale parts for years. I am using ST's Nuleo64 STM32L07 processor.

 Using STM32CubeMX I designed a very simple application that simply captures the system timer, GPIO interupt for the push button. I simply wrote some code (just for me getting used to the system) using Keil V5. I simply use the system timer interupt to blink the LED, pressing the button changes the speed of the LED . Within Cubemx I also selected RTC,LPUART but done nothing with the code yet. After building the application and ran it I noticed that the size of the code is 6420, and ram 1176. To me this seams a lot for what it does, I am used of having more tighter code. The application uses HAL library, does this put a large overhead to the size of code and ram used? And am I better stripping out some of the code that CubeMX creates?

Thanks

Scott

21 REPLIES 21
Posted on February 19, 2017 at 17:19

No I think there is purposeful casting of the maths related to the clock speed vs baud rate ...

IMHO that would make it even worse.

Such sub-optimal implementations seem to be the price of a 'one-click' software ...

Posted on February 19, 2017 at 17:30

There is supposedly a low-level subset of CubeMX, without such dreadful 'helper functionality'. These files carry a '_ll_' as part of their name.

However, this low-level subset is available (and complete) for just a few selected MCU variants.

I have stated the assumption that ST's hardware/silicon team is far outpacing the Cube development team, and never got any objection from ST staff here. So most probably this is the case. And any experienced developer could make a good guess what that means for the quality & maturity of the Cube software in the foreseeable future ...

Posted on February 22, 2017 at 00:13

Just showing some love for the _ll_ stuff,  for the f3. Porting from SPL to _ll_ was a cakewalk vs the effort it took to port SPL-based source to a barely-functional HAL implementation.

Posted on February 22, 2017 at 10:25

Yes, the F3 (in particular the F303, as found on the F3 discovery) is one of the MCUs which is blessed with a full LL header set.

I have a F746 Nucleo, and miss that LL-headers badly. There do (did) exist just two sorry stub files ...

BTW, downloading hundreds of megabytes for a few LL headers seems somehow disproportionate to me. Would be great if they were available as extra package - for the brave one's.

Maor Avni
Associate II
Posted on February 22, 2017 at 11:44

First of all you're working with 32-bit code, in contrast to 8-bit code. The coded instructions will be bigger by definition.

Second, if you use Debug compilation, which usually means no code optimizations at all, the code will always be larger than optimized code.

Posted on February 22, 2017 at 12:53

First of all you're working with 32-bit code, in contrast to 8-bit code. The coded instructions will be bigger by definition.

That is not quite true.

The size of an instruction is defined by the number of instruction types to decode (like ADD, SUB, MOV, etc. in it's variants), and source and destination operand range. E.g. selecting one of 16 ARM core registers requires 4 bits of an instruction.

The same holds true for 8-bit processors. The Z80, for example, had 8 general purpose registers, requiring 3 bits for encoding.Only does that 8-bit core need more cycles to fetch a multi-byte instruction on the 8-bit data bus. That's why many old cores have a dedicated accumulator register - many operations implicitly use this accu, saving bits in instruction size and thus fetches. However, in the a greater context, you often need more instructions to achieve certain operations, loosing more than you gained before.

Posted on February 22, 2017 at 13:26

also ARM hates misalignment and the compiler will often add bytes to a variable for alignment

do a sizeof() on this struct on your 8bit mcu and then on the stm32 and note the difference

typedef struct {
 uint16_t foo;
 byte bar;
 float baz; 
} foobarbaz;

Posted on February 22, 2017 at 13:47

This is correct - for other 32-bit architectures (and greater) as well.

However, it has no direct impact on code size here (Flash), rather data size (RAM).

Posted on February 22, 2017 at 15:24

The Cortex-Mx processors used here don't run 32-bit ARM instructions, the code is almost entirely 16-bit Thumb opcodes.

The coding rules/standards covering the library should have precluded the use of floating point, its application here is just lazyness

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on February 24, 2017 at 20:01

If most of the extra code is because the floating point library is linked in, then I know (well,think) the size of the code should only increase due to my own code. And as I will be using double calculations, then I would have linked it in anyway. But I was just curious why the code was just large with such a small Cube bit of code. Thanks for all the info..