cancel
Showing results for 
Search instead for 
Did you mean: 

HAL Library seems to not follow classic programming guidelines

Werner Dähn
Associate II
Posted on June 09, 2018 at 15:42

I understand that programming enterprise software on a 8 core CPU with 4GHz and a simple STM32 MCU is different, nevertheless I would have added a few things to the HAL layer from the beginning:

  1. All HAL Init calls check if there is a collision with other resources. I configure PA1 as GPIO, then I use UART1 which needs PA1, hence the UART config should return an error. Granted, CubeMX is supposed to help you with that in a graphical manner but it does not in case you assign pins dynamically. And even if, what's wrong with validating that in a debug compile firmware a second time? Wouldn't that help beginners significantly?
  2. Programming a MCU is very hardware related. You need to open that gate to provide power to... by setting a register, switch the line to alternate pin by another register etc. From a HAL library I would expect to abstract exactly that. I want to enable UART1 on pin PA6, therefore the__HAL_RCC_GPIOA_CLK_ENABLE is called implicitly by the HAL_GPIO_Init function, the remap is called and the corresponding AF clock enabled and.... Things like that. Why should I remember every dependency if the HAL can do that for me?Another example: How many lines of code do you need to get an quadruple encoder (up/down counter) to work? Logically one call saying which pins to use and maybe choose the timer number. The counter specific X2 falling/X2 rising/X4 setting is needed also. Everything else can be derived from that information. Today you need 20-30 lines of code and if just one is wrong, you will not find out easily.
  3. The implementation is often too basic. Most obvious in the UART part of HAL. You can do char polling, add a interrupt callback or DMA. Fine. But useless as in most cases you would use a ring buffer. No rocket science but since when is that requested and yet still not available out of the box? Another missing receiver method is when timing plays a role like with serial packets. Every 20ms I get a packet that starts with 0x01 and ends with 0xEF and lasts for up to 5ms.

Wouldn't such things help greatly to get started? And to debug? And make the user experience better? And lower frustrations? And cause proper error messages rather than simply not working?

Or am I missing something.

Note: this post was migrated and contained many threaded conversations, some content may be missing.
60 REPLIES 60
Posted on June 15, 2018 at 20:03

BLISS was a system implementation language for DEC's Large Machine hardware (PDP-10/20, VAX, Alpha).  It was never a mainstream language for developers.  The target audience were systems programmers who maintained the OS (TOPS and VMS).  Big iron sites usually had a couple in residence.  They did all the usual sysadmin stuff but also had to program in assembler and be well-versed in OS architecture and concepts.  BLISS was a way to speed up building commands and utilities in assembler, along with the less speed-critical OS sections of code.

BLISS was in the ALGOL family and another one of those C ancestors.

In the spirit of 'Those who fail to study history are doomed...' it's been established for decades that code generators have always been the answer to automated software development.  That's why virtually all applications these days are written in RPG....

Don't laugh, there was a local (Nebraska, USA) want ad for RPG programmers a week ago.  And no, it wasn't on a System/3 with the MFCU (bleep bleep Card Unit).

  Jack Peacock

Posted on June 15, 2018 at 20:25

But when it comes to embedded, just what IS the 'application'?  Minimizing the hardware bill of materials and aggressively managing battery power are often more important that whatever else the board does.  A HAL isn't going to work in that kind of environment.

For embedded commercial economics  turn out to be more important than any of the web-oriented development cycles taught in schools.  Sure abstraction and layering are nice, except the image doesn't fit in the flash, or it's too slow starting up, or the 90 day battery life runs out in 45 days, or the HAL design doesn't support the network stack (anyone ever get a CAN V4 stack running on the HAL, or a fast, half duplex, low overhead RS-485 with Modbus?).  ST's HAL doesn't come close to meeting the needs in the real world. 

I need to reuse software as much as possible, but frequently with a change in processor family.  The SPL worked well for this, moving code with minimal changes from F1 to F2 to F4 to L1 to L4 to L0 and soon to F7.  ST has chosen to cut off SPL in favor of HAL, and I understand their position, scarce resources.  That means I have to factor in costs to convert the SPL API to a new family, time consuming but still better than fighting the HAL.  I'm sure it's not the development cycle ST would like to see, but I still buy the parts, and in the end that's all that matters.

  Jack Peacock

Posted on June 15, 2018 at 20:53

'

Minimizing the hardware bill of materials and aggressively managing battery power are often more important that whatever else the board does.'

there is a diverse collection of applications out there and it is impossible to categorize them based on one or two criteria. For the kind of applications I worked on over the last decade or so, hardware cost is of limited concern. instead, time-to-market and documentation get me paid.

'A HAL isn't going to work in that kind of environment.'

not seeing that myself.

Posted on June 15, 2018 at 21:02

Agree with the above 100%, we have been using F767 device in our electronics board and BOM costs for the individual component are not that important since our electronics components in total add up to around $700. For us its much more important to get to market quickly and for the MCU to offer easy reconfiguration via Cube. As im not interested in squeezing every last OP per clock out of it or trying to rewrite the HAL libraries for no particular reason, i am interested in actually writing the application on top of HAL to do the various hardware control we need it for.

Posted on June 15, 2018 at 22:14

>>As im not interested in squeezing every last OP per clock out of it or trying to rewrite the HAL libraries for no particular reason, i am interested in actually writing the application on top of HAL to do the various hardware control we need it for.

The particular reason to rework some of it would be because it is defective or gets in the way. There is definitely a lot of dross in the current offering, there are race conditions, dead-locks, no real rigor to what can or should be done in call-backs, and the USART scheme doesn't fit any buffering model I want to implement.

There is 95% of it that does a good job across platforms allowing for easier development and porting, and expedited code delivery, but one still has to be willing to take a knife to the portions of it, and bits that metastasize across releases for no good reason. I've seen code added that kills performance to 99.99% of uses cases because it can't enforce pointer alignment at the API interfaces. ie because some PC programmer doesn't understand 32-bit pointers should point to aligned memory, performance goes in the toilet for everyone, likely resulting in over/underruns on the specific interface involved. Instead of using an assert() or special casing, we now have potential for latent, unexplained, errors in the field.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on June 15, 2018 at 22:35

That makes sense. And i agree with the fact that Cube + Hal has its bugs. But from a time perspective there are only so many hours in a work day, and i need most of them to develop/debug the actual functionality instead of spending hours trying to rework HAL(When it fails due to a bug in HAL is when i have to spend my time on HAL. But this only happen2 times over the course of 1.5 years of development on STM32 for me that was due to HAL, and this was fixed by reading the forum and changing 1 line of code in the Cube generated init blocks. Future updates fixed this).

For my project most of the functionality is rather simple just alot of it. We are using all 6 SPI and 2 I2C busses to connect to 10+ external peripherals. So as you can imagine i just need the HAL SPI transfer function to work (which it does) because ive got 10+ other data sheets to go through at the application level.

Posted on June 15, 2018 at 23:55

Even more than time to market, automated tools such as the Cube cut time from hardware bring-up to actually developing the application. I cannot afford to spend days only to figure out how to start a new chip, set up various clocks and pins.

'Static' manually tuned libraries for other chips do not help - exactly because I have to figure what to change towards new chips ( = read all the manuals...).

The future of automated tools is AI. This is already beginning in other areas and will come to embedded too.

Machines are better in repetitive works, yes. Machines can learn, do not forget, know all the specs, manuals and errata.

First AI based tools likely will be too heavy to install on developer's machine and will run on cloud.

This is why I proposed to ST to make an experimental 'CubeMX online'. Because it would be deployed in a central location, development and bug fixes can roll forward fast. Feedback from users also will be collected faster. Later they could add some 'intelligence'.

-- pa

Posted on June 16, 2018 at 18:33

Alan Chambers

A template or block or a ready function in my understanding is part of a large one, with a direct prohibition of entering erroneous data. This is when you do not have a choice in using other data - except those that are allowed by the template.

In C ++ tool mode, you'll need to describe all the uses of the template, it's a lot of code. But even a large number of data entry rules - does not solve the error problem. This property of the flexibility of the C ++ language itself, you can almost everything and a little more.

Note on the function used:

Functions set, reset, toggle - must use the read register GPIOx_IDR, and write to GPIOx_BSRR. In this way, you can change the value of the individual lines of the selected port, without affecting neighboring lines. Even in the mode of simultaneous treatment from different tasks, and interrupts.

This is the case when you need to read the documentation first, do it once and do not think about how it works.

But you missed this stage, and used a ready-made solution - this is the problem.

With this approach, nothing good will come out.
Posted on June 16, 2018 at 18:59

It looks like things are going to the direction where embedded SW only handles the HW and communicates with the net. The clouds then do the processing.

Posted on June 16, 2018 at 19:37

>> With this approach, nothing good will come out.

This is contrary to my experience. Do you use C++ much? I use it exclusively for development on Cortex-M processors, and have done so for many years without any troubles whatsoever. C++ is an excellent choice for embedded software development, and far superior to C. C does have more ubiquitous compiler support, but ARMs have excellent compilers, so this is not an issue here.

My aim is to create abstractions which makes life easier, not harder. My template library is only an experiment: it has some promising features, but I'm not convinced it will all work out. The basic goal is to create a typesafe and intuitive replacement for CMSIS. This is actually quite easy, definitely worth doing, and the code is smaller than the equivalent C. The secondary goal is capturing all the metadata dependencies such as which RCC bit to set to enable a given peripheral. This is achievable, but preventing or detecting all invalid usages at compile time is not so straightforward. I definitely do not want to disappear down the incomprehensible rabbit hole of template metaprogramming: there is rather too much of that around already.

As for the toy digital output implementation, there is more than one way do things: for all you know, the ODR implementation uses bit banding (this is an option in the library). But I accept the point about IDR: I goofed a little. But it does rather seem that you've missed the point. A suite of ready made solutions (however they might be implemented), at least for lots of the common hardware use cases is, I believe, precisely what is missing at the moment.

Al