USART Receive Interrupt

gerrysweeney · ‎2024-01-23

Hopefully this is more of a sensible non-noob question, it relates to interrupt handling, in my case a simple USART receive interrupt.

I can see how the CubeMX tool is managing the .c file, its created an interrupt handler, for me to insert my code, thats easy. I can detect the interrupt and spit out a short string in response, so its working - all good.

The question I have is, what can I do inside that ISR safely. Do I need to preserve any registers. My aim is to throw anything I receive into a ring buffer of some form, with a view to building a received packet of data. At some point I will need to pick up each completed packet from a list. In order to do that I will possibly need to allocate some memory, and it would be useful if I do not have to put the whole implementation of that ISR in the file that is being auto-generated, so would be handy to also be able to call a function where I can contain the implementation of the ISR in its own compilation unit.

One other question is that of controlling interrupts. At some point, I would have a packet of data in a buffer somewhere that I would access from the main loop of the code. In order for me to safely access that packet from the receive buffer, I would need to first, stop/pause/prevent the next interrupt from interrupting me. On x86 architecture this is quite tricky as you have to rely on the atomic behaviour of specific instructions, and prevent the interrupt in a way that, should an interrupt occur while getting the data safely from the buffer, the interrupt controller will hold/queue that interrupt until the interrupts are re-enabled.

What is the right semantics for doing this on the STM32 platform?

As ever, any help at all much appreciated.
Gerry

TDK · ‎2024-01-23

> During the time I am messing with the buffer form the main loop, how do I ensure that the ISR does not fire again and change the buffer that I am currently copying?

Generally, you should write your accesses such that the ISR firing in the middle of something is okay, but you could also disable the interrupt during a time to prevent this from happening. HAL_NVIC_DisableIRQ/HAL_NVIC_EnableIRQ could be used here to suspect a particular interrupt.

A ring buffer can be written in such a way that the ISR pushes data while the main loop pops data, with neither interfering with each other.

> And one other question, how do I know which registers its safe to mess with during my ISR, for example, if I used memcpy thats going to use some registers for src/dst/len param, that I presume for an ISR would be bad right?

Core registers? Typically those aren't manipulated at the C code level. But you don't need to preserve anything. Core register (r0, r1, r2, etc...) are pushed before the ISR starts. It takes something like 12 cycles for this to happen. They are restored when the ISR exits.

If you feel a post has answered your question, please click "Accept as Solution".

magene · ‎2024-01-23

Your question "how I can access that queue and safely dequeue ..." is very pertinent. I think you're talking about dealing with something like this.

1. The ISR enqueues a message on the queue and exits.

2. The non-ISR part of your code detects something is in the queue and starts dequeuing.

3. Before the dequeue process is finished, the ISR fires again and enqueues a new message.

A thread safe queue can handle this situation, it requires proper sequencing of when you change the value of the pointers to the head and tail of the queue. I thought the link I gave you talked about this but I didn't see much. If you google around for something like "thread safe" FIFO queue, you'll be able to educate yourself about how to implement a queue that can deal with the above.

And BTW, the std::queue is immensely resource intensive so I use a much lighter weight version I wrote based on the link I gave you.

magene · ‎2024-01-23

And googling for "lock free Queues" is helpful also. Here's one link that might be useful

https://moodycamel.com/blog/2013/a-fast-lock-free-queue-for-c++

gerrysweeney · ‎2024-01-24

Thanks for the link. I am quite familiar with threading and lock-free queues etc, but these implementation require the use of threads and synchronisation natives like mutex's etc... under the hood, these syncronization objects are provided by the operating system, which in turn use very specific characteristics of the CPU and specific atomic operations in order to function reliably. None of this applies when programming bare metal in the absence of a scheduler.

An interrupt routine is not a thread, its a lot like one conceptually, but, in practice you have to 100% rely of very specific characteristics of the CPU and/or interrupt hardware in order to ensure there is controlled access to critical data structures.

So in the case of the STM32 what I was asking is, what is the *correct* way of ensuring that if a mess with a buffer pointer during an interrupt that I am not breaking other application that was in the middle of messing with the same pointer value in memory at the time the interrupt fired. The solution here cannot be a generic lock free thing, it really has to be something very very specific to the way in which the STM processor/interrupt controller hardware works.

I spent many hours yesterday evening trawling the internet looking for examples of how to do this, and I am so surprised that there is so little information out there, STM32 parts are very popular, yet there appears to be a total lack of information out there...even on GitHub where you can generally find most things.

Case in point, I want to implement a simple ISR to read chars of a UART and place them into a buffer, I want to do that on interrupts, its a slow connection (9600 baud) so should be a breeze for q 180Mhz 32 but part, so interrupts per char should be absolutely find. In my main code I want to **safely** access that same buffer while the interrupts are potentially still firing. This is such a common use case, its truly remarkable how none of the ST documentation or examples appear to show how to do this.

I was always told that STM32/ARM stuff has a very high barrier to entry, and I am starting to understand why that is now. The HAL is one way of doing things, yet examples around the net seems to showing 100 different ways of doing the same thing.

Other parts I have worked with (ESP32, Microship PIC18/32 for example, even Atmel (no Microchip too) parts) are so much better documented, with so many more decent examples and explanations, the barrier to entry is far less.

A few years ago I worked on a project that used an ESP32, this is a Chinese part, and at the time was new and barely documented, yet even that was 1000x simpler to start working with, the examples where good, and the software supplied was part of the Patform.io ecosystem. Within hours I was writing C++ code and quite a complex application, about 16,000 lines of code, all of the elements done what they said on the tin, and most importantly there were good working examples to draw from.

Having tried for a few days (on and off) I was thinking this morning that I may just give up on STM32 parts at this point. Too difficult to do the basics, you seem to have to know an aweful lot of stuff that does not appear to be documented. Unless I am looking in the wrong places, but I have really tried...

TDK · ‎2024-01-24

Perhaps show some code with what you think will break. In general, your ISR and your main thread shouldn't be changing the same pointer variable at all.

In the circular buffer example, have a pointer to the head of the buffer and the tail. The ISR updates the head pointer, the main thread updates the tail.

None of this is STM32-specific. This is all applicable to general coding in C with threads.

There is the LDREX and STREX instructions, but these shouldn't be needed in general. You can use LDREX/STREX to implement a mutex object, but again, shouldn't be needed for a circular buffer implementation.

If you feel a post has answered your question, please click "Accept as Solution".

gerrysweeney · ‎2024-01-24

Hi TDK,

Thats fair, I have not started to code the ring buffer implementation, I agree its generally generic and you are right that the ISR will change the head pointer, the main app will change the tail pointer. But at some point, it is required that the ISR should ensure there is no overflow, and more typically the main app will need to check if there is data to read, the way it will do that is check if (tail < head) but also change the wrap around too, in this case those checks most definitely will not be atomic, and so you could end up where you read some value which is wrong. Of course you can design such that you can guarantee that the buffer is big enough that things are much less likely to go wrong, but generally speaking its better to have guarantees that at those critical moments there is some predictable and understood atomicity.

The pointer at the LDREX/STREX is useful, has led me here, I think this is pretty much what I was looking for, that gives me enough to work on I expect: https://devblogs.microsoft.com/oldnewthing/20210614-00/?p=105307

I will share the code if I run into a problem, as it is now I am struggling to get the Interrupts to fire predictably, in fact I can predict with 100% certainty that, if I handle the IDLE state interrupt, and turn on a LED each time it triggers, and I trigger it by simply pressing a key in the terminal I have connected to USART3 port, I would expect the IDLE interrupt to trigger after every key press, but what it does is trigger (very reliably) only after every second keypress. Seems a bit weird, I must be doing something wrong, but I really cannot see what. The lack of any examples is the killer here, I have no point of reference, and am not quite at the stage of breaking out the scope/logic analyser to try and reverse engineer what the UART is doing...

I read somewhere that USART3 and USART4 share the same interrupt so I am wondering if it might have something to do with that, but to be honest with you, its a bit like poking around in the dark - quite frustrating, I cant even find any official documentation that provides anything like the levels of technical details one might need to understand in detail how the USART actually works. The MCU data sheet tells you what it is, and what features it has, but thats it, I must be looking in the wrong places - and thats after hours of googling...

Thank you
Gerry

Tesla DeLorean · ‎2024-01-24

There are 16550 IP cells and equivalents, both ARM and SAMSUNG had them available 2+ decades ago. Also some very clever and efficient baud rate dividers..

Apparently too expensive to licence so ST opted for DIY. Or too many gates/complexity

Some of the newer STM32 parts/families have FIFO in UART and SPI, not particularity deep.

On STM32 I've typically used a DMA circular/ring buffer for Rx to similar effect, at a depth I want, and swept that periodically. Chaining Tx DMA into largest blocks of pending data out of similar ring buffers.

I don't care for the HAL implementation / paradigm for USART much at all..

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

gerrysweeney · ‎2024-01-24

Hi Tesla DeLorean,

The DMA method seems to be what people use, not sure thats going to work for me, or at the very least, it adds such a level of complexity to deal with while learning.

The basic use case I have is this. 9600baud incoming serial data stream. The system emitting the serial stream opted to use "quite spots" otherwise known as IDLE state as the end of packet markers, with no header length value, this is an RS485 bus. Messages are small (2-18) bytes. So if feed that into a normal USB attached UART on a PC and you cannot reliably decode that data because buffering at the OS level eradicates the timed IDLE spaces on the stream. So as part of my "Learn STM32" I thought would make a Rs485 to USB thingy that will read off the messages, package the message into something I can send into a PC and go from there.

On a micro controller (STM32 included) they have IDLE detection, that is, if the incoming RX line is high for more than one byte time (11 bits on 8N1) then the UART can fire an IDLE state interrupt. So it would seem, the simplest way to reliably deal with the stream would be to ...

1. Start with an empty buffer
2. Read a byte at a time adding to a buffer.
3. At the point I get an IDLE interrupt, mark the buffers bytes (how ever many there are) as a packet and let the main application code send that to the PC.

Like it could not be any simpler, but, I cannot seem to make the STM32 UART do even the most basic of things. I am using an LED as a debug aid, and even with such a simple thing, I cannot make the IDLE interrupt do what on paper, it should so easily do... the lack of examples and documentation makes for a very frustrating experience when I am trying to do something so simple, something I know I could do in an hour or two on a PIC.

I am not normally this grumpy!

Gerry

TDK · ‎2024-01-24

> But at some point, it is required that the ISR should ensure there is no overflow, and more typically the main app will need to check if there is data to read, the way it will do that is check if (tail < head) but also change the wrap around too, in this case those checks most definitely will not be atomic, and so you could end up where you read some value which is wrong.

Reads and writes of aligned uint32_t values will be atomic (probably unaligned as well). You will not get intermediate values.

Because of that, the ISR can check for space available and the main thread can check for space left with (head - tail). When the ISR reads the tail in the middle of when the main thread is modifying it, either it will read the old value or the new value. Either one is valid.

> The MCU data sheet tells you what it is, and what features it has, but thats it, I must be looking in the wrong places - and thats after hours of googling...

This is a common issue. The reference manual will describe how the chip works at the register level. That should be your primary reference for how the chip works. The reference manual is good. Not sure if you've mentioned a chip, but here it is for the H743:

https://www.st.com/resource/en/reference_manual/dm00314099-stm32h742-stm32h743-753-and-stm32h750-value-line-advanced-arm-based-32-bit-mcus-stmicroelectronics.pdf

There is also the cortex documentation, which describes in some detail how the core works. This is somewhat harder to interpret, but also not as useful.

A shared interrupt is still a single interrupt and cannot pre-empt itself even if the flag comes from the other peripheral.

Edit: I'll also echo the sentiment that USART is not handled particularly well in the HAL library. The basic paradigm of shoving things into a circular buffer in the ISR and interpreting those in the main loop is a simple concept and isn't covered in any examples. You can cobble it together by using HAL_UART_Receive_IT on single bytes, which is very inefficient, but works. There is also HAL_UARTEx_ReceiveToIdle_IT or similar which is a different method but also work.

If you feel a post has answered your question, please click "Accept as Solution".

gerrysweeney · ‎2024-01-24

@TDK

Ahh thank you, well I was looking for the details reference manual and I could not find it...not even under the technical documentation section for the part. The Chip I am using is the STM32F429ZI on the Nucleo 144 eval board.

Can I ask how you located that document, so I can try and find the same one for the chip I am using please?

I think you are right about the HAL, the abstraction seems a bit weird for sure. I found a post on Stack Overflow that explains that when you called the HAL_UARTx_xxxx_IT function to receive, you have to tell it how many bytes, and under the hood it maintains a counter, once that count reaches zero it disables the interrupt again, that sounds totally dumb to me.

I don't suppose you know where I might get an example of how do enable interrupts and implement the ISR without the HAL?

Thanks again for your help TDK...

Gerry