USART help - basic IRQ handler

john · ‎2008-08-22

Posted on August 23, 2008 at 07:34

picguy · ‎2011-05-17

Posted on May 17, 2011 at 12:38

Comment back to lanchon who posted on 19-08-2008 at 22:33

>[The link

http://www.hmtown.com/fifo.htm

] proposes incorrect code. it says:

>Pointer and data must be updated in the correct order. This is always true in interruptible code and is good discipline in an ISR. In our getData() function the code should look something like this:

>char getData() {

>char temp;

>if (in == out) {whatever;}

>temp = *in; // The order here

>in+++; // is important

>return temp; }

>that's a mistake, the compiler and the processor are free to reorder things the way they like. C only guarantees that things happen as if by sequential execution, but nothing specific can be said about the state of memory when you interrupt code. getData() needs to synchronize with the ISR using some kind of nonportable (non-C) functionality.

Side note rant, actually: do this all in C then recompile for a Luminary Micro Cortex processor. Then letâ€™s talk about portable code. â€œLittleâ€� details about which interrupt vector(s) are involved and the I/O port addresses and how the I/O registers are setup will become not so little details. Without even looking I can guarantee that setting the baud rate will be different. IMHO itâ€™s better to do the low level stuff in your own assembly code and allow the higher level C code the freedom to ignore the low level stuff. End rant.

lanchon is correct about the potential problem if/when â€œthe compiler and the processor are free to reorder things the way they like.â€� The order of storing inside an ISR (or any code where the other half of the half duplex transfer can not get in between the pointer and data stores) will not break things.

As long as the C compiler does not reorder the stores all will be okay. The computer hardware SHOULD (or better, MUST) be able to guarantee that if writes are done to locations A & B in that order that another process reading B & A in that order will if getting the updated B will get the updated A as well. There is a comp sci name for this. But I donâ€™t know the name. (I got my degree before BSCS existed.)

This method was used in the CDC6000 / Cyber 70 mainframe computers for ALL I/O. In that environment separate autonomous processors moved data in and out of the same buffer simultaneously and asynchronously. I was there working inside the 6000 Series Scope OS.

lanchon · ‎2011-05-17

Posted on May 17, 2011 at 12:38

for those wanting to read a bit about the JMM:

-first read the current FAQ, which is very java-centered but gives an overview:

http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html

-then take a look at these slides from 2000, when problems were being discovered and solutions weren't yet found, it's very interesting:

http://www.cs.umd.edu/~pugh/java/memoryModel/multithreaded.pdf

(Multithreaded semantics for Java, presentation given at MIT Sept 10th, 2000)

I guess you'll agree that things have moved on quite a bit since mainframe days... :)

lanchon · ‎2011-05-17

Posted on May 17, 2011 at 12:38

joseph, thanks I completely missed your reply before.

picguy, that getData() function is really wrong in a very practical sense, even under a simple architecture such as the cm3. (in fact it's ''wrong'' under any architecture since that code has undefined behavior.) the compilers *do* reorder accesses and that kind of code routinely fails.

> The order of storing inside an ISR [...] will not break things.

that's correct in the general case (unless the ISR talks to another ISR with higher preemptive priority), but it's the reordering in the main code that causes the problems; the getData() in question isn't an ISR, it's a user function. (also, there may be other issues affecting the correctness of the code, such as cache coherency issues, that could affect the ISR itself. note that this article is not directed to a particular architecture, it's given as if correct for any architecture.)

> As long as the C compiler does not reorder the stores all will be okay.

but they do, and processors themselves also do it. that code requires a compiler and processor that guarantee that they won't reorder accesses. (I really don't know whether those things exist, mainly because I have no use for such contraptions.)

> The computer hardware SHOULD (or better, MUST) be able to guarantee that if writes are done to locations A & B in that order that another process reading B & A in that order will if getting the updated B will get the updated A as well.

that's not true, architecture designers are free to choose any memory model. what you mention is a specific characteristic that some memory models provide and others don't. the tendency is towards weaker memory models (providing fewer guarantees) because they outperform the stronger ones under several metrics. some architectures implement more than one model; weaker ones for performance, stronger ones for legacy compatibility.

however keep in mind that that code is broken under *any* hardware memory model because C doesn't define memory visibility outside of the normal flow of execution (ie: compilers and hardware are free to optimize). for synchronization you must use a mechanism that's external to C (that can't be implemented in C).

if you're interested you could read about the redefinition of the java memory model:

http://www.cs.umd.edu/~pugh/java/memoryModel/

Bill Pugh realized that the original model didn't make any sense, and after a while it was clear that nobody understood the model at all, not even the original authors, and that most of java code written to that date was technically incorrect, and that no virtual machine in existence complied with the java specification. the whole affair was a big mess.

since java is meant to be portable, it's very instructive to read the rationale for the new model: it has to work well on all hardware implementations. (C on the other hand get the easy way out: it doesn't define a model, and thus it can't be used for making concurrent programs (without using non-C mechanisms).)

miles · ‎2011-05-17

Posted on May 17, 2011 at 12:38

Quote:

char getData() {

char temp;

if (in == out) {whatever;}

temp = *in; // The order here

in+++; // is important

return temp;

}

Quote:

lanchon wrote:

picguy, that getData() function is really wrong in a very practical sense, even under a simple architecture such as the cm3. (in fact it's ''wrong'' under any architecture since that code has undefined behavior.) the compilers *do* reorder accesses and that kind of code routinely fails.

Can you elaborate? At first I thought you meant that ''temp = *in;'' would read from the post-increment location, which is a reordering a compiler would never do. I assume you mean that the compiler will generate an instruction to store the incremented value of 'in' (while keeping the old value of 'in' in a register) and then follow it with an instruction to retrieve the character from the old location 'in' pointed to. That would allow the ISR to put a character into the old location, before the instruction to retrieve from that location executes.

If I've understood it correctly, I'm curious to hear how you'd change getData() to fix it. Making 'in' a volatile pointer, or a volatile pointer to volatile data, won't change the body of getData() at all, since 'in' isn't declared there.

- Miles

lanchon · ‎2011-05-17

Posted on May 17, 2011 at 12:38

hi miles,

> At first I thought you meant that ''temp = *in;'' would read from the post-increment location, which is a reordering a compiler would never do.

correct, that'd be a compiler or hardware bug.

> I assume you mean that the compiler will generate an instruction to store the incremented value of 'in' (while keeping the old value of 'in' in a register) and then follow it with an instruction to retrieve the character from the old location 'in' pointed to. That would allow the ISR to put a character into the old location, before the instruction to retrieve from that location executes.

exactly, that's perfectly legal behavior for a compiler.

> If I've understood it correctly, I'm curious to hear how you'd change getData() to fix it.

that's exactly the point, there's *no way* to fix it. C can't be used to write concurrent code, it hasn't got any concurrent semantics, you need a C with extended semantics. the extension could come from, say, a threading library, or from your own knowledge of the specific compiler and architecture you're using. in the second case your code would be correct only on that specific implementation; it wouldn't be ''correct C'', since it depends on semantics that C doesn't provide. in the first case, it'd be portable only to platforms where correct threading lib ports are available. (the porting effort would be done only once for all applications.)

in essence, in whatever way you choose (lib or low level fiddling), you need to provide semantics for ordering, atomicity and visibility. ordering is clear. atomicity means mutual exclusion of critical parts (disable interrupt, read-modify-write, enable interrupt -when talking to an ISR). visibility means assuring that the data sent by one flow will be there for the target flow (disable interrupt, invalidate the local cache to read global memory, read-modify-write, flush the local cache to global memory, enable interrupt -for instance on multiprocessors without cache coherency, or uniprocessors with virtual memory and virtual-address-space caches).

> Making 'in' a volatile pointer, or a volatile pointer to volatile data, won't change the body of getData() at all, since 'in' isn't declared there.

it can change the compiled output. but what is volatile? AFAIK, there's no clear meaning of volatile. it's said of volatiles that they won't be cached in registers and that accesses won't be optimized away by the compiler or hardware, that volatile accesses won't be reordered by the compiler or hardware (with respect to other volatile accesses), and that they'll tunnel the caches. so what? volatiles defined like that are utterly useless.

imagine a complex OS in which a program has to print a document. the word processor needs to give a sort of standardized list of instructions (an object graph) representing the document to the spooler and printer driver. so it builds the object graph and queues a pointer to it on the spooler FIFO. the FIFO is great, it's done with volatiles and great care, so the spooler is guaranteed to get the pointer.

now what? when it dereferences it, what will it find? there's nothing to assure that the graph is visible to the spooler. because the graph itself isn't volatile, it could be cached locally in a word processor thread! or maybe the graph stores were reordered to after the FIFO push by the compiler, and after the push the OS preempted the thread to run solitaire. so the graph doesn't exist anywhere, it hasn't really been built yet!

you'd have to make *all the graph volatile* and keep going back until everything in the OS is volatile. which is another way of saying ''let's disable the cache and forbid absolutely all access reordering'', or let's go back to the performance levels of the 1980s. clearly the ''portable'' meaning of volatile, *if it exists at all*, is not enough. like I said, you need extended semantics.

(some implementations of volatile act like mem barriers, in effect providing some kind of acquire/release semantics, but again, that's extended C. BTW, anyone here knows the volatile semantics of GCC?)