cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with Single Step STMF7

Posted on December 07, 2016 at 21:47

There is a problem with IDE debuggers for STMF7, and as I understand, all ARM M7 core based processors. With other CORTEX controllers, when a debbuger pauses or hits a breakpoint, the debugger stops interrupt processing so that, even if interrupts are running in the system, when you single step, you go to the next line in the software module you are debugging. With the F7 debuggers, like GDB which is used by Eclipse, SW4STM32, Keil and other toolsets, when you hit a breakpoint and then try to single step, if any interrupts are running in the system, you step into an ISR, or the same instruction, not your next instruction and will keep doing this forever. The 'workaraound' is apparently to put your cursor on another line and tell the debugger to 'Run to Cursor'. But even this does not work until you disable the breakpoint where you stopped, because the interrupt return keeps taking you back to the breakpoint rather than going on.

Does anyone know if this is being fixed? Or if there is a workaround? Or if any toolsets have fixed it? I have been working on a complex application on the F746 Discovery board that uses FreeRTOS, STemWin, A/D's running interrupts etc. etc. and this bug makes it really painful to just operate. I am Using SW4STM32 but I assume this also affects the big money toochains that also use the basic GDB debugger as its main debug engine.

Help?

#f7 #sw4stm32 #ide

Note: this post was migrated and contained many threaded conversations, some content may be missing.
64 REPLIES 64
A L
Associate II
Posted on April 22, 2018 at 14:33

Hi. I have the

http://www.keil.com/support/docs/3778.htm

on an

http://www.st.com/en/microcontrollers/stm32f746zg.html

board; single stepping in GDB is impossible due to the pending interrupt bug.

Here, some info on my particular type of microcontroller:

ARM Cortex-M7 r0p1

Unique ID: 0x00420037, 0x32355112, 0x35353534

CPUID 410FC271 DEVID 449

System Clock: 216000000 Hz

C0000038 2004FD18 00000000

10110021 11000011 00000040

FPU-S Single-precision only

Are there any workarounds for debugging it with GDB that solve this issue? (I have not read the whole thread yet...)

Does it make sense to try to change the DHCSR register in GDB?

And are there any chips with fixed r0p2 cores available already?

Any help would be appreciated. I have this issue since 2016, and hoped that it would be fixed in more recent chips, but it wasn't when I've ordered new ones recently..

Posted on April 22, 2018 at 16:01

ST's policy up to now has been to keep using the same core for a specific product, and only step their peripheral IP around it.

The F77x/F76x and H7xx use different cores

CPUID 411FC270 DEVID 451 REVID 1000

Cortex M7 r1p0

STM32F76xxx or F77xxx

C0000018 20021EB0 00000000

10110221 12000011 00000040

FPU-D Single-precision and Double-precision

SystemCoreClock: 200000000, 200.00 MHz

CPUID 411FC271 DEVID 450 REVID 1003

Cortex M7 r1p1

STM32H7xx

C0000018 20000438 00000000

10110221 12000011 00000040

FPU-D Single-precision and Double-precision

SystemCoreClock: 400000000, 400 MHz

My general recommendation would be to try the J-Link firmware, and to instrument your code to understand flow and interaction.

The situations where I need to single-step my code to know what it is doing is an extremely fractional case, static analysis is very effective, and knowing, via reporting, what it is doing dynamically augments that. Add code to get the visibility you need, and use bit flags or levels to turn on/off or isolate reporting.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on April 23, 2018 at 13:43

I am sure that their is an errata available who figures out what kernel step's are got this bugs.

ARM Cortex-M7 r0p0 and ARM Cortex-M7 r0p1 if i remembered it right

Posted on April 23, 2018 at 16:25

Thank you Clive One!

http://www.keil.com/support/docs/3778.htm

 :

Also, check if the device manufacturer offers updated revisions of the device with a Cortex-M7 core revision r0p2 or newer.

According to this

http://www.keil.com/support/docs/3778.htm

, in ARM Cortex-M7's revision r0p2 the bug should have been fixed.

Clive One wrote:

ST's policy up to now has been to keep using the same core for a specific product, and only step their peripheral IP around it.

Does that mean that the core will not be updated at all? And that we need to move on to another STM32 type? (Which type would be a repalcement for an STM32F746ZGT6?)

My general recommendation would be to try the J-Link firmware, and to instrument your code to understand flow and interaction.

I use the official

http://www.st.com/en/development-tools/st-link-v2.html

programmer by ST for debugging and programming purposes. And GDB with a standard GNU toolchain.

The situations where I need to single-step my code to know what it is doing is an extremely fractional case, static analysis is very effective, and knowing, via reporting, what it is doing dynamically augments that. Add code to get the visibility you need, and use bit flags or levels to turn on/off or isolate reporting.

Well, it is possible to live without debugging the code on the target and stepping through it, but I don't understand how such a impactful bug can survive for over 2 years, and why buggy chips are still sold by official distributors.

Any hints on how to solve the issue would be much appreciated.

Posted on April 23, 2018 at 18:05

There is a J-LINK OB firmware that runs on the ST-LINK, it is described earlier in the thread. Segger's business has more focus on resolving/addressing ARM level anomalies.

Debugging is an extremely fractional use case, expect 99.99999% of all parts sold never get attached to one, so don't expect millions of dollar to be spent on new mask sets and validation to address low impact problems. Your view of the bug/behaviour is distorted, it is of marginal importance in the millions of deployed devices. You're debugging a handful of devices in a lab, the number of devices the software will go into is several orders of magnitude higher in most any commercial context.

Most debugging relates to logic flaws and bugs in code you've written. Most problems where single-stepping is employed can be solved by static analysis and understanding what code you've written, and what the compiler generated, computers are stupid and do very predicable things. Problems that are dynamic and bound in the time domain are not resolved by dead stopping live systems. In mechanical systems this can result in physical damage and destruction, things with momentum, or inertia are apt to rip themselves apart, and things where gravity is involved can fall out of the sky.

>>Does that mean that the core will not be updated at all?

Correct, the ARM IP is generally left alone. Problems that effect peripherals, ie Ethernet, may be addressed in steppings as that impacts user facing functionality and sale-ability of the product.

>>Which type would be a replacement for an STM32F746ZGT6

Check for a pin equivalent in the F76x/F77x families. Should be able to run same code in a materially equivalent way. The Cache and TCM memory sizes may have an impact, but if you test you code thoroughly on both parts its hard to believe these differences will be impactful to the logic of your code and just add interesting variability in the dynamics of things. You can build for the FPU-S. Do your single-stepping on the more advanced core, deploy the final software to the least cost device.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on April 23, 2018 at 22:29

Hi Clive,

your comment explain it detailed and should be stated the situation clear. I had a phone call to Segger, (they are quite close to me in Ger) near about 18month ago about this issue and they had confirmed the bug in the 2 core versions.

I don't know what ever they are doing as a work-around in detail, but this works pretty nice in the firmware of the j-link.

The debugger is quite helpful to solve issues around the external bound hardware. Debugging a pc plateform is quite easier as a microcontroller in a depended hardware-environment. As you wrote before, timing issues are the most of problems and often enought is the software i had written, not the source of the unexpected behavior. 

Hardware-Based (OCD) debugging is a powerful tool that's make the development easier and faster. The ST-Link/2 is to slow and unreliable for my purposes. What i don't understand is, that ST know the problem and don't  implement a bugfix into their own st-Link firmware like segger did it for.
Posted on April 23, 2018 at 22:42

I suspect ST engineers have been busy with CubeMX and the increasingly expanding STM32 portfolio. Generally teams get moved from one project to the next, the original F7 design is some 4-years old a this point, people have moved on, and whatever flaws in the ARM IP will remain in what is an archived design.

IC Design is not like a software project that drags on indefinitely, there are points where it is done, the design tapes out and it moves to the realm of manufacture, validation and test. Flaws and issues that are identified get feed into the next designs as things not to repeat, new IP may also be available from third-parties based on feedback from all consumers of that IP, and improvements added.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on April 23, 2018 at 23:22

Oh, i understand, that they at R&D of ST (and of course all the others too) are in rush - time is money and the market develop's fast, but this fix is not magic - that's not a nuclear power plant :-). St spend much money into their marked development in the last few years. The several Nulceo's, Discovery's and the shield's for near every purpose you can imagine, are available for just a few bugs. Arduino has teached them how a conroller could be successful for a wide range of customers. They got Atollic-Truestudio, even they had supported AC6 a longer time. They develop a newer HAL, develop newer high-performance controller like the Arm M8, and in addition the cubeMX development consumes much money without a defined ROI. All this should support and save the business of now and tomorrow.

And they lacks into a more reliable firmware for the debugger?

The Nucleo and Disco boards are destinated to the educations and individuals like radio-amateurs, digital-photographers, the whole IoT community, students and many other non commercials. All of them requires a functional debugger, not the professional who sells 2000 or more devices each day. i guess that they had ignored this because professional developer don't use an ST-LINK/1,2,3... They uses a J-Link-Trace or an Lauterbach -Trace32- for a few k$ each. All of them works-around such issues.

 Many of the ST-Boards will be debugged and get in trouble if the IP core is not bug-free or a work-around is not available.

The project's i am intended are less complex and a cheaper J-Link will be universal enought for us purposes. 
Posted on April 24, 2018 at 17:08

Thank you for the detailed answer!

Clive One wrote:

>>Which type would be a replacement for an STM32F746ZGT6

Check for a pin equivalent in the F76x/F77x families. Should be able to run same code in a materially equivalent way. The Cache and TCM memory sizes may have an impact, but if you test you code thoroughly on both parts its hard to believe these differences will be impactful to the logic of your code and just add interesting variability in the dynamics of things. You can build for the FPU-S. Do your single-stepping on the more advanced core, deploy the final software to the least cost device.

This is probably the most straight-forward solution for me right now, since the board is nearly finished and ready for more software development... And I don't want to fiddle with more advanced debugging gear right now.

So, it would be possible to replace the old F74x with one of those?

STM32F767ZG

STM32F767ZI

STM32F777ZI

How can we make sure that the ARM core is up-to-date in those chips?

I've contacted Digikey support, and asked how one can find out more about the actual revision of the parts in stock. Hope to get an answer soon.

As

https://community.st.com/people/64044

‌ pointed out up there in the thread, the bug seems to have been fixed on the STM32F769 for example, but I cannot find any details in the Datasheets, Reference or Programming manuals about this, but I'm sure the info is somewhere... And the Errata sheets only list errors in the peripherals I think, not the ARM part.

There is a J-LINK OB firmware that runs on the ST-LINK, it is described earlier in the thread. Segger's business has more focus on resolving/addressing ARM level anomalies.

Interesting. - So far I've only used ST-LINK/V2 with GDB / OpenOCD on Linux. - I've heard good things about Segger tools, and eventually switch over, but I'm not sure how much hassle it will be to get them working with my setup, and if it's worth the cost.

Debugging is an extremely fractional use case, expect 99.99999% of all parts sold never get attached to one, so don't expect millions of dollar to be spent on new mask sets and validation to address low impact problems. Your view of the bug/behaviour is distorted, it is of marginal importance in the millions of deployed devices. You're debugging a handful of devices in a lab, the number of devices the software will go into is several orders of magnitude higher in most any commercial context.

Yes, that makes sense.

Most debugging relates to logic flaws and bugs in code you've written. Most problems where single-stepping is employed can be solved by static analysis and understanding what code you've written, and what the compiler generated, computers are stupid and do very predicable things. Problems that are dynamic and bound in the time domain are not resolved by dead stopping live systems. In mechanical systems this can result in physical damage and destruction, things with momentum, or inertia are apt to rip themselves apart, and things where gravity is involved can fall out of the sky.

Yes, I agree. The reason why I really want to be able to debug the program on the device is that it is partially an educational board I'm making, so the user will be programming it too.

>>Does that mean that the core will not be updated at all?

Correct, the ARM IP is generally left alone. Problems that effect peripherals, ie Ethernet, may be addressed in steppings as that impacts user facing functionality and sale-ability of the product.

That's good to know; after making the first prototype I thought: well, they are certainly going to fix this soon, so I simply need to make basic tests, and replace the chip later. - But I see the reasons. - ST needs to make many cuts to keep their powerful parts affordable enough I guess.

Posted on April 24, 2018 at 18:25

I've posted core revisions for the F76x earlier in the thread, the initial product was r1p0, not the r0px cores

https://community.st.com/0D50X00009XkgAOSAZ

The stepping of the ST IP is the A, X, Y, Z type designators. The published Errata (ST) should cover the details there.

Stupid post level links still broken, argh!!!

The r1p0 should be newer than r0p2

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..