Showing results for 
Search instead for 
Did you mean: 

DWT and Microsecond Delay

Associate II

Hello everyone
I want to use dwt unit to create microsecand delay function. How safe is it to use the DWT unit ? 
Does it only work in debug mode or does it also work in runtime ?

Is there a guarentee that every time the MCU is reset it will start up properly every time ?

Can i use it in place wehere critical processing in required ?


What chip? Not all series have this.

It's perfectly safe to use DWT->CYCCNT for any use, including non-debugging scenarios. You need to start it at startup, which is easily done:


  // enable core debug timers
  SET_BIT(CoreDebug->DEMCR, CoreDebug_DEMCR_TRCENA_Msk);
  // unlock write access to DWT registers
#ifdef STM32H7
  DWT->LAR = 0xC5ACCE55;
  // enable the clock counter



If you feel a post has answered your question, please click "Accept as Solution".

You are responsible for starting it, just like you are for the SysTick.

It counts machine cycles, so is the highest resolution counter typically available.

It is not present in the CM0(+) offerings, and is optioned in all of ST's other STM32 cores to date. It and things like the ITM, DWT, FPB can be dropped by implementers to save gates.

Your alternative is to use a maximal 16 or 32-bit TIM that you have clock at 1 MHz, or as fast as possible, and you gauge elapsed time from the TIM->CNT advancing there. Having a fast count allows you to get tight accuracy, but interrupts can cause other things to occur in the mean time.

SysTick is 24-bit and down counting, it can be used but the math and wrapping is far more clumsy.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Associate II

For the STM32F4 series, I will use the CYCCNT register of the dwt unit. Is there such a thing as CYCCNT register not counting every time my processor resets?
I write my other functions relying on my microsecond delay functions. If CYCCNT does not start, my other functions will not start.
Is the code you wrote above reliable for initializing the CYCCNT register?

I wrote a microsecond delayed function using the CYCCNT register and tested it through logic. If HSI is selected as the system clock, it cannot produce the correct results, but when HSE is selected, I get the delay I want. Any thoughts about this? Is this happening due to HSI stability?


The code I posted reliably starts the CYCCNT counter.

> Any thoughts about this? Is this happening due to HSI stability?

Probably a code bug. Don't see how HSI tolerance could have any effect. Perhaps expand upon "it cannot produce the correct results". Why can't it? What results does it produce? What results are you expecting instead? Be detailed.

CYCCNT works with HSI just as it does with any other clock input.

If you feel a post has answered your question, please click "Accept as Solution".

I have a microsecond delay function. In this function, it reads the CYCCNT register, compares the first value with the last value and waits with while. I create a delay of 10 microseconds with the HSI clock. When I check it via logic, I observe a delay of 15 microseconds, while when I select HSE, I see a delay of 11 - 12 microseconds. What I mean is that I get more accurate results with HSE.


What is "DWT"?

When I have implemented a "micro-second" delay timer in my project...:

  • most of my projects are based on RTOS:
    this gives me a 1 milli-second delay (via osDelay() )
  • but not fine granular, sometimes a need for micro-seconds
  • I have used a TIM (counter/timer):
    configure counter to expire (and fire INT) after X micro-seconds
    and wait in blocking mode for the INT seen
  • but the drawback is:
    I have to initialize, fire and wait for the completion:
    During this time, the RTOS is not scheduling any other task/thread (it would not make so much sense for few
    micro-seconds to try another thread/task - the overhead is anyway way too much)
  • So, with a TIM I can wait for micro-seconds, but if 1000 micro-seconds - still blocking the RTOS scheduler
    (OK, you could "combine" osDelay with TIM micro-second delay, my concern is just the "jitter" generated when
    the RTOS kicks in, schedules a new task/thread and my timer expires too late...)

for a very fine granular timing, I have used also a "NOOP_delay": wait for N numbers of NOOPs done (also blocking the RTOS scheduler). It is just a bit tricky to "calibrate" (how many NOOPs are one micro-seconds?).

Just "waste the time" with NOOPs (in a loop with N as parameter). As long it remains below 1 milli-second - you should be fine.