Skip to main content
con3
Senior
November 6, 2017
Solved

Toggling a GPIO pin at 50 + MHz

  • November 6, 2017
  • 3 replies
  • 6152 views
Posted on November 06, 2017 at 16:22

The original post was too long to process during our migration. Please click on the attachment to read the original post.
    This topic has been closed for replies.
    Best answer by waclawek.jan
    Posted on November 07, 2017 at 14:22

    Looks very much like you're hitting the limits of your measurement setup, i.e. scope bandwidth and (more probably) probe bandwidth - unless there's some limitation imposed by the PCB and/or connected circuitry, too.

    Is it a 1:10 probe?

    Also, from which memory are you running? Did you confirm your system clock runs at the frequency you assume it does?

    JW

    3 replies

    S.Ma
    Principal
    November 6, 2017
    Posted on November 06, 2017 at 16:43

    You should use a high speed timer output compare to run a GPIO at this high speed (beware of signal integrity), and make sure the GPIO speed (slew rate) is programmed as appropriate.

    With SPI, you can easily get 27 or 54 MHz (with good PCB)

    waclawek.jan
    Super User
    November 6, 2017
    Posted on November 06, 2017 at 17:11

    Try an unrolled sequence of

    while (1)

    {

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

    }

    Jump at the end of the while(1) loop may take surprisingly long, especially if run out of uncached FLASH. The read-modify-write operation (xor) takes some time, too. 

    JW

    PS. I wonder what exactly in this post triggered moderation...

    Tesla DeLorean
    Guru
    November 6, 2017
    Posted on November 06, 2017 at 17:22

    Don't use XOR it will burn a lot of cycles doing reads and writes which can't be pipelined. Do writes to BSRR to set and then clear the GPIO, and unroll the loop.

    Use a TIM to output toggling signals.

    Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
    con3
    con3Author
    Senior
    November 7, 2017
    Posted on November 07, 2017 at 11:15

    Just to double check, something like this:

    while (1)

    {

    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */

    GPIOC->BSRR = (1<<9);

    GPIOC->BSRR = (1<<25);

    }

    I just want to check how fast I can actually do this in the main. 

    +

    SYSCFG->CMPCR = 0x1;

    Cause something seems odd to me, I've tried using an external interrupt, but the fastest I could get it to react to a signal was 800 kHz, so just checking the code execution with the amount of assembler lines.

    Jan Waclawek
    Visitor II
    November 7, 2017
    Posted on November 07, 2017 at 12:03

    That's about 200 cycles, not surprising at all.

    The inherent entry/exit is two dozen cycles, pending stack is in single-cycle-access memory. Add to that prologue and epilogue added by compiler plus whatever processing happens in the ISR. Can get down to say 3-4 dozens of cycles if you can run it from single-cycle-execution memory like the TCM RAM.

    JW

    Martin HUBIK
    Associate
    November 9, 2017
    Posted on November 09, 2017 at 11:38

    Just to give another view on this. If you use the code which was suggested above.

    while (1)

    {

      GPIOC->ODR = GPIO_PIN_9;

      GPIOC->ODR = 0;

    }

    It will assemble into simple STR Rx, [Ry], where Rx is the value you are writing to the ODR register which is located at address Ry. According to Technical Reference Manual Cortex M7_r0p2.pdf from ARM in section 3.3.3  Load/store timings, this instruction is single cycle. So you might expect to toggle the pin at half of the SystemClock frequency which is indeed 108 MHz. At the end of while loop there will be an unconditional branch to the beginning of while cycle. This takes 1 + P cycles, where P ranges from 1 to 3 depending on the alignment and width of the target instruction, and whether the processor manages to speculate the address early. On F4 devices you would probably see a discontinuity when this happens. On F7 you might not because of the Superscalar nature of the core. Check the following simple benchmark which was executed on F4 and F7

    C code

    for (n = 0; n < NUM_SAMPLES; n++)

    {

    acc += array1[n];

    }

    Assembly

    ??main_1:

    LDR R2,[R0], #+4 // load the next value from array1

    ADDS R4,R2,R4 // add it to accumulator

    SUBS R1,R1,#+1 // increment loop counter

    BNE.N ??main_1 // loop back

    core       Cycles for 500 iterations      Cycles for 1iteration

    M7                                        1024                            2.048

    M4                                        3006                            6.012

    As was already said, it is important that the caches are enabled to compensate for slow flash access. Either the ART caches when ITCM is used or Core caches when AXI bus is used.

    Best regards,

    Martin

    STMicroelectronics, Microcontroller Application Support Engineer

    alexandre239955_stm1
    Associate
    November 9, 2017
    Posted on November 09, 2017 at 14:43

    You could use the BSRR register:

    while(1)

    {

      GPIOC->BSRR= (1 << 9); // Set bit 9

      GPIOC->BSRR = (1 << (9+16)); // Reset bit 9

    }

    Another way to optimize is the use of bit-banding, because it doesn't affect another pins of the same port.

    __no_init volatile unsigned int GPIOC_ODR_9 @ 0x424102A4;

    while(1)

    {

      GPIOC_ODR_9 = 1;

      GPIOC_ODR_9 = 0;

    }

    LMI2
    Senior III
    November 9, 2017
    Posted on November 09, 2017 at 15:37

    Interesting

    What does __no_init volatile unsigned int GPIOC_ODR_9 @ 0x424102A4; do? Especially __no_init and @ 0x424102A4;

    I have seen __commands in Windows programming but forgotten their meaning.

    And @ 0x424102A4 is this CPU scpecific, and what does it do.

    Regards

    Leif M