Showing results for 
Search instead for 
Did you mean: 

Create a 32b parallel bus

Associate II

Hello, I am working on the stm32mp157-dk1. I would like to create a 32bits parallel bus on cortex A7, at a frequency greater than 20Mhz but when I do a test on a pin I get a max frequency of 120khz, which is too low. I think this is the wrong program.

Can anyone help me

Here is my code

// SPDX-License-Identifier: GPL-2.0-only


 * gpio-hammer - example swiss army knife to shake GPIO lines on a system


 * Copyright (C) 2016 Linus Walleij


 * Usage:

 *           gpio-hammer -n <device-name> -o <offset1> -o <offset2>


#include <unistd.h>

#include <stdlib.h>

#include <stdbool.h>

#include <stdio.h>

#include <dirent.h>

#include <errno.h>

#include <string.h>

#include <poll.h>

#include <fcntl.h>

#include <getopt.h>

#include <sys/ioctl.h>

#include <linux/gpio.h>

#include "gpio-utils.h"

int hammer_device(const char *device_name, unsigned int *lines, int nlines, unsigned int loops)


               struct gpiohandle_data data;

               int fd;

               int ret;

               unsigned int iteration = 0;

               memset(&data.values, 0, sizeof(data.values));

               ret = gpiotools_request_linehandle(device_name, lines, nlines,

                                                                                GPIOHANDLE_REQUEST_OUTPUT, &data,


               fd = ret;

               ret = gpiotools_get_values(fd, &data);



                              data.values[0] = !data.values[0];

                              ret = gpiotools_set_values(fd, &data);

                              gpio_set_value(PIN90, 0);

                              for(short int h=0;h<1;h++);                       


               ret = 0;




               return ret;


int main(int argc, char **argv)


               int ret=1;

               const char *device_name = NULL;

               unsigned int loops = 0;

               int nlines = 1;


               device_name = "gpiochip5"; // PORT F

               unsigned int lines[]={6}; // pin 6

               ret = hammer_device(device_name, lines, nlines, loops);

               return ret;


Thank you.

Principal III

Linux separates your userland code far from the kernel driver and GPIO hardware. 20MHz is out of scope. That's why you have the M4 in the system. And even that might be a challenge.

Associate II

Thank you for your reply. I don't understand, the bus that drives the gpio, MLAHB is at 209Mhz so the gpio can go to several Mhz, right?

What you are saying is that it is impossible to make the 32bit bus with the stm32mp

Principal III

What I was saying is, that it takes many many clock cycles before the gpio function has flipped the bits and return.

ST Employee


Linux is far to be optimized to handle realtime and you cannot expect going below few us in SW (without any guarantee to be preempted). Furthermore, in the HW, between Cortex-A7 and GPIOs, there is asynchronous bridges which add some clock cycles latency (smaller in write direction than read from AHB).

As stated by @KnarfB​ , might be much better using Cortex-M4 which could likely handle the GPIO access in a couple of 209MHz clock cycle.

Note that toggling at 20MHz 2x16 GPIOs need to access 2x2x20M=80M time per second the GPIO output register, which let very few Cortex-M4 processing time.

Please note that as each GPIO bank is 16-bit, there will be some skew in your upper/lower part of 32-bit.

You could imagine to use DMA to copy a data buffer to GPIO register, synchronized by a timer.

Anyway, as one GPIO bank is only 16-bit, if you want to manage 32-bits, it might be more complex and need two DMA channels.

Could you give more details on your expectations/application ?

Even if you reach toggling GPIOs at 20MHz there is probably some additional processing to do on your 'data' which would take processing as well.

Using FMC with some external HW to reconstruct a 32-bit bus (e.g. small FPGA, with some Fifo) might be another option which might fly on Linux (need to prototype).


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.
Associate II


Indeed, making a 32b bus on the A7 cortex seems very hard.

My project is to make a robot that uses TI's AWR1243 sensors and transceivers which allowed to make a very efficient radar but they need a CSI-2 bus.

I thought about programming on a ZYNQ 7000 FPGA. It has A9 cores and an FPGA part, which could be good for my project.

Thank you very much for your very detailed response.


ST Employee


Depending on the requested pixel clock rate and image formatting, you could use CSI-2 on STM32MP1 using an external STMIPID02 bridge connected to DCMI, see AN5470 STM32MP1 Series interfacing with a MIPI® CSI-2 camera. Two sensors could be connected, but only one could be active at a time. We achieved up to 1280 × 720 @ 27fps or 5Mpixels @ 3 fps.

All drivers are included in OpenSTLinux.


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.