cancel
Showing results for 
Search instead for 
Did you mean: 

Clock cycle shift on GPIO output STM32F103

vbesson
Associate III

Dear Community,

I am porting an old application made on AVR to STM32, and I am facing a strange timing issue. 

In a nutshell, the application is reading sector (512 Bytes) from a SDCARD and output the content of the buffer to GPIO with 4us cycle (meaning 3us low, 1 us data signal). 

The SDCard read is working fine, and I have written a small assembly code to output GPIO signal with precise MCU cycle counting. 

Using DWT on the debugger, it give a very stable and precise counting (288 cycles for a total of 4us).

When using a Logic analyser with 24 MHz freq, I can see shift of signal by 1 or 2 cpu cycles and so delay. 

I have tried to use ODR directly and BSRR but with no luck. 

Attached :

- Screenshot of the logic analyzer

Screenshot 2024-05-12 at 06.30.59.png
As you can see I do not have 3us but 3.042 and this is not always the case
 

Clock configuration

Screenshot 2024-05-12 at 06.32.34.png

Port configuration:

 

GPIO_InitStruct.Pin = GPIO_PIN_13| READ_PULSE_Pin|READ_CLK_Pin;
GPIO_InitStruct.Mode = GPIO_MODE_OUTPUT_PP;
GPIO_InitStruct.Pull = GPIO_NOPULL;
GPIO_InitStruct.Speed=GPIO_SPEED_FREQ_HIGH;
HAL_GPIO_Init(GPIOC, &GPIO_InitStruct);
 
Assembly code : 
 
.global wait_1us
wait_1us:
.fnstart
push {lr}
nop ;// 1 1
nop ;// 1 2
mov r2,#20 ;// 1 3
wait_1us_1:
subs r2,r2,#1 ;// 1 1
bne wait_1us_1 ;// 1 2
pop {lr}
bx lr // return from function call
.fnend

.global wait_3us
wait_3us:
.fnstart
push {lr}
nop
nop
wait_3us_1:
subs r2,r2,#1
bne wait_3us_1
pop {lr}
bx lr // return from function call
.fnend
 
 
sendByte:
 
and r5,r3,0x80000000;// 1 1
lsl r3,r3,#1 ;// 1 2 // right shift r3 by 1
subs r4,r4,#1 ;// 1 3 //; dec r4 bit counter
//mov r6,#0 // Reset the DWT Cycle counter for debug cycle counting
//ldr r6,=DWTCYCNT
//mov r2,#0
//str r2,[r6] // end
bne sendBit ;// 1 4
beq process ;// 1 5
// Clk 15, Readpulse 14, Enable 13
sendBit:
ldr r6,=PIN_BSRR ;// 2 2
LDR r2, [r6] ;// 3 5
cmp r5,#0 ;// 1 6
ITE EQ ;// 1 7

 
ORREQ r2,r2, #0x80000000 ;// 1 8 set bit 13 to 1, OR with 0000 0010 0000 0000 0x2000 (Bit13) 0x6000 (Bit13 & 14)
ORRNE r2,r2, #0x00008000 ;// 1 9 set bit 29 to 1, OR with 0010 0000 0000 0000
 
 
ORR r2,r2, #0x00004000 ;// 1 8 set bit 13 to 1, OR with 0000 0010 0000 0000 0x2000 (Bit13) 0x6000 (Bit13 & 14)
 
STR r2, [r6] ;// 1 10 set the GPIO port -> from this point we need 1us, 72 CPU cycles (to be confirmed)
bl wait_1us ;// 65 75 144 209
ORR r2,r2, #0xC0000000 ;// 1 12 ; // Bring the pin down
STR r2,[r6] ;// 1 13 ; //
; // We need to adjust the duration of the 3us function if it is the first bit (coming from process less 10 cycle)
cmp r4,#1
ite eq
moveq r2,#56
movne r2,#62
bl wait_3us ; // wait for 3 us in total
b sendByte

 

I do not know where to look at to be honnest

 

40 REPLIES 40

Ok I found it :

It needs to have 

void HAL_SPI_TxCpltCallback(SPI_HandleTypeDef *hspi)
{
  printf("debug full\n");
}


void HAL_SPI_TxHalfCpltCallback(SPI_HandleTypeDef *hspi){
   printf("debug half\n");
}

along with

  HAL_SPI_Transmit_DMA(&hspi1,DMA_BIT_BUFFER,1608);               // 402*8*4

By the way the DMA_SPI_TX seems to be the best approach to save RAM and very precise & accurate

 

Vincent