cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F103 C faster then ASM?

darkfirefighter
Associate III
Posted on September 07, 2010 at 12:11

STM32F103 C faster then ASM?

4 REPLIES 4
Andrew Neil
Evangelist
Posted on May 17, 2011 at 14:06

It is a common fallacy that, simply by writing in assembler, you code will somehow magically become much faster and/or smaller!

Assembler is not a magic bullet - it is just a tool and, therefore, is only as good as the person using the tool!

Modern optimising compilers really are very good - so, if you want to beat a modern optimising compiler, you are going to have to be an exceptionally good assembler programmer!

If you can't understand the assembler that the compiler produces, that probably means that it is better at assembler than you are - and, thus, it's not surprising that its code is faster than yours!

stforum2
Associate II
Posted on May 17, 2011 at 14:06

I think that:

MOV      r2,#0x01

BICS     r1,r2,r1,LSR #8

BEQ      0x080005E4

is the equivalent of:

TST      r1,#0x100

BNE      0x080005E4

stforum2
Associate II
Posted on May 17, 2011 at 14:06

And in C, a preload can be done like this:

void sendDataC(unsigned long *dat)

{

    TIM3->CR1 = 0x1;             //Enable TIM3

    //----------------Startbit (Low)---------------      

    GPIO_Port->BSRR = dat[0];

    unsigned long nPreload = dat[1];

    while(!(Toggle_Port->IDR & Toggle_Pin) == 0);

    //----------------Data 1-----------------------

    GPIO_Port->BSRR = nPreload;

    nPreload = dat[2];

    while((Toggle_Port->IDR & Toggle_Pin) == 0);

    //----------------Data 2-----------------------

    GPIO_Port->BSRR = nPreload;

    nPreload = dat[3];

    while(!(Toggle_Port->IDR & Toggle_Pin) == 0);

    //----------------Data 3-----------------------      

    GPIO_Port->BSRR = nPreload;

    while((Toggle_Port->IDR & Toggle_Pin) == 0);

  

    TIM3->CR1 = 0x00;             //Enable TIM3  

}

Posted on May 17, 2011 at 14:06

If proximity to the rising/falling edge is important, you really shouldn't be fluffing around loading and incrementing the index before outputting the data. The key thing the compiler did was remove your ''r0 += 4'' code. Placement to the edge would be quicker if you loaded the output value before entering the spin loop.

__asm void sendDataASM(unsigned long *data){

;****************************Init Registers****************************************

        LDR        r1,=0x40011810    ;GPIOE->BSRR

        LDR        r3,=0x40011c08    ;GPIOF->IDR

        LDR        r5,=0x40000400    ;TIM3->CR1

        MOV       r6,#0x01

        MOV       r7,#0x00

        STRH    r6,[r5,#0x00]    ;Enable TIM3

            LDR     r2,[r0,#0x00] ; Data Start, Preload

;----------------Wait for the first edge------

while0   LDR        r4,[r3,#0x00]

            TST         r4,#0x100

            BEQ        while0

;****************************Now send Bit for Bit, synchronized by TIM3************

;----------------Startbit (Low)---------------

            STR     r2,[r1,#0x00] ; Out Data Start

            LDR     r2,[r0,#0x04] ; Data 1, Preload

while1   LDR     r4,[r3,#0x00]

            TST     r4,#0x100

            BNE    while1

;----------------Data 1-----------------------

            STR      r2,[r1,#0x00] ; Out Data 1

            LDR     r2,[r0,#0x08] ; Data 2, Preload

while2   LDR      r4,[r3,#0x00]

            TST      r4,#0x100

            BEQ     while2

;----------------Data 2-----------------------

            STR        r2,[r1,#0x00] ; Out Data 2

            LDR     r2,[r0,#0x0C] ; Data 3, Preload

while3   LDR        r4,[r3,#0x00]

            TST        r4,#0x100

            BNE        while3

;----------------Data 3-----------------------

            STR        r2,[r1,#0x00] ; Out Data 3

while4   LDR        r4,[r3,#0x00]

            TST         r4,#0x100

            BEQ        while4

;****************************Sending done, Stop Timer and Jump back****************

            STRH    r7,[r5,#0x00]    ;Diable TIM3

            BX         lr

}

If the data/clock edge placement is critical, and you *have* to bit-bang it in software, you'd be better off driving a pair of GPIO's together, and either have a software calibrated spin-loop, or use a high resolution free running counter to handle the mark/space ratio.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..