2012-09-24 07:22 PM
Hello All,
Finally got to a stm32 discovery specific question. I've turned my baud up to a really high rate921600. I have a simple loop that sends a request for data and then gets data back from the stm32 as fast as possible.
I only ever see about 300Kbs. I dont see any errors so I'm wondering if simply the processing of the packet takes so long that it bottlenecks my baud rate ? I'm using the F0STM32F051 series for testing.
I have thought about dma'ing the usart data into memory and then using a crc inside the memory sent to check the memory on dma TC, then mark the data as good, bad or new. This I think would give me the maximum throughput, allowing one dma of rx to kick off the dma of tx. But i dont know if thats a good idea. I'm currently using a circular buffer and pulling/reading the bytes out into another buffer before processing that data as a packet. Then I send the response packet after building it using dma. My packet sizes are just over 100 bytes. I want to tell if the program flow is whats taking the time and causing the bottleneck, because that would be cross telemetry/periph. Guess I could run some tests getting the time in ms since getting the first byte and sending the packet to profile the code. Any advice is appreciated ! Thanks #usart-stm32-discovery2012-09-24 09:18 PM
Well you could use GPIO toggling to apportion times to various parts of the process.
It should be pretty easy to time the inter-symbol gap between bytes, and frankly I'd believe it should be easy to saturate the USART output with data. The gap should tend to zero. I have found that doing XMODEM-1K-CRC it is better to compute/bill the CRC a byte at a time as it transmits, rather that compute it for a block at the end. Building a buffer, and computing a CRC prior to dispatching a DMA transfer will add to your latency. Better perhaps to have a scatter-gather list of DMA buffers, or form buffers so header information can be prepended, or to transmit the payload portion whilst computing the CRC, and avoid copying buffers.2012-09-25 04:20 AM
Thanks Clive,
designing my buffers in such a way that the header information can simply be prepended before the data is shipped out is a good idea. I was thinking of that too, it would work for some portions of memory i'm sending but other portions are contiguous 7k blocks. I never thought of the ''scatter gather'' that makes sense. I could simply have a queue that holds chunks of memory and sizes that need to be sent out and pass them to the dma on completion of previous dma. as soon as i work out bugs that i put in trying to make it less latent i'll set some timers and see exactly how many ms i spend in each area. I also have a bunch of ''bytestou16()'' functions that i'm going to replace with structs so i can just point at the data. guess if i really wanted to not be lazy i could just count the steps in the debugger and use the clock cycles to calculate the latency. Thx, MM2012-09-27 07:21 PM
After much changing and re factoring seems like my throughput only got worse :(
I changed all of my copying of data from memory into packets into simply passing pointers of the data to use/send to the dma. The only other thing I could do is change the logic around, remove my main switch statement state machine. I guess blindly trying to resolve a problem that I dont know what is is is futile. I should determine where the hold up is and try and fix that, for all I know its on the pc side.2012-09-27 07:29 PM
Oops. I forgot I turned on the printf to print debug output to usart3. I was trying to actually see that in the usart or printf window in keil but it never worked. So at some point i was sending out data in a blocking way at a much slower rate. Commenting that out I now get
BitsPerSec: 463552.0when my baud is set to 921600, only 50% ''goodput''. Without further anaylizing the actual time spent inside each function I dont think i'll figit with the code anymore. Is there any analyzer within keil that can tell you where you spend the most time in your code? like a profiler ?