Skip to main content
GDall.387
Associate
October 16, 2019
Question

Fast data logging with STM32F4 using USB_OTG_FS and FatFS

  • October 16, 2019
  • 11 replies
  • 4328 views

Hi everyone,

I'm trying to log data from different sensor into a USB flash drive with a 500Hz sampling frequency.

I set up a timer to define the sampling frequency for data logging and everything seems to work fine.

The problems comes when i try to write the data to a log file (I've tryed txt, csv and bin, but it doesn't seems to make a difference for FatFs) beacuse the writing instruction introduces a irregoular delay.

So far i've tried:

  • buffering data in different way to match sector size
  • formatting usb disk in different way (FAT32, FAT16, exFAT)
  • cleaning code as much as possible to just do the writing operation while logging

Does anyone know a method to speed up writing operation?

Thank you

Here's the main:

 if(eff==1){
 
 	if(fs==1){
 	 HAL_GPIO_WritePin(LED2_GPIO_Port, LED2_Pin, RESET);
 	 myData = LIS3DSH_GetDataScaled();
 	 	 x=(int)(myData.x*9806.65);
 	 	 acc_raw[k]=x;
 	 	 k++;
 	 	 y=(int)(myData.y*9806.65);
 	 	 acc_raw[k]=y;
 	 k++;
 	 	 z=(int)(myData.z*9806.65);
 	 	 acc_raw[k]=z;
 	 	 k++;
 	 	 HAL_GPIO_WritePin(LED2_GPIO_Port, LED2_Pin, SET);
 	 fs=0;
 	 }
 
 		if(k==99){
 			s++;
 			for(int g=0; g<98; g=g+3){
 
 			f_printf(&myFilea, "%d : %d, %d, %d\n", s, acc_raw[g], acc_raw[g+1], acc_raw[g+2] );
 				
 			}
 			k=0;
 		bufclear1();
 
 		}
 
 
 if(HAL_GPIO_ReadPin(GPIOA, GPIO_PIN_0) == GPIO_PIN_SET && write_var==0){
 
 	 i=0;
 	write_var=1;
 	 while(deb<250){
 
 	 	 }
 
 	 eff=1;
 	 }
 
 if(one_sec==500 && write_var==1){
 	 	 	 	 for(int g=0; g<98; g=g+3){
 
 f_printf(&myFilea, "%d : %d, %d, %d\n", s+1, acc_raw[g], acc_raw[g+1], acc_raw[g+2] );
 	 				//g=g+2;
 	 			}
 	 HAL_GPIO_WritePin(LED2_GPIO_Port, LED2_Pin, GPIO_PIN_RESET);
 	 HAL_GPIO_WritePin(LED1_GPIO_Port, LED1_Pin, GPIO_PIN_RESET);
 	 HAL_GPIO_WritePin(LED4_GPIO_Port, LED4_Pin, GPIO_PIN_RESET);
 	 f_close(&myFilea);
 	 f_close(&myFilet);
 	 write_var=0;
 	 s=0;
 	 deb=0;
 	 eff=0;
 	 open=0;
 	 one_sec=0;
 	 acqu_count++;
 	 HAL_Delay(500);
 	 }
 if(Appli_state == APPLICATION_DISCONNECT){
 	 HAL_GPIO_WritePin(LED1_GPIO_Port, LED1_Pin, GPIO_PIN_RESET);
 	 HAL_GPIO_WritePin(LED2_GPIO_Port, LED2_Pin, GPIO_PIN_RESET);
 	 open=0;
 }
 	 	
 if(open==0){
 	 	 switch(Appli_state){
 	 	 case APPLICATION_IDLE:
 	 		break;
 	 	 case APPLICATION_START:
 	 	 	res=f_mount(&myUsbFatFS, (TCHAR const*)USBHPath, 0);
 	 	 	if( res != FR_OK)
 	 	 					{
 	 	 						/* FatFs Initialization Error */
 	 	 						Error_Handler();
 	 	 					}
 	 	 					else
 	 	 					{
 	 	 						//HAL_GPIO_WritePin(LED1_GPIO_Port, LED1_Pin, GPIO_PIN_SET);
 	 	 					}
 	 	 					break;
 	 	 case APPLICATION_READY:
 	 	 					sprintf (buffer, "A%d.bin",acqu_count);
 	 	 					res=f_open(&myFilea, buffer, FA_OPEN_APPEND | FA_WRITE );
 	 	 					bufclear();
 	 	 					open=1;
 	 	 	 				HAL_GPIO_WritePin(LED1_GPIO_Port, LED1_Pin, GPIO_PIN_SET );
 
 	 	 					//HAL_Delay(500);
 	 	 				//}
 
 
 	 	 	break;
 	 	 case APPLICATION_DISCONNECT:
 	 	 	 //tUTRN GREEN ON
 	 	 	 HAL_GPIO_WritePin(LED1_GPIO_Port, LED1_Pin, GPIO_PIN_RESET);
 	 	 	break;
 	 	 }
 	 	 }
 }
 
 

Here's the Timer interrupt handler:

void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim)
{
 /* Prevent unused argument(s) compilation warning */
		HAL_GPIO_TogglePin(LED3_GPIO_Port, LED3_Pin);
 
		if(eff==1){
			one_sec++;
		}
 
		if(write_var==1){
		deb++;
		fs=1;
		if(temp_counter <250) temp_counter++;
		else {
			fs_temp=1;
			temp_counter=0;
		}
		//i=1-i;
		}
}

This topic has been closed for replies.

11 replies

Tesla DeLorean
Guru
October 16, 2019

Not sure how your states advance here, or what you do to prevent multiple f_mount or f_open calls.

Using any f_printf() or f_write() with small bursts of data is going to be brutal. I personally would shoot for 8KB or 32KB

If you can't handle significant hits in your loop you're going to have to buffer more data, and separate the writing into a worker task/thread which has more immunity in the ebb/flow

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
Bill Dempsey
Associate III
October 17, 2019

Here are a couple of things to try to learn your way through this issue even though it seems like you've tried some of this already. Set up a simple write and read test by writing a large amount of data made up of smaller blocks such as (512B, 1024B, 2048B, etc) and then read them back using the same block size. Do this as a fully dedicated loop and measure the time it takes to write/read all the data and you'll get an average write/read speed from your sdcard (hint: this code exists everywhere on the net...). You're going to find there is a "sweet" spot for the block size that works best for you. Then play around with the FatFS "conf" file and change when FatFS syncs (sector or cluster). Rerun the speed test and you'll find a difference in performance when you change this setting.

By writing the "petite" blocks that you are doing that don't fit a sector you are forcing a whole bunch of data to be read from the flash, altered, and then be written back. This is going to hurt the overall performance. Also, when the flash controller has to switch sectors or clusters you'll find the writes can stall for up to 10x or more normal speed sometimes depending on what is already out on disk and if the controller is attempting to erase and write all behind the scenes.

(There is also a way to disable write verification on transfer which will also speed things up...can't remember where that lives tho!)

What you're going to find is that you really need to manage your data in larger chunks. Maybe you can't take the risk of losing data between writes due to power fails so you will end up paying a penalty for writing all the time. That's a system design issue... Be sure to also examine a large amount of write times. I bet you'll find that 512B or larger takes more than 2ms (500Hz rate) so you will need to buffer data significantly. If you're using SPI you might need to crank up the clock speed to transfer to the SD internal write buffer. That is only one part of the write speed issue but it does help.

Out of curiosity, why are you doing the float multiply against a constant for the accelerometer data? Why not read the signed integer data from the LIS3DSH and store it "raw"? Then in post-processing do any conversion. It won't affect you much right now with this small code size but in a busy system it doesn't make sense to do multiplies on well-known data that can be converted later.

GDall.387
GDall.387Author
Associate
October 17, 2019

Thank you both for the answers.

@Community member​ What do you mean by " separating the writing into a worker thread" ? Is there a specific way to do it?

@Bill Dempsey​ Just to check if procedure is correct, to bufferize writing i set un a char array with the variable dimensions (512 up to 32k) and while reading data from the accelerometer i sprintf it to the string, until it's full of data and the i write the entire char array into SD. Is this the correct procedure to bufferize?

The fact that i'm multiplying accelerometers data was beacause of limit of f_printf in printing float, i should modify that part of the code as you say, thanks

Bill Dempsey
Associate III
October 17, 2019

I think you have the gist of it but understand you need a data "capture" and a data "write" buffer. With two equal-sized memory arrays it's easy to swap back and forth while writing and capturing. The ISR will point to one buffer while the main loop points to the other...

Regarding the side question -- I was really talking about the *need* for the mulitply - there's no need if both sides using the data know its relative value. Sure it's convenient but if you are trying to write only "usable" data to the SD card you can just store the 16-bit data for X,Y,Z instead of multiplying it and then storing it as an ASCII converted string. I know you're nowhere near the processing limits of the micro you're using but think about that in the future when you might be hurting for cpu cycles.

GDall.387
GDall.387Author
Associate
October 21, 2019

@Bill Dempsey​  I managed to reach desired writing speed (and even go further) with your suggestion, but now i'm facing another problem.

Basically i'm reading two different sensor (one accelerometer and one thermocouple) and i fill up a buffer in the interrupt routine of the timer, for then copying it to another buffer when full and then write the last one to the USB stick while the first one can be filled again. Everything seems to run fine, but after about 3/4 minutes it goeas into Hard Fault handler.

After investigating a bit, it's seems like a stack overflow problem. I've tried increasing stack size, but without any results.

Maybe your experience could help me resolving even this problem.

Thank you very much

 while (1)
 {
 /* USER CODE END WHILE */
 
 MX_USB_HOST_Process();
 
 
 
 
 
 /* USER CODE BEGIN 3 */
 
 
 	if(fs==1 && eff==1){
 	 HAL_GPIO_WritePin(LED2_GPIO_Port, LED2_Pin, RESET);
 	 myData = LIS3DSH_GetDataScaled();
 	 	 HAL_GPIO_WritePin(LED2_GPIO_Port, LED2_Pin, SET);
 	 fs=0;
 	 }
 
 //Writing to file as buffer is full
 		if(k==100){
 				k=0;
 				memcpy(USB_buff_write, USB_buff_read, sizeof(USB_buff_read));
 				clearUSB_L(USB_buff_read);
 				f_write(&myFilea, USB_buff_write, strlen(USB_buff_write), NULL);
 				//f_sync(&myFilea);
 				clearUSB_L(USB_buff_write);
 
 					}
 
 		if(t==150){
 			 t=0;
 		 memcpy(USB_buff_writet, USB_buff_readt, sizeof(USB_buff_readt));
 		 clearUSB_S(USB_buff_readt);
 		 f_write(&myFilet, USB_buff_writet, strlen(USB_buff_writet), NULL);
 		 clearUSB_S(USB_buff_writet);
 
 		 	}
 
 	if(fs_temp==1 && eff==1){
 		HAL_GPIO_TogglePin(LED4_GPIO_Port, LED4_Pin);
 		redtemp=(float)readData()*0.25;
 
 	 fs_temp=0;
 	 t++;
 		}

And the Handler routine:

void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim)
{
 /* Prevent unused argument(s) compilation warning */
		HAL_GPIO_TogglePin(LED3_GPIO_Port, LED3_Pin);
 
		if(eff==1){
			sprintf(USB_buff_read, "%s%f, %f, %f\n",USB_buff_read, myData.x, myData.y, myData.z );
			k++;
			//one_sec++;
 
		}
 
		if(temp_counter < 250) temp_counter++;
						else {
							fs_temp=1;
							sprintf(USB_buff_readt, "%s%.2f\n",USB_buff_readt, redtemp );
							temp_counter=0;
						}
 
		fs=1;
}

Thanks

Tesla DeLorean
Guru
October 21, 2019

strlen() requires buffers that are properly terminated with NUL characters.

sprintf() returns a length.

Look at what specifically is Hard Faulting (disassembled instructions and registers), try using a proper fault handler rather than a while(1) loop. Posted examples several times.

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
Bill Dempsey
Associate III
October 21, 2019

Definitely don't want to do a memcpy approach. Use two buffers (for each device)

BUFFER_A

BUFFER_B

Capture to A while writing B to SDCARD

when capture A buffer full then

Capture to B while writing A to SDCARD

and then start over

We call these "ping-pong" buffers. There of course is some init logic and flags that wrap around the code so you know when the buffers are valid but no need to copy.

And if the callback is at the interrupt level, which I suspect it is, there may be an issue with sprintf in interrupts...that is something I would check with the community. I personally have been bitten by any "printf" in an ISR so I never use them.

GDall.387
GDall.387Author
Associate
October 21, 2019

Thanks for tour time and effort trying to help me

@Bill Dempsey​ 

The aim of what i was doing was exactly a ping-pong kind of thing. Before "sprintfing" in the ISR i've tried doing the normal procedure for ping pong buffering, but i had the same problem as the f_write command when writing one buffer, ​was delaying the data acquisition in the other one in the same way as without ping pong, while i found out that in the ISR the buffer Is filled even if a write instruction is taking more time than my sampling period. I'will try again with a better code, maybe i was missing something the last time i tried.

Basically what i need to in pills is:

- Using the ISR Just ti raise the flag timing the sampling instant ​

- In the main, whenever a sampling flag Is raised, filling One buffer

-As soon as One Is full, swap the buffers and write the full one ti the USB, hoping the other one is still filled while the full one is written

Am i right? ​

​@Community member​ 

From the brief of the sprintf function It Is actually terminating the string with a null character. I'also tried strcpy() with the same result

From the disassembly i can't find the real cause of the Hard Fault, i'm gonna try Better

Thanks ​

Bill Dempsey
Associate III
October 22, 2019

Hmmm...can you publish the write-speed findings when you did the earlier tests I suggested? That would help understand the true throughput of the system. There can't be any "hope" as part of it working.

From the code I saw, it did not appear to be "ping-pong" style without copies...perhaps that code stayed on the bench?

Remember this comment?

"you will end up paying a penalty for writing all the time. That's a system design issue... Be sure to also examine a large amount of write times. I bet you'll find that 512B or larger takes more than 2ms (500Hz rate) so you will need to buffer data significantly"

If you are pulling in more data than you are writing you are going to have a collision. You have to decouple the input from the output. Setting buffer size and determining system requirements is part of what seems to be missing. If you have the calculations you're using (bytes/sec inbound, bytes/sec outbound) and you'd like to share, that'd be a better starting point than why did my code hard-fault?

The input is coming from an ISR. You are burning around 32 bytes with your formatted string per capture. I'll assume you did a 512B buffer. That means you need to write after every 16th ISR. So in your case at 500Hz you need to write every 32ms. Well, if you go back to your write-evaluation test and really dig-in you're going to find that *on occasion* the SD card writes of 512B buffer actually take longer than 32ms as some behind the scenes FAT table updates are getting made. That then implies that a two-buffer system will NOT work. So now the problem becomes perhaps a 3-buffer solution? Or 4?

Once you can *guarantee* that the system I/O rate is balanced you don't need to hope. I, like others on here, do the same kind of system you imply you are working on and it runs "forever" (assuming I don't run out of memory on the SDCARD). I have a multi-buffer "FIFO" implemented in software that helps average my slow writes with the fast ones. I made sure that my average write speed was >> than my input read speed. The rest is easy.

One final comment: make sure that the ISR write does not over-write its buffer. I did not see where you were tracking the remaining buffer size vs the input capture size. The sprintf will blow across a memory boundary as it has no bounds-checking. If you've made sure that this can't happen then good, it's one less thing to worry about.

GDall.387
GDall.387Author
Associate
October 22, 2019

For sure, i got from the write-speed test ( writing around 10 MBs per test) :

-512B = 0,64 MB/s

-1024B = 0,66 MB/s

-2kB = 0,61 MB/s

-4kB = 0,81 MB/s

-8kB = 0,62 MB/s

-16kB= 0,80 MB/s

So i went for 4 kB buffers (over 16 kB) to prevent future RAM overflow.

I'm sampling at 500 Hz, which result in a 20kB/s datarate, that compared to the tested writing speed should be nothing.

"The input is coming from an ISR. You are burning around 32 bytes with your formatted string per capture. I'll assume you did a 512B buffer. That means you need to write after every 16th ISR. So in your case at 500Hz you need to write every 32ms. Well, if you go back to your write-evaluation test and really dig-in you're going to find that *on occasion* the SD card writes of 512B buffer actually take longer than 32ms as some behind the scenes FAT table updates are getting made. That then implies that a two-buffer system will NOT work. So now the problem becomes perhaps a 3-buffer solution? Or 4?"

I'm actually writing every 100 ISR the data collected on the buffer to the file, but to do so, i had to copy my data from the acquiring buffer to one already acquired, or swap the two ping pong buffers if doing it that way.

And the problem is that everything is running fine until everything freezes. The only one problem i could imagine is when i'm doing a f_write() i'm actually writing just a fraction of the 4 kB buffer, because the acquired data change in size, so i had to keep a margin when deciding how much of them to store in the read buffer and this maybe makes fall all the test on writing speed i made cause i'm not writing always the same length as a buffer.

I'm trying to fix this too.

I've made even a 3 buffer version, but without any difference unfortunately, like shown below as a part of the ISR. And in the main i'm always copying it to a write buffer to write it to the file.

// In ISR if writing is enabled, i start to store data in the read buffer	
if(eff==1){
			if(ping==true){
				sprintf(USB_buff_read1, "%s%f, %f, %f\n",USB_buff_read1, myData.x, myData.y, myData.z );
			}else{
				sprintf(USB_buff_read2, "%s%f, %f, %f\n",USB_buff_read2, myData.x, myData.y, myData.z );
		}
			k++;
			//one_sec++;
 
		}

Thank you so much, with your help i'm gradually reaching the result

Bill Dempsey
Associate III
October 23, 2019

It seems as if you're missing a basic concept of how to do the ping-pong-etc buffers. I get this from the sense of your replies and missing code segments...

I can't see all your code but it should have some kind of pointer swap logic in it. Two pointers would be a useful way of implementing this such as below:

char bufferA[4096];
char bufferB[4096];
 
char * isrPtr = bufferA;
char * wrtPtr = bufferB;

Never would you need to copy. isrPtr is changed to point to bufferA or to bufferB and at the same time wrtPtr would point to the opposite buffer. Above shows a starting condition but of course the data is not full so out of the gate the write side pends until the isr buffer is full.

Somehow in your code you need to have a software mutex (https://en.wikipedia.org/wiki/Mutual_exclusion) such that you can't swap if the "other" side is busy.

According to your math, this should never happen but I suspect due to unexpected timing events you have a race-case where either the main loop is trying to advance before the ISR is done or vice-versa.

The trick is signaling when it's time to write and then making sure the write is complete before a swap can occur (enough_bytes && write_not_busy). If you encounter a situation where the case is not true, stop the code and debug why. I'll bet you'll find this race-case exists.

Even though you show the *average* write-rate is high enough, did you set a watermark for the worst-case write? Go ahead and write a GB worth of data and time each block-write and keep track of the highest time. This can be card specific so be aware. Hopefully no blocks exceed 50ms or so...

Also why not go ahead and write the full 4096 bytes during the write? That simplifies things on the capture side... set your 3rd argument to the buffer size (4096). http://irtos.sourceforge.net/FAT32_ChaN/doc/en/write.html

If you're already doing all this above, I have not seen it in the code you've posted so feel free to share any additional code that might be useful.

GDall.387
GDall.387Author
Associate
October 23, 2019

Again a big thanks,

I'm trying to explain the situation as clear as possible:

Here my global variables i'm working with: (nevermind the presence of temperature or t related variables)

//Buffer fro file names
char buffer[8];
//Acc data buffers
char USB_buff_read1[4096];
char USB_buff_read2[4096];
//Temp data buffers
char USB_buff_readt[2048];
char USB_buff_writet[2048];
//usb variables
extern ApplicationTypeDef Appli_state;
//Button behaviour flag
bool write_var=0;
//USB mounted & File opened flag
bool mounted=0;
bool open=0;
//Sampling frequency flag
bool fs=0;
//Number of acquisition made
int acqu_count=0;
//Temperature sampling flags&counter
uint16_t temp_counter;
uint16_t fs_temp=0;
//buffer filling counters
int k=0;
int t=0;
//Actually acquiring flag
int eff=0;
//Ping pong flag
bool ping=0;
 

After everything has been init, the first thing the program does is wayting the usb to be mounted and opening a file to be ready to acquire:

MX_USB_HOST_Process();
 
if(open==0){
 	 	 switch(Appli_state){
 	 	 case APPLICATION_IDLE:
 	 		break;
 
 	 	 case APPLICATION_START:
 	 	 	if(mounted == 0){
 	 	 	res=f_mount(&myUsbFatFS, (TCHAR const*)USBHPath, 0);
 	 	 	if( res != FR_OK)
 	 	 					{
 	 	 						/* FatFs Initialization Error */
 	 	 						Error_Handler();
 	 	 					}
 	 	 					else
 	 	 					{
 	 	 						mounted=1;
 	 	 		 	 	 					}
 	 	 	}
 	 	 					break;
 	 	 case APPLICATION_READY:
 //Open file to log datas
 	 	 					sprintf (buffer, "A%d.bin",acqu_count);
 	 	 					res=f_open(&myFilea, buffer, FA_OPEN_APPEND | FA_WRITE );
 	 	 					bufclear();
 	 	 					open=1;
 	 	 	break;
 	 	 case APPLICATION_DISCONNECT:
 	 	 	 //Turn greed led on
 	 	 	 HAL_GPIO_WritePin(LED1_GPIO_Port, LED1_Pin, GPIO_PIN_RESET);
 	 	 	break;
 	 	 }
 	 	 }
 }

When file is open and ready to be written, a button press is waited to start acquiring, when pressed, the acquisition can start:

if(HAL_GPIO_ReadPin(GPIOA, GPIO_PIN_0) == GPIO_PIN_SET && write_var==0){
 	 k=0;
 	 t=0;
 	 write_var=1;
 	 clearUSB_L(USB_buff_read1);
 	 clearUSB_L(USB_buff_read2);
 	 HAL_Delay(500);
 	 temp_counter=0;
 	 eff=1;
 	 }

When this happen, the ISR starts to acquire data and store them in one of the two buffers, depending on the ping_pong state:

if(eff==1){
 	 myData = LIS3DSH_GetDataScaled();
			if(ping==0){
				sprintf(USB_buff_read1, "%s%f, %f, %f\n",USB_buff_read1, myData.x, myData.y, myData.z );
			}else{
				sprintf(USB_buff_read2, "%s%f, %f, %f\n",USB_buff_read2, myData.x, myData.y, myData.z );
		}
			k++;
			//one_sec++;
 
		}

As soon as the buffer is full (counter k reach 150) in the main the buffers are swapped and the full one is written to file and then cleared:

if(k==100){
 				ping=1-ping;
 				k=0;
 
 				if(ping==1){
 				f_write(&myFilea, USB_buff_read1, 4096, NULL);
 			 clearUSB_L(USB_buff_read1);
 				}else{
 		 f_write(&myFilea, USB_buff_read2, 4096, NULL);
 		 clearUSB_L(USB_buff_read2);
 				}
 
 					}

And this goes on until i repress the button and close and save the file, restore all the flags and open a new one ready to be logged again.

I didint use a pointer approach for swapping buffers, but i change them based on flags as you can see from the code above.

And the program runs fine for acquisition of 2 minutes, even ten of them, but when trying to log for a longer time the previously disussed errors came into the game.

I try to find some hint about Mutex implementation, but everything i found was based on RTOS, and for me it's a complete new world, since this step is something i'm doing as a step of a more long and complex project which is supposed to use the data i can retrieve.

  • Is correct the way i implemented the double buffering or i must rearrange it to work with pointers logic?
  • Is there anyway i can implement a software mutex without using RTOS?

Thank you