2015-10-12 09:59 AM
Hi
I want to know whether there is penalty on accessing an array from a structure vs accessing by a simple array. Here is my snippet which i used when I faced the problemunsigned
char
array_1[10240];
typedef
struct
{
int
header;
unsgined
char
array_2[10240]
}simple_queue;
typedef
struct
{
int
top;
simple_queue simple_arr[3]
}simple_stack;
simple_stack struct_var1;
....some file open ....
f_write
(&fp, &struct_var1.simple_arr.array_2[0], 10240);
f_write
(&fp, &array_1[0], 10240);
....
So this is a file operation function which works perfectly and I could see that the performance was varying in the both steps.
What I could Observe in the benchmark test was the first f_write took 50ms time to complete , while the second f_write took only 2ms
So all that differs while comparing the two f_write are , one uses array in a structure and other uses a simple array..
Please explain me about this behavior.
#memory-organization #stm32f429
2015-10-12 10:29 AM
> So all that differs while comparing the two f_write are , one uses array in a structure and other uses a simple array..
... and the state of the file metadata (e.g. FAT table), and the state of the medium where you are writing (e.g. cache in a SD card full from previous write)... The two arrays may be aligned differently in the mcu memory, but it's unlikely that would make such a difference. JW2015-10-12 10:37 AM
You are measuring the sum of two variables in your benchmark, array access and file write time. Use a generated test pattern instead of file I/O to reduce your benchmark to a single variable.
Your time difference could easily be from the file I/O. If it's flash you may be seeing a sector erase taking place. Jack Peacock2015-10-12 11:55 PM
Did some debugging today removing the f_write. So I used a simple memcpy() operation
unsigned char array_1[10240];
unsigned char array_2[10240]; typedef struct { unsgined char array_2[10240] }simple_queue; typedef struct { simple_queue simple_arr[3] }simple_stack; simple_stack struct_var1; ... .... memcpy(char array_2, char array_1, 10240 ); memcpy(&struct_var1.simple_arr.array_2[0], char array_1, 10240 ); .... So I could see that first memcpy() took 90microsec, while the second mempy() took 310microsec. I changed theunsigned char
toint and I could see that each memcpy() took 40microsec. So I assume this is related to compiler optimization.. Can someone please explain to me how compiler optimization can lead to this behaviour?
2015-10-13 12:04 AM
Is this a Cortex-M0/M0+ device? Those don't allow unaligned accesses at all, and the compiler might be aware of the alignment of the source/target.
Also, are both arrays in the same memory? JW2015-10-13 12:31 AM
you got it ! this is something that happens with memcpy. As f_write also uses memcpy, you can observe the same penalties.
The memcpy routine is provided by the C library, and depending on your library provider, this memcpy migth uses some optimisations based on source and target alignements. In other word, when it is possible, bytes are moved by 4 or even 8 for most of loads and stores. Otherwise bytes are moved one by one. Of course both cases ends up in very different performance. How it is connected to your data ? Alignment. Yes try the following to understand how it works: memcpy (dst, src+0, count) memcpy (dst, src+1, count) memcpy (dst, src+2, count) memcpy (dst, src+3, count) memcpy (dst, src+4, count) Normally there should be a difference. When +0 and +4 cases exhibit a significative difference, try to run until +8.2015-10-13 10:46 AM
Laurent,
> you got it !
Memory access alone would explain a (310-40)us difference, not the (50-2)ms difference. JW2015-10-13 11:17 AM
Oh, you are certainly right. Who knows what's between f_write and memcpy that explains a 50-2 ms difference !