2011-04-26 08:24 AM
memory allocation advice needed
2011-05-17 05:33 AM
My point would be that you will probably need to get more involved in the memory usage between this parsing task and your primary task. You will likely have to change some of your static allocation with global scope to something more flexible in order for you to not exceed your available resources. There is some time division between your need for certain resources, if you allocate them all up-front assuming they have a simultaneous/infinite ultilization/lifespan you will run out of RAM.
Develop a floor plan of the memory currently being used, and a time line for who gets to use certain sections. If you can diagram some of these it might be helpful to the discussion. One approach is to use the scatter file to direct the linker about how to place and group certain things within RAM. This might permit you to find a contiguous chunk on memory (say 16-22K) which you can share in a co-operative manner. Another is to look at your applications current RAM utilization, do you have some large uninitialized structures or buffers that you currently statically allocate at compile time, that you could instead allocate from a heap at run time? Do you have a lot of static allocations that could be migrated to local stack based allocations? Do you have global structures that could be handled on the stack if you passed parameters/pointers down your call trees? If so, consider having them stack based, and make your stack footprint bigger to accommodate them. When you talk about uploading to a PC, do you mean just sending the raw 16KB data from your SPI flash, or the 6KB parsed version? The former could be sent in 128 byte chunks, for the latter consider storing your parsed down data in another place within the SPI flash.2011-05-17 05:33 AM
Are you sure this is the best approach?
You clearly know it isn't.
2011-05-17 05:33 AM
You wrote, ''Would really appreciate considerations on this subject.''. There's no magic solution being hidden from you. It just seems silly not to pre-process the data if it is constant. Maybe you already have a system using CAN to transfer the data. If not, have you considered how you are going to pack the data into CAN messages and then unpack it?
It's hard to make suggestions without knowing more about your system. Anyway, Clive is better at constructive suggestions - I just like to pick at loose ends.2011-05-17 05:33 AM
Are you sure this is the best approach?
If you can diagram some of these it might be helpful to the discussion. Is it ok to place some description here so the discussion may be closer to reality ?
2011-05-17 05:33 AM
It's hard to make suggestions without knowing more about your system.
I am not looking for 'magic solutions'. And I don't ask to do my job for me. Just asking for considerations from people who definitely know more than I do and willing to share the knowledge. And I learn a lot. And really appreciate any input. So here it is. The system consists of 3 controllers: master, slave1, slave2. Master - ethernet pc interface, video development (switching, on-screen displays etc.), slave control etc. Slave1 - audio development Slave 2 - audio inputs interface Communication between controllers - over CAN. Cycle time is 10 ms (based on audio development), so speed is a big factor. PC uploads to the master the Main profile, which is stored in spi flash - Main_spi_profile. This is the base profile, contains also some data unrelated to the system operation. Size - about 16K. Master and slaves operates on subprofiles of a main profile: Master_profile - about 7K Slave1_profile - 3K Slave2_profile - 6K Some data in profiles can be modified any time by the command from pc. Master profile contains all the data related to the master operation and data from the slave profiles which can be modified. Since profiles can be modified, I keep a copy of a master profile in non-volatile FRAM, which allows me to store modified data without erasing. Also the host (PC) can request profile downloading - and this should be the Main profile with modified data. So that is what I am doing now. 1. On the Main profile upload (can happen any time): - Store Main profile in Main_spi_profile. - Parse Main_spi_profile to Master_ram_profile (either on a fly or from instantiated Main_ram_profile). - Store copy of the Master_ram_profile in fram - create Master_fram_profile. - restart the system 2. On a system reboot: - verify integrity of the Master_fram_profile. If not ok - do parsing as in step 1. - Create Master_ram_profile (copy of Master_fram_profile). - Create Slave1_profile (parts which can change are parsed from Master_profile, static - from Main_profile). - Send Slave1 profile to the Slave1 over CAN. BTW, this works quite nicely and fast, since at this point no other CAN communication happens, and it's actually point-to-point. The only thing I needed to do is to use a byte number as a first byte in a packet, confirmation after each packet and final checksum verification. - Create Slave2_profile (same way as Slave1, send to the Slave2 over CAN). Creating Slave1_ and Slave2_ profiles in ram doesn't waste memory. Since they are used one at a time and only once, I share them in a union with on-screen displays buffers. 3. On data modification (from the pc) - update data in Master_ram_profile, Master_fram_profile. Update slaves over CAN. 4. On profile download request - replace modified parts of the Main_profile (easy, if there is an instance of the Main_profile in ram, quite cumbersome otherwise). So actually it all works fine, and seems logical (to me). The main question - effective memory usage - in particular, instantiating the Main_profile in ram. Doing it simplifies parsing/deparsing and makes firmware modifications easier if Main_profile format changes. But it comes with extra 16KB of ram used. Parsing on-a-fly is doable (thanks Hex Editor Neo, which allows to bind the binary file with the structure and makes addresses defining part simpler). Also it's hard for me to justify the use of heap for the Main_ram_profile, since there is very little data in a program which can be dynamically allocated for ram reuse. Thank you.2011-05-17 05:33 AM
The diagram is attached to the previous post
2011-05-17 05:33 AM
Is it ok to place some description here so the discussion may be closer to reality ?
It is a public forum, so sanitize any confidential/sensitive information, you can attach RAR/ZIP archives to your posts. Perhaps having an example of the 16KB and 6KB JSON profiles would be helpful, along with a C harness that transforms one to the other. I think debunking the need for 22KB of RAM would be a good first step. Consider also a paradigm shift, how would you perform this task if you only had 4KB of RAM to work with? This is the difference between programming embedded, and programming a 3 GHz Windows PC with 4GB of RAM.
2011-05-17 05:33 AM
no more input on a general setup ?
2011-05-17 05:33 AM
2011-05-17 05:33 AM
no more input on a general setup ?
Unless you can get beyond holding everything in memory at once, probably not. You need to move your goal posts. If you can't get the PC application to provide you with a more optimally formatted data, or limited line lengths, etc, you might need to transform the data into a representation you can handle more easily as an initial step. Whether that is some tokenization, or preprocessing, depends on just how variable the output from the PC application can be. If I had a large chunk of string/ascii data where I needed to filter into three streams, I'd probably scan the data and generate a bitmap array indicating the outputs to send each input byte to. ie 00 = Master, 01 = Slave 1, 10 = Slave 2, 11 = ALL or xxx1 = Master, xx1x = Slave 1, x1xx = Slave 2, 0000 = Stripped, 0111 = ALL This would allow you to decode into multiple streams, very rapidly on-the-fly. So perhaps 24K, or 20K total for Original+Master+Slave1+Slave2 If you were to get really clever, you could perhaps create some RLE (Run Length Encoding) scheme to compress the bitmap quite significantly as you are probably taking/not-taking whole lines of data at a time, or at least several characters. How many lines in this file, 1000? What's the bandwidth of the FRAM vs CAN? If the speed of the FRAM is an order of magnitude, or two, faster than where you are pushing the data why does it all need to be in RAM? But to do any of this you're going to have to get a tad more complex than treating the data as a monolithic lump that must be instantiated multiple times.