Need to confirm how FMAC loads\consumes coefficients on STM32G4


I want to use the filter math accelerator in an unorthodox way and need to understand exactly how it loads coefficients. The input buffer is to be loaded by a combination of DMA and code and is initiated in code. When the FMAC calculation is executed, 8 input coeffients will be loaded already and 4 more need to be calculated and loaded into the input buffer. I want to initiate the FMAC calculation, and then load the 4 needed values into the input buffer while the FMAC unit conducts the first 8 calculations. Does the FMAC consume the input buffer in a predictable sequence like this and can operations be streamlined in this way? 

Similarly, if I want to update values in the input buffer after they are consumed by the FMAC, but before its calculations are complete, can I do that?

I need to reload the filter coeffiecients between FMAC operations. When the filter coefficient buffer is reloaded, does the FMAC do that in the background when a reload command is given or does the loading process run on the CPU?