OpenAMP u-boot startup

JMein.1 · ‎2021-08-27

After reading most of the topics concerning my question and not finding an answer, which solves our problem, here my describtion:

We are currently integrating OpenAMP in our Application. For now, we are using the configuration from the OpenAMP_TTY_echo example project with enabled trace logs, and a single virtual UART between the M4 and the A7.

When starting the Application by using 'rproc start' in a booted linux system, it works as expected. The trace logs are available, and the /dev/ttyRPMSG0 virtual UART is loaded:

cat /sys/class/remoteproc/remoteproc0/state outputs 'running'
cat /sys/kernel/debug/remoteproc/remoteproc0/trace0 shows our debug outputs from the M4

We can send dummy data to /dev/ttyRPMSG0 and get the echoed response

In our use case we need the M4 Application to start as early as possible, so we have configured u-boot SSBL to load the ELF file and start the M4 execution. We are then facing the problem that the software is started, but the linux kernel does not load the /dev/ttyRPMSG0 device:

cat /sys/class/remoteproc/remoteproc0/state outputs 'running'
/sys/kernel/debug/remoteproc/remoteproc0/trace0 is not loaded
cat /sys/kernel/debug/remoteproc/remoteproc0/resource_table is empty
/dev/ttyRPMSG0 is not loaded

We checked the following points:

When attaching to the running M4 execution after the startup sequence has completed we see that the function rproc_virtio_wait_remote_ready(vdev) does not return. It seems that linux rproc does not load the resource table.

The resource_table is present in the ELF file (checked with readelf)

u-boot rproc does load the correct address of the resource table into the TAMPER backup register

During our investigations we found out that the issue seems to be M4 code + ram size dependent. When the code size exceeds ~130KB (Code + RAM where Code is placed in MCU SRAM 1) we can reproduce the problem.

Below that size the M4 Application is started correctly from u-boot and under Linux the trace is available and the RPMSG device is created.

We tried to output verbose log messages for the related kernel modules (stm32_rproc and remoteproc*) during boot and added the dyndbg parameter to the u-boot kernel command line (dyndbg="file ec.c +p") but we only got kernel panic errors when trying to boot with this parameter. Dynamic debugging is enabled in the kernel configuration. How can we switch on to get this additional debug infos?

Any ideas or suggestions from your side to solve this issue would be greatly appreciated.

Best regards,

Jan-Otto

Olivier GALLIEN · ‎2021-09-15

Hi @JMein.1 ,

Sorry for late reply

Find attached the patch that will be integrated in next version.

Olivier

Olivier GALLIEN
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

View solution in original post

Olivier GALLIEN · ‎2021-08-29

Hi @JMein.1 ,

Thanks for your post.

Can you please confirm you followed all instructions in

https://wiki.st.com/stm32mpu/wiki/How_to_start_the_coprocessor_from_the_bootloader

Can you confirm you are using 'st,auto-boot' in kernel dt ?

Seems indeed that issue come from resource table allocation issue when your code exceed MCU RAM1 size ( 128KB).

I need to escalate to some expert and I come back to you.

Olivier

Olivier GALLIEN
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

JMein.1 · ‎2021-08-30

Hello Olivier,

thank you for your hint but I unfortunately have to confirm, that we double checked the wiki part.

Yes, we are using the 'st.auto-boot' in kernel dt.

After some more investigation, we also observed, that in DDR the *.elf file loaded in bad case, the data differs from the original one compiled (especially the last part of the *elf file) but in the "Tamper" register, the correct address of the ressource table is located so the parsing was correct?

Hopefully you have some other parts to check by us or directly an idea, what goes wrong in our case.

Best regards,

Jan-Otto

Olivier GALLIEN · ‎2021-09-02

Hi @JMein.1 ,

After several iteration with you using private message to get complete details about your project we are in position to explain the issue.

The issue comes from an overlap between the code (mtext) and data(m_data) memory section.

the linker script examples defines followings region:

/* Memories definition */

MEMORY

{

m_interrupts (RX) : ORIGIN = 0x00000000, LENGTH = 0x00000298

m_text (RX) : ORIGIN = 0x10000000, LENGTH = 0x00020000

m_data (RW) : ORIGIN = 0x10020000, LENGTH = 0x00020000

m_ipc_shm (RW) : ORIGIN = 0x10040000, LENGTH = 0x00008000

}

Your project was based on this definitions but with additional code when you . the elf resulting is mapped as following:

Sections:

Idx Name Size VMA LMA

0 .isr_vector 00000298 00000000 00000000

1 .text 000161f0 10000000 10000000

2 .startup_copro_fw.Reset_Handler 00000050 100161f0 100161f0

3 .rodata 0000876c 10016240 10016240

4 .ARM.extab 00000000 1001e9ac 1001e9ac

5 .ARM 00000008 1001e9ac 1001e9ac

6 .preinit_array 00000000 1001e9b4 1001e9b4

7 .init_array 00000008 1001e9b4 1001e9b4

8 .fini_array 00000004 1001e9bc 1001e9bc

9 .data 00000df0 10020000 1001e9c0

10 .resource_table 0000008c 10020df0 1001f7b0

11 .bss 00000fb4 10020e80 1001f840

12 ._user_heap_stack 00000604 10021e34 100207f4

The VMA is the virtual memory address, means the effective address of the code and the variable for firmware execution

The LMA is the load memory address. For code it is aligned with the VMA, for data it is defined in the m_text section and contains initial value of the variables.

The issue in this elf above, is that the ._user_heap_stack section LMA is not defined in the m_text memory area but overflow in the m_data(same for the end of the .bss section.

this results in side effect as a part of the m_data is overwrite.

The actual linker script is not warm on such problem, and should be improved.

Problem was visible only in boot from Uboot, because when from Linux the resource_table is re-initialized by linux world after M4 firmware start.

A workaround consist to carefully monitor m_text real size and shift m_data accordingly.

We are working on a clean solution.

Olivier

Olivier GALLIEN
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

JMein.1 · ‎2021-09-03

Hello Olivier,

thank you for your quick reply. But there rised now regarding your answer some maybe final questions:

With which tool (I guess command line), you can extract exactly this output of the sections with VMA and LMA?
"Your project was based on this definitions but with additional code when you . the elf resulting is mapped as following:" after the you is something missing. Can you add this please if its like this?
Where is exactly the difference when starting from U-Boot instead of booting by the Linux Kernel? The M4 has in his ressource table e.g. the trace buffer which is filled by the Linux so the *.elf file is parsed completely by the rproc. How is here the exact workflow?

So, we now first care about the m_text size (when we have same output as you provided in your post) and hope, your come back with a *.ld file, where this issue can be tracked directly by the M4 linker (normally, if something does not fit, he will complain).

Thank you in advance for your investigation and looking forward to hearing from you.

Best regards,

Jan-Otto

Olivier GALLIEN · ‎2021-09-07

Hi @JMein.1 ,

Find answer to your questions below :

to extract VMA and LMA we use "objdump -hw HelloWorldM4.elf" on a linux station.
Sorry, nothing missing but "when you" are nonsense and can be remove.
Find detailed explanation below

When the Cortex-M4 firmware is loaded and started by the U-boot:

When started by U-boot the Cortex-M4 is started before the Linux. The U-boot loads the Cortex-M4 elf firmware and stores in a backup register the resource table address and the state of the Cortex-M4 (https://wiki.st.com/stm32mpu/wiki/STM32MP15_backup_registers#Memory_mapping) to inform the Linux.

On Linux kernel boot, the remoteproc framework detects the state of the Cortex- M4 and attach to the Cortex-M4 by parsing the resource table (already loaded in memory by U-boot) and enable the associated resource ( RPMsg, trace)

In case of the m_text overflow issue, the resource table memory section has been corrupted before the Linux is booted. So when -The Linux remoteproc framework tries to get the resource table, the data is corrupted

When the Linux starts the Cortex-M4 the sequence is different:

The Linux remoteproc framework parses the elf firmware to get the resource table and make a local copy.
The Linux remoteproc framework loads the elf file in memory
The Linux remoteproc framework parses the resource table ( local copy) enable the associated resources ( RPMSg, trace) and update the local copy in consequence
The Linux remoteproc framework starts the Cortex-M4 firmware

In case of the m_text overflow issue, the resource table is corrupted at this point

5 The Linux remoteproc framework overwrite the resource table section with the local copy, which is not corrupted

Olivier

Olivier GALLIEN
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

JMein.1 · ‎2021-09-08

Hello Olivier,

thank you for your reply. Then actually I hope you can provide us a *.ld file, where this behavior will be handled correctly.

In the meantime, we'll have a look at the .text area and adapt the linker file accordingly.

In the meantime I double checked the startup code of the M4 on its own with the overlapping but this does not harm the behavior of the startup because the overlapping part is not taken into account by the startup code (initialized is only the .bss with the VM addresses and the .data area considering the LM addresses). The .ressource table and the ._user_heap_stack is untouched by the startup code.

Summarizing, all of the magic with the overlapping area problem is part of the remoteproc and the U-boot loader. Is this correct or am I still not correct?

Best regards,

Jan-Otto

Olivier GALLIEN · ‎2021-09-15

Hi @JMein.1 ,

Sorry for late reply

Find attached the patch that will be integrated in next version.

Olivier

Olivier GALLIEN
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

JMein.1 · ‎2021-09-15

Hello Olivier,

thank you for the patch. We will integrate this on our machine.

One last question:

The .text area contains therefore now the code + the initialized data? So, the rproc does not care about the LM addresses of the .bss segment because this is not filled with .text values but only zeros.
Otherwise, it will be like this. 128K .text segment will contain the code, the data which is initialized by the .text values AND the uninitialized data .bss (which would not make sense).

Because this would lead to much less place for the 'real' code part.

Hopefully you got my question and my doubts?

Best regards,

Jan-Otto

JMein.1 · ‎2021-09-15