2022-04-19 11:35 AM
Hi!
Problem:
I found a bug when I enabling the ETH perpherial in the STM32F407 processor. The error is a DMA bus error.
What's happening:
It occurs when the ethernet PHY recieve a message from my router.
How to produce the error:
I do the following steps to reproduce the error.
Then this call back function HAL_ETH_ErrorCallback(ETH_HandleTypeDef *heth); will be called.
The error code is a DMA error.
/** @defgroup ETH_Error_Code ETH Error Code
* @{
*/
#define HAL_ETH_ERROR_NONE ((uint32_t)0x00000000U) /*!< No error */
#define HAL_ETH_ERROR_PARAM ((uint32_t)0x00000001U) /*!< Busy error */
#define HAL_ETH_ERROR_BUSY ((uint32_t)0x00000002U) /*!< Parameter error */
#define HAL_ETH_ERROR_TIMEOUT ((uint32_t)0x00000004U) /*!< Timeout error */
#define HAL_ETH_ERROR_DMA ((uint32_t)0x00000008U) /*!< DMA transfer error */
#define HAL_ETH_ERROR_MAC ((uint32_t)0x00000010U) /*!< MAC transfer error */
#if (USE_HAL_ETH_REGISTER_CALLBACKS == 1)
#define HAL_ETH_ERROR_INVALID_CALLBACK ((uint32_t)0x00000020U) /*!< Invalid Callback error */
#endif /* USE_HAL_ETH_REGISTER_CALLBACKS */
/**
* @}
*/
The DMA error is a DMA bus error.
This part of the code creates the DMA error code. See the arrow <---- HERE!
/* ETH DMA Error */
if (__HAL_ETH_DMA_GET_IT(heth, ETH_DMASR_AIS))
{
if (__HAL_ETH_DMA_GET_IT_SOURCE(heth, ETH_DMAIER_AISE))
{
heth->ErrorCode |= HAL_ETH_ERROR_DMA;
/* if fatal bus error occurred */
if (__HAL_ETH_DMA_GET_IT(heth, ETH_DMASR_FBES))
{
/* Get DMA error code */
heth->DMAErrorCode = READ_BIT(heth->Instance->DMASR, (ETH_DMASR_FBES | ETH_DMASR_TPS | ETH_DMASR_RPS)); <<--- HERE!
/* Disable all interrupts */
__HAL_ETH_DMA_DISABLE_IT(heth, ETH_DMAIER_NISE | ETH_DMAIER_AISE);
/* Set HAL state to ERROR */
heth->gState = HAL_ETH_STATE_ERROR;
}
else
{
/* Get DMA error status */
heth->DMAErrorCode = READ_BIT(heth->Instance->DMASR, (ETH_DMASR_ETS | ETH_DMASR_RWTS |
ETH_DMASR_RBUS | ETH_DMASR_AIS));
/* Clear the interrupt summary flag */
__HAL_ETH_DMA_CLEAR_IT(heth, (ETH_DMASR_ETS | ETH_DMASR_RWTS |
ETH_DMASR_RBUS | ETH_DMASR_AIS));
}
My main function
/* Private variables ---------------------------------------------------------*/
ETH_TxPacketConfig TxConfig;
ETH_DMADescTypeDef DMARxDscrTab[ETH_RX_DESC_CNT]; /* Ethernet Rx DMA Descriptors */
ETH_DMADescTypeDef DMATxDscrTab[ETH_TX_DESC_CNT]; /* Ethernet Tx DMA Descriptors */
void HAL_ETH_ErrorCallback(ETH_HandleTypeDef *heth){
uint32_t errorCode = heth->ErrorCode;
}
int main(void)
{
/* USER CODE BEGIN 1 */
/* USER CODE END 1 */
/* MCU Configuration--------------------------------------------------------*/
/* Reset of all peripherals, Initializes the Flash interface and the Systick. */
HAL_Init();
/* USER CODE BEGIN Init */
/* USER CODE END Init */
/* Configure the system clock */
SystemClock_Config();
/* USER CODE BEGIN SysInit */
/* USER CODE END SysInit */
/* Initialize all configured peripherals */
MX_GPIO_Init();
MX_FSMC_Init();
MX_DCMI_Init();
MX_SPI2_Init();
MX_TIM1_Init();
MX_TIM3_Init();
MX_ADC1_Init();
MX_CAN1_Init();
MX_RTC_Init();
MX_TIM4_Init();
MX_UART5_Init();
MX_ETH_Init();
/* USER CODE BEGIN 2 */
/* Start up LCD */
HAL_GPIO_WritePin(LCD_RESET_GPIO_Port, LCD_RESET_Pin, GPIO_PIN_SET);
LCD_BL_ON();
lcdInit();
HAL_GPIO_WritePin(ETH_RESET_GPIO_Port, ETH_RESET_Pin, GPIO_PIN_RESET);
HAL_Delay(1);
HAL_GPIO_WritePin(ETH_RESET_GPIO_Port, ETH_RESET_Pin, GPIO_PIN_SET);
/* Enable interrupt */
HAL_ETH_Start_IT(&heth);
/* USER CODE END 2 */
/* Infinite loop */
/* USER CODE BEGIN WHILE */
while (1)
{
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
}
/* USER CODE END 3 */
}
Hardware settings
The hardware settings are for RMII for the Ethernet PHY DP83848 .
Yes! The LED D1 flashes when something happen at the network. The pin ACT_LED/COL should go low when something happens. The oscillator is at 50 MHz and very close to the DP83848 chip.
Software settings:
Download my project here:
STM32 project:
Schematic project (KiCAD):
Why I'm thinking this be a bug?
Because I have not configured ETH DMA and it give me a bug about that when my Ethernet PHY got a message and pass it over to the STM32 processor. I assume that STM32CubeIDE 1.9.0 have some issues then.
What am I 100% sure that I have been constructed the hardware correctly?
The Ethernet PHY address is 0x1 and I have been veryfied that this is correct address. The LED D1 is flashing when activity occurs at the network.
The callback function calls when the LED D1 flashes after initialization.
2022-04-19 12:55 PM
Your hardware likely is good, because it seems to "do something" when the cable is connected.
But the software obviously has issues.
> Because I have not configured ETH DMA
ETH has its own DMA master. It does not use DMA controllers of STM32. This DMA starts when you call HAL_ETH_Start_IT.
Again: your code does not assign RX buffer memory to the descriptors. So the ETH DMA fails. What else can it do.
2022-04-19 01:02 PM
I call `HAL_ETH_Start_IT` at the beginning.
>> Your code does not assign RX buffer memory to the descriptors.
How can I do that?
2022-04-19 01:37 PM
Please see the LwIP examples. File ethernetif.c.
2022-04-19 02:05 PM
I'm not using LwIP here. Only pure ETH.
2022-04-21 02:05 PM
Nobody knows? I haven't found a solution to this problem.
2022-04-21 05:19 PM
Also you can look at FreeRTOS+ implementation.
https://github.com/FreeRTOS/FreeRTOS-Plus-TCP/tree/main/portable/NetworkInterface/STM32Fxx
2022-04-22 01:29 AM
Sorry. I cannot implement that. I'm using a CubeMX project, and these .c files are made in pure HAL only. If I'm using it and then update my project, then the CubeMX will rewrite the code.
It's much better to understand why I'm getting a Fatal Bus error for the DMA.
Is it because I have the FSCM LCD activated?
2022-04-23 10:15 AM
I looked at that FreeRTOS+ link. Oh, dear... They are using task notifications, but are setting interrupt flags in a separate ulISREvents variable. Then later in a task they are testing and resetting those flags without disabling at least the ETH interrupt at NVIC. That's just a bunch of race conditions, including this one.
But the solution is ridiculously simple. Instead of using vTaskNotifyGiveFromISR(...) from the ISRs, they should use xTaskNotifyFromISR(hTask, fEvents, eSetBits, &fYield). Then in the task ulTaskNotifyTake(pdTRUE, pdMS_TO_TICKS(250)) will just return those flags and clear the internal variable without race conditions. And, if it returns zero, then it has timed out because there are no new frames received for at least 250ms, which means the time has come to check (and update, if necessary) the state of physical link. A proper link state checking logic free of charge without additional tasks, timers or whatnot!
I cannot understand this... At least 5 years of many developers looking at a bunch of blatant race conditions and nobody sees the problem? And do I have to teach FreeRTOS/Amazon developers how to use... FreeRTOS kernel API?
@Pavel A., if you want, of course, report it to them and probably give a link to this topic.
2022-04-23 10:23 AM
> Sorry. I cannot implement that.
ST's lwIP implementation and FreeRTOS+ are both using HAL. Pavel was giving you an advice where you can look at an example of how to use HAL ETH driver.