AnsweredAssumed Answered

Correct handling of NPTXFEM USB host interrupt

Question asked by Sheludko.Dave on Sep 20, 2016
I have written a USB DFU host class on a STM32F7 series to upload new firmware to a second STM32F7 which is running the built-in DFU bootloader. My host class works.

However, successful download of a new firmware image to the target seems to be very dependant on the timing of the USB interrupts, to the extent that I'm sure I must be mis-handling something. If I remove my debug printf statements, the interrupts occur in a different order and the channel halts prior to completion. I could simply insert delays as needed, but I'd rather understand the root cause of the problem rather than patch it.

The problems occur during the download phase of the DFU process, when a control transfer with a 2048-byte data stage containing the firmware data is sent using:
/*
         * Read from flash to buffer
         */
        NOR_ReadBuffer((uint16_t *)&flashBuf, lastAddr,(readBytes+1)/2); //size to read is in 16-bit half words
 
        /*
         * Configure setup packet with data stage pointing to filled buffer
         */
        phost->Control.setup.b.bmRequestType = USB_H2D |USB_REQ_TYPE_CLASS | USB_REQ_RECIPIENT_INTERFACE;
        phost->Control.setup.b.bRequest = DFU_DNLOAD;
        phost->Control.setup.b.wValue.w = blknum;
        phost->Control.setup.b.wIndex.w = 0;
        phost->Control.setup.b.wLength.w = readBytes; //length of data stage: can be IN or OUT depending on USB_D2H or H2D in bmRequestType
        phost->Control.buff = flashBuf;
        phost->Control.length = readBytes;
        while (status != USBH_OK){
            status = USBH_CtlReq(phost, flashBuf ,readBytes ); /* status = 0 if OK, 1 = BUSY */
            if (HAL_GetTick() >= (startTime + dfu_timeout)) {
                return USBH_FAIL;
            }
        }

 
The setup packet is acknowledged OK. When the data stage begins, the (CubeMX-generated) driver unmasks the NPTXFEM interrupt as the transfer size is larger than the FIFO buffer. At that stage I also set a flag (USE_NPTXFEM_IRQ) which enables some extra sections in the USBH channel interrupt handler (see further below). There was no existing handler for NPTXFEM interrupt generated by Cube, so I have written my own, following the basic guide given in the ref manual (RM0385, pages 1411-1415):

void DFU_NPTXFE_IRQ_Handler(HCD_HandleTypeDef *hhcd) {
    //volatile uint16_t f_avail=0;
    uint32_t bRemaining,bAvail=0;
    uint32_t nbytes=(hhcd->hc[0].max_packet);
 
    /*
     * Mask the interrupt while we're dealing with it
     */
    USB_MASK_INTERRUPT(hhcd->Instance, USB_OTG_GINTSTS_NPTXFE);
    printf("S%dU%d\n",hhcd->hc[0].state,hhcd->hc[0].urb_state);
    if ((hhcd->hc[0].xfer_count >= hhcd->hc[0].xfer_len) || (hhcd->hc[0].xfer_len ==0)) {
        return//nothing to do...
    }
    if ((!(USE_NPTXFEM_IRQ & (DFU_ACK | DFU_NAK))) && (hhcd->hc[0].xfer_count !=0)) {
        printf("Wait: 0x%x-%d,%d", USE_NPTXFEM_IRQ,hhcd->hc[0].xfer_count,hhcd->hc[0].xfer_len);
        //not the first packet, and haven't got an ACK or a NAK yet for the last one, so return and wait for response
        USB_UNMASK_INTERRUPT(hhcd->Instance, USB_OTG_GINTSTS_NPTXFE);
        return;
    }
 
 
    /*
     * Channel is ready for data:
     *  how much space is there in the NP Tx FIFO?
     *
     */
    bAvail = (hhcd->Instance->HNPTXSTS & (0x0000FFFF))*4; //n 32-bit words available in FIFO.
    bRemaining=hhcd->hc[0].xfer_len-hhcd->hc[0].xfer_count;
    //there is space in the buffer
    if ((bAvail > nbytes) && (bRemaining >= nbytes)) {
        DFU_send_bytes = nbytes;
        // not enough space for whole transfer, send nbytes
        USB_WritePacket(hhcd->Instance, hhcd->hc[0].xfer_buff, hhcd->hc[0].ch_num, nbytes, 0);
        hhcd->hc[0].xfer_buff += nbytes; //move the data buffer pointer
        hhcd->hc[0].xfer_count  += nbytes;
        printf("%d,%d\n", hhcd->hc[0].xfer_count,hhcd->hc[0].xfer_len);
        USE_NPTXFEM_IRQ=1; //leave it on, but clear any ACK/NAK
        if (hhcd->hc[0].xfer_count >= hhcd->hc[0].xfer_len) {
            return;
        }
        else {
            USB_UNMASK_INTERRUPT(hhcd->Instance, USB_OTG_GINTSTS_NPTXFE);//more data to go
            return;
        }
    }
    else if ((bAvail >= bRemaining) && (bRemaining < nbytes)) {
 
        //there is at least bRemaining space available, and if we're here, theres < nbytes left to send, so just send bRemaining
        DFU_send_bytes = bRemaining;
        //enough space for remaining bytes:
        USB_WritePacket(hhcd->Instance, hhcd->hc[0].xfer_buff, hhcd->hc[0].ch_num, bRemaining, 0);
        hhcd->hc[0].xfer_buff += bRemaining; //move the data buffer pointer
        hhcd->hc[0].xfer_count  += bRemaining;
        printf("%d\n", hhcd->hc[0].xfer_count);
        //finished sending: don't unmask the interrupt again - will be done in NAK handler if needed
        USE_NPTXFEM_IRQ=1; //leave it on (so NAK handler catches bad packets), but clear any ACK/NAK
        //Do we need a ZLP???
        return;
    }
    else {
        //not enough space in TX FIFO for bRemaining
        printf("N\n");
        USB_UNMASK_INTERRUPT(hhcd->Instance, USB_OTG_GINTSTS_NPTXFE);
        return//wait for more space
    }
return//should never get to here
}

My full (altered) USB host channel handler is below:
static void HCD_HC_OUT_IRQHandler  (HCD_HandleTypeDef *hhcd, uint8_t chnum)
{
  USB_OTG_GlobalTypeDef *USBx = hhcd->Instance;
  GPIOC->BSRR = GPIO_PIN_11; //set debug line high
  /*
   * When using the NPTXFIFO, we only send 1 packet at a time. If it gets NAK'd, (and the first packet ALWAYS does)
   * the OTG peripheral expects to be told to finish the transfer then halt the channel, BUT if we get and process the HALT
   * interrupt BEFORE we've finished the transfer then the device sits around forever more waiting for the rest of the transfer.
   * If we DON'T halt the channel on a NAK, the peripheral doesn't try to send any more packets and we hang.
   * So: slow the IRQ handler down so that it doesn't get to the halt. This works, but can't be the right way to do it?.
   */
  if (USE_NPTXFEM_IRQ) {
      delay_us(300);
  }
  //printf("OUT:%x-%d-%d\n",USBx_HC(chnum)->HCINT,hhcd->hc[chnum].urb_state,hhcd->hc[chnum].state);
  GPIOC->BSRR = (uint32_t)GPIO_PIN_11<<16; //set debug line low
 
  if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_AHBERR)
  {
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_AHBERR);
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
  
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_ACK)
  {
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_ACK);
    printf("A\n");
    if (USE_NPTXFEM_IRQ) {
        USE_NPTXFEM_IRQ |= DFU_ACK;
    }
    if( hhcd->hc[chnum].do_ping == 1)
    {
      hhcd->hc[chnum].state = HC_NYET;    
      __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
      USB_HC_Halt(hhcd->Instance, chnum);
      hhcd->hc[chnum].urb_state  = URB_NOTREADY;
    }
  }
   
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_NYET)
  {
    hhcd->hc[chnum].state = HC_NYET;
    hhcd->hc[chnum].ErrCnt= 0;   
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
    USB_HC_Halt(hhcd->Instance, chnum);     
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_NYET);
     
  
   
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_FRMOR)
  {
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
    USB_HC_Halt(hhcd->Instance, chnum); 
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_FRMOR);
  }
   
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_XFRC)
  {
      if (USE_NPTXFEM_IRQ) {
          USE_NPTXFEM_IRQ=0; //turn it off, transfer is complete.
          if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_NAK){
              __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_NAK); //we got xfrc, so clear any NAK
          }
      }
      hhcd->hc[chnum].ErrCnt = 0; 
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
    USB_HC_Halt(hhcd->Instance, chnum);  
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_XFRC);
    hhcd->hc[chnum].state = HC_XFRC;
 
  
 
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_STALL) 
  {
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_STALL); 
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
    USB_HC_Halt(hhcd->Instance, chnum);  
    hhcd->hc[chnum].state = HC_STALL;   
  }
 
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_NAK)
  
      if (USE_NPTXFEM_IRQ) {
          USE_NPTXFEM_IRQ |= DFU_NAK;
          //NAK'd: rewind the buffers
          hhcd->hc[chnum].xfer_buff -= DFU_send_bytes;
          hhcd->hc[chnum].xfer_count -= DFU_send_bytes;
          USB_UNMASK_INTERRUPT(hhcd->Instance, USB_OTG_GINTSTS_NPTXFE);//more data, unmask the txfifo empty interrupt
      }
          hhcd->hc[chnum].ErrCnt = 0;
          __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
          USB_HC_Halt(hhcd->Instance, chnum);
          hhcd->hc[chnum].state = HC_NAK;
 
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_NAK);
  }
 
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_TXERR)
  {
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
    USB_HC_Halt(hhcd->Instance, chnum);     
    hhcd->hc[chnum].state = HC_XACTERR; 
     __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_TXERR);
  }
   
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_DTERR)
  {
    __HAL_HCD_UNMASK_HALT_HC_INT(chnum);
    USB_HC_Halt(hhcd->Instance, chnum);     
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_NAK);
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_DTERR);   
    hhcd->hc[chnum].state = HC_DATATGLERR;
  }
   
   
  else if ((USBx_HC(chnum)->HCINT) &  USB_OTG_HCINT_CHH)
  {
    __HAL_HCD_MASK_HALT_HC_INT(chnum);
     
    if(hhcd->hc[chnum].state == HC_XFRC)
    {
      hhcd->hc[chnum].urb_state  = URB_DONE;
      if (hhcd->hc[chnum].ep_type == EP_TYPE_BULK)
      {
        hhcd->hc[chnum].toggle_out ^= 1;
      }     
    }
    else if (hhcd->hc[chnum].state == HC_NAK)
    {
 
            hhcd->hc[chnum].urb_state  = URB_NOTREADY;
 
    
     
    else if (hhcd->hc[chnum].state == HC_NYET)
    {
      hhcd->hc[chnum].urb_state  = URB_NOTREADY;
      hhcd->hc[chnum].do_ping = 0;
    }  
     
    else if (hhcd->hc[chnum].state == HC_STALL)
    {
      hhcd->hc[chnum].urb_state  = URB_STALL;
    }
     
    else if((hhcd->hc[chnum].state == HC_XACTERR) ||
            (hhcd->hc[chnum].state == HC_DATATGLERR))
    {
      if(hhcd->hc[chnum].ErrCnt++ > 3)
      {     
        hhcd->hc[chnum].ErrCnt = 0;
        hhcd->hc[chnum].urb_state = URB_ERROR;
      }
      else
      {
        hhcd->hc[chnum].urb_state = URB_NOTREADY;
      }
       
      /* re-activate the channel  */
      USBx_HC(chnum)->HCCHAR &= ~USB_OTG_HCCHAR_CHDIS;        
      USBx_HC(chnum)->HCCHAR |= USB_OTG_HCCHAR_CHENA;     
    }
     
    __HAL_HCD_CLEAR_HC_INT(chnum, USB_OTG_HCINT_CHH);
    HAL_HCD_HC_NotifyURBChange_Callback(hhcd, chnum, hhcd->hc[chnum].urb_state); 
 
  }
}

The flow appears to be as follows, but I'm new to USB comms, so please correct me as needed.

When the first data packet is sent to the device, the device returns NAK. As I understand it, the host should then try to re-send the packet. However, the CubeMX code halts the channel, and then tries to start the entire data stage again (switches back to CTRL_DATA_OUT stage in USBH_HandleControl) . This behaviour doesn't seem to work for a DFU device (I tried it first) as device uses NAK for flow control, and simply expects the last packet to be resent until ACK'd. By slowing down the USBH channel OUT interrupt (the delay_us(300) line) , I was able to allow time for the packet to be resent before the channel halt interrupt was processed, thus continuing the transfer. This method works, but it doesn't seem like 'right' way to do it.

If I remove the delay_us(300) from the channel interrupt handler, then the halt interrupt is processed, the data stage is restarted and the DFU device seems very confused, sending NAKs for ever more. No more data is transferred.

If I simply don't halt the channel when I receive a NAK, it looks as if the USB peripheral on the host refuses to send more packets, even if I continue to write to the NPTX fifo. I can't verify that behaviour directly as I don't have a USB hardware analyser to see what's going on between the host and the device, however without doing the halt channel when I get a NAK, the channel OUT interrupt never fires again (no ACK,NAK, or any other response). URB state is URB_IDLE.

I'd greatly appreciate a clarification of what the flow here should be when a NAK is received in this situation.

Thanks!

Outcomes