Skip to content

fix(usb-ncm): NULL deref race in xmit_setup_next_glue_ntb + osal_spin_lock protection (T-145)#4

Open
chipsoft wants to merge 2 commits into
masterfrom
172-fix-tinyusb-ncm-null-deref-race
Open

fix(usb-ncm): NULL deref race in xmit_setup_next_glue_ntb + osal_spin_lock protection (T-145)#4
chipsoft wants to merge 2 commits into
masterfrom
172-fix-tinyusb-ncm-null-deref-race

Conversation

@chipsoft
Copy link
Copy Markdown
Owner

Summary

  • T-145.1 — Eliminate NULL-deref double-read race in xmit_setup_next_glue_ntb and tud_network_xmit: capture ncm_interface.xmit_glue_ntb into a local once, NULL-check and use the local, publish to global only after full NTB init.
  • T-145.2 — Add osal_spin_lock critical sections (→ taskENTER_CRITICAL on single-core RW612) to all four concurrent xmit_glue_ntb/xmit_glue_ntb_datagram_ndx mutation points: xmit_setup_next_glue_ntb, tud_network_xmit, xmit_start_if_possible, ncm_flush_data_paths.

Root cause

lwIP task and USB task (tud_taskncm_flush_data_paths or DMA-done completion) concurrently access ncm_interface.xmit_glue_ntb without synchronisation. The lwIP task checked the pointer, the USB task zeroed it, the lwIP task dereferenced it → write to address 0 → NS SecureFault → TF-M SYSRESETREQ → nboot ISP loop.

Crash: pc=0x08101690 (ncm_device.c:509), r3=0 in exception frame.

Test plan

  • CAN1_OVER_FLEXCOMM0=1 DISABLE_ROLLBACK=1 MCUBOOT_IMG_VERSION=3.6.0 ./scripts/verify.sh mcuboot → PASS (build + flash + RTT ALIVE)
  • Sustained CAN loopback bench (T-139-class load) to confirm no regression under USB TX burst

🤖 Generated with Claude Code

chipsoft and others added 2 commits May 17, 2026 13:49
…nd tud_network_xmit

Both functions read ncm_interface.xmit_glue_ntb twice: once for a NULL
check, then again to assign a local pointer. A concurrent SET_INTERFACE
or DMA-done callback (USB task) can zero xmit_glue_ntb between the two
reads via ncm_flush_data_paths() or xmit_start_if_possible(), causing
the second read to return NULL and the subsequent field write to address 0.

Crash site: ncm_device.c:509, pc=0x08101690, r3=0 → SecureFault →
TF-M SYSRESETREQ → nboot ISP loop (T-145).

Fix: capture the pointer once into a local variable, NULL-check the local,
and publish to the global only after the NTB is fully initialised
(xmit_setup_next_glue_ntb). In tud_network_xmit, snapshot the global
before the NULL check so the same pointer is used throughout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tate machine

The lwIP task (tud_network_can_xmit / tud_network_xmit) and the USB task
(tud_task → ncm_flush_data_paths / xmit_start_if_possible) both mutate
ncm_interface.xmit_glue_ntb and xmit_glue_ntb_datagram_ndx without
synchronisation, creating data races on a preemptive FreeRTOS scheduler.

The T-145.1 local-variable fix prevents the immediate NULL-deref crash;
this commit closes the residual race window completely.

Protected regions (single file-static osal_spinlock_t s_xmit_glue_lock):

  ncm_flush_data_paths()       — zero xmit_glue_ntb + datagram_ndx (USB task)
  xmit_start_if_possible()     — guard check + steal of xmit_glue_ntb (USB task)
  xmit_setup_next_glue_ntb()   — snapshot old ptr, publish new ptr + ndx (lwIP task)
  tud_network_xmit()           — snapshot ptr, reserve datagram slot (lwIP task)

RW612 is single-core (TUP_MCU_MULTIPLE_CORE=0); osal_spin_lock(false) maps
to taskENTER_CRITICAL() which masks interrupts up to configMAX_SYSCALL_
INTERRUPT_PRIORITY — includes the USB IRQ at configLIBRARY_LOWEST_
INTERRUPT_PRIORITY-1. All protected sections are short and non-blocking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant