Skip to content
Merged

Todo #34

Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 53 additions & 27 deletions TODO
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
NuttX TODO List (Last updated November 21, 2019)
NuttX TODO List (Last updated January 3, 2019)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This file summarizes known NuttX bugs, limitations, inconsistencies with
Expand Down Expand Up @@ -589,32 +589,58 @@ o SMP
can that occur? I think it can occur in the following
situation:

CPU0 - Task A is running.
- The CPU0 IDLE task is the only other task in the
CPU0 ready-to-run list.
CPU1 - Task B is running.
- Task C is blocked but remains in the g_assignedtasks[]
list because of a CPU affinity selection. Task C
also holds the critical section which is temporarily
relinquished because Task C is blocked by Task B.
- The CPU1 IDLE task is at the end of the list.

Actions:
1. Task A/CPU 0 takes the critical section.
2. Task B/CPU 1 suspends waiting for an event
3. Task C is restarted.

Now both Task A and Task C hold the critical section.

This problem has never been observed, but seems to be a
possibility. I believe it could only occur if CPU affinity
is used (otherwise, tasks will pend must as when pre-
emption is disabled).

A proper solution would probably involve re-designing how
CPU affinity is implemented. The CPU1 IDLE thread should
more appropriately run, but cannot because the Task C TCB
is in the g_assignedtasks[] list.
The log below was reported is Nuttx running on two cores
Cortex-A7 architecture in SMP mode. You can notice see that
when sched_addreadytorun() was called, the g_cpu_irqset is 3.

sched_addreadytorun: irqset cpu 1, me 0 btcbname init, irqset 1 irqcount 2.
sched_addreadytorun: sched_addreadytorun line 338 g_cpu_irqset = 3.

This can happen, but only under a very certain condition.
g_cpu_irqset only exists to support this certain condition:

a. A task running on CPU 0 takes the critical section. So
g_cpu_irqset == 0x1.

b. A task exits on CPU 1 and a waiting, ready-to-run task
is re-started on CPU 1. This new task also holds the
critical section. So when the task is re-restarted on
CPU 1, we than have g_cpu_irqset == 0x3

So we are in a very perverse state! There are two tasks
running on two different CPUs and both hold the critical
section. I believe that is a dangerous situation and there
could be undiscovered bugs that could happen in that case.
However, as of this moment, I have not heard of any specific
problems caused by this weird behavior.

A possible solution would be to add a new task state that
would exist only for SMP.

- Add a new SMP-only task list and state. Say,
g_csection_wait[]. It should be prioritized.
- When a task acquires the critical section, all tasks in
g_readytorun[] that need the critical section would be
moved to g_csection_wait[].
- When any task is unblocked for any reason and moved to the
g_readytorun[] list, if that unblocked task needs the
critical section, it would also be moved to the
g_csection_wait[] list. No task that needs the critical
section can be in the ready-to-run list if the critical
section is not available.
- When the task releases the critical section, all tasks in
the g_csection_wait[] needs to be moved back to
g_readytorun[].
- This may result in a context switch. The tasks should be
moved back to g_readytorun[] higest priority first. If a
context switch occurs and the critical section to re-taken
by the re-started task, the lower priority tasks in
g_csection_wait[] must stay in that list.

That is really not as much work as it sounds. It is
something that could be done in 2-3 days of work if you know
what you are doing. Getting the proper test setup and
verifying the cahnge would be the more difficult task.

Status: Open
Priority: Unknown. Might be high, but first we would need to confirm
Expand Down