Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kernel/queue: Fix spurious NULL exit condition when using timeouts
The queue loop when CONFIG_POLL is in used has an inherent race between the return of k_poll() and the inspection of the list where no lock can be held. Other contending readers of the same queue can sneak in and steal the item out of the list before the current thread gets to the sys_sflist_get() call, and the current loop will (if it has a timeout) spuriously return NULL before the timeout expires. It's not even a hard race to exercise. Consider three threads at different priorities: High (which can be an ISR too), Mid, and Low: 1. Mid and Low both enter k_queue_get() and sleep inside k_poll() on an empty queue. 2. High comes along and calls k_queue_insert(). The queue code then wakes up Mid, and reschedules, but because High is still running Mid doesn't get to run yet. 3. High inserts a SECOND item. The queue then unpends the next thread in the list (Low), and readies it to run. But as before, it won't be scheduled yet. 4. Now High sleeps (or if it's an interrupt, exits), and Mid gets to run. It dequeues and returns the item it was delivered normally. 5. But Mid is still running! So it re-enters the loop it's sitting in and calls k_queue_get() again, which sees and returns the second item in the queue synchronously. Then it calls it a third time and goes to sleep because the queue is empty. 6. Finally, Low wakes up to find an empty queue, and returns NULL despite the fact that the timeout hadn't expired. The fix is simple enough: check the timeout expiration inside the loop so we don't return early. Signed-off-by: Andy Ross <[email protected]>
- Loading branch information