Skip to content

Commit

Permalink
kernel/queue: Fix spurious NULL exit condition when using timeouts
Browse files Browse the repository at this point in the history
The queue loop when CONFIG_POLL is in used has an inherent race
between the return of k_poll() and the inspection of the list where no
lock can be held.  Other contending readers of the same queue can
sneak in and steal the item out of the list before the current thread
gets to the sys_sflist_get() call, and the current loop will (if it
has a timeout) spuriously return NULL before the timeout expires.

It's not even a hard race to exercise.  Consider three threads at
different priorities: High (which can be an ISR too), Mid, and Low:

1. Mid and Low both enter k_queue_get() and sleep inside k_poll() on
   an empty queue.

2. High comes along and calls k_queue_insert().  The queue code then
   wakes up Mid, and reschedules, but because High is still running Mid
   doesn't get to run yet.

3. High inserts a SECOND item.  The queue then unpends the next thread
   in the list (Low), and readies it to run.  But as before, it won't
   be scheduled yet.

4. Now High sleeps (or if it's an interrupt, exits), and Mid gets to
   run.  It dequeues and returns the item it was delivered normally.

5. But Mid is still running!  So it re-enters the loop it's sitting in
   and calls k_queue_get() again, which sees and returns the second
   item in the queue synchronously.  Then it calls it a third time and
   goes to sleep because the queue is empty.

6. Finally, Low wakes up to find an empty queue, and returns NULL
   despite the fact that the timeout hadn't expired.

The fix is simple enough: check the timeout expiration inside the loop
so we don't return early.

Signed-off-by: Andy Ross <[email protected]>
  • Loading branch information
Andy Ross authored and nashif committed Jun 11, 2018
1 parent 10d8e71 commit b173e43
Showing 1 changed file with 15 additions and 7 deletions.
22 changes: 15 additions & 7 deletions kernel/queue.c
Original file line number Diff line number Diff line change
Expand Up @@ -274,31 +274,39 @@ void k_queue_merge_slist(struct k_queue *queue, sys_slist_t *list)
static void *k_queue_poll(struct k_queue *queue, s32_t timeout)
{
struct k_poll_event event;
int err;
int err, elapsed = 0, done = 0;
unsigned int key;
void *val;
u32_t start;

k_poll_event_init(&event, K_POLL_TYPE_FIFO_DATA_AVAILABLE,
K_POLL_MODE_NOTIFY_ONLY, queue);

if (timeout != K_FOREVER) {
start = k_uptime_get_32();
}

do {
event.state = K_POLL_STATE_NOT_READY;

err = k_poll(&event, 1, timeout);
if (err) {
err = k_poll(&event, 1, timeout - elapsed);

if (err && err != -EAGAIN) {
return NULL;
}

__ASSERT_NO_MSG(event.state ==
K_POLL_STATE_FIFO_DATA_AVAILABLE);

/* sys_sflist_* aren't threadsafe, so must be always protected
* by irq_lock.
*/
key = irq_lock();
val = z_queue_node_peek(sys_sflist_get(&queue->data_q), true);
irq_unlock(key);
} while (!val && timeout == K_FOREVER);

if (!val && timeout != K_FOREVER) {
elapsed = k_uptime_get_32() - start;
done = elapsed > timeout;
}
} while (!val && !done);

return val;
}
Expand Down

0 comments on commit b173e43

Please sign in to comment.