Skip to content

Commit

Permalink
tcp: tsq: fix nonagle handling
Browse files Browse the repository at this point in the history
Commit 46d3cea ("tcp: TCP Small Queues") introduced a possible
regression for applications using TCP_NODELAY.

If TCP session is throttled because of tsq, we should consult
tp->nonagle when TX completion is done and allow us to send additional
segment, especially if this segment is not a full MSS.
Otherwise this segment is sent after an RTO.

[edumazet] : Cooked the changelog, added another fix about testing
sk_wmem_alloc twice because TX completion can happen right before
setting TSQ_THROTTLED bit.

This problem is particularly visible with recent auto corking,
but might also be triggered with low tcp_limit_output_bytes
values or NIC drivers delaying TX completion by hundred of usec,
and very low rtt.

Thomas Glanzmann for example reported an iscsi regression, caused
by tcp auto corking making this bug quite visible.

Fixes: 46d3cea ("tcp: TCP Small Queues")
Signed-off-by: John Ogness <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Thomas Glanzmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
jogness authored and davem330 committed Feb 10, 2014
1 parent 684bd2e commit bf06200
Showing 1 changed file with 11 additions and 2 deletions.
13 changes: 11 additions & 2 deletions net/ipv4/tcp_output.c
Original file line number Diff line number Diff line change
Expand Up @@ -698,7 +698,8 @@ static void tcp_tsq_handler(struct sock *sk)
if ((1 << sk->sk_state) &
(TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_CLOSING |
TCPF_CLOSE_WAIT | TCPF_LAST_ACK))
tcp_write_xmit(sk, tcp_current_mss(sk), 0, 0, GFP_ATOMIC);
tcp_write_xmit(sk, tcp_current_mss(sk), tcp_sk(sk)->nonagle,
0, GFP_ATOMIC);
}
/*
* One tasklet per cpu tries to send more skbs.
Expand Down Expand Up @@ -1904,7 +1905,15 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,

if (atomic_read(&sk->sk_wmem_alloc) > limit) {
set_bit(TSQ_THROTTLED, &tp->tsq_flags);
break;
/* It is possible TX completion already happened
* before we set TSQ_THROTTLED, so we must
* test again the condition.
* We abuse smp_mb__after_clear_bit() because
* there is no smp_mb__after_set_bit() yet
*/
smp_mb__after_clear_bit();
if (atomic_read(&sk->sk_wmem_alloc) > limit)
break;
}

limit = mss_now;
Expand Down

0 comments on commit bf06200

Please sign in to comment.