Skip to content

Commit

Permalink
tcp: always timestamp on every skb transmission
Browse files Browse the repository at this point in the history
Previously TCP skbs are not always timestamped if the transmission
failed due to memory or other local issues. This makes deciding
when to abort a socket tricky and complicated because the first
unacknowledged skb's timestamp may be 0 on TCP timeout.

The straight-forward fix is to always timestamp skb on every
transmission attempt. Also every skb retransmission needs to be
flagged properly to avoid RTT under-estimation. This can happen
upon receiving an ACK for the original packet and the a previous
(spurious) retransmission has failed.

It's worth noting that this reverts to the old time-stamping
style before commit 8c72c65 ("tcp: update skb->skb_mstamp more
carefully") which addresses a problem in computing the elapsed time
of a stalled window-probing socket. The problem will be addressed
differently in the next patches with a simpler approach.

Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Neal Cardwell <[email protected]>
Reviewed-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
yuchungcheng authored and davem330 committed Jan 17, 2019
1 parent 88f8598 commit 7f12422
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions net/ipv4/tcp_output.c
Original file line number Diff line number Diff line change
Expand Up @@ -980,7 +980,6 @@ static void tcp_update_skb_after_send(struct sock *sk, struct sk_buff *skb,
{
struct tcp_sock *tp = tcp_sk(sk);

skb->skb_mstamp_ns = tp->tcp_wstamp_ns;
if (sk->sk_pacing_status != SK_PACING_NONE) {
unsigned long rate = sk->sk_pacing_rate;

Expand Down Expand Up @@ -1028,7 +1027,9 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,

BUG_ON(!skb || !tcp_skb_pcount(skb));
tp = tcp_sk(sk);

prior_wstamp = tp->tcp_wstamp_ns;
tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
skb->skb_mstamp_ns = tp->tcp_wstamp_ns;
if (clone_it) {
TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq
- tp->snd_una;
Expand All @@ -1045,11 +1046,6 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
return -ENOBUFS;
}

prior_wstamp = tp->tcp_wstamp_ns;
tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);

skb->skb_mstamp_ns = tp->tcp_wstamp_ns;

inet = inet_sk(sk);
tcb = TCP_SKB_CB(skb);
memset(&opts, 0, sizeof(opts));
Expand Down Expand Up @@ -2937,12 +2933,16 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
}

/* To avoid taking spuriously low RTT samples based on a timestamp
* for a transmit that never happened, always mark EVER_RETRANS
*/
TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;

if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_RETRANS_CB_FLAG))
tcp_call_bpf_3arg(sk, BPF_SOCK_OPS_RETRANS_CB,
TCP_SKB_CB(skb)->seq, segs, err);

if (likely(!err)) {
TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;
trace_tcp_retransmit_skb(sk, skb);
} else if (err != -EBUSY) {
NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL, segs);
Expand Down

0 comments on commit 7f12422

Please sign in to comment.