Skip to content

Commit

Permalink
tcp: fix child sockets to use system default congestion control if no…
Browse files Browse the repository at this point in the history
…t set

Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.

This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.

In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.

Fixes: 55d8694 ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Glenn Judd <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
nealcardwell authored and davem330 committed Jun 1, 2015
1 parent beb39db commit 9f95041
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 3 deletions.
3 changes: 2 additions & 1 deletion include/net/inet_connection_sock.h
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,8 @@ struct inet_connection_sock {
const struct tcp_congestion_ops *icsk_ca_ops;
const struct inet_connection_sock_af_ops *icsk_af_ops;
unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8 icsk_ca_state:7,
__u8 icsk_ca_state:6,
icsk_ca_setsockopt:1,
icsk_ca_dst_locked:1;
__u8 icsk_retransmits;
__u8 icsk_pending;
Expand Down
5 changes: 4 additions & 1 deletion net/ipv4/tcp_cong.c
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@ static void tcp_reinit_congestion_control(struct sock *sk,

tcp_cleanup_congestion_control(sk);
icsk->icsk_ca_ops = ca;
icsk->icsk_ca_setsockopt = 1;

if (sk->sk_state != TCP_CLOSE && icsk->icsk_ca_ops->init)
icsk->icsk_ca_ops->init(sk);
Expand Down Expand Up @@ -335,8 +336,10 @@ int tcp_set_congestion_control(struct sock *sk, const char *name)
rcu_read_lock();
ca = __tcp_ca_find_autoload(name);
/* No change asking for existing value */
if (ca == icsk->icsk_ca_ops)
if (ca == icsk->icsk_ca_ops) {
icsk->icsk_ca_setsockopt = 1;
goto out;
}
if (!ca)
err = -ENOENT;
else if (!((ca->flags & TCP_CONG_NON_RESTRICTED) ||
Expand Down
5 changes: 4 additions & 1 deletion net/ipv4/tcp_minisocks.c
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,10 @@ void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst)
rcu_read_unlock();
}

if (!ca_got_dst && !try_module_get(icsk->icsk_ca_ops->owner))
/* If no valid choice made yet, assign current system default ca. */
if (!ca_got_dst &&
(!icsk->icsk_ca_setsockopt ||
!try_module_get(icsk->icsk_ca_ops->owner)))
tcp_assign_congestion_control(sk);

tcp_set_ca_state(sk, TCP_CA_Open);
Expand Down

0 comments on commit 9f95041

Please sign in to comment.