Skip to content

Commit

Permalink
tcp: Only init congestion control if not initialized already
Browse files Browse the repository at this point in the history
Change tcp_init_transfer() to only initialize congestion control if it
has not been initialized already.

With this new approach, we can arrange things so that if the EBPF code
sets the congestion control by calling setsockopt(TCP_CONGESTION) then
tcp_init_transfer() will not re-initialize the CC module.

This is an approach that has the following beneficial properties:

(1) This allows CC module customizations made by the EBPF called in
    tcp_init_transfer() to persist, and not be wiped out by a later
    call to tcp_init_congestion_control() in tcp_init_transfer().

(2) Does not flip the order of EBPF and CC init, to avoid causing bugs
    for existing code upstream that depends on the current order.

(3) Does not cause 2 initializations for for CC in the case where the
    EBPF called in tcp_init_transfer() wants to set the CC to a new CC
    algorithm.

(4) Allows follow-on simplifications to the code in net/core/filter.c
    and net/ipv4/tcp_cong.c, which currently both have some complexity
    to special-case CC initialization to avoid double CC
    initialization if EBPF sets the CC.

Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Acked-by: Kevin Yang <[email protected]>
Cc: Lawrence Brakmo <[email protected]>
  • Loading branch information
nealcardwell authored and Alexei Starovoitov committed Sep 11, 2020
1 parent 18841da commit 8919a9b
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 3 deletions.
3 changes: 2 additions & 1 deletion include/net/inet_connection_sock.h
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,8 @@ struct inet_connection_sock {
void (*icsk_clean_acked)(struct sock *sk, u32 acked_seq);
struct hlist_node icsk_listen_portaddr_node;
unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8 icsk_ca_state:6,
__u8 icsk_ca_state:5,
icsk_ca_initialized:1,
icsk_ca_setsockopt:1,
icsk_ca_dst_locked:1;
__u8 icsk_retransmits;
Expand Down
1 change: 1 addition & 0 deletions net/ipv4/tcp.c
Original file line number Diff line number Diff line change
Expand Up @@ -2698,6 +2698,7 @@ int tcp_disconnect(struct sock *sk, int flags)
if (icsk->icsk_ca_ops->release)
icsk->icsk_ca_ops->release(sk);
memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv));
icsk->icsk_ca_initialized = 0;
tcp_set_ca_state(sk, TCP_CA_Open);
tp->is_sack_reneg = 0;
tcp_clear_retrans(tp);
Expand Down
3 changes: 2 additions & 1 deletion net/ipv4/tcp_cong.c
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ void tcp_assign_congestion_control(struct sock *sk)

void tcp_init_congestion_control(struct sock *sk)
{
const struct inet_connection_sock *icsk = inet_csk(sk);
struct inet_connection_sock *icsk = inet_csk(sk);

tcp_sk(sk)->prior_ssthresh = 0;
if (icsk->icsk_ca_ops->init)
Expand All @@ -185,6 +185,7 @@ void tcp_init_congestion_control(struct sock *sk)
INET_ECN_xmit(sk);
else
INET_ECN_dontxmit(sk);
icsk->icsk_ca_initialized = 1;
}

static void tcp_reinit_congestion_control(struct sock *sk,
Expand Down
4 changes: 3 additions & 1 deletion net/ipv4/tcp_input.c
Original file line number Diff line number Diff line change
Expand Up @@ -5894,8 +5894,10 @@ void tcp_init_transfer(struct sock *sk, int bpf_op, struct sk_buff *skb)
tp->snd_cwnd = tcp_init_cwnd(tp, __sk_dst_get(sk));
tp->snd_cwnd_stamp = tcp_jiffies32;

icsk->icsk_ca_initialized = 0;
bpf_skops_established(sk, bpf_op, skb);
tcp_init_congestion_control(sk);
if (!icsk->icsk_ca_initialized)
tcp_init_congestion_control(sk);
tcp_init_buffer_space(sk);
}

Expand Down

0 comments on commit 8919a9b

Please sign in to comment.