Skip to content

Commit

Permalink
tcp: fix possible freeze in tx path under memory pressure
Browse files Browse the repository at this point in the history
Blamed commit only dealt with applications issuing small writes.

Issue here is that we allow to force memory schedule for the sk_buff
allocation, but we have no guarantee that sendmsg() is able to
copy some payload in it.

In this patch, I make sure the socket can use up to tcp_wmem[0] bytes.

For example, if we consider tcp_wmem[0] = 4096 (default on x86),
and initial skb->truesize being 1280, tcp_sendmsg() is able to
copy up to 2816 bytes under memory pressure.

Before this patch a sendmsg() sending more than 2816 bytes
would either block forever (if persistent memory pressure),
or return -EAGAIN.

For bigger MTU networks, it is advised to increase tcp_wmem[0]
to avoid sending too small packets.

v2: deal with zero copy paths.

Fixes: 8e4d980 ("tcp: fix behavior for epoll edge trigger")
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Reviewed-by: Wei Wang <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
Eric Dumazet authored and davem330 committed Jun 17, 2022
1 parent c4ee118 commit 849b425
Showing 1 changed file with 29 additions and 4 deletions.
33 changes: 29 additions & 4 deletions net/ipv4/tcp.c
Original file line number Diff line number Diff line change
Expand Up @@ -951,6 +951,23 @@ static int tcp_downgrade_zcopy_pure(struct sock *sk, struct sk_buff *skb)
return 0;
}

static int tcp_wmem_schedule(struct sock *sk, int copy)
{
int left;

if (likely(sk_wmem_schedule(sk, copy)))
return copy;

/* We could be in trouble if we have nothing queued.
* Use whatever is left in sk->sk_forward_alloc and tcp_wmem[0]
* to guarantee some progress.
*/
left = sock_net(sk)->ipv4.sysctl_tcp_wmem[0] - sk->sk_wmem_queued;
if (left > 0)
sk_forced_mem_schedule(sk, min(left, copy));
return min(copy, sk->sk_forward_alloc);
}

static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
struct page *page, int offset, size_t *size)
{
Expand Down Expand Up @@ -986,7 +1003,11 @@ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
tcp_mark_push(tp, skb);
goto new_segment;
}
if (tcp_downgrade_zcopy_pure(sk, skb) || !sk_wmem_schedule(sk, copy))
if (tcp_downgrade_zcopy_pure(sk, skb))
return NULL;

copy = tcp_wmem_schedule(sk, copy);
if (!copy)
return NULL;

if (can_coalesce) {
Expand Down Expand Up @@ -1334,8 +1355,11 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)

copy = min_t(int, copy, pfrag->size - pfrag->offset);

if (tcp_downgrade_zcopy_pure(sk, skb) ||
!sk_wmem_schedule(sk, copy))
if (tcp_downgrade_zcopy_pure(sk, skb))
goto wait_for_space;

copy = tcp_wmem_schedule(sk, copy);
if (!copy)
goto wait_for_space;

err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,
Expand All @@ -1362,7 +1386,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
skb_shinfo(skb)->flags |= SKBFL_PURE_ZEROCOPY;

if (!skb_zcopy_pure(skb)) {
if (!sk_wmem_schedule(sk, copy))
copy = tcp_wmem_schedule(sk, copy);
if (!copy)
goto wait_for_space;
}

Expand Down

0 comments on commit 849b425

Please sign in to comment.