Skip to content

Commit

Permalink
xfrm: do not set IPv4 DF flag when encapsulating IPv6 frames <= 1280 …
Browse files Browse the repository at this point in the history
…bytes.

One may want to have DF set on large packets to support discovering
path mtu and limiting the size of generated packets (hence not
setting the XFRM_STATE_NOPMTUDISC tunnel flag), while still
supporting networks that are incapable of carrying even minimal
sized IPv6 frames (post encapsulation).

Having IPv4 Don't Frag bit set on encapsulated IPv6 frames that
are not larger than the minimum IPv6 mtu of 1280 isn't useful,
because the resulting ICMP Fragmentation Required error isn't
actionable (even assuming you receive it) because IPv6 will not
drop it's path mtu below 1280 anyway.  While the IPv4 stack
could prefrag the packets post encap, this requires the ICMP
error to be successfully delivered and causes a loss of the
original IPv6 frame (thus requiring a retransmit and latency
hit).  Luckily with IPv4 if we simply don't set the DF flag,
we'll just make further fragmenting the packets some other
router's problems.

We'll still learn the correct IPv4 path mtu through encapsulation
of larger IPv6 frames.

I'm still not convinced this patch is entirely sufficient to make
everything happy... but I don't see how it could possibly
make things worse.

See also recent:
  4ff2980 'xfrm: fix tunnel model fragmentation behavior'
and friends

Cc: Lorenzo Colitti <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Lina Wang <[email protected]>
Cc: Steffen Klassert <[email protected]>
Signed-off-by: Maciej Zenczykowski <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
  • Loading branch information
zenczykowski authored and klassert committed May 25, 2022
1 parent 9c90c9b commit 6821ad8
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion net/xfrm/xfrm_output.c
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,7 @@ static int xfrm4_beet_encap_add(struct xfrm_state *x, struct sk_buff *skb)
*/
static int xfrm4_tunnel_encap_add(struct xfrm_state *x, struct sk_buff *skb)
{
bool small_ipv6 = (skb->protocol == htons(ETH_P_IPV6)) && (skb->len <= IPV6_MIN_MTU);
struct dst_entry *dst = skb_dst(skb);
struct iphdr *top_iph;
int flags;
Expand Down Expand Up @@ -303,7 +304,7 @@ static int xfrm4_tunnel_encap_add(struct xfrm_state *x, struct sk_buff *skb)
if (flags & XFRM_STATE_NOECN)
IP_ECN_clear(top_iph);

top_iph->frag_off = (flags & XFRM_STATE_NOPMTUDISC) ?
top_iph->frag_off = (flags & XFRM_STATE_NOPMTUDISC) || small_ipv6 ?
0 : (XFRM_MODE_SKB_CB(skb)->frag_off & htons(IP_DF));

top_iph->ttl = ip4_dst_hoplimit(xfrm_dst_child(dst));
Expand Down

0 comments on commit 6821ad8

Please sign in to comment.