This example shows how to use BPF to preserve DSCP values across an encapsulating interface such as Wireguard. It relies on the encapsulation layer preserving the skb->hash value across the encapsulation, which is commonly the case on kernel encapsulation protocols (including Wireguard).
PRESERVING DSCP MARKS ACROSS ENCAPSULATION LEAKS DATA! DON’T DO THIS IF YOUR TUNNEL GOES ACROSS THE PUBLIC INTERNET!
The example contains two filters: one that parses the packet header and reads the DSCP value out of the packet, then stores that value in a map keyed on the skb->hash value (which is calculated if it isn’t already set). And the second TC filter reads back the skb->hash value, looks it up in the map, and if found rewrites the packet DSCP based on that value.
The idea is that the first filter is run on the internal (encapsulating) interface, and the second is run on the physical interface that transmits the encapsulated packet. To install the filters, run the userspace component like:
sudo ./preserve-dscp <ifname pre> <ifname post>
To unload the filters again, run:
sudo ./preserve-dscp <ifname pre> <ifname post> --unload
Note that unloading will remove the clsact qdisc from the interfaces entirely,
so don’t run this if you want to preserve that; instead manually remove the
filters using tc
.
There are a couple of caveats to this approach:
- As mentioned above, doing this in the first place LEAKS DATA! I.e., it makes it possible for an outside observer to distinguish between different types of traffic inside the tunnel. This is generally a bad idea, especially if the traffic goes across the public internet.
- This only works for encapsulation protocols that preserve the SKB hash in the first place.
- The userspace program will try to detect if the
pre
interface has an Ethernet header by checking if the interface has a type ofARPHRD_NONE
, and if so will assume the packet starts with the IP header. If this heuristic turns out to be wrong, the filter will fail. - There is no sanity checking on the outer filter that the packets actually
come from the interface that we ran the
pre
filter on in the first place; there is no general way to check this from BPF, but thewrite_dscp
filter can be amended to do some other sanity checks on the packet before modifying it (such as checking port numbers). - Since this relies on
skb->hash
, it is flow-based; if individual packets in the same flow have different marks, which ones will be preserved is racy.