Skip to content

Commit

Permalink
classifier: Adjust segment boundary to execute prerequisite processing.
Browse files Browse the repository at this point in the history
During flow processing, the flow wildcards are checked as a series of
stages, and these stages are intended to carry dependencies in a single
direction.  But when the neighbor discovery processing, for example, was
executed there is an incorrect dependency chain - we need fields from
stage 4 to determine whether we need fields from stage 3.

We can build a set of flow rules to demonstrate this:
  table=0,priority=100,ipv6,ipv6_src=1000::/10 actions=resubmit(,1)
  table=0,priority=0 actions=NORMAL
  table=1,priority=110,ipv6,ipv6_dst=1000::3 actions=resubmit(,2)
  table=1,priority=100,ipv6,ipv6_dst=1000::4 actions=resubmit(,2)
  table=1,priority=0 actions=NORMAL
  table=2,priority=120,icmp6,nw_ttl=255,icmp_type=135,icmp_code=0,nd_sll=10:de:ad:be:ef:10 actions=NORMAL
  table=2,priority=100,tcp actions=NORMAL
  table=2,priority=100,icmp6 actions=NORMAL
  table=2,priority=0 actions=NORMAL

With this set of flows, any IPv6 packet that executes through this pipeline
will have the corresponding nd_sll field flagged as required match for
classification even if that field doesn't make sense in such a context
(for example, TCP packets).  When the corresponding flow is installed into
the kernel datapath, this field is not reflected when the revalidator
executes the dump stage (see net/openvswitch/flow_netlink.c for more details).

During the sweep stage, revalidator will compare the dumped WC with a
generated WC - these will mismatch because the generated WC will match on
the Neighbor Discovery fields, while the datapath WC will not match on
these fields.  We will then invalidate the flow and as a side effect
force an upcall.

By redefining the boundary, we shift these fields to the l4 subtable, and
cause masks to be generated matching just the requisite fields.  The list
of fields being shifted:

    struct in6_addr nd_target;
    struct eth_addr arp_sha;
    struct eth_addr arp_tha;
    ovs_be16 tcp_flags;
    ovs_be16 pad2;
    struct ovs_key_nsh nsh;

A standout field would be tcp_flags moving from l3 subtable matches to
the l4 subtable matches.  This reverts a partial performance optimization
in the case of stateless firewalling.  The tcp_flags field might have
been a good candidate to retain in the l3 segment, but it got overloaded
with ICMPv6 ND matching, and therefore we can't preserve this kind of
optimization.

Two other approaches were considered - moving the nd_target field alone
and collapsing the l3/l4 segments into a single subtable for matching.
Moving any field individually introduces ABI mismatch, and doesn't
completely address the problems with other neighbor discovery related
fields (such as nd_sll/nd_tll).  Collapsing the two subtables creates
an issue with datapath flow explosion, since the l3 and l4 fields will
be unwildcarded together (this can be seen with some of the existing
classifier tests).

A simple test is added to showcase the behavior.

Fixes: 476f36e ("Classifier: Staged subtable matching.")
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2081773
Reported-by: Numan Siddique <[email protected]>
Suggested-by: Ilya Maximets <[email protected]>
Signed-off-by: Aaron Conole <[email protected]>
Acked-by: Eelco Chaudron <[email protected]>
Acked-by: Cian Ferriter <[email protected]>
Tested-by: Numan Siddique <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
  • Loading branch information
apconole authored and igsilya committed Jun 7, 2022
1 parent c0d7d63 commit ca44218
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 4 deletions.
7 changes: 3 additions & 4 deletions include/openvswitch/flow.h
Original file line number Diff line number Diff line change
Expand Up @@ -141,15 +141,14 @@ struct flow {
uint8_t nw_tos; /* IP ToS (including DSCP and ECN). */
uint8_t nw_ttl; /* IP TTL/Hop Limit. */
uint8_t nw_proto; /* IP protocol or low 8 bits of ARP opcode. */
/* L4 (64-bit aligned) */
struct in6_addr nd_target; /* IPv6 neighbor discovery (ND) target. */
struct eth_addr arp_sha; /* ARP/ND source hardware address. */
struct eth_addr arp_tha; /* ARP/ND target hardware address. */
ovs_be16 tcp_flags; /* TCP flags/ICMPv6 ND options type.
* With L3 to avoid matching L4. */
ovs_be16 tcp_flags; /* TCP flags/ICMPv6 ND options type. */
ovs_be16 pad2; /* Pad to 64 bits. */
struct ovs_key_nsh nsh; /* Network Service Header keys */

/* L4 (64-bit aligned) */
ovs_be16 tp_src; /* TCP/UDP/SCTP source port/ICMP type. */
ovs_be16 tp_dst; /* TCP/UDP/SCTP destination port/ICMP code. */
ovs_be16 ct_tp_src; /* CT original tuple source port/ICMP type. */
Expand Down Expand Up @@ -179,7 +178,7 @@ BUILD_ASSERT_DECL(offsetof(struct flow, igmp_group_ip4) + sizeof(uint32_t)
enum {
FLOW_SEGMENT_1_ENDS_AT = offsetof(struct flow, dl_dst),
FLOW_SEGMENT_2_ENDS_AT = offsetof(struct flow, nw_src),
FLOW_SEGMENT_3_ENDS_AT = offsetof(struct flow, tp_src),
FLOW_SEGMENT_3_ENDS_AT = offsetof(struct flow, nd_target),
};
BUILD_ASSERT_DECL(FLOW_SEGMENT_1_ENDS_AT % sizeof(uint64_t) == 0);
BUILD_ASSERT_DECL(FLOW_SEGMENT_2_ENDS_AT % sizeof(uint64_t) == 0);
Expand Down
25 changes: 25 additions & 0 deletions tests/classifier.at
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,31 @@ Datapath actions: 3
OVS_VSWITCHD_STOP(["/'prefixes' with incompatible field: ipv6_label/d"])
AT_CLEANUP

AT_SETUP([flow classifier - ipv6 ND dependency])
OVS_VSWITCHD_START
add_of_ports br0 1 2
AT_DATA([flows.txt], [dnl
table=0,priority=100,ipv6,ipv6_src=1000::/10 actions=resubmit(,1)
table=0,priority=0 actions=NORMAL
table=1,priority=110,ipv6,ipv6_dst=1000::3 actions=resubmit(,2)
table=1,priority=100,ipv6,ipv6_dst=1000::4 actions=resubmit(,2)
table=1,priority=0 actions=NORMAL
table=2,priority=120,icmp6,nw_ttl=255,icmp_type=135,icmp_code=0,nd_target=1000::1 actions=NORMAL
table=2,priority=100,tcp actions=NORMAL
table=2,priority=100,icmp6 actions=NORMAL
table=2,priority=0 actions=NORMAL
])
AT_CHECK([ovs-ofctl add-flows br0 flows.txt])

# test ICMPv6 echo request (which should have no nd_target field)
AT_CHECK([ovs-appctl ofproto/trace br0 "in_port=1,eth_src=f6:d2:b0:19:5e:7b,eth_dst=d2:49:19:91:78:fe,dl_type=0x86dd,ipv6_src=1000::3,ipv6_dst=1000::4,nw_proto=58,icmpv6_type=128,icmpv6_code=0"], [0], [stdout])
AT_CHECK([tail -2 stdout], [0],
[Megaflow: recirc_id=0,eth,icmp6,in_port=1,dl_src=f6:d2:b0:19:5e:7b,dl_dst=d2:49:19:91:78:fe,ipv6_src=1000::/10,ipv6_dst=1000::4,nw_ttl=0,nw_frag=no
Datapath actions: 100,2
])
OVS_VSWITCHD_STOP
AT_CLEANUP

AT_BANNER([conjunctive match])

AT_SETUP([single conjunctive match])
Expand Down

0 comments on commit ca44218

Please sign in to comment.