Skip to content

Commit

Permalink
ofproto-dpif-xlate: Fix zone set from non-frozen-metadata fields.
Browse files Browse the repository at this point in the history
CT zone could be set from a field that is not included in frozen
metadata. Consider the example rules which are typically seen in
OpenStack security group rules:

priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0)
priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2

The zone is set from the first rule's ct action. These two rules will
generate two megaflows: the first one uses zone=5 to query the CT module,
the second one sets the zone-id from the first megaflow and commit to CT.

The current implementation will generate a megaflow that does not use
ct_zone=5 as a match, but directly commit into the ct using zone=5, as zone is
set by an Imm not a field.

Consider a situation that one changes the zone id (for example to 15)
in the first rule, however, still keep the second rule unchanged. During
this change, there is traffic hitting the two generated megaflows, the
revaldiator would revalidate all megaflows, however, the revalidator will
not change the second megaflow, because zone=5 is recorded in the
megaflow, so the xlate will still translate the commit action into zone=5,
and the new traffic will still commit to CT as zone=5, not zone=15,
resulting in taffic drops and other issues.

Just like OVS set-field convention, if a field X is set by Y
(Y is a variable not an Imm), we should also mask Y as a match
in the generated megaflow. An exception is that if the zone-id is
set by the field that is included in the frozen state (i.e. regs) and this
upcall is a resume of a thawed xlate, the un-wildcarding can be skipped,
as the recirc_id is a hash of the values in these fields, and it will change
following the changes of these fields. When the recirc_id changes,
all megaflows with the old recirc id will be invalid later.

Fixes: 0765951 ("Add support for connection tracking.")
Reported-by: Sai Su <[email protected]>
Signed-off-by: Peng He <[email protected]>
Acked-by: Mark D. Gray <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
  • Loading branch information
xnhp0320 authored and igsilya committed Oct 13, 2021
1 parent 02aebad commit c1fdb83
Show file tree
Hide file tree
Showing 4 changed files with 143 additions and 8 deletions.
1 change: 1 addition & 0 deletions include/openvswitch/meta-flow.h
Original file line number Diff line number Diff line change
Expand Up @@ -2305,6 +2305,7 @@ void mf_set_flow_value_masked(const struct mf_field *,
const union mf_value *mask,
struct flow *);
bool mf_is_tun_metadata(const struct mf_field *);
bool mf_is_frozen_metadata(const struct mf_field *);
bool mf_is_pipeline_field(const struct mf_field *);
bool mf_is_set(const struct mf_field *, const struct flow *);
void mf_mask_field(const struct mf_field *, struct flow_wildcards *);
Expand Down
13 changes: 13 additions & 0 deletions lib/meta-flow.c
Original file line number Diff line number Diff line change
Expand Up @@ -1788,6 +1788,19 @@ mf_is_tun_metadata(const struct mf_field *mf)
mf->id < MFF_TUN_METADATA0 + TUN_METADATA_NUM_OPTS;
}

bool
mf_is_frozen_metadata(const struct mf_field *mf)
{
if (mf->id >= MFF_TUN_ID && mf->id <= MFF_IN_PORT_OXM) {
return true;
}

if (mf->id >= MFF_REG0 && mf->id < MFF_ETH_SRC) {
return true;
}
return false;
}

bool
mf_is_pipeline_field(const struct mf_field *mf)
{
Expand Down
32 changes: 24 additions & 8 deletions ofproto/ofproto-dpif-xlate.c
Original file line number Diff line number Diff line change
Expand Up @@ -6199,11 +6199,32 @@ static void
compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc,
bool is_last_action)
{
ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
size_t ct_offset;
uint16_t zone;
if (ofc->zone_src.field) {
union mf_subvalue value;
memset(&value, 0xff, sizeof(value));

zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow);
if (ctx->xin->frozen_state) {
/* If the upcall is a resume of a recirculation, we only need to
* unwildcard the fields that are not in the frozen_metadata, as
* when the rules update, OVS will generate a new recirc_id,
* which will invalidate the megaflow with old the recirc_id.
*/
if (!mf_is_frozen_metadata(ofc->zone_src.field)) {
mf_write_subfield_flow(&ofc->zone_src, &value,
&ctx->wc->masks);
}
} else {
mf_write_subfield_flow(&ofc->zone_src, &value, &ctx->wc->masks);
}
} else {
zone = ofc->zone_imm;
}

size_t ct_offset;
ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
/* Ensure that any prior actions are applied before composing the new
* conntrack action. */
xlate_commit_actions(ctx);
Expand All @@ -6215,11 +6236,6 @@ compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack *ofc,
do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx,
is_last_action, false);

if (ofc->zone_src.field) {
zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow);
} else {
zone = ofc->zone_imm;
}

ct_offset = nl_msg_start_nested(ctx->odp_actions, OVS_ACTION_ATTR_CT);
if (ofc->flags & NX_CT_F_COMMIT) {
Expand Down
105 changes: 105 additions & 0 deletions tests/system-traffic.at
Original file line number Diff line number Diff line change
Expand Up @@ -1981,6 +1981,111 @@ tcp,orig=(src=10.1.1.3,dst=10.1.1.4,sport=<cleared>,dport=<cleared>),reply=(src=
OVS_TRAFFIC_VSWITCHD_STOP
AT_CLEANUP

AT_SETUP([conntrack - zones from other field])
CHECK_CONNTRACK()
OVS_TRAFFIC_VSWITCHD_START()

ADD_NAMESPACES(at_ns0, at_ns1)

ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")

dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0.
AT_DATA([flows.txt], [dnl
priority=1,action=drop
priority=10,arp,action=normal
priority=10,icmp,action=normal
priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0)
priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2
priority=100,in_port=2,ct_state=-trk,tcp,action=ct(table=0,zone=5)
priority=100,in_port=2,ct_state=+trk,ct_zone=5,tcp,action=1
])

AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])

OVS_START_L7([at_ns1], [http])

dnl HTTP requests from p0->p1 should work fine.
NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log])

AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], [dnl
tcp,dnl
orig=(src=10.1.1.1,dst=10.1.1.2,sport=<cleared>,dport=<cleared>),dnl
reply=(src=10.1.1.2,dst=10.1.1.1,sport=<cleared>,dport=<cleared>),dnl
zone=5,protoinfo=(state=<cleared>)
])

dnl This is to test when the zoneid is set by a field variable like
dnl NXM_NX_CT_ZONE, the OVS xlate should generate a megaflow with a form of
dnl "ct_zone(5), ... actions: ct(commit, zone=5)". The match "ct_zone(5)"
dnl is needed as if we changes the zoneid into 15 in the following, the old
dnl "ct_zone(5), ... actions: ct(commit, zone=5)" megaflow will not get hit,
dnl and OVS will generate a new megaflow with the match "ct_zone(0xf)".
dnl This will make sure that the new packets are committing to zoneid 15
dnl rather than old 5.
AT_CHECK([ovs-appctl dpctl/dump-flows --names filter=in_port=ovs-p0 dnl
| grep "+trk" | grep -q "ct_zone(0x5)" ], [0], [])

AT_CHECK([ovs-ofctl mod-flows br0 dnl
'priority=100,ct_state=-trk,tcp,in_port="ovs-p0" actions=ct(table=0,zone=15)'])

NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log])

AT_CHECK([ovs-appctl dpctl/dump-flows --names filter=in_port=ovs-p0 dnl
| grep "+trk" | grep -q "ct_zone(0xf)" ], [0], [])

OVS_TRAFFIC_VSWITCHD_STOP
AT_CLEANUP

AT_SETUP([conntrack - zones from other field, more tests])
CHECK_CONNTRACK()
OVS_TRAFFIC_VSWITCHD_START()

ADD_NAMESPACES(at_ns0, at_ns1)

ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")

dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0.
AT_DATA([flows.txt], [dnl
priority=1,action=drop
priority=10,arp,action=normal
priority=10,icmp,action=normal
priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0,commit,exec(load:0xffff0005->NXM_NX_CT_LABEL[[0..31]]))
priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_LABEL[[0..15]]),2
priority=100,in_port=2,ct_state=-trk,tcp,action=ct(table=0,zone=5)
priority=100,in_port=2,ct_state=+trk,ct_zone=5,tcp,action=1
])

AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])

OVS_START_L7([at_ns1], [http])

dnl HTTP requests from p0->p1 should work fine.
NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log])

AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], [dnl
tcp,dnl
orig=(src=10.1.1.1,dst=10.1.1.2,sport=<cleared>,dport=<cleared>),dnl
reply=(src=10.1.1.2,dst=10.1.1.1,sport=<cleared>,dport=<cleared>),dnl
zone=5,labels=0xffff0005,protoinfo=(state=<cleared>)
])

AT_CHECK([ovs-appctl dpctl/dump-flows --names filter=in_port=ovs-p0 dnl
| grep "+trk" | sed 's/0xffff0005\/0xffff/0x5\/0xffff/' dnl
| grep -q "ct_label(0x5/0xffff)" ], [0], [])

AT_CHECK([ovs-ofctl mod-flows br0 'priority=100,ct_state=-trk,tcp,in_port="ovs-p0" actions=ct(table=0,zone=15,commit,exec(load:0xffff000f->NXM_NX_CT_LABEL[[0..31]]))'])

NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v -o wget0.log])

AT_CHECK([ovs-appctl dpctl/dump-flows --names filter=in_port=ovs-p0 dnl
| grep "+trk" | sed 's/0xffff000f\/0xffff/0xf\/0xffff/' dnl
| grep -q "ct_label(0xf/0xffff)" ], [0], [])

OVS_TRAFFIC_VSWITCHD_STOP
AT_CLEANUP

AT_SETUP([conntrack - multiple bridges])
CHECK_CONNTRACK()
OVS_TRAFFIC_VSWITCHD_START(
Expand Down

0 comments on commit c1fdb83

Please sign in to comment.