Skip to content

Commit

Permalink
Userspace datapath: Add fragmentation handling.
Browse files Browse the repository at this point in the history
Fragmentation handling is added for supporting conntrack.
Both v4 and v6 are supported.

After discussion with several people, I decided to not store
configuration state in the database to be more consistent with
the kernel in future, similarity with other conntrack configuration
which will not be in the database as well and overall simplicity.
Accordingly, fragmentation handling is enabled by default.

This patch enables fragmentation tests for the userspace datapath.

Signed-off-by: Darrell Ball <[email protected]>
Signed-off-by: Ben Pfaff <[email protected]>
  • Loading branch information
darball1 authored and blp committed Feb 14, 2019
1 parent 9f17f10 commit 4ea9669
Show file tree
Hide file tree
Showing 18 changed files with 2,330 additions and 83 deletions.
51 changes: 26 additions & 25 deletions Documentation/faq/releases.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,31 +105,32 @@ Q: Are all features available with all datapaths?
The following table lists the datapath supported features from an Open
vSwitch user's perspective.

===================== ============== ============== ========= =======
Feature Linux upstream Linux OVS tree Userspace Hyper-V
===================== ============== ============== ========= =======
NAT 4.6 YES Yes NO
Connection tracking 4.3 YES PARTIAL PARTIAL
Tunnel - LISP NO YES NO NO
Tunnel - STT NO YES NO YES
Tunnel - GRE 3.11 YES YES YES
Tunnel - VXLAN 3.12 YES YES YES
Tunnel - Geneve 3.18 YES YES YES
Tunnel - GRE-IPv6 4.18 YES YES NO
Tunnel - VXLAN-IPv6 4.3 YES YES NO
Tunnel - Geneve-IPv6 4.4 YES YES NO
Tunnel - ERSPAN 4.18 YES YES NO
Tunnel - ERSPAN-IPv6 4.18 YES YES NO
QoS - Policing YES YES YES NO
QoS - Shaping YES YES NO NO
sFlow YES YES YES NO
IPFIX 3.10 YES YES NO
Set action YES YES YES PARTIAL
NIC Bonding YES YES YES YES
Multiple VTEPs YES YES YES YES
Meters 4.15 YES YES NO
Conntrack zone limit 4.18 YES NO NO
===================== ============== ============== ========= =======
========================== ============== ============== ========= =======
Feature Linux upstream Linux OVS tree Userspace Hyper-V
========================== ============== ============== ========= =======
Connection tracking 4.3 YES YES YES
Conntrack Fragment Reass. 4.3 YES YES YES
NAT 4.6 YES YES NO
Conntrack zone limit 4.18 YES NO NO
Tunnel - LISP NO YES NO NO
Tunnel - STT NO YES NO YES
Tunnel - GRE 3.11 YES YES YES
Tunnel - VXLAN 3.12 YES YES YES
Tunnel - Geneve 3.18 YES YES YES
Tunnel - GRE-IPv6 NO NO YES NO
Tunnel - VXLAN-IPv6 4.3 YES YES NO
Tunnel - Geneve-IPv6 4.4 YES YES NO
Tunnel - ERSPAN 4.18 YES YES NO
Tunnel - ERSPAN-IPv6 4.18 YES YES NO
QoS - Policing YES YES YES NO
QoS - Shaping YES YES NO NO
sFlow YES YES YES NO
IPFIX 3.10 YES YES NO
Set action YES YES YES PARTIAL
NIC Bonding YES YES YES YES
Multiple VTEPs YES YES YES YES
Meters 4.15 YES YES NO
========================== ============== ============== ========= =======

Do note, however:

Expand Down
10 changes: 9 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,15 @@ Post-v2.11.0
- Userspace datapath:
* ICMPv6 ND enhancements: support for match and set ND options type
and reserved fields.

* Add v4/v6 fragmentation support for conntrack.
* New ovs-appctl "dpctl/ipf-set-enabled" and "dpctl/ipf-set-disabled"
commands for userspace datapath conntrack fragmentation support.
* New "ovs-appctl dpctl/ipf-set-min-frag" command for userspace
datapath conntrack fragmentation support.
* New "ovs-appctl dpctl/ipf-set-max-nfrags" command for userspace datapath
conntrack fragmentation support.
* New "ovs-appctl dpctl/ipf-get-status" command for userspace datapath
conntrack fragmentation support.

v2.11.0 - xx xxx xxxx
---------------------
Expand Down
1 change: 1 addition & 0 deletions include/sparse/netinet/ip6.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,5 +64,6 @@ struct ip6_frag {
};

#define IP6F_OFF_MASK ((OVS_FORCE ovs_be16) 0xfff8)
#define IP6F_MORE_FRAG ((OVS_FORCE ovs_be16) 0x0001)

#endif /* netinet/ip6.h sparse */
4 changes: 3 additions & 1 deletion lib/automake.mk
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (C) 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc.
# Copyright (C) 2009-2018 Nicira, Inc.
#
# Copying and distribution of this file, with or without modification,
# are permitted in any medium without royalty provided the copyright
Expand Down Expand Up @@ -108,6 +108,8 @@ lib_libopenvswitch_la_SOURCES = \
lib/hmapx.h \
lib/id-pool.c \
lib/id-pool.h \
lib/ipf.c \
lib/ipf.h \
lib/jhash.c \
lib/jhash.h \
lib/json.c \
Expand Down
22 changes: 19 additions & 3 deletions lib/conntrack.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2015, 2016, 2017 Nicira, Inc.
* Copyright (c) 2015-2019 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -30,6 +30,7 @@
#include "ct-dpif.h"
#include "dp-packet.h"
#include "flow.h"
#include "ipf.h"
#include "netdev.h"
#include "odp-netlink.h"
#include "openvswitch/hmap.h"
Expand Down Expand Up @@ -340,6 +341,7 @@ conntrack_init(struct conntrack *ct)
atomic_init(&ct->n_conn_limit, DEFAULT_N_CONN_LIMIT);
latch_init(&ct->clean_thread_exit);
ct->clean_thread = ovs_thread_create("ct_clean", clean_thread_main, ct);
ct->ipf = ipf_init();
}

/* Destroys the connection tracker 'ct' and frees all the allocated memory. */
Expand Down Expand Up @@ -382,6 +384,7 @@ conntrack_destroy(struct conntrack *ct)
hindex_destroy(&ct->alg_expectation_refs);
ct_rwlock_unlock(&ct->resources_lock);
ct_rwlock_destroy(&ct->resources_lock);
ipf_destroy(ct->ipf);
}

static unsigned hash_to_bucket(uint32_t hash)
Expand Down Expand Up @@ -1299,7 +1302,8 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,

/* Sends the packets in '*pkt_batch' through the connection tracker 'ct'. All
* the packets must have the same 'dl_type' (IPv4 or IPv6) and should have
* the l3 and and l4 offset properly set.
* the l3 and and l4 offset properly set. Performs fragment reassembly with
* the help of ipf_preprocess_conntrack().
*
* If 'commit' is true, the packets are allowed to create new entries in the
* connection tables. 'setmark', if not NULL, should point to a two
Expand All @@ -1314,11 +1318,15 @@ conntrack_execute(struct conntrack *ct, struct dp_packet_batch *pkt_batch,
const struct nat_action_info_t *nat_action_info,
long long now)
{
ipf_preprocess_conntrack(ct->ipf, pkt_batch, now, dl_type, zone,
ct->hash_basis);

struct dp_packet *packet;
struct conn_lookup_ctx ctx;

DP_PACKET_BATCH_FOR_EACH (i, packet, pkt_batch) {
if (!conn_key_extract(ct, packet, dl_type, &ctx, zone)) {
if (packet->md.ct_state == CS_INVALID
|| !conn_key_extract(ct, packet, dl_type, &ctx, zone)) {
packet->md.ct_state = CS_INVALID;
write_ct_md(packet, zone, NULL, NULL, NULL);
continue;
Expand All @@ -1327,6 +1335,8 @@ conntrack_execute(struct conntrack *ct, struct dp_packet_batch *pkt_batch,
setlabel, nat_action_info, tp_src, tp_dst, helper);
}

ipf_postprocess_conntrack(ct->ipf, pkt_batch, now, dl_type);

return 0;
}

Expand Down Expand Up @@ -2484,6 +2494,12 @@ conn_to_ct_dpif_entry(const struct conn *conn, struct ct_dpif_entry *entry,
}
}

struct ipf *
conntrack_ipf_ctx(struct conntrack *ct)
{
return ct->ipf;
}

int
conntrack_dump_start(struct conntrack *ct, struct conntrack_dump *dump,
const uint16_t *pzone, int *ptot_bkts)
Expand Down
6 changes: 5 additions & 1 deletion lib/conntrack.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2015, 2016, 2017 Nicira, Inc.
* Copyright (c) 2015, 2016, 2017, 2019 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -122,6 +122,7 @@ int conntrack_flush_tuple(struct conntrack *, const struct ct_dpif_tuple *,
int conntrack_set_maxconns(struct conntrack *ct, uint32_t maxconns);
int conntrack_get_maxconns(struct conntrack *ct, uint32_t *maxconns);
int conntrack_get_nconns(struct conntrack *ct, uint32_t *nconns);
struct ipf *conntrack_ipf_ctx(struct conntrack *ct);

/* 'struct ct_lock' is a wrapper for an adaptive mutex. It's useful to try
* different types of locks (e.g. spinlocks) */
Expand Down Expand Up @@ -293,6 +294,9 @@ struct conntrack {
*/
struct ct_rwlock resources_lock;

/* Fragmentation handling context. */
struct ipf *ipf;

};

#endif /* conntrack.h */
58 changes: 57 additions & 1 deletion lib/ct-dpif.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2015 Nicira, Inc.
* Copyright (c) 2015, 2018 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -194,6 +194,62 @@ ct_dpif_del_limits(struct dpif *dpif, const struct ovs_list *zone_limits)
: EOPNOTSUPP);
}

int
ct_dpif_ipf_set_enabled(struct dpif *dpif, bool v6, bool enable)
{
return (dpif->dpif_class->ipf_set_enabled
? dpif->dpif_class->ipf_set_enabled(dpif, v6, enable)
: EOPNOTSUPP);
}

int
ct_dpif_ipf_set_min_frag(struct dpif *dpif, bool v6, uint32_t min_frag)
{
return (dpif->dpif_class->ipf_set_min_frag
? dpif->dpif_class->ipf_set_min_frag(dpif, v6, min_frag)
: EOPNOTSUPP);
}

int
ct_dpif_ipf_set_max_nfrags(struct dpif *dpif, uint32_t max_frags)
{
return (dpif->dpif_class->ipf_set_max_nfrags
? dpif->dpif_class->ipf_set_max_nfrags(dpif, max_frags)
: EOPNOTSUPP);
}

int ct_dpif_ipf_get_status(struct dpif *dpif,
struct dpif_ipf_status *dpif_ipf_status)
{
return (dpif->dpif_class->ipf_get_status
? dpif->dpif_class->ipf_get_status(dpif, dpif_ipf_status)
: EOPNOTSUPP);
}

int
ct_dpif_ipf_dump_start(struct dpif *dpif, struct ipf_dump_ctx **dump_ctx)
{
return (dpif->dpif_class->ipf_dump_start
? dpif->dpif_class->ipf_dump_start(dpif, dump_ctx)
: EOPNOTSUPP);
}

int
ct_dpif_ipf_dump_next(struct dpif *dpif, void *dump_ctx, char **dump)
{
return (dpif->dpif_class->ipf_dump_next
? dpif->dpif_class->ipf_dump_next(dpif, dump_ctx, dump)
: EOPNOTSUPP);
}

int
ct_dpif_ipf_dump_done(struct dpif *dpif, void *dump_ctx)
{
return (dpif->dpif_class->ipf_dump_done
? dpif->dpif_class->ipf_dump_done(dpif, dump_ctx)
: EOPNOTSUPP);
}

void
ct_dpif_entry_uninit(struct ct_dpif_entry *entry)
{
Expand Down
12 changes: 11 additions & 1 deletion lib/ct-dpif.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2015 Nicira, Inc.
* Copyright (c) 2015, 2018 Nicira, Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -186,6 +186,8 @@ enum {
};

struct dpif;
struct dpif_ipf_status;
struct ipf_dump_ctx;

struct ct_dpif_dump_state {
struct dpif *dpif;
Expand All @@ -212,6 +214,14 @@ int ct_dpif_set_limits(struct dpif *dpif, const uint32_t *default_limit,
int ct_dpif_get_limits(struct dpif *dpif, uint32_t *default_limit,
const struct ovs_list *, struct ovs_list *);
int ct_dpif_del_limits(struct dpif *dpif, const struct ovs_list *);
int ct_dpif_ipf_set_enabled(struct dpif *, bool v6, bool enable);
int ct_dpif_ipf_set_min_frag(struct dpif *, bool v6, uint32_t min_frag);
int ct_dpif_ipf_set_max_nfrags(struct dpif *, uint32_t max_frags);
int ct_dpif_ipf_get_status(struct dpif *dpif,
struct dpif_ipf_status *dpif_ipf_status);
int ct_dpif_ipf_dump_start(struct dpif *dpif, struct ipf_dump_ctx **);
int ct_dpif_ipf_dump_next(struct dpif *dpif, void *, char **);
int ct_dpif_ipf_dump_done(struct dpif *dpif, void *);
void ct_dpif_entry_uninit(struct ct_dpif_entry *);
void ct_dpif_format_entry(const struct ct_dpif_entry *, struct ds *,
bool verbose, bool print_stats);
Expand Down
Loading

0 comments on commit 4ea9669

Please sign in to comment.