Skip to content

Commit

Permalink
userspace: Handling of versatile tunnel ports
Browse files Browse the repository at this point in the history
In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based
on packet_type of flow. If it's about an Ethernet packet, it is set to
ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set
according to the name space type.

Signed-off-by: Jan Scheurich <[email protected]>
Signed-off-by: Ben Pfaff <[email protected]>
  • Loading branch information
blp committed Jun 27, 2017
1 parent 3d4b2e6 commit 875ab13
Show file tree
Hide file tree
Showing 18 changed files with 280 additions and 102 deletions.
6 changes: 2 additions & 4 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,9 @@ Post-v2.7.0
* OVN services are no longer restarted automatically after upgrade.
- Add --cleanup option to command 'ovs-appctl exit' (see ovs-vswitchd(8)).
- L3 tunneling:
* Add "layer3" options for tunnel ports that support non-Ethernet (L3)
payload (GRE, VXLAN-GPE).
* Use new tunnel port option "packet_type" to configure L2 vs. L3.
* New vxlan tunnel extension "gpe" to support VXLAN-GPE tunnels.
* Transparently pop and push Ethernet headers at transmit/reception
of packets to/from L3 tunnels.
* New support for non-Ethernet (L3) payloads in GRE and VXLAN-GPE.
- The BFD detection multiplier is now user-configurable.
- Add experimental support for hardware offloading
* HW offloading is disabled by default.
Expand Down
30 changes: 15 additions & 15 deletions lib/meta-flow.xml
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,27 @@
networking technology in use are called called <dfn>root fields</dfn>.
Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
and this remains the default mode of operation for Open vSwitch bridges.
In this mode, when a packet is received from a non-Ethernet interfaces,
such as a layer-3 LISP or GRE tunnel, Open vSwitch force-fits it to this
When a packet is received from a non-Ethernet interfaces, such as a layer-3
LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
Ethernet-centric point of view by pretending that an Ethernet header is
present whose Ethernet type that indicates the packet's actual type (and
whose source and destination addresses are all-zero).
</p>

<p>
Open vSwitch 2.8 and later supports the ``packet type-aware pipeline''
concept introduced in OpenFlow 1.5. A bridge configured to be packet
type-aware can handle packets of multiple networking technologies, such as
Ethernet, IP, ARP, MPLS, or NSH in parallel. Such a bridge does not have
any root fields.
Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
concept introduced in OpenFlow 1.5. Such a pipeline does not have any root
fields. Instead, a new metadata field, <ref field="packet_type"/>,
indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
or another type. For backward compatibility, by default Open vSwitch 2.8
imitates the behavior of Open vSwitch 2.7 and earlier. Later versions of
Open vSwitch may change the default, and in the meantime controllers can
turn off this legacy behavior, on a port-by-port basis, by setting
<code>options:packet_type</code> to <code>ptap</code> in the
<code>Interface</code> table. This is significant only for ports that can
handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
GRE tunnel ports. See <code>ovs-vwitchd.conf.db</code>(5) for more
information.
</p>

<p>
Expand Down Expand Up @@ -332,14 +340,6 @@ tcp,tp_src=0x07c0/0xfff0
<dt><code>mplsm</code></dt> <dd><code>eth_type=0x8848</code></dd>
</dl>

<p>
These shorthand notations continue to work in packet type-aware bridges.
The absence of a packet_type match implies
<code>packet_type=ethernet</code>, so that shorthands match on Ethernet
packets with the implied eth_type. Please note that the shorthand
<code>ip</code> does not match packets of packet_type (1,0x800) for IPv4.
</p>


<h2>Evolution of OpenFlow Fields</h2>

Expand Down
1 change: 1 addition & 0 deletions lib/netdev-bsd.c
Original file line number Diff line number Diff line change
Expand Up @@ -1517,6 +1517,7 @@ netdev_bsd_update_flags(struct netdev *netdev_, enum netdev_flags off,
\
GET_FEATURES, \
NULL, /* set_advertisement */ \
NULL, /* get_pt_mode */ \
NULL, /* set_policing */ \
NULL, /* get_qos_type */ \
NULL, /* get_qos_capabilities */ \
Expand Down
1 change: 1 addition & 0 deletions lib/netdev-dpdk.c
Original file line number Diff line number Diff line change
Expand Up @@ -3276,6 +3276,7 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev)
GET_STATS, \
GET_FEATURES, \
NULL, /* set_advertisements */ \
NULL, /* get_pt_mode */ \
\
netdev_dpdk_set_policing, \
netdev_dpdk_get_qos_types, \
Expand Down
1 change: 1 addition & 0 deletions lib/netdev-dummy.c
Original file line number Diff line number Diff line change
Expand Up @@ -1382,6 +1382,7 @@ netdev_dummy_update_flags(struct netdev *netdev_,
\
NULL, /* get_features */ \
NULL, /* set_advertisements */ \
NULL, /* get_pt_mode */ \
\
NULL, /* set_policing */ \
NULL, /* get_qos_types */ \
Expand Down
1 change: 1 addition & 0 deletions lib/netdev-linux.c
Original file line number Diff line number Diff line change
Expand Up @@ -2843,6 +2843,7 @@ netdev_linux_update_flags(struct netdev *netdev_, enum netdev_flags off,
\
GET_FEATURES, \
netdev_linux_set_advertisements, \
NULL, /* get_pt_mode */ \
\
netdev_linux_set_policing, \
netdev_linux_get_qos_types, \
Expand Down
23 changes: 17 additions & 6 deletions lib/netdev-native-tnl.c
Original file line number Diff line number Diff line change
Expand Up @@ -463,10 +463,13 @@ netdev_gre_build_header(const struct netdev *netdev,

greh = netdev_tnl_ip_build_header(data, params, IPPROTO_GRE);

if (tnl_cfg->is_layer3) {
greh->protocol = params->flow->dl_type;
} else {
if (params->flow->packet_type == htonl(PT_ETH)) {
greh->protocol = htons(ETH_TYPE_TEB);
} else if (pt_ns(params->flow->packet_type) == OFPHTN_ETHERTYPE) {
greh->protocol = pt_ns_type_be(params->flow->packet_type);
} else {
ovs_mutex_unlock(&dev->mutex);
return 1;
}
greh->flags = 0;

Expand Down Expand Up @@ -575,8 +578,10 @@ netdev_vxlan_build_header(const struct netdev *netdev,
put_16aligned_be32(&vxh->vx_flags, htonl(VXLAN_FLAGS | VXLAN_HF_GPE));
put_16aligned_be32(&vxh->vx_vni,
htonl(ntohll(params->flow->tunnel.tun_id) << 8));
if (tnl_cfg->is_layer3) {
switch (ntohs(params->flow->dl_type)) {
if (params->flow->packet_type == htonl(PT_ETH)) {
vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
} else if (pt_ns(params->flow->packet_type) == OFPHTN_ETHERTYPE) {
switch (pt_ns_type(params->flow->packet_type)) {
case ETH_TYPE_IP:
vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_IPV4;
break;
Expand All @@ -586,9 +591,11 @@ netdev_vxlan_build_header(const struct netdev *netdev,
case ETH_TYPE_TEB:
vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
break;
default:
goto drop;
}
} else {
vxh->vx_gpe.next_protocol = VXLAN_GPE_NP_ETHERNET;
goto drop;
}
} else {
put_16aligned_be32(&vxh->vx_flags, htonl(VXLAN_FLAGS));
Expand All @@ -600,6 +607,10 @@ netdev_vxlan_build_header(const struct netdev *netdev,
data->header_len += sizeof *vxh;
data->tnl_type = OVS_VPORT_TYPE_VXLAN;
return 0;

drop:
ovs_mutex_unlock(&dev->mutex);
return 1;
}

struct dp_packet *
Expand Down
6 changes: 6 additions & 0 deletions lib/netdev-provider.h
Original file line number Diff line number Diff line change
Expand Up @@ -474,6 +474,12 @@ struct netdev_class {
int (*set_advertisements)(struct netdev *netdev,
enum netdev_features advertise);

/* Returns 'netdev''s configured packet_type mode.
*
* This function may be set to null if it would always return
* NETDEV_PT_LEGACY_L2. */
enum netdev_pt_mode (*get_pt_mode)(const struct netdev *netdev);

/* Attempts to set input rate limiting (policing) policy, such that up to
* 'kbits_rate' kbps of traffic is accepted, with a maximum accumulative
* burst size of 'kbits' kb.
Expand Down
107 changes: 76 additions & 31 deletions lib/netdev-vport.c
Original file line number Diff line number Diff line change
Expand Up @@ -98,18 +98,6 @@ netdev_vport_is_patch(const struct netdev *netdev)
return class->get_config == get_patch_config;
}

bool
netdev_vport_is_layer3(const struct netdev *dev)
{
if (is_vport_class(netdev_get_class(dev))) {
struct netdev_vport *vport = netdev_vport_cast(dev);

return vport->tnl_cfg.is_layer3;
}

return false;
}

static bool
netdev_vport_needs_dst_port(const struct netdev *dev)
{
Expand Down Expand Up @@ -407,23 +395,45 @@ parse_tunnel_ip(const char *value, bool accept_mcast, bool *flow,
return 0;
}

enum tunnel_layers {
TNL_L2 = 1 << 0, /* 1 if a tunnel type can carry Ethernet traffic. */
TNL_L3 = 1 << 1 /* 1 if a tunnel type can carry L3 traffic. */
};
static enum tunnel_layers
tunnel_supported_layers(const char *type,
const struct netdev_tunnel_config *tnl_cfg)
{
if (!strcmp(type, "lisp")) {
return TNL_L3;
} else if (!strcmp(type, "gre")) {
return TNL_L2 | TNL_L3;
} else if (!strcmp(type, "vxlan") && tnl_cfg->exts & OVS_VXLAN_EXT_GPE) {
return TNL_L2 | TNL_L3;
} else {
return TNL_L2;
}
}
static enum netdev_pt_mode
default_pt_mode(enum tunnel_layers layers)
{
return layers == TNL_L3 ? NETDEV_PT_LEGACY_L3 : NETDEV_PT_LEGACY_L2;
}

static int
set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
{
struct netdev_vport *dev = netdev_vport_cast(dev_);
const char *name = netdev_get_name(dev_);
const char *type = netdev_get_type(dev_);
struct ds errors = DS_EMPTY_INITIALIZER;
bool needs_dst_port, has_csum, optional_layer3;
bool needs_dst_port, has_csum;
uint16_t dst_proto = 0, src_proto = 0;
struct netdev_tunnel_config tnl_cfg;
struct smap_node *node;
bool is_layer3 = false;
int err;

has_csum = strstr(type, "gre") || strstr(type, "geneve") ||
strstr(type, "stt") || strstr(type, "vxlan");
optional_layer3 = !strcmp(type, "gre");
memset(&tnl_cfg, 0, sizeof tnl_cfg);

/* Add a default destination port for tunnel ports if none specified. */
Expand All @@ -437,7 +447,6 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)

if (!strcmp(type, "lisp")) {
tnl_cfg.dst_port = htons(LISP_DST_PORT);
tnl_cfg.is_layer3 = true;
}

if (!strcmp(type, "stt")) {
Expand Down Expand Up @@ -501,9 +510,10 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
}
} else if (!strcmp(node->key, "key") ||
!strcmp(node->key, "in_key") ||
!strcmp(node->key, "out_key")) {
!strcmp(node->key, "out_key") ||
!strcmp(node->key, "packet_type")) {
/* Handled separately below. */
} else if (!strcmp(node->key, "exts")) {
} else if (!strcmp(node->key, "exts") && !strcmp(type, "vxlan")) {
char *str = xstrdup(node->value);
char *ext, *save_ptr = NULL;

Expand All @@ -515,7 +525,6 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GBP);
} else if (!strcmp(type, "vxlan") && !strcmp(ext, "gpe")) {
tnl_cfg.exts |= (1 << OVS_VXLAN_EXT_GPE);
optional_layer3 = true;
} else {
ds_put_format(&errors, "%s: unknown extension '%s'\n",
name, ext);
Expand All @@ -528,21 +537,44 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
} else if (!strcmp(node->key, "egress_pkt_mark")) {
tnl_cfg.egress_pkt_mark = strtoul(node->value, NULL, 10);
tnl_cfg.set_egress_pkt_mark = true;
} else if (!strcmp(node->key, "layer3")) {
if (!strcmp(node->value, "true")) {
is_layer3 = true;
}
} else {
ds_put_format(&errors, "%s: unknown %s argument '%s'\n", name,
type, node->key);
}
}

if (optional_layer3 && is_layer3) {
tnl_cfg.is_layer3 = is_layer3;
} else if (!optional_layer3 && is_layer3) {
ds_put_format(&errors, "%s: unknown %s argument '%s'\n",
name, type, "layer3");
enum tunnel_layers layers = tunnel_supported_layers(type, &tnl_cfg);
const char *full_type = (strcmp(type, "vxlan") ? type
: tnl_cfg.exts & OVS_VXLAN_EXT_GPE ? "VXLAN-GPE"
: "VXLAN (without GPE");
const char *packet_type = smap_get(args, "packet_type");
if (!packet_type) {
tnl_cfg.pt_mode = default_pt_mode(layers);
} else if (!strcmp(packet_type, "legacy_l2")) {
tnl_cfg.pt_mode = NETDEV_PT_LEGACY_L2;
if (!(layers & TNL_L2)) {
ds_put_format(&errors, "%s: legacy_l2 configured on %s tunnel "
"that cannot carry L2 traffic\n",
name, full_type);
err = EINVAL;
goto out;
}
} else if (!strcmp(packet_type, "legacy_l3")) {
tnl_cfg.pt_mode = NETDEV_PT_LEGACY_L3;
if (!(layers & TNL_L3)) {
ds_put_format(&errors, "%s: legacy_l3 configured on %s tunnel "
"that cannot carry L3 traffic\n",
name, full_type);
err = EINVAL;
goto out;
}
} else if (!strcmp(packet_type, "ptap")) {
tnl_cfg.pt_mode = NETDEV_PT_AWARE;
} else {
ds_put_format(&errors, "%s: unknown packet_type '%s'\n",
name, packet_type);
err = EINVAL;
goto out;
}

if (!ipv6_addr_is_set(&tnl_cfg.ipv6_dst) && !tnl_cfg.ip_dst_flow) {
Expand Down Expand Up @@ -675,9 +707,12 @@ get_tunnel_config(const struct netdev *dev, struct smap *args)
smap_add(args, "csum", "true");
}

if (tnl_cfg.is_layer3 && (!strcmp("gre", type) ||
!strcmp("vxlan", type))) {
smap_add(args, "layer3", "true");
enum tunnel_layers layers = tunnel_supported_layers(type, &tnl_cfg);
if (tnl_cfg.pt_mode != default_pt_mode(layers)) {
smap_add(args, "packet_type",
tnl_cfg.pt_mode == NETDEV_PT_LEGACY_L2 ? "legacy_l2"
: tnl_cfg.pt_mode == NETDEV_PT_LEGACY_L3 ? "legacy_l3"
: "ptap");
}

if (!tnl_cfg.dont_fragment) {
Expand Down Expand Up @@ -809,6 +844,15 @@ get_stats(const struct netdev *netdev, struct netdev_stats *stats)
return 0;
}

static enum netdev_pt_mode
get_pt_mode(const struct netdev *netdev)
{
struct netdev_vport *dev = netdev_vport_cast(netdev);

return dev->tnl_cfg.pt_mode;
}



#ifdef __linux__
static int
Expand Down Expand Up @@ -873,6 +917,7 @@ netdev_vport_get_ifindex(const struct netdev *netdev_)
\
NULL, /* get_features */ \
NULL, /* set_advertisements */ \
get_pt_mode, \
\
NULL, /* set_policing */ \
NULL, /* get_qos_types */ \
Expand Down
1 change: 0 additions & 1 deletion lib/netdev-vport.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@ void netdev_vport_tunnel_register(void);
void netdev_vport_patch_register(void);

bool netdev_vport_is_patch(const struct netdev *);
bool netdev_vport_is_layer3(const struct netdev *);

char *netdev_vport_patch_peer(const struct netdev *netdev);

Expand Down
8 changes: 8 additions & 0 deletions lib/netdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -727,6 +727,14 @@ netdev_set_tx_multiq(struct netdev *netdev, unsigned int n_txq)
return error;
}

enum netdev_pt_mode
netdev_get_pt_mode(const struct netdev *netdev)
{
return (netdev->netdev_class->get_pt_mode
? netdev->netdev_class->get_pt_mode(netdev)
: NETDEV_PT_LEGACY_L2);
}

/* Sends 'batch' on 'netdev'. Returns 0 if successful (for every packet),
* otherwise a positive errno value. Returns EAGAIN without blocking if
* at least one the packets cannot be queued immediately. Returns EMSGSIZE
Expand Down
Loading

0 comments on commit 875ab13

Please sign in to comment.