Skip to content

Commit

Permalink
openvswitch: Userspace tunneling.
Browse files Browse the repository at this point in the history
Following patch adds support for userspace tunneling. Tunneling
needs three more component first is routing table which is configured by
caching kernel routes and second is ARP cache which build automatically
by snooping arp. And third is tunnel protocol table which list all
listening protocols which is populated by vswitchd as tunnel ports
are added. GRE and VXLAN protocol support is added in this patch.

Tunneling works as follows:
On packet receive vswitchd check if this packet is targeted to tunnel
port. If it is then vswitchd inserts tunnel pop action which pops
header and sends packet to tunnel port.
On packet xmit rather than generating Set tunnel action it generate
tunnel push action which has tunnel header data. datapath can use
tunnel-push action data to generate header for each packet and
forward this packet to output port. Since tunnel-push action
contains most of packet header vswitchd needs to lookup routing
table and arp table to build this action.

Signed-off-by: Pravin B Shelar <[email protected]>
Acked-by: Jarno Rajahalme <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Acked-by: Ben Pfaff <[email protected]>
  • Loading branch information
Pravin B Shelar committed Nov 12, 2014
1 parent 0746a84 commit a36de77
Show file tree
Hide file tree
Showing 46 changed files with 2,144 additions and 77 deletions.
1 change: 1 addition & 0 deletions Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ EXTRA_DIST = \
PORTING.md \
README.md \
README-lisp.md \
README-native-tunneling.md \
REPORTING-BUGS.md \
TODO.md \
.travis.yml \
Expand Down
2 changes: 2 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ Post-v2.3.0
- A simple wrapper script, 'ovs-docker', to integrate OVS with Docker
containers. If and when there is a native integration of Open vSwitch
with Docker, the wrapper script will be retired.
- Added support for DPDK Tunneling. VXLAN and GRE are supported protocols.
This is generic tunneling mechanism for userspace datapath.


v2.3.0 - 14 Aug 2014
Expand Down
82 changes: 82 additions & 0 deletions README-native-tunneling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@

Open vSwitch supports tunneling in userspace. Tunneling is implemented in
platform independent way.

Setup:
======
Setup physical bridges for all physical interfaces. Create integration bridge.
Add VXLAN port to int-bridge. Assign IP address to physical bridge where
VXLAN traffic is expected.

Example:
========
Connect to VXLAN tunnel endpoint logical ip: 192.168.1.2 and 192.168.1.1.

Configure OVS bridges as follows.

1. Lets assume 172.168.1.2/24 network is reachable via eth1 create physical bridge br-eth1
assign ip address (172.168.1.1/24) to br-eth1, Add eth1 to br-eth1
2. Check ovs cached routes using appctl command
ovs-appctl ovs/route/show
Add tunnel route if not present in OVS route table.
ovs-appctl ovs/route/add 172.168.1.1/24 br-eth1
3. Add integration brdge int-br and add tunnel port using standard syntax.
ovs-vsctl add-port int-br vxlan0 -- set interface vxlan0 type=vxlan options:remote_ip=172.168.1.2
4. Assign IP address to int-br, So final topology looks like:


192.168.1.1/24
+--------------+
| int-br | 192.168.1.2/24
+--------------+ +--------------+
| vxlan0 | | vxlan0 |
+--------------+ +--------------+
| |
| |
| |
172.168.1.1/24 |
+--------------+ |
| br-eth1 | 172.168.1.2/24
+--------------+ +---------------+
| eth1 |----------------------------------| eth1 |
+--------------+ +----------------

Host A with OVS. Remote host.

With this setup, ping to VXLAN target device (192.168.1.2) should work
There are following commands that shows internal tables:

Tunneling related commands:
===========================
Tunnel routing table:
To Add route:
ovs-appctl ovs/route/add <IP address>/<prefix length> <output-bridge-name> <gw>
To see all routes configured:
ovs-appctl ovs/route/show
To del route:
ovs-appctl ovs/route/del <IP address>/<prefix length>

ARP:
To see arp cache content:
ovs-appctl tnl/arp/show
To flush arp cache:
ovs-appctl tnl/arp/flush

To check tunnel ports listening in vswitchd:
ovs-appctl tnl/ports/show

To set range for VxLan udp source port:
To set:
ovs-appctl tnl/egress_port_range <num1> <num2>
Shows Current range:
ovs-appctl tnl/egress_port_range

To check datapath ports:
ovs-appctl dpif/show

To check datapath flows:
ovs-appctl dpif/dump-flows

Contact
=======
[email protected]
31 changes: 31 additions & 0 deletions datapath/linux/compat/include/linux/openvswitch.h
Original file line number Diff line number Diff line change
Expand Up @@ -582,6 +582,28 @@ struct ovs_action_hash {
uint32_t hash_basis;
};

#ifndef __KERNEL__
#define TNL_PUSH_HEADER_SIZE 128

/*
* struct ovs_action_push_tnl - %OVS_ACTION_ATTR_TUNNEL_PUSH
* @tnl_port: To identify tunnel port to pass header info.
* @out_port: Physical port to send encapsulated packet.
* @header_len: Length of the header to be pushed.
* @tnl_type: This is only required to format this header. Otherwise
* ODP layer can not parse %header.
* @header: Partial header for the tunnel. Tunnel push action can use
* this header to build final header according to actual packet parameters.
*/
struct ovs_action_push_tnl {
uint32_t tnl_port;
uint32_t out_port;
uint32_t header_len;
uint32_t tnl_type; /* For logging. */
uint8_t header[TNL_PUSH_HEADER_SIZE];
};
#endif

/**
* enum ovs_action_attr - Action types.
*
Expand Down Expand Up @@ -617,6 +639,11 @@ struct ovs_action_hash {
* Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all
* fields within a header are modifiable, e.g. the IPv4 protocol and fragment
* type may not be changed.
*
* @OVS_ACTION_ATTR_TUNNEL_PUSH: Push tunnel header described by struct
* ovs_action_push_tnl.
* @OVS_ACTION_ATTR_TUNNEL_POP: Lookup tunnel port by port-no passed and pop
* tunnel header.
*/

enum ovs_action_attr {
Expand All @@ -636,6 +663,10 @@ enum ovs_action_attr {
* The data must be zero for the unmasked
* bits. */

#ifndef __KERNEL__
OVS_ACTION_ATTR_TUNNEL_PUSH, /* struct ovs_action_push_tnl*/
OVS_ACTION_ATTR_TUNNEL_POP, /* u32 port number. */
#endif
__OVS_ACTION_ATTR_MAX
};

Expand Down
1 change: 1 addition & 0 deletions debian/openvswitch-common.docs
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
FAQ.md
INSTALL.DPDK.md
README-native-tunneling.md
4 changes: 4 additions & 0 deletions lib/automake.mk
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,10 @@ lib_libopenvswitch_la_SOURCES = \
lib/timer.h \
lib/timeval.c \
lib/timeval.h \
lib/tnl-arp-cache.c \
lib/tnl-arp-cache.h \
lib/tnl-ports.c \
lib/tnl-ports.h \
lib/token-bucket.c \
lib/token-bucket.h \
lib/type-props.h \
Expand Down
104 changes: 101 additions & 3 deletions lib/dpif-netdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@
#include "shash.h"
#include "sset.h"
#include "timeval.h"
#include "tnl-arp-cache.h"
#include "unixctl.h"
#include "util.h"
#include "vlog.h"
Expand Down Expand Up @@ -226,6 +227,7 @@ struct dp_netdev {
* for pin of pmd threads. */
size_t n_dpdk_rxqs;
char *pmd_cmask;
uint64_t last_tnl_conf_seq;
};

static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev *dp,
Expand Down Expand Up @@ -610,6 +612,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class,
return error;
}

dp->last_tnl_conf_seq = seq_read(tnl_conf_seq);
*dpp = dp;
return 0;
}
Expand Down Expand Up @@ -2185,12 +2188,14 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
}
}

static void
/* Return true if needs to revalidate datapath flows. */
static bool
dpif_netdev_run(struct dpif *dpif)
{
struct dp_netdev_port *port;
struct dp_netdev *dp = get_dp_netdev(dpif);
struct dp_netdev_pmd_thread *non_pmd = dp_netdev_get_nonpmd(dp);
uint64_t new_tnl_seq;

ovs_mutex_lock(&dp->non_pmd_mutex);
CMAP_FOR_EACH (port, node, &dp->ports) {
Expand All @@ -2203,6 +2208,14 @@ dpif_netdev_run(struct dpif *dpif)
}
}
ovs_mutex_unlock(&dp->non_pmd_mutex);
tnl_arp_cache_run();
new_tnl_seq = seq_read(tnl_conf_seq);

if (dp->last_tnl_conf_seq != new_tnl_seq) {
dp->last_tnl_conf_seq = new_tnl_seq;
return true;
}
return false;
}

static void
Expand All @@ -2222,6 +2235,7 @@ dpif_netdev_wait(struct dpif *dpif)
}
}
ovs_mutex_unlock(&dp_netdev_mutex);
seq_wait(tnl_conf_seq, dp->last_tnl_conf_seq);
}

struct rxq_poll {
Expand Down Expand Up @@ -2925,15 +2939,45 @@ dpif_netdev_register_upcall_cb(struct dpif *dpif, upcall_callback *cb,
static void
dp_netdev_drop_packets(struct dpif_packet ** packets, int cnt, bool may_steal)
{
int i;

if (may_steal) {
int i;

for (i = 0; i < cnt; i++) {
dpif_packet_delete(packets[i]);
}
}
}

static int
push_tnl_action(const struct dp_netdev *dp,
const struct nlattr *attr,
struct dpif_packet **packets, int cnt)
{
struct dp_netdev_port *tun_port;
const struct ovs_action_push_tnl *data;

data = nl_attr_get(attr);

tun_port = dp_netdev_lookup_port(dp, u32_to_odp(data->tnl_port));
if (!tun_port) {
return -EINVAL;
}
netdev_push_header(tun_port->netdev, packets, cnt, data);

return 0;
}

static void
dp_netdev_clone_pkt_batch(struct dpif_packet **tnl_pkt,
struct dpif_packet **packets, int cnt)
{
int i;

for (i = 0; i < cnt; i++) {
tnl_pkt[i] = dpif_packet_clone(packets[i]);
}
}

static void
dp_execute_cb(void *aux_, struct dpif_packet **packets, int cnt,
const struct nlattr *a, bool may_steal)
Expand All @@ -2956,6 +3000,60 @@ dp_execute_cb(void *aux_, struct dpif_packet **packets, int cnt,
}
break;

case OVS_ACTION_ATTR_TUNNEL_PUSH:
if (*depth < MAX_RECIRC_DEPTH) {
struct dpif_packet *tnl_pkt[NETDEV_MAX_RX_BATCH];
int err;

if (!may_steal) {
dp_netdev_clone_pkt_batch(tnl_pkt, packets, cnt);
packets = tnl_pkt;
}

err = push_tnl_action(dp, a, packets, cnt);
if (!err) {
(*depth)++;
dp_netdev_input(pmd, packets, cnt);
(*depth)--;
} else {
dp_netdev_drop_packets(tnl_pkt, cnt, !may_steal);
}
return;
}
break;

case OVS_ACTION_ATTR_TUNNEL_POP:
if (*depth < MAX_RECIRC_DEPTH) {
odp_port_t portno = u32_to_odp(nl_attr_get_u32(a));

p = dp_netdev_lookup_port(dp, portno);
if (p) {
struct dpif_packet *tnl_pkt[NETDEV_MAX_RX_BATCH];
int err;

if (!may_steal) {
dp_netdev_clone_pkt_batch(tnl_pkt, packets, cnt);
packets = tnl_pkt;
}

err = netdev_pop_header(p->netdev, packets, cnt);
if (!err) {

for (i = 0; i < cnt; i++) {
packets[i]->md.in_port.odp_port = portno;
}

(*depth)++;
dp_netdev_input(pmd, packets, cnt);
(*depth)--;
} else {
dp_netdev_drop_packets(tnl_pkt, cnt, !may_steal);
}
return;
}
}
break;

case OVS_ACTION_ATTR_USERSPACE:
if (!fat_rwlock_tryrdlock(&dp->upcall_rwlock)) {
const struct nlattr *userdata;
Expand Down
3 changes: 2 additions & 1 deletion lib/dpif-netlink.c
Original file line number Diff line number Diff line change
Expand Up @@ -689,7 +689,7 @@ dpif_netlink_destroy(struct dpif *dpif_)
return dpif_netlink_dp_transact(&dp, NULL, NULL);
}

static void
static bool
dpif_netlink_run(struct dpif *dpif_)
{
struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
Expand All @@ -700,6 +700,7 @@ dpif_netlink_run(struct dpif *dpif_)
dpif_netlink_refresh_channels(dpif, dpif->n_handlers);
fat_rwlock_unlock(&dpif->upcall_lock);
}
return false;
}

static int
Expand Down
5 changes: 3 additions & 2 deletions lib/dpif-provider.h
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,9 @@ struct dpif_class {
* the 'close' member function. */
int (*destroy)(struct dpif *dpif);

/* Performs periodic work needed by 'dpif', if any is necessary. */
void (*run)(struct dpif *dpif);
/* Performs periodic work needed by 'dpif', if any is necessary.
* Returns true if need to revalidate. */
bool (*run)(struct dpif *dpif);

/* Arranges for poll_block() to wake up if the "run" member function needs
* to be called for 'dpif'. */
Expand Down
Loading

0 comments on commit a36de77

Please sign in to comment.