Skip to content

Commit

Permalink
ovn: Add a case of policy based routing.
Browse files Browse the repository at this point in the history
OVN currently supports multiple gateway routers (residing on
different chassis) connected to the same logical topology.

When external traffic enters the logical topology, they can enter
from any gateway routers and reach its eventual destination. This
is achieved with proper static routes configured on the gateway
routers.

But when traffic is initiated in the logical space by a logical
port, we do not have a good way to distribute that traffic across
multiple gateway routers.

This commit introduces one particular way to do it. Based on the
source IP address or source IP network of the packet, we can now
jump to a specific gateway router.

This is very useful for a specific use case of Kubernetes.
When traffic is initiated inside a container heading to outside world,
we want to be able to send such traffic outside the gateway router
residing in the same host as that of the container. Since each
host gets a specific subnet, we can use source IP address based
policy routing to decide on the gateway router.

Rationale for using the same routing table for both source and
destination IP address based routing:

Some hardware network vendors support policy routing in a different table
on arbitrary "match".  And when a packet enters, if there is a match
in policy based routing table, the default routing table is not
consulted at all.  In case of OVN, we mainly want policy based routing
for north-south traffic. We want east-west traffic to flow as-is. Creating
a separate table for policy based routing complicates the configuration
quite a bit. For e.g., if we have a source IP network based rule added,
to decide a particular gateway router as a next hop, we should add rules at
a higher priority for all the connected routes to make sure that east-west
traffic is not effected in the policy based routing table itself.

Signed-off-by: Gurucharan Shetty <[email protected]>
Acked-by: Ben Pfaff <[email protected]>
  • Loading branch information
shettyg committed Nov 3, 2016
1 parent 75fd74f commit 440a9f4
Show file tree
Hide file tree
Showing 8 changed files with 334 additions and 39 deletions.
1 change: 1 addition & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Post-v2.6.0
* QoS is now implemented via egress shaping rather than ingress policing.
* DSCP marking is now supported, via the new northbound QoS table.
* IPAM now supports fixed MAC addresses.
* Support for source IP address based routing.
- Fixed regression in table stats maintenance introduced in OVS
2.3.0, wherein the number of OpenFlow table hits and misses was
not accurate.
Expand Down
24 changes: 18 additions & 6 deletions ovn/northd/ovn-northd.c
Original file line number Diff line number Diff line change
Expand Up @@ -3247,10 +3247,20 @@ find_lrp_member_ip(const struct ovn_port *op, const char *ip_s)
static void
add_route(struct hmap *lflows, const struct ovn_port *op,
const char *lrp_addr_s, const char *network_s, int plen,
const char *gateway)
const char *gateway, const char *policy)
{
bool is_ipv4 = strchr(network_s, '.') ? true : false;
struct ds match = DS_EMPTY_INITIALIZER;
const char *dir;
uint16_t priority;

if (policy && !strcmp(policy, "src-ip")) {
dir = "src";
priority = plen * 2;
} else {
dir = "dst";
priority = (plen * 2) + 1;
}

/* IPv6 link-local addresses must be scoped to the local router port. */
if (!is_ipv4) {
Expand All @@ -3260,7 +3270,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
ds_put_format(&match, "inport == %s && ", op->json_key);
}
}
ds_put_format(&match, "ip%s.dst == %s/%d", is_ipv4 ? "4" : "6",
ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", dir,
network_s, plen);

struct ds actions = DS_EMPTY_INITIALIZER;
Expand All @@ -3284,7 +3294,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op,

/* The priority here is calculated to implement longest-prefix-match
* routing. */
ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, plen,
ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, priority,
ds_cstr(&match), ds_cstr(&actions));
ds_destroy(&match);
ds_destroy(&actions);
Expand Down Expand Up @@ -3397,7 +3407,9 @@ build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
goto free_prefix_s;
}

add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop);
char *policy = route->policy ? route->policy : "dst-ip";
add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop,
policy);

free_prefix_s:
free(prefix_s);
Expand Down Expand Up @@ -4031,13 +4043,13 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s,
op->lrp_networks.ipv4_addrs[i].network_s,
op->lrp_networks.ipv4_addrs[i].plen, NULL);
op->lrp_networks.ipv4_addrs[i].plen, NULL, NULL);
}

for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s,
op->lrp_networks.ipv6_addrs[i].network_s,
op->lrp_networks.ipv6_addrs[i].plen, NULL);
op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL);
}
}

Expand Down
8 changes: 6 additions & 2 deletions ovn/ovn-nb.ovsschema
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "OVN_Northbound",
"version": "5.4.0",
"cksum": "4176761817 11225",
"version": "5.4.1",
"cksum": "3773248894 11490",
"tables": {
"NB_Global": {
"columns": {
Expand Down Expand Up @@ -196,6 +196,10 @@
"Logical_Router_Static_Route": {
"columns": {
"ip_prefix": {"type": "string"},
"policy": {"type": {"key": {"type": "string",
"enum": ["set", ["src-ip",
"dst-ip"]]},
"min": 0, "max": 1}},
"nexthop": {"type": "string"},
"output_port": {"type": {"key": "string", "min": 0, "max": 1}}},
"isRoot": false},
Expand Down
28 changes: 28 additions & 0 deletions ovn/ovn-nb.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1083,12 +1083,40 @@
Each record represents a static route.
</p>

<p>
When multiple routes match a packet, the longest-prefix match is chosen.
For a given prefix length, a <code>dst-ip</code> route is preferred over
a <code>src-ip</code> route.
</p>

<column name="ip_prefix">
<p>
IP prefix of this route (e.g. 192.168.100.0/24).
</p>
</column>

<column name="policy">
<p>
If it is specified, this setting describes the policy used to make
routing decisions. This setting must be one of the following strings:
</p>
<ul>
<li>
<code>src-ip</code>: This policy sends the packet to the
<ref column="nexthop"/> when the packet's source IP address matches
<ref column="ip_prefix"/>.
</li>
<li>
<code>dst-ip</code>: This policy sends the packet to the
<ref column="nexthop"/> when the packet's destination IP address
matches <ref column="ip_prefix"/>.
</li>
</ul>
<p>
If not specified, the default is <code>dst-ip</code>.
</p>
</column>

<column name="nexthop">
<p>
Nexthop IP address for this route. Nexthop IP address should be the IP
Expand Down
8 changes: 7 additions & 1 deletion ovn/utilities/ovn-nbctl.8.xml
Original file line number Diff line number Diff line change
Expand Up @@ -380,7 +380,7 @@
<h1>Logical Router Static Route Commands</h1>

<dl>
<dt>[<code>--may-exist</code>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
<dt>[<code>--may-exist</code>] [<code>--policy</code>=<var>POLICY</var>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
<dd>
<p>
Adds the specified route to <var>router</var>.
Expand All @@ -395,6 +395,12 @@
on <var>nexthop</var>.
</p>

<p>
<code>--policy</code> describes the policy used to make routing
decisions. This should be one of "dst-ip" or "src-ip". If not
specified, the default is "dst-ip".
</p>

<p>
It is an error if a route with <var>prefix</var> already exists,
unless <code>--may-exist</code> is specified.
Expand Down
43 changes: 32 additions & 11 deletions ovn/utilities/ovn-nbctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ Logical router port commands:\n\
('enabled' or 'disabled')\n\
\n\
Route commands:\n\
lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
[--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
add a route to ROUTER\n\
lr-route-del ROUTER [PREFIX]\n\
remove routes from ROUTER\n\
Expand Down Expand Up @@ -2031,6 +2031,11 @@ nbctl_lr_route_add(struct ctl_context *ctx)
lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
char *prefix, *next_hop;

const char *policy = shash_find_data(&ctx->options, "--policy");
if (policy && strcmp(policy, "src-ip") && strcmp(policy, "dst-ip")) {
ctl_fatal("bad policy: %s", policy);
}

prefix = normalize_prefix_str(ctx->argv[2]);
if (!prefix) {
ctl_fatal("bad prefix argument: %s", ctx->argv[2]);
Expand Down Expand Up @@ -2091,6 +2096,9 @@ nbctl_lr_route_add(struct ctl_context *ctx)
nbrec_logical_router_static_route_set_output_port(route,
ctx->argv[4]);
}
if (policy) {
nbrec_logical_router_static_route_set_policy(route, policy);
}
free(rt_prefix);
free(next_hop);
free(prefix);
Expand All @@ -2104,6 +2112,9 @@ nbctl_lr_route_add(struct ctl_context *ctx)
if (ctx->argc == 5) {
nbrec_logical_router_static_route_set_output_port(route, ctx->argv[4]);
}
if (policy) {
nbrec_logical_router_static_route_set_policy(route, policy);
}

nbrec_logical_router_verify_static_routes(lr);
struct nbrec_logical_router_static_route **new_routes
Expand Down Expand Up @@ -2457,7 +2468,7 @@ nbctl_lrp_get_enabled(struct ctl_context *ctx)
}

struct ipv4_route {
int plen;
int priority;
ovs_be32 addr;
const struct nbrec_logical_router_static_route *route;
};
Expand All @@ -2468,8 +2479,8 @@ ipv4_route_cmp(const void *route1_, const void *route2_)
const struct ipv4_route *route1p = route1_;
const struct ipv4_route *route2p = route2_;

if (route1p->plen != route2p->plen) {
return route1p->plen > route2p->plen ? -1 : 1;
if (route1p->priority != route2p->priority) {
return route1p->priority > route2p->priority ? -1 : 1;
} else if (route1p->addr != route2p->addr) {
return ntohl(route1p->addr) < ntohl(route2p->addr) ? -1 : 1;
} else {
Expand All @@ -2478,7 +2489,7 @@ ipv4_route_cmp(const void *route1_, const void *route2_)
}

struct ipv6_route {
int plen;
int priority;
struct in6_addr addr;
const struct nbrec_logical_router_static_route *route;
};
Expand All @@ -2489,8 +2500,8 @@ ipv6_route_cmp(const void *route1_, const void *route2_)
const struct ipv6_route *route1p = route1_;
const struct ipv6_route *route2p = route2_;

if (route1p->plen != route2p->plen) {
return route1p->plen > route2p->plen ? -1 : 1;
if (route1p->priority != route2p->priority) {
return route1p->priority > route2p->priority ? -1 : 1;
}
return memcmp(&route1p->addr, &route2p->addr, sizeof(route1p->addr));
}
Expand All @@ -2505,6 +2516,12 @@ print_route(const struct nbrec_logical_router_static_route *route, struct ds *s)
free(prefix);
free(next_hop);

if (route->policy) {
ds_put_format(s, " %s", route->policy);
} else {
ds_put_format(s, " %s", "dst-ip");
}

if (route->output_port) {
ds_put_format(s, " %s", route->output_port);
}
Expand All @@ -2530,11 +2547,13 @@ nbctl_lr_route_list(struct ctl_context *ctx)
= lr->static_routes[i];
unsigned int plen;
ovs_be32 ipv4;
const char *policy = route->policy ? route->policy : "dst-ip";
char *error;

error = ip_parse_cidr(route->ip_prefix, &ipv4, &plen);
if (!error) {
ipv4_routes[n_ipv4_routes].plen = plen;
ipv4_routes[n_ipv4_routes].priority = !strcmp(policy, "dst-ip")
? (2 * plen) + 1
: 2 * plen;
ipv4_routes[n_ipv4_routes].addr = ipv4;
ipv4_routes[n_ipv4_routes].route = route;
n_ipv4_routes++;
Expand All @@ -2544,7 +2563,9 @@ nbctl_lr_route_list(struct ctl_context *ctx)
struct in6_addr ipv6;
error = ipv6_parse_cidr(route->ip_prefix, &ipv6, &plen);
if (!error) {
ipv6_routes[n_ipv6_routes].plen = plen;
ipv6_routes[n_ipv6_routes].priority = !strcmp(policy, "dst-ip")
? (2 * plen) + 1
: 2 * plen;
ipv6_routes[n_ipv6_routes].addr = ipv6;
ipv6_routes[n_ipv6_routes].route = route;
n_ipv6_routes++;
Expand Down Expand Up @@ -2947,7 +2968,7 @@ static const struct ctl_command_syntax nbctl_commands[] = {

/* logical router route commands. */
{ "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
nbctl_lr_route_add, NULL, "--may-exist", RW },
nbctl_lr_route_add, NULL, "--may-exist,--policy=", RW },
{ "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL, nbctl_lr_route_del,
NULL, "--if-exists", RW },
{ "lr-route-list", 1, 1, "ROUTER", NULL, nbctl_lr_route_list, NULL,
Expand Down
42 changes: 23 additions & 19 deletions tests/ovn-nbctl.at
Original file line number Diff line number Diff line change
Expand Up @@ -657,20 +657,23 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1/64], [
])

AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1])
AT_CHECK([ovn-nbctl --policy=src-ip lr-route-add lr0 9.16.1.0/24 11.0.0.1])

AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
10.0.0.0/24 11.0.0.1
10.0.1.0/24 11.0.1.1 lp0
0.0.0.0/0 192.168.0.1
10.0.0.0/24 11.0.0.1 dst-ip
10.0.1.0/24 11.0.1.1 dst-ip lp0
9.16.1.0/24 11.0.0.1 src-ip
0.0.0.0/0 192.168.0.1 dst-ip
])

AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 lp1])
AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
10.0.0.0/24 11.0.0.1 lp1
10.0.1.0/24 11.0.1.1 lp0
0.0.0.0/0 192.168.0.1
10.0.0.0/24 11.0.0.1 dst-ip lp1
10.0.1.0/24 11.0.1.1 dst-ip lp0
9.16.1.0/24 11.0.0.1 src-ip
0.0.0.0/0 192.168.0.1 dst-ip
])

dnl Delete non-existent prefix
Expand All @@ -680,11 +683,12 @@ AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.2.1/24], [1], [],
AT_CHECK([ovn-nbctl --if-exists lr-route-del lr0 10.0.2.1/24])

AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.1.1/24])
AT_CHECK([ovn-nbctl lr-route-del lr0 9.16.1.0/24])

AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
10.0.0.0/24 11.0.0.1 lp1
0.0.0.0/0 192.168.0.1
10.0.0.0/24 11.0.0.1 dst-ip lp1
0.0.0.0/0 192.168.0.1 dst-ip
])

AT_CHECK([ovn-nbctl lr-route-del lr0])
Expand All @@ -698,17 +702,17 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])

AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv6 Routes
2001:db8::/64 2001:db8:0:f102::1 lp0
2001:db8:1::/64 2001:db8:0:f103::1
::/0 2001:db8:0:f101::1
2001:db8::/64 2001:db8:0:f102::1 dst-ip lp0
2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
::/0 2001:db8:0:f101::1 dst-ip
])

AT_CHECK([ovn-nbctl lr-route-del lr0 2001:0db8:0::/64])

AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv6 Routes
2001:db8:1::/64 2001:db8:0:f103::1
::/0 2001:db8:0:f101::1
2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
::/0 2001:db8:0:f101::1 dst-ip
])

AT_CHECK([ovn-nbctl lr-route-del lr0])
Expand All @@ -725,14 +729,14 @@ AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])

AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
IPv4 Routes
10.0.0.0/24 11.0.0.1
10.0.1.0/24 11.0.1.1 lp0
0.0.0.0/0 192.168.0.1
10.0.0.0/24 11.0.0.1 dst-ip
10.0.1.0/24 11.0.1.1 dst-ip lp0
0.0.0.0/0 192.168.0.1 dst-ip

IPv6 Routes
2001:db8::/64 2001:db8:0:f102::1 lp0
2001:db8:1::/64 2001:db8:0:f103::1
::/0 2001:db8:0:f101::1
2001:db8::/64 2001:db8:0:f102::1 dst-ip lp0
2001:db8:1::/64 2001:db8:0:f103::1 dst-ip
::/0 2001:db8:0:f101::1 dst-ip
])

OVN_NBCTL_TEST_STOP
Expand Down
Loading

0 comments on commit 440a9f4

Please sign in to comment.