Skip to content

Commit

Permalink
ice: Report stats for allocated queues via ethtool stats
Browse files Browse the repository at this point in the history
It is not safe to have the string table for statistics change order or
size over the lifetime of a given netdevice. This is because of the
nature of the 3-step process for obtaining stats. First, user space
performs a request for the size of the strings table. Second it performs
a separate request for the strings themselves, after allocating space
for the table. Third, it requests the stats themselves, also allocating
space for the table.

If the size decreased, there is potential to see garbage data or stats
values. In the worst case, we could potentially see stats values become
mis-aligned with their strings, so that it looks like a statistic is
being reported differently than it actually is.

Even worse, if the size increased, there is potential that the strings
table or stats table was not allocated large enough and the stats code
could access and write to memory it should not, potentially resulting in
undefined behavior and system crashes.

It isn't even safe if the size always changes under the RTNL lock. This
is because the calls take place over multiple user space commands, so it
is not possible to hold the RTNL lock for the entire duration of
obtaining strings and stats. Further, not all consumers of the ethtool
API are the user space ethtool program, and it is possible that one
assumes the strings will not change (valid under the current contract),
and thus only requests the stats values when requesting stats in a loop.

Finally, it's not possible in the general case to detect when the size
changes, because it is quite possible that one value which could impact
the stat size increased, while another decreased. This would result in
the same total number of stats, but reordering them so that stats no
longer line up with the strings they belong to. Since only size changes
aren't enough, we would need some sort of hash or token to determine
when the strings no longer match. This would require extending the
ethtool stats commands, but there is no more space in the relevant
structures.

The real solution to resolve this would be to add a completely new API
for stats, probably over netlink.

In the ice driver, the only thing impacting the stats that is not
constant is the number of queues. Instead of reporting stats for each
used queue, report stats for each allocated queue. We do not change the
number of queues allocated for a given netdevice, as we pass this into
the alloc_etherdev_mq() function to set the num_tx_queues and
num_rx_queues.

This resolves the potential bugs at the slight cost of displaying many
queue statistics which will not be activated.

Signed-off-by: Jacob Keller <[email protected]>
Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
  • Loading branch information
jacob-keller authored and Jeff Kirsher committed Aug 23, 2018
1 parent 5ab5224 commit f8ba7db
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 13 deletions.
7 changes: 7 additions & 0 deletions drivers/net/ethernet/intel/ice/ice.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,13 @@ extern const char ice_drv_ver[];
#define ice_for_each_rxq(vsi, i) \
for ((i) = 0; (i) < (vsi)->num_rxq; (i)++)

/* Macros for each allocated tx/rx ring whether used or not in a VSI */
#define ice_for_each_alloc_txq(vsi, i) \
for ((i) = 0; (i) < (vsi)->alloc_txq; (i)++)

#define ice_for_each_alloc_rxq(vsi, i) \
for ((i) = 0; (i) < (vsi)->alloc_rxq; (i)++)

struct ice_tc_info {
u16 qoffset;
u16 qcount;
Expand Down
52 changes: 39 additions & 13 deletions drivers/net/ethernet/intel/ice/ice_ethtool.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ static int ice_q_stats_len(struct net_device *netdev)
{
struct ice_netdev_priv *np = netdev_priv(netdev);

return ((np->vsi->num_txq + np->vsi->num_rxq) *
return ((np->vsi->alloc_txq + np->vsi->alloc_rxq) *
(sizeof(struct ice_q_stats) / sizeof(u64)));
}

Expand Down Expand Up @@ -218,15 +218,15 @@ static void ice_get_strings(struct net_device *netdev, u32 stringset, u8 *data)
p += ETH_GSTRING_LEN;
}

ice_for_each_txq(vsi, i) {
ice_for_each_alloc_txq(vsi, i) {
snprintf(p, ETH_GSTRING_LEN,
"tx-queue-%u.tx_packets", i);
p += ETH_GSTRING_LEN;
snprintf(p, ETH_GSTRING_LEN, "tx-queue-%u.tx_bytes", i);
p += ETH_GSTRING_LEN;
}

ice_for_each_rxq(vsi, i) {
ice_for_each_alloc_rxq(vsi, i) {
snprintf(p, ETH_GSTRING_LEN,
"rx-queue-%u.rx_packets", i);
p += ETH_GSTRING_LEN;
Expand All @@ -253,6 +253,24 @@ static int ice_get_sset_count(struct net_device *netdev, int sset)
{
switch (sset) {
case ETH_SS_STATS:
/* The number (and order) of strings reported *must* remain
* constant for a given netdevice. This function must not
* report a different number based on run time parameters
* (such as the number of queues in use, or the setting of
* a private ethtool flag). This is due to the nature of the
* ethtool stats API.
*
* User space programs such as ethtool must make 3 separate
* ioctl requests, one for size, one for the strings, and
* finally one for the stats. Since these cross into
* user space, changes to the number or size could result in
* undefined memory access or incorrect string<->value
* correlations for statistics.
*
* Even if it appears to be safe, changes to the size or
* order of strings will suffer from race conditions and are
* not safe.
*/
return ICE_ALL_STATS_LEN(netdev);
default:
return -EOPNOTSUPP;
Expand Down Expand Up @@ -280,18 +298,26 @@ ice_get_ethtool_stats(struct net_device *netdev,
/* populate per queue stats */
rcu_read_lock();

ice_for_each_txq(vsi, j) {
ice_for_each_alloc_txq(vsi, j) {
ring = READ_ONCE(vsi->tx_rings[j]);
if (!ring)
continue;
data[i++] = ring->stats.pkts;
data[i++] = ring->stats.bytes;
if (ring) {
data[i++] = ring->stats.pkts;
data[i++] = ring->stats.bytes;
} else {
data[i++] = 0;
data[i++] = 0;
}
}

ice_for_each_rxq(vsi, j) {
ice_for_each_alloc_rxq(vsi, j) {
ring = READ_ONCE(vsi->rx_rings[j]);
data[i++] = ring->stats.pkts;
data[i++] = ring->stats.bytes;
if (ring) {
data[i++] = ring->stats.pkts;
data[i++] = ring->stats.bytes;
} else {
data[i++] = 0;
data[i++] = 0;
}
}

rcu_read_unlock();
Expand Down Expand Up @@ -519,7 +545,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring)
goto done;
}

for (i = 0; i < vsi->num_txq; i++) {
for (i = 0; i < vsi->alloc_txq; i++) {
/* clone ring and setup updated count */
tx_rings[i] = *vsi->tx_rings[i];
tx_rings[i].count = new_tx_cnt;
Expand Down Expand Up @@ -551,7 +577,7 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring)
goto done;
}

for (i = 0; i < vsi->num_rxq; i++) {
for (i = 0; i < vsi->alloc_rxq; i++) {
/* clone ring and setup updated count */
rx_rings[i] = *vsi->rx_rings[i];
rx_rings[i].count = new_rx_cnt;
Expand Down

0 comments on commit f8ba7db

Please sign in to comment.