Skip to content

Commit

Permalink
net: add a generic tracepoint for TX queue timeout
Browse files Browse the repository at this point in the history
Although devlink health report does a nice job on reporting TX
timeout and other NIC errors, unfortunately it requires drivers
to support it but currently only mlx5 has implemented it.
Before other drivers could catch up, it is useful to have a
generic tracepoint to monitor this kind of TX timeout. We have
been suffering TX timeout with different drivers, we plan to
start to monitor it with rasdaemon which just needs a new tracepoint.

Sample output:

  ksoftirqd/1-16    [001] ..s2   144.043173: net_dev_xmit_timeout: dev=ens3 driver=e1000 queue=0

Cc: Eran Ben Elisha <[email protected]>
Cc: Jiri Pirko <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Reviewed-by: Eran Ben Elisha <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
  • Loading branch information
congwang authored and davem330 committed May 4, 2019
1 parent f3f050a commit 141b6b2
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 0 deletions.
23 changes: 23 additions & 0 deletions include/trace/events/net.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,29 @@ TRACE_EVENT(net_dev_xmit,
__get_str(name), __entry->skbaddr, __entry->len, __entry->rc)
);

TRACE_EVENT(net_dev_xmit_timeout,

TP_PROTO(struct net_device *dev,
int queue_index),

TP_ARGS(dev, queue_index),

TP_STRUCT__entry(
__string( name, dev->name )
__string( driver, netdev_drivername(dev))
__field( int, queue_index )
),

TP_fast_assign(
__assign_str(name, dev->name);
__assign_str(driver, netdev_drivername(dev));
__entry->queue_index = queue_index;
),

TP_printk("dev=%s driver=%s queue=%d",
__get_str(name), __get_str(driver), __entry->queue_index)
);

DECLARE_EVENT_CLASS(net_dev_template,

TP_PROTO(struct sk_buff *skb),
Expand Down
2 changes: 2 additions & 0 deletions net/sched/sch_generic.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#include <net/pkt_sched.h>
#include <net/dst.h>
#include <trace/events/qdisc.h>
#include <trace/events/net.h>
#include <net/xfrm.h>

/* Qdisc to use by default */
Expand Down Expand Up @@ -441,6 +442,7 @@ static void dev_watchdog(struct timer_list *t)
}

if (some_queue_timedout) {
trace_net_dev_xmit_timeout(dev, i);
WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
dev->name, netdev_drivername(dev), i);
dev->netdev_ops->ndo_tx_timeout(dev);
Expand Down

0 comments on commit 141b6b2

Please sign in to comment.