diff options
Diffstat (limited to 'Documentation/networking/device_drivers/mellanox/mlx5.rst')
-rw-r--r-- | Documentation/networking/device_drivers/mellanox/mlx5.rst | 79 |
1 files changed, 77 insertions, 2 deletions
diff --git a/Documentation/networking/device_drivers/mellanox/mlx5.rst b/Documentation/networking/device_drivers/mellanox/mlx5.rst index 214325897732..b30a63dbf4b7 100644 --- a/Documentation/networking/device_drivers/mellanox/mlx5.rst +++ b/Documentation/networking/device_drivers/mellanox/mlx5.rst @@ -12,6 +12,7 @@ Contents - `Enabling the driver and kconfig options`_ - `Devlink info`_ - `Devlink health reporters`_ +- `mlx5 tracepoints`_ Enabling the driver and kconfig options ================================================ @@ -126,7 +127,7 @@ Devlink health reporters tx reporter ----------- -The tx reporter is responsible of two error scenarios: +The tx reporter is responsible for reporting and recovering of the following two error scenarios: - TX timeout Report on kernel tx timeout detection. @@ -135,7 +136,7 @@ The tx reporter is responsible of two error scenarios: Report on error tx completion. Recover by flushing the TX queue and reset it. -TX reporter also support Diagnose callback, on which it provides +TX reporter also support on demand diagnose callback, on which it provides real time information of its send queues status. User commands examples: @@ -144,11 +145,40 @@ User commands examples: $ devlink health diagnose pci/0000:82:00.0 reporter tx +NOTE: This command has valid output only when interface is up, otherwise the command has empty output. + - Show number of tx errors indicated, number of recover flows ended successfully, is autorecover enabled and graceful period from last recover:: $ devlink health show pci/0000:82:00.0 reporter tx +rx reporter +----------- +The rx reporter is responsible for reporting and recovering of the following two error scenarios: + +- RX queues initialization (population) timeout + RX queues descriptors population on ring initialization is done in + napi context via triggering an irq, in case of a failure to get + the minimum amount of descriptors, a timeout would occur and it + could be recoverable by polling the EQ (Event Queue). +- RX completions with errors (reported by HW on interrupt context) + Report on rx completion error. + Recover (if needed) by flushing the related queue and reset it. + +RX reporter also supports on demand diagnose callback, on which it +provides real time information of its receive queues status. + +- Diagnose rx queues status, and corresponding completion queue:: + + $ devlink health diagnose pci/0000:82:00.0 reporter rx + +NOTE: This command has valid output only when interface is up, otherwise the command has empty output. + +- Show number of rx errors indicated, number of recover flows ended successfully, + is autorecover enabled and graceful period from last recover:: + + $ devlink health show pci/0000:82:00.0 reporter rx + fw reporter ----------- The fw reporter implements diagnose and dump callbacks. @@ -190,3 +220,48 @@ User commands examples: $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal NOTE: This command can run only on PF. + +mlx5 tracepoints +================ + +mlx5 driver provides internal trace points for tracking and debugging using +kernel tracepoints interfaces (refer to Documentation/trace/ftrase.rst). + +For the list of support mlx5 events check /sys/kernel/debug/tracing/events/mlx5/ + +tc and eswitch offloads tracepoints: + +- mlx5e_configure_flower: trace flower filter actions and cookies offloaded to mlx5:: + + $ echo mlx5:mlx5e_configure_flower >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + tc-6535 [019] ...1 2672.404466: mlx5e_configure_flower: cookie=0000000067874a55 actions= REDIRECT + +- mlx5e_delete_flower: trace flower filter actions and cookies deleted from mlx5:: + + $ echo mlx5:mlx5e_delete_flower >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + tc-6569 [010] .N.1 2686.379075: mlx5e_delete_flower: cookie=0000000067874a55 actions= NULL + +- mlx5e_stats_flower: trace flower stats request:: + + $ echo mlx5:mlx5e_stats_flower >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + tc-6546 [010] ...1 2679.704889: mlx5e_stats_flower: cookie=0000000060eb3d6a bytes=0 packets=0 lastused=4295560217 + +- mlx5e_tc_update_neigh_used_value: trace tunnel rule neigh update value offloaded to mlx5:: + + $ echo mlx5:mlx5e_tc_update_neigh_used_value >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + kworker/u48:4-8806 [009] ...1 55117.882428: mlx5e_tc_update_neigh_used_value: netdev: ens1f0 IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_used=1 + +- mlx5e_rep_neigh_update: trace neigh update tasks scheduled due to neigh state change events:: + + $ echo mlx5:mlx5e_rep_neigh_update >> /sys/kernel/debug/tracing/set_event + $ cat /sys/kernel/debug/tracing/trace + ... + kworker/u48:7-2221 [009] ...1 1475.387435: mlx5e_rep_neigh_update: netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: 1.1.1.10 IPv6: ::ffff:1.1.1.10 neigh_connected=1 |