From f4bdd7103652fab5ac8b0ed75fa5cbc515b50b8b Mon Sep 17 00:00:00 2001 From: Jacob Keller Date: Thu, 9 Jan 2020 14:46:10 -0800 Subject: devlink: move devlink documentation to subfolder Combine the documentation for devlink into a subfolder, and provide an index.rst file that can be used to generally describe devlink. Signed-off-by: Jacob Keller Signed-off-by: David S. Miller --- Documentation/networking/devlink-health.txt | 86 ------- Documentation/networking/devlink-info-versions.rst | 64 ----- Documentation/networking/devlink-params-bnxt.txt | 18 -- Documentation/networking/devlink-params-mlx5.txt | 17 -- Documentation/networking/devlink-params-mlxsw.txt | 10 - .../networking/devlink-params-mv88e6xxx.txt | 7 - Documentation/networking/devlink-params-nfp.txt | 5 - .../networking/devlink-params-ti-cpsw-switch.txt | 10 - Documentation/networking/devlink-params.txt | 71 ------ .../networking/devlink-trap-netdevsim.rst | 20 -- Documentation/networking/devlink-trap.rst | 270 --------------------- .../networking/devlink/devlink-health.txt | 86 +++++++ .../networking/devlink/devlink-info-versions.rst | 64 +++++ .../networking/devlink/devlink-params-bnxt.txt | 18 ++ .../networking/devlink/devlink-params-mlx5.txt | 17 ++ .../networking/devlink/devlink-params-mlxsw.txt | 10 + .../devlink/devlink-params-mv88e6xxx.txt | 7 + .../networking/devlink/devlink-params-nfp.txt | 5 + .../devlink/devlink-params-ti-cpsw-switch.txt | 10 + .../networking/devlink/devlink-params.txt | 71 ++++++ .../networking/devlink/devlink-trap-netdevsim.rst | 20 ++ Documentation/networking/devlink/devlink-trap.rst | 270 +++++++++++++++++++++ Documentation/networking/devlink/index.rst | 14 ++ Documentation/networking/index.rst | 4 +- 24 files changed, 593 insertions(+), 581 deletions(-) delete mode 100644 Documentation/networking/devlink-health.txt delete mode 100644 Documentation/networking/devlink-info-versions.rst delete mode 100644 Documentation/networking/devlink-params-bnxt.txt delete mode 100644 Documentation/networking/devlink-params-mlx5.txt delete mode 100644 Documentation/networking/devlink-params-mlxsw.txt delete mode 100644 Documentation/networking/devlink-params-mv88e6xxx.txt delete mode 100644 Documentation/networking/devlink-params-nfp.txt delete mode 100644 Documentation/networking/devlink-params-ti-cpsw-switch.txt delete mode 100644 Documentation/networking/devlink-params.txt delete mode 100644 Documentation/networking/devlink-trap-netdevsim.rst delete mode 100644 Documentation/networking/devlink-trap.rst create mode 100644 Documentation/networking/devlink/devlink-health.txt create mode 100644 Documentation/networking/devlink/devlink-info-versions.rst create mode 100644 Documentation/networking/devlink/devlink-params-bnxt.txt create mode 100644 Documentation/networking/devlink/devlink-params-mlx5.txt create mode 100644 Documentation/networking/devlink/devlink-params-mlxsw.txt create mode 100644 Documentation/networking/devlink/devlink-params-mv88e6xxx.txt create mode 100644 Documentation/networking/devlink/devlink-params-nfp.txt create mode 100644 Documentation/networking/devlink/devlink-params-ti-cpsw-switch.txt create mode 100644 Documentation/networking/devlink/devlink-params.txt create mode 100644 Documentation/networking/devlink/devlink-trap-netdevsim.rst create mode 100644 Documentation/networking/devlink/devlink-trap.rst create mode 100644 Documentation/networking/devlink/index.rst (limited to 'Documentation') diff --git a/Documentation/networking/devlink-health.txt b/Documentation/networking/devlink-health.txt deleted file mode 100644 index 1db3fbea0831..000000000000 --- a/Documentation/networking/devlink-health.txt +++ /dev/null @@ -1,86 +0,0 @@ -The health mechanism is targeted for Real Time Alerting, in order to know when -something bad had happened to a PCI device -- Provide alert debug information -- Self healing -- If problem needs vendor support, provide a way to gather all needed debugging - information. - -The main idea is to unify and centralize driver health reports in the -generic devlink instance and allow the user to set different -attributes of the health reporting and recovery procedures. - -The devlink health reporter: -Device driver creates a "health reporter" per each error/health type. -Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error) -or unknown (driver specific). -For each registered health reporter a driver can issue error/health reports -asynchronously. All health reports handling is done by devlink. -Device driver can provide specific callbacks for each "health reporter", e.g. - - Recovery procedures - - Diagnostics and object dump procedures - - OOB initial parameters -Different parts of the driver can register different types of health reporters -with different handlers. - -Once an error is reported, devlink health will do the following actions: - * A log is being send to the kernel trace events buffer - * Health status and statistics are being updated for the reporter instance - * Object dump is being taken and saved at the reporter instance (as long as - there is no other dump which is already stored) - * Auto recovery attempt is being done. Depends on: - - Auto-recovery configuration - - Grace period vs. time passed since last recover - -The user interface: -User can access/change each reporter's parameters and driver specific callbacks -via devlink, e.g per error type (per health reporter) - - Configure reporter's generic parameters (like: disable/enable auto recovery) - - Invoke recovery procedure - - Run diagnostics - - Object dump - -The devlink health interface (via netlink): -DEVLINK_CMD_HEALTH_REPORTER_GET - Retrieves status and configuration info per DEV and reporter. -DEVLINK_CMD_HEALTH_REPORTER_SET - Allows reporter-related configuration setting. -DEVLINK_CMD_HEALTH_REPORTER_RECOVER - Triggers a reporter's recovery procedure. -DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE - Retrieves diagnostics data from a reporter on a device. -DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET - Retrieves the last stored dump. Devlink health - saves a single dump. If an dump is not already stored by the devlink - for this reporter, devlink generates a new dump. - dump output is defined by the reporter. -DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR - Clears the last saved dump file for the specified reporter. - - - netlink - +--------------------------+ - | | - | + | - | | | - +--------------------------+ - |request for ops - |(diagnose, - mlx5_core devlink |recover, - |dump) -+--------+ +--------------------------+ -| | | reporter| | -| | | +---------v----------+ | -| | ops execution | | | | -| <----------------------------------+ | | -| | | | | | -| | | + ^------------------+ | -| | | | request for ops | -| | | | (recover, dump) | -| | | | | -| | | +-+------------------+ | -| | health report | | health handler | | -| +-------------------------------> | | -| | | +--------------------+ | -| | health reporter create | | -| +----------------------------> | -+--------+ +--------------------------+ diff --git a/Documentation/networking/devlink-info-versions.rst b/Documentation/networking/devlink-info-versions.rst deleted file mode 100644 index 4914f581b1fd..000000000000 --- a/Documentation/networking/devlink-info-versions.rst +++ /dev/null @@ -1,64 +0,0 @@ -.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) - -===================== -Devlink info versions -===================== - -board.id -======== - -Unique identifier of the board design. - -board.rev -========= - -Board design revision. - -asic.id -======= - -ASIC design identifier. - -asic.rev -======== - -ASIC design revision. - -board.manufacture -================= - -An identifier of the company or the facility which produced the part. - -fw -== - -Overall firmware version, often representing the collection of -fw.mgmt, fw.app, etc. - -fw.mgmt -======= - -Control unit firmware version. This firmware is responsible for house -keeping tasks, PHY control etc. but not the packet-by-packet data path -operation. - -fw.app -====== - -Data path microcode controlling high-speed packet processing. - -fw.undi -======= - -UNDI software, may include the UEFI driver, firmware or both. - -fw.ncsi -======= - -Version of the software responsible for supporting/handling the -Network Controller Sideband Interface. - -fw.psid -======= - -Unique identifier of the firmware parameter set. diff --git a/Documentation/networking/devlink-params-bnxt.txt b/Documentation/networking/devlink-params-bnxt.txt deleted file mode 100644 index 481aa303d5b4..000000000000 --- a/Documentation/networking/devlink-params-bnxt.txt +++ /dev/null @@ -1,18 +0,0 @@ -enable_sriov [DEVICE, GENERIC] - Configuration mode: Permanent - -ignore_ari [DEVICE, GENERIC] - Configuration mode: Permanent - -msix_vec_per_pf_max [DEVICE, GENERIC] - Configuration mode: Permanent - -msix_vec_per_pf_min [DEVICE, GENERIC] - Configuration mode: Permanent - -gre_ver_check [DEVICE, DRIVER-SPECIFIC] - Generic Routing Encapsulation (GRE) version check will - be enabled in the device. If disabled, device skips - version checking for incoming packets. - Type: Boolean - Configuration mode: Permanent diff --git a/Documentation/networking/devlink-params-mlx5.txt b/Documentation/networking/devlink-params-mlx5.txt deleted file mode 100644 index 5071467118bd..000000000000 --- a/Documentation/networking/devlink-params-mlx5.txt +++ /dev/null @@ -1,17 +0,0 @@ -flow_steering_mode [DEVICE, DRIVER-SPECIFIC] - Controls the flow steering mode of the driver. - Two modes are supported: - 1. 'dmfs' - Device managed flow steering. - 2. 'smfs - Software/Driver managed flow steering. - In DMFS mode, the HW steering entities are created and - managed through the Firmware. - In SMFS mode, the HW steering entities are created and - managed though by the driver directly into Hardware - without firmware intervention. - Type: String - Configuration mode: runtime - -enable_roce [DEVICE, GENERIC] - Enable handling of RoCE traffic in the device. - Defaultly enabled. - Configuration mode: driverinit diff --git a/Documentation/networking/devlink-params-mlxsw.txt b/Documentation/networking/devlink-params-mlxsw.txt deleted file mode 100644 index c63ea9fc7009..000000000000 --- a/Documentation/networking/devlink-params-mlxsw.txt +++ /dev/null @@ -1,10 +0,0 @@ -fw_load_policy [DEVICE, GENERIC] - Configuration mode: driverinit - -acl_region_rehash_interval [DEVICE, DRIVER-SPECIFIC] - Sets an interval for periodic ACL region rehashes. - The value is in milliseconds, minimal value is "3000". - Value "0" disables the periodic work. - The first rehash will be run right after value is set. - Type: u32 - Configuration mode: runtime diff --git a/Documentation/networking/devlink-params-mv88e6xxx.txt b/Documentation/networking/devlink-params-mv88e6xxx.txt deleted file mode 100644 index 21c4b3556ef2..000000000000 --- a/Documentation/networking/devlink-params-mv88e6xxx.txt +++ /dev/null @@ -1,7 +0,0 @@ -ATU_hash [DEVICE, DRIVER-SPECIFIC] - Select one of four possible hashing algorithms for - MAC addresses in the Address Translation Unit. - A value of 3 seems to work better than the default of - 1 when many MAC addresses have the same OUI. - Configuration mode: runtime - Type: u8. 0-3 valid. diff --git a/Documentation/networking/devlink-params-nfp.txt b/Documentation/networking/devlink-params-nfp.txt deleted file mode 100644 index 43e4d4034865..000000000000 --- a/Documentation/networking/devlink-params-nfp.txt +++ /dev/null @@ -1,5 +0,0 @@ -fw_load_policy [DEVICE, GENERIC] - Configuration mode: permanent - -reset_dev_on_drv_probe [DEVICE, GENERIC] - Configuration mode: permanent diff --git a/Documentation/networking/devlink-params-ti-cpsw-switch.txt b/Documentation/networking/devlink-params-ti-cpsw-switch.txt deleted file mode 100644 index 4037458499f7..000000000000 --- a/Documentation/networking/devlink-params-ti-cpsw-switch.txt +++ /dev/null @@ -1,10 +0,0 @@ -ale_bypass [DEVICE, DRIVER-SPECIFIC] - Allows to enable ALE_CONTROL(4).BYPASS mode for debug purposes. - All packets will be sent to the Host port only if enabled. - Type: bool - Configuration mode: runtime - -switch_mode [DEVICE, DRIVER-SPECIFIC] - Enable switch mode - Type: bool - Configuration mode: runtime diff --git a/Documentation/networking/devlink-params.txt b/Documentation/networking/devlink-params.txt deleted file mode 100644 index 04e234e9acc9..000000000000 --- a/Documentation/networking/devlink-params.txt +++ /dev/null @@ -1,71 +0,0 @@ -Devlink configuration parameters -================================ -Following is the list of configuration parameters via devlink interface. -Each parameter can be generic or driver specific and are device level -parameters. - -Note that the driver-specific files should contain the generic params -they support to, with supported config modes. - -Each parameter can be set in different configuration modes: - runtime - set while driver is running, no reset required. - driverinit - applied while driver initializes, requires restart - driver by devlink reload command. - permanent - written to device's non-volatile memory, hard reset - required. - -Following is the list of parameters: -==================================== -enable_sriov [DEVICE, GENERIC] - Enable Single Root I/O Virtualisation (SRIOV) in - the device. - Type: Boolean - -ignore_ari [DEVICE, GENERIC] - Ignore Alternative Routing-ID Interpretation (ARI) - capability. If enabled, adapter will ignore ARI - capability even when platforms has the support - enabled and creates same number of partitions when - platform does not support ARI. - Type: Boolean - -msix_vec_per_pf_max [DEVICE, GENERIC] - Provides the maximum number of MSIX interrupts that - a device can create. Value is same across all - physical functions (PFs) in the device. - Type: u32 - -msix_vec_per_pf_min [DEVICE, GENERIC] - Provides the minimum number of MSIX interrupts required - for the device initialization. Value is same across all - physical functions (PFs) in the device. - Type: u32 - -fw_load_policy [DEVICE, GENERIC] - Controls the device's firmware loading policy. - Valid values: - * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DRIVER (0) - Load firmware version preferred by the driver. - * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH (1) - Load firmware currently stored in flash. - * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DISK (2) - Load firmware currently available on host's disk. - Type: u8 - -reset_dev_on_drv_probe [DEVICE, GENERIC] - Controls the device's reset policy on driver probe. - Valid values: - * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_UNKNOWN (0) - Unknown or invalid value. - * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_ALWAYS (1) - Always reset device on driver probe. - * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_NEVER (2) - Never reset device on driver probe. - * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_DISK (3) - Reset only if device firmware can be found in the - filesystem. - Type: u8 - -enable_roce [DEVICE, GENERIC] - Enable handling of RoCE traffic in the device. - Type: Boolean diff --git a/Documentation/networking/devlink-trap-netdevsim.rst b/Documentation/networking/devlink-trap-netdevsim.rst deleted file mode 100644 index b721c9415473..000000000000 --- a/Documentation/networking/devlink-trap-netdevsim.rst +++ /dev/null @@ -1,20 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 - -====================== -Devlink Trap netdevsim -====================== - -Driver-specific Traps -===================== - -.. list-table:: List of Driver-specific Traps Registered by ``netdevsim`` - :widths: 5 5 90 - - * - Name - - Type - - Description - * - ``fid_miss`` - - ``exception`` - - When a packet enters the device it is classified to a filtering - indentifier (FID) based on the ingress port and VLAN. This trap is used - to trap packets for which a FID could not be found diff --git a/Documentation/networking/devlink-trap.rst b/Documentation/networking/devlink-trap.rst deleted file mode 100644 index 03311849bfb1..000000000000 --- a/Documentation/networking/devlink-trap.rst +++ /dev/null @@ -1,270 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 - -============ -Devlink Trap -============ - -Background -========== - -Devices capable of offloading the kernel's datapath and perform functions such -as bridging and routing must also be able to send specific packets to the -kernel (i.e., the CPU) for processing. - -For example, a device acting as a multicast-aware bridge must be able to send -IGMP membership reports to the kernel for processing by the bridge module. -Without processing such packets, the bridge module could never populate its -MDB. - -As another example, consider a device acting as router which has received an IP -packet with a TTL of 1. Upon routing the packet the device must send it to the -kernel so that it will route it as well and generate an ICMP Time Exceeded -error datagram. Without letting the kernel route such packets itself, utilities -such as ``traceroute`` could never work. - -The fundamental ability of sending certain packets to the kernel for processing -is called "packet trapping". - -Overview -======== - -The ``devlink-trap`` mechanism allows capable device drivers to register their -supported packet traps with ``devlink`` and report trapped packets to -``devlink`` for further analysis. - -Upon receiving trapped packets, ``devlink`` will perform a per-trap packets and -bytes accounting and potentially report the packet to user space via a netlink -event along with all the provided metadata (e.g., trap reason, timestamp, input -port). This is especially useful for drop traps (see :ref:`Trap-Types`) -as it allows users to obtain further visibility into packet drops that would -otherwise be invisible. - -The following diagram provides a general overview of ``devlink-trap``:: - - Netlink event: Packet w/ metadata - Or a summary of recent drops - ^ - | - Userspace | - +---------------------------------------------------+ - Kernel | - | - +-------+--------+ - | | - | drop_monitor | - | | - +-------^--------+ - | - | - | - +----+----+ - | | Kernel's Rx path - | devlink | (non-drop traps) - | | - +----^----+ ^ - | | - +-----------+ - | - +-------+-------+ - | | - | Device driver | - | | - +-------^-------+ - Kernel | - +---------------------------------------------------+ - Hardware | - | Trapped packet - | - +--+---+ - | | - | ASIC | - | | - +------+ - -.. _Trap-Types: - -Trap Types -========== - -The ``devlink-trap`` mechanism supports the following packet trap types: - - * ``drop``: Trapped packets were dropped by the underlying device. Packets - are only processed by ``devlink`` and not injected to the kernel's Rx path. - The trap action (see :ref:`Trap-Actions`) can be changed. - * ``exception``: Trapped packets were not forwarded as intended by the - underlying device due to an exception (e.g., TTL error, missing neighbour - entry) and trapped to the control plane for resolution. Packets are - processed by ``devlink`` and injected to the kernel's Rx path. Changing the - action of such traps is not allowed, as it can easily break the control - plane. - -.. _Trap-Actions: - -Trap Actions -============ - -The ``devlink-trap`` mechanism supports the following packet trap actions: - - * ``trap``: The sole copy of the packet is sent to the CPU. - * ``drop``: The packet is dropped by the underlying device and a copy is not - sent to the CPU. - -Generic Packet Traps -==================== - -Generic packet traps are used to describe traps that trap well-defined packets -or packets that are trapped due to well-defined conditions (e.g., TTL error). -Such traps can be shared by multiple device drivers and their description must -be added to the following table: - -.. list-table:: List of Generic Packet Traps - :widths: 5 5 90 - - * - Name - - Type - - Description - * - ``source_mac_is_multicast`` - - ``drop`` - - Traps incoming packets that the device decided to drop because of a - multicast source MAC - * - ``vlan_tag_mismatch`` - - ``drop`` - - Traps incoming packets that the device decided to drop in case of VLAN - tag mismatch: The ingress bridge port is not configured with a PVID and - the packet is untagged or prio-tagged - * - ``ingress_vlan_filter`` - - ``drop`` - - Traps incoming packets that the device decided to drop in case they are - tagged with a VLAN that is not configured on the ingress bridge port - * - ``ingress_spanning_tree_filter`` - - ``drop`` - - Traps incoming packets that the device decided to drop in case the STP - state of the ingress bridge port is not "forwarding" - * - ``port_list_is_empty`` - - ``drop`` - - Traps packets that the device decided to drop in case they need to be - flooded (e.g., unknown unicast, unregistered multicast) and there are - no ports the packets should be flooded to - * - ``port_loopback_filter`` - - ``drop`` - - Traps packets that the device decided to drop in case after layer 2 - forwarding the only port from which they should be transmitted through - is the port from which they were received - * - ``blackhole_route`` - - ``drop`` - - Traps packets that the device decided to drop in case they hit a - blackhole route - * - ``ttl_value_is_too_small`` - - ``exception`` - - Traps unicast packets that should be forwarded by the device whose TTL - was decremented to 0 or less - * - ``tail_drop`` - - ``drop`` - - Traps packets that the device decided to drop because they could not be - enqueued to a transmission queue which is full - * - ``non_ip`` - - ``drop`` - - Traps packets that the device decided to drop because they need to - undergo a layer 3 lookup, but are not IP or MPLS packets - * - ``uc_dip_over_mc_dmac`` - - ``drop`` - - Traps packets that the device decided to drop because they need to be - routed and they have a unicast destination IP and a multicast destination - MAC - * - ``dip_is_loopback_address`` - - ``drop`` - - Traps packets that the device decided to drop because they need to be - routed and their destination IP is the loopback address (i.e., 127.0.0.0/8 - and ::1/128) - * - ``sip_is_mc`` - - ``drop`` - - Traps packets that the device decided to drop because they need to be - routed and their source IP is multicast (i.e., 224.0.0.0/8 and ff::/8) - * - ``sip_is_loopback_address`` - - ``drop`` - - Traps packets that the device decided to drop because they need to be - routed and their source IP is the loopback address (i.e., 127.0.0.0/8 and ::1/128) - * - ``ip_header_corrupted`` - - ``drop`` - - Traps packets that the device decided to drop because they need to be - routed and their IP header is corrupted: wrong checksum, wrong IP version - or too short Internet Header Length (IHL) - * - ``ipv4_sip_is_limited_bc`` - - ``drop`` - - Traps packets that the device decided to drop because they need to be - routed and their source IP is limited broadcast (i.e., 255.255.255.255/32) - * - ``ipv6_mc_dip_reserved_scope`` - - ``drop`` - - Traps IPv6 packets that the device decided to drop because they need to - be routed and their IPv6 multicast destination IP has a reserved scope - (i.e., ffx0::/16) - * - ``ipv6_mc_dip_interface_local_scope`` - - ``drop`` - - Traps IPv6 packets that the device decided to drop because they need to - be routed and their IPv6 multicast destination IP has an interface-local scope - (i.e., ffx1::/16) - * - ``mtu_value_is_too_small`` - - ``exception`` - - Traps packets that should have been routed by the device, but were bigger - than the MTU of the egress interface - * - ``unresolved_neigh`` - - ``exception`` - - Traps packets that did not have a matching IP neighbour after routing - * - ``mc_reverse_path_forwarding`` - - ``exception`` - - Traps multicast IP packets that failed reverse-path forwarding (RPF) - check during multicast routing - * - ``reject_route`` - - ``exception`` - - Traps packets that hit reject routes (i.e., "unreachable", "prohibit") - * - ``ipv4_lpm_miss`` - - ``exception`` - - Traps unicast IPv4 packets that did not match any route - * - ``ipv6_lpm_miss`` - - ``exception`` - - Traps unicast IPv6 packets that did not match any route - -Driver-specific Packet Traps -============================ - -Device drivers can register driver-specific packet traps, but these must be -clearly documented. Such traps can correspond to device-specific exceptions and -help debug packet drops caused by these exceptions. The following list includes -links to the description of driver-specific traps registered by various device -drivers: - - * :doc:`devlink-trap-netdevsim` - -Generic Packet Trap Groups -========================== - -Generic packet trap groups are used to aggregate logically related packet -traps. These groups allow the user to batch operations such as setting the trap -action of all member traps. In addition, ``devlink-trap`` can report aggregated -per-group packets and bytes statistics, in case per-trap statistics are too -narrow. The description of these groups must be added to the following table: - -.. list-table:: List of Generic Packet Trap Groups - :widths: 10 90 - - * - Name - - Description - * - ``l2_drops`` - - Contains packet traps for packets that were dropped by the device during - layer 2 forwarding (i.e., bridge) - * - ``l3_drops`` - - Contains packet traps for packets that were dropped by the device or hit - an exception (e.g., TTL error) during layer 3 forwarding - * - ``buffer_drops`` - - Contains packet traps for packets that were dropped by the device due to - an enqueue decision - -Testing -======= - -See ``tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh`` for a -test covering the core infrastructure. Test cases should be added for any new -functionality. - -Device drivers should focus their tests on device-specific functionality, such -as the triggering of supported packet traps. diff --git a/Documentation/networking/devlink/devlink-health.txt b/Documentation/networking/devlink/devlink-health.txt new file mode 100644 index 000000000000..1db3fbea0831 --- /dev/null +++ b/Documentation/networking/devlink/devlink-health.txt @@ -0,0 +1,86 @@ +The health mechanism is targeted for Real Time Alerting, in order to know when +something bad had happened to a PCI device +- Provide alert debug information +- Self healing +- If problem needs vendor support, provide a way to gather all needed debugging + information. + +The main idea is to unify and centralize driver health reports in the +generic devlink instance and allow the user to set different +attributes of the health reporting and recovery procedures. + +The devlink health reporter: +Device driver creates a "health reporter" per each error/health type. +Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error) +or unknown (driver specific). +For each registered health reporter a driver can issue error/health reports +asynchronously. All health reports handling is done by devlink. +Device driver can provide specific callbacks for each "health reporter", e.g. + - Recovery procedures + - Diagnostics and object dump procedures + - OOB initial parameters +Different parts of the driver can register different types of health reporters +with different handlers. + +Once an error is reported, devlink health will do the following actions: + * A log is being send to the kernel trace events buffer + * Health status and statistics are being updated for the reporter instance + * Object dump is being taken and saved at the reporter instance (as long as + there is no other dump which is already stored) + * Auto recovery attempt is being done. Depends on: + - Auto-recovery configuration + - Grace period vs. time passed since last recover + +The user interface: +User can access/change each reporter's parameters and driver specific callbacks +via devlink, e.g per error type (per health reporter) + - Configure reporter's generic parameters (like: disable/enable auto recovery) + - Invoke recovery procedure + - Run diagnostics + - Object dump + +The devlink health interface (via netlink): +DEVLINK_CMD_HEALTH_REPORTER_GET + Retrieves status and configuration info per DEV and reporter. +DEVLINK_CMD_HEALTH_REPORTER_SET + Allows reporter-related configuration setting. +DEVLINK_CMD_HEALTH_REPORTER_RECOVER + Triggers a reporter's recovery procedure. +DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE + Retrieves diagnostics data from a reporter on a device. +DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET + Retrieves the last stored dump. Devlink health + saves a single dump. If an dump is not already stored by the devlink + for this reporter, devlink generates a new dump. + dump output is defined by the reporter. +DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR + Clears the last saved dump file for the specified reporter. + + + netlink + +--------------------------+ + | | + | + | + | | | + +--------------------------+ + |request for ops + |(diagnose, + mlx5_core devlink |recover, + |dump) ++--------+ +--------------------------+ +| | | reporter| | +| | | +---------v----------+ | +| | ops execution | | | | +| <----------------------------------+ | | +| | | | | | +| | | + ^------------------+ | +| | | | request for ops | +| | | | (recover, dump) | +| | | | | +| | | +-+------------------+ | +| | health report | | health handler | | +| +-------------------------------> | | +| | | +--------------------+ | +| | health reporter create | | +| +----------------------------> | ++--------+ +--------------------------+ diff --git a/Documentation/networking/devlink/devlink-info-versions.rst b/Documentation/networking/devlink/devlink-info-versions.rst new file mode 100644 index 000000000000..4914f581b1fd --- /dev/null +++ b/Documentation/networking/devlink/devlink-info-versions.rst @@ -0,0 +1,64 @@ +.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) + +===================== +Devlink info versions +===================== + +board.id +======== + +Unique identifier of the board design. + +board.rev +========= + +Board design revision. + +asic.id +======= + +ASIC design identifier. + +asic.rev +======== + +ASIC design revision. + +board.manufacture +================= + +An identifier of the company or the facility which produced the part. + +fw +== + +Overall firmware version, often representing the collection of +fw.mgmt, fw.app, etc. + +fw.mgmt +======= + +Control unit firmware version. This firmware is responsible for house +keeping tasks, PHY control etc. but not the packet-by-packet data path +operation. + +fw.app +====== + +Data path microcode controlling high-speed packet processing. + +fw.undi +======= + +UNDI software, may include the UEFI driver, firmware or both. + +fw.ncsi +======= + +Version of the software responsible for supporting/handling the +Network Controller Sideband Interface. + +fw.psid +======= + +Unique identifier of the firmware parameter set. diff --git a/Documentation/networking/devlink/devlink-params-bnxt.txt b/Documentation/networking/devlink/devlink-params-bnxt.txt new file mode 100644 index 000000000000..481aa303d5b4 --- /dev/null +++ b/Documentation/networking/devlink/devlink-params-bnxt.txt @@ -0,0 +1,18 @@ +enable_sriov [DEVICE, GENERIC] + Configuration mode: Permanent + +ignore_ari [DEVICE, GENERIC] + Configuration mode: Permanent + +msix_vec_per_pf_max [DEVICE, GENERIC] + Configuration mode: Permanent + +msix_vec_per_pf_min [DEVICE, GENERIC] + Configuration mode: Permanent + +gre_ver_check [DEVICE, DRIVER-SPECIFIC] + Generic Routing Encapsulation (GRE) version check will + be enabled in the device. If disabled, device skips + version checking for incoming packets. + Type: Boolean + Configuration mode: Permanent diff --git a/Documentation/networking/devlink/devlink-params-mlx5.txt b/Documentation/networking/devlink/devlink-params-mlx5.txt new file mode 100644 index 000000000000..5071467118bd --- /dev/null +++ b/Documentation/networking/devlink/devlink-params-mlx5.txt @@ -0,0 +1,17 @@ +flow_steering_mode [DEVICE, DRIVER-SPECIFIC] + Controls the flow steering mode of the driver. + Two modes are supported: + 1. 'dmfs' - Device managed flow steering. + 2. 'smfs - Software/Driver managed flow steering. + In DMFS mode, the HW steering entities are created and + managed through the Firmware. + In SMFS mode, the HW steering entities are created and + managed though by the driver directly into Hardware + without firmware intervention. + Type: String + Configuration mode: runtime + +enable_roce [DEVICE, GENERIC] + Enable handling of RoCE traffic in the device. + Defaultly enabled. + Configuration mode: driverinit diff --git a/Documentation/networking/devlink/devlink-params-mlxsw.txt b/Documentation/networking/devlink/devlink-params-mlxsw.txt new file mode 100644 index 000000000000..c63ea9fc7009 --- /dev/null +++ b/Documentation/networking/devlink/devlink-params-mlxsw.txt @@ -0,0 +1,10 @@ +fw_load_policy [DEVICE, GENERIC] + Configuration mode: driverinit + +acl_region_rehash_interval [DEVICE, DRIVER-SPECIFIC] + Sets an interval for periodic ACL region rehashes. + The value is in milliseconds, minimal value is "3000". + Value "0" disables the periodic work. + The first rehash will be run right after value is set. + Type: u32 + Configuration mode: runtime diff --git a/Documentation/networking/devlink/devlink-params-mv88e6xxx.txt b/Documentation/networking/devlink/devlink-params-mv88e6xxx.txt new file mode 100644 index 000000000000..21c4b3556ef2 --- /dev/null +++ b/Documentation/networking/devlink/devlink-params-mv88e6xxx.txt @@ -0,0 +1,7 @@ +ATU_hash [DEVICE, DRIVER-SPECIFIC] + Select one of four possible hashing algorithms for + MAC addresses in the Address Translation Unit. + A value of 3 seems to work better than the default of + 1 when many MAC addresses have the same OUI. + Configuration mode: runtime + Type: u8. 0-3 valid. diff --git a/Documentation/networking/devlink/devlink-params-nfp.txt b/Documentation/networking/devlink/devlink-params-nfp.txt new file mode 100644 index 000000000000..43e4d4034865 --- /dev/null +++ b/Documentation/networking/devlink/devlink-params-nfp.txt @@ -0,0 +1,5 @@ +fw_load_policy [DEVICE, GENERIC] + Configuration mode: permanent + +reset_dev_on_drv_probe [DEVICE, GENERIC] + Configuration mode: permanent diff --git a/Documentation/networking/devlink/devlink-params-ti-cpsw-switch.txt b/Documentation/networking/devlink/devlink-params-ti-cpsw-switch.txt new file mode 100644 index 000000000000..4037458499f7 --- /dev/null +++ b/Documentation/networking/devlink/devlink-params-ti-cpsw-switch.txt @@ -0,0 +1,10 @@ +ale_bypass [DEVICE, DRIVER-SPECIFIC] + Allows to enable ALE_CONTROL(4).BYPASS mode for debug purposes. + All packets will be sent to the Host port only if enabled. + Type: bool + Configuration mode: runtime + +switch_mode [DEVICE, DRIVER-SPECIFIC] + Enable switch mode + Type: bool + Configuration mode: runtime diff --git a/Documentation/networking/devlink/devlink-params.txt b/Documentation/networking/devlink/devlink-params.txt new file mode 100644 index 000000000000..04e234e9acc9 --- /dev/null +++ b/Documentation/networking/devlink/devlink-params.txt @@ -0,0 +1,71 @@ +Devlink configuration parameters +================================ +Following is the list of configuration parameters via devlink interface. +Each parameter can be generic or driver specific and are device level +parameters. + +Note that the driver-specific files should contain the generic params +they support to, with supported config modes. + +Each parameter can be set in different configuration modes: + runtime - set while driver is running, no reset required. + driverinit - applied while driver initializes, requires restart + driver by devlink reload command. + permanent - written to device's non-volatile memory, hard reset + required. + +Following is the list of parameters: +==================================== +enable_sriov [DEVICE, GENERIC] + Enable Single Root I/O Virtualisation (SRIOV) in + the device. + Type: Boolean + +ignore_ari [DEVICE, GENERIC] + Ignore Alternative Routing-ID Interpretation (ARI) + capability. If enabled, adapter will ignore ARI + capability even when platforms has the support + enabled and creates same number of partitions when + platform does not support ARI. + Type: Boolean + +msix_vec_per_pf_max [DEVICE, GENERIC] + Provides the maximum number of MSIX interrupts that + a device can create. Value is same across all + physical functions (PFs) in the device. + Type: u32 + +msix_vec_per_pf_min [DEVICE, GENERIC] + Provides the minimum number of MSIX interrupts required + for the device initialization. Value is same across all + physical functions (PFs) in the device. + Type: u32 + +fw_load_policy [DEVICE, GENERIC] + Controls the device's firmware loading policy. + Valid values: + * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DRIVER (0) + Load firmware version preferred by the driver. + * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH (1) + Load firmware currently stored in flash. + * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DISK (2) + Load firmware currently available on host's disk. + Type: u8 + +reset_dev_on_drv_probe [DEVICE, GENERIC] + Controls the device's reset policy on driver probe. + Valid values: + * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_UNKNOWN (0) + Unknown or invalid value. + * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_ALWAYS (1) + Always reset device on driver probe. + * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_NEVER (2) + Never reset device on driver probe. + * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_DISK (3) + Reset only if device firmware can be found in the + filesystem. + Type: u8 + +enable_roce [DEVICE, GENERIC] + Enable handling of RoCE traffic in the device. + Type: Boolean diff --git a/Documentation/networking/devlink/devlink-trap-netdevsim.rst b/Documentation/networking/devlink/devlink-trap-netdevsim.rst new file mode 100644 index 000000000000..b721c9415473 --- /dev/null +++ b/Documentation/networking/devlink/devlink-trap-netdevsim.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================== +Devlink Trap netdevsim +====================== + +Driver-specific Traps +===================== + +.. list-table:: List of Driver-specific Traps Registered by ``netdevsim`` + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``fid_miss`` + - ``exception`` + - When a packet enters the device it is classified to a filtering + indentifier (FID) based on the ingress port and VLAN. This trap is used + to trap packets for which a FID could not be found diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst new file mode 100644 index 000000000000..03311849bfb1 --- /dev/null +++ b/Documentation/networking/devlink/devlink-trap.rst @@ -0,0 +1,270 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============ +Devlink Trap +============ + +Background +========== + +Devices capable of offloading the kernel's datapath and perform functions such +as bridging and routing must also be able to send specific packets to the +kernel (i.e., the CPU) for processing. + +For example, a device acting as a multicast-aware bridge must be able to send +IGMP membership reports to the kernel for processing by the bridge module. +Without processing such packets, the bridge module could never populate its +MDB. + +As another example, consider a device acting as router which has received an IP +packet with a TTL of 1. Upon routing the packet the device must send it to the +kernel so that it will route it as well and generate an ICMP Time Exceeded +error datagram. Without letting the kernel route such packets itself, utilities +such as ``traceroute`` could never work. + +The fundamental ability of sending certain packets to the kernel for processing +is called "packet trapping". + +Overview +======== + +The ``devlink-trap`` mechanism allows capable device drivers to register their +supported packet traps with ``devlink`` and report trapped packets to +``devlink`` for further analysis. + +Upon receiving trapped packets, ``devlink`` will perform a per-trap packets and +bytes accounting and potentially report the packet to user space via a netlink +event along with all the provided metadata (e.g., trap reason, timestamp, input +port). This is especially useful for drop traps (see :ref:`Trap-Types`) +as it allows users to obtain further visibility into packet drops that would +otherwise be invisible. + +The following diagram provides a general overview of ``devlink-trap``:: + + Netlink event: Packet w/ metadata + Or a summary of recent drops + ^ + | + Userspace | + +---------------------------------------------------+ + Kernel | + | + +-------+--------+ + | | + | drop_monitor | + | | + +-------^--------+ + | + | + | + +----+----+ + | | Kernel's Rx path + | devlink | (non-drop traps) + | | + +----^----+ ^ + | | + +-----------+ + | + +-------+-------+ + | | + | Device driver | + | | + +-------^-------+ + Kernel | + +---------------------------------------------------+ + Hardware | + | Trapped packet + | + +--+---+ + | | + | ASIC | + | | + +------+ + +.. _Trap-Types: + +Trap Types +========== + +The ``devlink-trap`` mechanism supports the following packet trap types: + + * ``drop``: Trapped packets were dropped by the underlying device. Packets + are only processed by ``devlink`` and not injected to the kernel's Rx path. + The trap action (see :ref:`Trap-Actions`) can be changed. + * ``exception``: Trapped packets were not forwarded as intended by the + underlying device due to an exception (e.g., TTL error, missing neighbour + entry) and trapped to the control plane for resolution. Packets are + processed by ``devlink`` and injected to the kernel's Rx path. Changing the + action of such traps is not allowed, as it can easily break the control + plane. + +.. _Trap-Actions: + +Trap Actions +============ + +The ``devlink-trap`` mechanism supports the following packet trap actions: + + * ``trap``: The sole copy of the packet is sent to the CPU. + * ``drop``: The packet is dropped by the underlying device and a copy is not + sent to the CPU. + +Generic Packet Traps +==================== + +Generic packet traps are used to describe traps that trap well-defined packets +or packets that are trapped due to well-defined conditions (e.g., TTL error). +Such traps can be shared by multiple device drivers and their description must +be added to the following table: + +.. list-table:: List of Generic Packet Traps + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``source_mac_is_multicast`` + - ``drop`` + - Traps incoming packets that the device decided to drop because of a + multicast source MAC + * - ``vlan_tag_mismatch`` + - ``drop`` + - Traps incoming packets that the device decided to drop in case of VLAN + tag mismatch: The ingress bridge port is not configured with a PVID and + the packet is untagged or prio-tagged + * - ``ingress_vlan_filter`` + - ``drop`` + - Traps incoming packets that the device decided to drop in case they are + tagged with a VLAN that is not configured on the ingress bridge port + * - ``ingress_spanning_tree_filter`` + - ``drop`` + - Traps incoming packets that the device decided to drop in case the STP + state of the ingress bridge port is not "forwarding" + * - ``port_list_is_empty`` + - ``drop`` + - Traps packets that the device decided to drop in case they need to be + flooded (e.g., unknown unicast, unregistered multicast) and there are + no ports the packets should be flooded to + * - ``port_loopback_filter`` + - ``drop`` + - Traps packets that the device decided to drop in case after layer 2 + forwarding the only port from which they should be transmitted through + is the port from which they were received + * - ``blackhole_route`` + - ``drop`` + - Traps packets that the device decided to drop in case they hit a + blackhole route + * - ``ttl_value_is_too_small`` + - ``exception`` + - Traps unicast packets that should be forwarded by the device whose TTL + was decremented to 0 or less + * - ``tail_drop`` + - ``drop`` + - Traps packets that the device decided to drop because they could not be + enqueued to a transmission queue which is full + * - ``non_ip`` + - ``drop`` + - Traps packets that the device decided to drop because they need to + undergo a layer 3 lookup, but are not IP or MPLS packets + * - ``uc_dip_over_mc_dmac`` + - ``drop`` + - Traps packets that the device decided to drop because they need to be + routed and they have a unicast destination IP and a multicast destination + MAC + * - ``dip_is_loopback_address`` + - ``drop`` + - Traps packets that the device decided to drop because they need to be + routed and their destination IP is the loopback address (i.e., 127.0.0.0/8 + and ::1/128) + * - ``sip_is_mc`` + - ``drop`` + - Traps packets that the device decided to drop because they need to be + routed and their source IP is multicast (i.e., 224.0.0.0/8 and ff::/8) + * - ``sip_is_loopback_address`` + - ``drop`` + - Traps packets that the device decided to drop because they need to be + routed and their source IP is the loopback address (i.e., 127.0.0.0/8 and ::1/128) + * - ``ip_header_corrupted`` + - ``drop`` + - Traps packets that the device decided to drop because they need to be + routed and their IP header is corrupted: wrong checksum, wrong IP version + or too short Internet Header Length (IHL) + * - ``ipv4_sip_is_limited_bc`` + - ``drop`` + - Traps packets that the device decided to drop because they need to be + routed and their source IP is limited broadcast (i.e., 255.255.255.255/32) + * - ``ipv6_mc_dip_reserved_scope`` + - ``drop`` + - Traps IPv6 packets that the device decided to drop because they need to + be routed and their IPv6 multicast destination IP has a reserved scope + (i.e., ffx0::/16) + * - ``ipv6_mc_dip_interface_local_scope`` + - ``drop`` + - Traps IPv6 packets that the device decided to drop because they need to + be routed and their IPv6 multicast destination IP has an interface-local scope + (i.e., ffx1::/16) + * - ``mtu_value_is_too_small`` + - ``exception`` + - Traps packets that should have been routed by the device, but were bigger + than the MTU of the egress interface + * - ``unresolved_neigh`` + - ``exception`` + - Traps packets that did not have a matching IP neighbour after routing + * - ``mc_reverse_path_forwarding`` + - ``exception`` + - Traps multicast IP packets that failed reverse-path forwarding (RPF) + check during multicast routing + * - ``reject_route`` + - ``exception`` + - Traps packets that hit reject routes (i.e., "unreachable", "prohibit") + * - ``ipv4_lpm_miss`` + - ``exception`` + - Traps unicast IPv4 packets that did not match any route + * - ``ipv6_lpm_miss`` + - ``exception`` + - Traps unicast IPv6 packets that did not match any route + +Driver-specific Packet Traps +============================ + +Device drivers can register driver-specific packet traps, but these must be +clearly documented. Such traps can correspond to device-specific exceptions and +help debug packet drops caused by these exceptions. The following list includes +links to the description of driver-specific traps registered by various device +drivers: + + * :doc:`devlink-trap-netdevsim` + +Generic Packet Trap Groups +========================== + +Generic packet trap groups are used to aggregate logically related packet +traps. These groups allow the user to batch operations such as setting the trap +action of all member traps. In addition, ``devlink-trap`` can report aggregated +per-group packets and bytes statistics, in case per-trap statistics are too +narrow. The description of these groups must be added to the following table: + +.. list-table:: List of Generic Packet Trap Groups + :widths: 10 90 + + * - Name + - Description + * - ``l2_drops`` + - Contains packet traps for packets that were dropped by the device during + layer 2 forwarding (i.e., bridge) + * - ``l3_drops`` + - Contains packet traps for packets that were dropped by the device or hit + an exception (e.g., TTL error) during layer 3 forwarding + * - ``buffer_drops`` + - Contains packet traps for packets that were dropped by the device due to + an enqueue decision + +Testing +======= + +See ``tools/testing/selftests/drivers/net/netdevsim/devlink_trap.sh`` for a +test covering the core infrastructure. Test cases should be added for any new +functionality. + +Device drivers should focus their tests on device-specific functionality, such +as the triggering of supported packet traps. diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst new file mode 100644 index 000000000000..1252c2a1b680 --- /dev/null +++ b/Documentation/networking/devlink/index.rst @@ -0,0 +1,14 @@ +Linux Devlink Documentation +=========================== + +devlink is an API to expose device information and resources not directly +related to any device class, such as chip-wide/switch-ASIC-wide configuration. + +Contents: + +.. toctree:: + :maxdepth: 1 + + devlink-info-versions + devlink-trap + devlink-trap-netdevsim diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index bee73be7af93..d07d9855dcd3 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -13,9 +13,7 @@ Contents: can_ucan_protocol device_drivers/index dsa/index - devlink-info-versions - devlink-trap - devlink-trap-netdevsim + devlink/index ethtool-netlink ieee802154 j1939 -- cgit v1.2.3-59-g8ed1b