aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-bus-mdio63
-rw-r--r--Documentation/devicetree/bindings/net/broadcom-bluetooth.txt7
-rw-r--r--Documentation/devicetree/bindings/net/dsa/ar9331.txt148
-rw-r--r--Documentation/devicetree/bindings/net/mediatek-dwmac.txt33
-rw-r--r--Documentation/devicetree/bindings/net/ti,dp83867.txt12
-rw-r--r--Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml273
-rw-r--r--Documentation/devicetree/bindings/ptp/ptp-ines.txt35
-rw-r--r--Documentation/devicetree/bindings/ptp/timestamper.txt42
-rw-r--r--Documentation/networking/device_drivers/index.rst1
-rw-r--r--Documentation/networking/device_drivers/netronome/nfp.rst116
-rw-r--r--Documentation/networking/device_drivers/stmicro/stmmac.rst697
-rw-r--r--Documentation/networking/device_drivers/stmicro/stmmac.txt401
-rw-r--r--Documentation/networking/device_drivers/ti/cpsw_switchdev.txt2
-rw-r--r--Documentation/networking/devlink-health.txt86
-rw-r--r--Documentation/networking/devlink-info-versions.rst64
-rw-r--r--Documentation/networking/devlink-params-bnxt.txt18
-rw-r--r--Documentation/networking/devlink-params-mlx5.txt17
-rw-r--r--Documentation/networking/devlink-params-mlxsw.txt10
-rw-r--r--Documentation/networking/devlink-params-mv88e6xxx.txt7
-rw-r--r--Documentation/networking/devlink-params-nfp.txt5
-rw-r--r--Documentation/networking/devlink-params-ti-cpsw-switch.txt10
-rw-r--r--Documentation/networking/devlink-params.txt71
-rw-r--r--Documentation/networking/devlink-trap-netdevsim.rst20
-rw-r--r--Documentation/networking/devlink/bnxt.rst41
-rw-r--r--Documentation/networking/devlink/devlink-dpipe.rst252
-rw-r--r--Documentation/networking/devlink/devlink-health.rst114
-rw-r--r--Documentation/networking/devlink/devlink-info.rst94
-rw-r--r--Documentation/networking/devlink/devlink-params.rst108
-rw-r--r--Documentation/networking/devlink/devlink-region.rst60
-rw-r--r--Documentation/networking/devlink/devlink-resource.rst62
-rw-r--r--Documentation/networking/devlink/devlink-trap.rst (renamed from Documentation/networking/devlink-trap.rst)21
-rw-r--r--Documentation/networking/devlink/index.rst42
-rw-r--r--Documentation/networking/devlink/ionic.rst29
-rw-r--r--Documentation/networking/devlink/mlx4.rst56
-rw-r--r--Documentation/networking/devlink/mlx5.rst59
-rw-r--r--Documentation/networking/devlink/mlxsw.rst81
-rw-r--r--Documentation/networking/devlink/mv88e6xxx.rst28
-rw-r--r--Documentation/networking/devlink/netdevsim.rst72
-rw-r--r--Documentation/networking/devlink/nfp.rst65
-rw-r--r--Documentation/networking/devlink/qed.rst26
-rw-r--r--Documentation/networking/devlink/ti-cpsw-switch.rst31
-rw-r--r--Documentation/networking/ethtool-netlink.rst520
-rw-r--r--Documentation/networking/index.rst5
-rw-r--r--Documentation/networking/ip-sysctl.txt4
-rw-r--r--Documentation/networking/phy.rst18
-rw-r--r--Documentation/networking/sfp-phylink.rst3
46 files changed, 3201 insertions, 728 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-mdio b/Documentation/ABI/testing/sysfs-bus-mdio
new file mode 100644
index 000000000000..da86efc7781b
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-mdio
@@ -0,0 +1,63 @@
+What: /sys/bus/mdio_bus/devices/.../statistics/
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ This folder contains statistics about global and per
+ MDIO bus address statistics.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/transfers
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of transfers for this MDIO bus.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/errors
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of transfer errors for this MDIO bus.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/writes
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of write transactions for this MDIO bus.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/reads
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of read transactions for this MDIO bus.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/transfers_<addr>
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of transfers for this MDIO bus address.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/errors_<addr>
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of transfer errors for this MDIO bus address.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/writes_<addr>
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of write transactions for this MDIO bus address.
+
+What: /sys/bus/mdio_bus/devices/.../statistics/reads_<addr>
+Date: January 2020
+KernelVersion: 5.6
+Contact: netdev@vger.kernel.org
+Description:
+ Total number of read transactions for this MDIO bus address.
diff --git a/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt b/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt
index f16b99571af1..b5eadee4a9a7 100644
--- a/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt
+++ b/Documentation/devicetree/bindings/net/broadcom-bluetooth.txt
@@ -30,6 +30,12 @@ Optional properties:
- "lpo": external low power 32.768 kHz clock
- vbat-supply: phandle to regulator supply for VBAT
- vddio-supply: phandle to regulator supply for VDDIO
+ - brcm,bt-pcm-int-params: configure PCM parameters via a 5-byte array
+ - sco-routing: 0 = PCM, 1 = Transport, 2 = Codec, 3 = I2S
+ - pcm-interface-rate: 128KBps, 256KBps, 512KBps, 1024KBps, 2048KBps
+ - pcm-frame-type: short, long
+ - pcm-sync-mode: slave, master
+ - pcm-clock-mode: slave, master
Example:
@@ -41,5 +47,6 @@ Example:
bluetooth {
compatible = "brcm,bcm43438-bt";
max-speed = <921600>;
+ brcm,bt-pcm-int-params = [01 02 00 01 01];
};
};
diff --git a/Documentation/devicetree/bindings/net/dsa/ar9331.txt b/Documentation/devicetree/bindings/net/dsa/ar9331.txt
new file mode 100644
index 000000000000..320607cbbb17
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/ar9331.txt
@@ -0,0 +1,148 @@
+Atheros AR9331 built-in switch
+=============================
+
+It is a switch built-in to Atheros AR9331 WiSoC and addressable over internal
+MDIO bus. All PHYs are built-in as well.
+
+Required properties:
+
+ - compatible: should be: "qca,ar9331-switch"
+ - reg: Address on the MII bus for the switch.
+ - resets : Must contain an entry for each entry in reset-names.
+ - reset-names : Must include the following entries: "switch"
+ - interrupt-parent: Phandle to the parent interrupt controller
+ - interrupts: IRQ line for the switch
+ - interrupt-controller: Indicates the switch is itself an interrupt
+ controller. This is used for the PHY interrupts.
+ - #interrupt-cells: must be 1
+ - mdio: Container of PHY and devices on the switches MDIO bus.
+
+See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
+required and optional properties.
+Examples:
+
+eth0: ethernet@19000000 {
+ compatible = "qca,ar9330-eth";
+ reg = <0x19000000 0x200>;
+ interrupts = <4>;
+
+ resets = <&rst 9>, <&rst 22>;
+ reset-names = "mac", "mdio";
+ clocks = <&pll ATH79_CLK_AHB>, <&pll ATH79_CLK_AHB>;
+ clock-names = "eth", "mdio";
+
+ phy-mode = "mii";
+ phy-handle = <&phy_port4>;
+};
+
+eth1: ethernet@1a000000 {
+ compatible = "qca,ar9330-eth";
+ reg = <0x1a000000 0x200>;
+ interrupts = <5>;
+ resets = <&rst 13>, <&rst 23>;
+ reset-names = "mac", "mdio";
+ clocks = <&pll ATH79_CLK_AHB>, <&pll ATH79_CLK_AHB>;
+ clock-names = "eth", "mdio";
+
+ phy-mode = "gmii";
+
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ switch10: switch@10 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ compatible = "qca,ar9331-switch";
+ reg = <0x10>;
+ resets = <&rst 8>;
+ reset-names = "switch";
+
+ interrupt-parent = <&miscintc>;
+ interrupts = <12>;
+
+ interrupt-controller;
+ #interrupt-cells = <1>;
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ switch_port0: port@0 {
+ reg = <0x0>;
+ label = "cpu";
+ ethernet = <&eth1>;
+
+ phy-mode = "gmii";
+
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+
+ switch_port1: port@1 {
+ reg = <0x1>;
+ phy-handle = <&phy_port0>;
+ phy-mode = "internal";
+ };
+
+ switch_port2: port@2 {
+ reg = <0x2>;
+ phy-handle = <&phy_port1>;
+ phy-mode = "internal";
+ };
+
+ switch_port3: port@3 {
+ reg = <0x3>;
+ phy-handle = <&phy_port2>;
+ phy-mode = "internal";
+ };
+
+ switch_port4: port@4 {
+ reg = <0x4>;
+ phy-handle = <&phy_port3>;
+ phy-mode = "internal";
+ };
+ };
+
+ mdio {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ interrupt-parent = <&switch10>;
+
+ phy_port0: phy@0 {
+ reg = <0x0>;
+ interrupts = <0>;
+ };
+
+ phy_port1: phy@1 {
+ reg = <0x1>;
+ interrupts = <0>;
+ };
+
+ phy_port2: phy@2 {
+ reg = <0x2>;
+ interrupts = <0>;
+ };
+
+ phy_port3: phy@3 {
+ reg = <0x3>;
+ interrupts = <0>;
+ };
+
+ phy_port4: phy@4 {
+ reg = <0x4>;
+ interrupts = <0>;
+ };
+ };
+ };
+ };
+};
diff --git a/Documentation/devicetree/bindings/net/mediatek-dwmac.txt b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
index 8a08621a5b54..afbcaebf062e 100644
--- a/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
+++ b/Documentation/devicetree/bindings/net/mediatek-dwmac.txt
@@ -14,7 +14,7 @@ Required properties:
Should be "macirq" for the main MAC IRQ
- clocks: Must contain a phandle for each entry in clock-names.
- clock-names: The name of the clock listed in the clocks property. These are
- "axi", "apb", "mac_main", "ptp_ref" for MT2712 SoC
+ "axi", "apb", "mac_main", "ptp_ref", "rmii_internal" for MT2712 SoC.
- mac-address: See ethernet.txt in the same directory
- phy-mode: See ethernet.txt in the same directory
- mediatek,pericfg: A phandle to the syscon node that control ethernet
@@ -23,8 +23,10 @@ Required properties:
Optional properties:
- mediatek,tx-delay-ps: TX clock delay macro value. Default is 0.
It should be defined for RGMII/MII interface.
+ It should be defined for RMII interface when the reference clock is from MT2712 SoC.
- mediatek,rx-delay-ps: RX clock delay macro value. Default is 0.
- It should be defined for RGMII/MII/RMII interface.
+ It should be defined for RGMII/MII interface.
+ It should be defined for RMII interface.
Both delay properties need to be a multiple of 170 for RGMII interface,
or will round down. Range 0~31*170.
Both delay properties need to be a multiple of 550 for MII/RMII interface,
@@ -34,13 +36,20 @@ or will round down. Range 0~31*550.
reference clock, which is from external PHYs, is connected to RXC pin
on MT2712 SoC.
Otherwise, is connected to TXC pin.
+- mediatek,rmii-clk-from-mac: boolean property, if present indicates that
+ MT2712 SoC provides the RMII reference clock, which outputs to TXC pin only.
- mediatek,txc-inverse: boolean property, if present indicates that
1. tx clock will be inversed in MII/RGMII case,
2. tx clock inside MAC will be inversed relative to reference clock
which is from external PHYs in RMII case, and it rarely happen.
+ 3. the reference clock, which outputs to TXC pin will be inversed in RMII case
+ when the reference clock is from MT2712 SoC.
- mediatek,rxc-inverse: boolean property, if present indicates that
1. rx clock will be inversed in MII/RGMII case.
- 2. reference clock will be inversed when arrived at MAC in RMII case.
+ 2. reference clock will be inversed when arrived at MAC in RMII case, when
+ the reference clock is from external PHYs.
+ 3. the inside clock, which be sent to MAC, will be inversed in RMII case when
+ the reference clock is from MT2712 SoC.
- assigned-clocks: mac_main and ptp_ref clocks
- assigned-clock-parents: parent clocks of the assigned clocks
@@ -50,29 +59,33 @@ Example:
reg = <0 0x1101c000 0 0x1300>;
interrupts = <GIC_SPI 237 IRQ_TYPE_LEVEL_LOW>;
interrupt-names = "macirq";
- phy-mode ="rgmii";
+ phy-mode ="rgmii-rxid";
mac-address = [00 55 7b b5 7d f7];
clock-names = "axi",
"apb",
"mac_main",
"ptp_ref",
- "ptp_top";
+ "rmii_internal";
clocks = <&pericfg CLK_PERI_GMAC>,
<&pericfg CLK_PERI_GMAC_PCLK>,
<&topckgen CLK_TOP_ETHER_125M_SEL>,
- <&topckgen CLK_TOP_ETHER_50M_SEL>;
+ <&topckgen CLK_TOP_ETHER_50M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_RMII_SEL>;
assigned-clocks = <&topckgen CLK_TOP_ETHER_125M_SEL>,
- <&topckgen CLK_TOP_ETHER_50M_SEL>;
+ <&topckgen CLK_TOP_ETHER_50M_SEL>,
+ <&topckgen CLK_TOP_ETHER_50M_RMII_SEL>;
assigned-clock-parents = <&topckgen CLK_TOP_ETHERPLL_125M>,
- <&topckgen CLK_TOP_APLL1_D3>;
+ <&topckgen CLK_TOP_APLL1_D3>,
+ <&topckgen CLK_TOP_ETHERPLL_50M>;
+ power-domains = <&scpsys MT2712_POWER_DOMAIN_AUDIO>;
mediatek,pericfg = <&pericfg>;
mediatek,tx-delay-ps = <1530>;
mediatek,rx-delay-ps = <1530>;
mediatek,rmii-rxc;
mediatek,txc-inverse;
mediatek,rxc-inverse;
- snps,txpbl = <32>;
- snps,rxpbl = <32>;
+ snps,txpbl = <1>;
+ snps,rxpbl = <1>;
snps,reset-gpio = <&pio 87 GPIO_ACTIVE_LOW>;
snps,reset-active-low;
};
diff --git a/Documentation/devicetree/bindings/net/ti,dp83867.txt b/Documentation/devicetree/bindings/net/ti,dp83867.txt
index 388ff48f53ae..44e2a4fab29e 100644
--- a/Documentation/devicetree/bindings/net/ti,dp83867.txt
+++ b/Documentation/devicetree/bindings/net/ti,dp83867.txt
@@ -8,8 +8,6 @@ Required properties:
- ti,tx-internal-delay - RGMII Transmit Clock Delay - see dt-bindings/net/ti-dp83867.h
for applicable values. Required only if interface type is
PHY_INTERFACE_MODE_RGMII_ID or PHY_INTERFACE_MODE_RGMII_TXID
- - ti,fifo-depth - Transmitt FIFO depth- see dt-bindings/net/ti-dp83867.h
- for applicable values
Note: If the interface type is PHY_INTERFACE_MODE_RGMII the TX/RX clock delays
will be left at their default values, as set by the PHY's pin strapping.
@@ -42,6 +40,14 @@ Optional property:
Some MACs work with differential SGMII clock.
See data manual for details.
+ - ti,fifo-depth - Transmitt FIFO depth- see dt-bindings/net/ti-dp83867.h
+ for applicable values (deprecated)
+
+ -tx-fifo-depth - As defined in the ethernet-controller.yaml. Values for
+ the depth can be found in dt-bindings/net/ti-dp83867.h
+ -rx-fifo-depth - As defined in the ethernet-controller.yaml. Values for
+ the depth can be found in dt-bindings/net/ti-dp83867.h
+
Note: ti,min-output-impedance and ti,max-output-impedance are mutually
exclusive. When both properties are present ti,max-output-impedance
takes precedence.
@@ -55,7 +61,7 @@ Example:
reg = <0>;
ti,rx-internal-delay = <DP83867_RGMIIDCTL_2_25_NS>;
ti,tx-internal-delay = <DP83867_RGMIIDCTL_2_75_NS>;
- ti,fifo-depth = <DP83867_PHYCR_FIFO_DEPTH_4_B_NIB>;
+ tx-fifo-depth = <DP83867_PHYCR_FIFO_DEPTH_4_B_NIB>;
};
Datasheet can be found:
diff --git a/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml
new file mode 100644
index 000000000000..a1717db36dba
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/wireless/qcom,ath11k.yaml
@@ -0,0 +1,273 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright (c) 2018-2019 The Linux Foundation. All rights reserved.
+
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/wireless/qcom,ath11k.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Technologies ath11k wireless devices Generic Binding
+
+maintainers:
+ - Kalle Valo <kvalo@codeaurora.org>
+
+description: |
+ These are dt entries for Qualcomm Technologies, Inc. IEEE 802.11ax
+ devices, for example like AHB based IPQ8074.
+
+properties:
+ compatible:
+ const: qcom,ipq8074-wifi
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ items:
+ - description: misc-pulse1 interrupt events
+ - description: misc-latch interrupt events
+ - description: sw exception interrupt events
+ - description: watchdog interrupt events
+ - description: interrupt event for ring CE0
+ - description: interrupt event for ring CE1
+ - description: interrupt event for ring CE2
+ - description: interrupt event for ring CE3
+ - description: interrupt event for ring CE4
+ - description: interrupt event for ring CE5
+ - description: interrupt event for ring CE6
+ - description: interrupt event for ring CE7
+ - description: interrupt event for ring CE8
+ - description: interrupt event for ring CE9
+ - description: interrupt event for ring CE10
+ - description: interrupt event for ring CE11
+ - description: interrupt event for ring host2wbm-desc-feed
+ - description: interrupt event for ring host2reo-re-injection
+ - description: interrupt event for ring host2reo-command
+ - description: interrupt event for ring host2rxdma-monitor-ring3
+ - description: interrupt event for ring host2rxdma-monitor-ring2
+ - description: interrupt event for ring host2rxdma-monitor-ring1
+ - description: interrupt event for ring reo2ost-exception
+ - description: interrupt event for ring wbm2host-rx-release
+ - description: interrupt event for ring reo2host-status
+ - description: interrupt event for ring reo2host-destination-ring4
+ - description: interrupt event for ring reo2host-destination-ring3
+ - description: interrupt event for ring reo2host-destination-ring2
+ - description: interrupt event for ring reo2host-destination-ring1
+ - description: interrupt event for ring rxdma2host-monitor-destination-mac3
+ - description: interrupt event for ring rxdma2host-monitor-destination-mac2
+ - description: interrupt event for ring rxdma2host-monitor-destination-mac1
+ - description: interrupt event for ring ppdu-end-interrupts-mac3
+ - description: interrupt event for ring ppdu-end-interrupts-mac2
+ - description: interrupt event for ring ppdu-end-interrupts-mac1
+ - description: interrupt event for ring rxdma2host-monitor-status-ring-mac3
+ - description: interrupt event for ring rxdma2host-monitor-status-ring-mac2
+ - description: interrupt event for ring rxdma2host-monitor-status-ring-mac1
+ - description: interrupt event for ring host2rxdma-host-buf-ring-mac3
+ - description: interrupt event for ring host2rxdma-host-buf-ring-mac2
+ - description: interrupt event for ring host2rxdma-host-buf-ring-mac1
+ - description: interrupt event for ring rxdma2host-destination-ring-mac3
+ - description: interrupt event for ring rxdma2host-destination-ring-mac2
+ - description: interrupt event for ring rxdma2host-destination-ring-mac1
+ - description: interrupt event for ring host2tcl-input-ring4
+ - description: interrupt event for ring host2tcl-input-ring3
+ - description: interrupt event for ring host2tcl-input-ring2
+ - description: interrupt event for ring host2tcl-input-ring1
+ - description: interrupt event for ring wbm2host-tx-completions-ring3
+ - description: interrupt event for ring wbm2host-tx-completions-ring2
+ - description: interrupt event for ring wbm2host-tx-completions-ring1
+ - description: interrupt event for ring tcl2host-status-ring
+
+
+ interrupt-names:
+ items:
+ - const: misc-pulse1
+ - const: misc-latch
+ - const: sw-exception
+ - const: watchdog
+ - const: ce0
+ - const: ce1
+ - const: ce2
+ - const: ce3
+ - const: ce4
+ - const: ce5
+ - const: ce6
+ - const: ce7
+ - const: ce8
+ - const: ce9
+ - const: ce10
+ - const: ce11
+ - const: host2wbm-desc-feed
+ - const: host2reo-re-injection
+ - const: host2reo-command
+ - const: host2rxdma-monitor-ring3
+ - const: host2rxdma-monitor-ring2
+ - const: host2rxdma-monitor-ring1
+ - const: reo2ost-exception
+ - const: wbm2host-rx-release
+ - const: reo2host-status
+ - const: reo2host-destination-ring4
+ - const: reo2host-destination-ring3
+ - const: reo2host-destination-ring2
+ - const: reo2host-destination-ring1
+ - const: rxdma2host-monitor-destination-mac3
+ - const: rxdma2host-monitor-destination-mac2
+ - const: rxdma2host-monitor-destination-mac1
+ - const: ppdu-end-interrupts-mac3
+ - const: ppdu-end-interrupts-mac2
+ - const: ppdu-end-interrupts-mac1
+ - const: rxdma2host-monitor-status-ring-mac3
+ - const: rxdma2host-monitor-status-ring-mac2
+ - const: rxdma2host-monitor-status-ring-mac1
+ - const: host2rxdma-host-buf-ring-mac3
+ - const: host2rxdma-host-buf-ring-mac2
+ - const: host2rxdma-host-buf-ring-mac1
+ - const: rxdma2host-destination-ring-mac3
+ - const: rxdma2host-destination-ring-mac2
+ - const: rxdma2host-destination-ring-mac1
+ - const: host2tcl-input-ring4
+ - const: host2tcl-input-ring3
+ - const: host2tcl-input-ring2
+ - const: host2tcl-input-ring1
+ - const: wbm2host-tx-completions-ring3
+ - const: wbm2host-tx-completions-ring2
+ - const: wbm2host-tx-completions-ring1
+ - const: tcl2host-status-ring
+
+ qcom,rproc:
+ $ref: /schemas/types.yaml#definitions/phandle
+ description:
+ DT entry of q6v5-wcss remoteproc driver.
+ Phandle to a node that can contain the following properties
+ * compatible
+ * reg
+ * reg-names
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - interrupt-names
+ - qcom,rproc
+
+additionalProperties: false
+
+examples:
+ - |
+
+ q6v5_wcss: q6v5_wcss@CD00000 {
+ compatible = "qcom,ipq8074-wcss-pil";
+ reg = <0xCD00000 0x4040>,
+ <0x4AB000 0x20>;
+ reg-names = "qdsp6",
+ "rmb";
+ };
+
+ wifi0: wifi@c000000 {
+ compatible = "qcom,ipq8074-wifi";
+ reg = <0xc000000 0x2000000>;
+ interrupts = <0 320 1>,
+ <0 319 1>,
+ <0 318 1>,
+ <0 317 1>,
+ <0 316 1>,
+ <0 315 1>,
+ <0 314 1>,
+ <0 311 1>,
+ <0 310 1>,
+ <0 411 1>,
+ <0 410 1>,
+ <0 40 1>,
+ <0 39 1>,
+ <0 302 1>,
+ <0 301 1>,
+ <0 37 1>,
+ <0 36 1>,
+ <0 296 1>,
+ <0 295 1>,
+ <0 294 1>,
+ <0 293 1>,
+ <0 292 1>,
+ <0 291 1>,
+ <0 290 1>,
+ <0 289 1>,
+ <0 288 1>,
+ <0 239 1>,
+ <0 236 1>,
+ <0 235 1>,
+ <0 234 1>,
+ <0 233 1>,
+ <0 232 1>,
+ <0 231 1>,
+ <0 230 1>,
+ <0 229 1>,
+ <0 228 1>,
+ <0 224 1>,
+ <0 223 1>,
+ <0 203 1>,
+ <0 183 1>,
+ <0 180 1>,
+ <0 179 1>,
+ <0 178 1>,
+ <0 177 1>,
+ <0 176 1>,
+ <0 163 1>,
+ <0 162 1>,
+ <0 160 1>,
+ <0 159 1>,
+ <0 158 1>,
+ <0 157 1>,
+ <0 156 1>;
+ interrupt-names = "misc-pulse1",
+ "misc-latch",
+ "sw-exception",
+ "watchdog",
+ "ce0",
+ "ce1",
+ "ce2",
+ "ce3",
+ "ce4",
+ "ce5",
+ "ce6",
+ "ce7",
+ "ce8",
+ "ce9",
+ "ce10",
+ "ce11",
+ "host2wbm-desc-feed",
+ "host2reo-re-injection",
+ "host2reo-command",
+ "host2rxdma-monitor-ring3",
+ "host2rxdma-monitor-ring2",
+ "host2rxdma-monitor-ring1",
+ "reo2ost-exception",
+ "wbm2host-rx-release",
+ "reo2host-status",
+ "reo2host-destination-ring4",
+ "reo2host-destination-ring3",
+ "reo2host-destination-ring2",
+ "reo2host-destination-ring1",
+ "rxdma2host-monitor-destination-mac3",
+ "rxdma2host-monitor-destination-mac2",
+ "rxdma2host-monitor-destination-mac1",
+ "ppdu-end-interrupts-mac3",
+ "ppdu-end-interrupts-mac2",
+ "ppdu-end-interrupts-mac1",
+ "rxdma2host-monitor-status-ring-mac3",
+ "rxdma2host-monitor-status-ring-mac2",
+ "rxdma2host-monitor-status-ring-mac1",
+ "host2rxdma-host-buf-ring-mac3",
+ "host2rxdma-host-buf-ring-mac2",
+ "host2rxdma-host-buf-ring-mac1",
+ "rxdma2host-destination-ring-mac3",
+ "rxdma2host-destination-ring-mac2",
+ "rxdma2host-destination-ring-mac1",
+ "host2tcl-input-ring4",
+ "host2tcl-input-ring3",
+ "host2tcl-input-ring2",
+ "host2tcl-input-ring1",
+ "wbm2host-tx-completions-ring3",
+ "wbm2host-tx-completions-ring2",
+ "wbm2host-tx-completions-ring1",
+ "tcl2host-status-ring";
+ qcom,rproc = <&q6v5_wcss>;
+ };
diff --git a/Documentation/devicetree/bindings/ptp/ptp-ines.txt b/Documentation/devicetree/bindings/ptp/ptp-ines.txt
new file mode 100644
index 000000000000..4c242bd1ce9c
--- /dev/null
+++ b/Documentation/devicetree/bindings/ptp/ptp-ines.txt
@@ -0,0 +1,35 @@
+ZHAW InES PTP time stamping IP core
+
+The IP core needs two different kinds of nodes. The control node
+lives somewhere in the memory map and specifies the address of the
+control registers. There can be up to three port handles placed as
+attributes of PHY nodes. These associate a particular MII bus with a
+port index within the IP core.
+
+Required properties of the control node:
+
+- compatible: "ines,ptp-ctrl"
+- reg: physical address and size of the register bank
+
+Required format of the port handle within the PHY node:
+
+- timestamper: provides control node reference and
+ the port channel within the IP core
+
+Example:
+
+ tstamper: timestamper@60000000 {
+ compatible = "ines,ptp-ctrl";
+ reg = <0x60000000 0x80>;
+ };
+
+ ethernet@80000000 {
+ ...
+ mdio {
+ ...
+ ethernet-phy@3 {
+ ...
+ timestamper = <&tstamper 0>;
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/ptp/timestamper.txt b/Documentation/devicetree/bindings/ptp/timestamper.txt
new file mode 100644
index 000000000000..fc550ce4d4ea
--- /dev/null
+++ b/Documentation/devicetree/bindings/ptp/timestamper.txt
@@ -0,0 +1,42 @@
+Time stamps from MII bus snooping devices
+
+This binding supports non-PHY devices that snoop the MII bus and
+provide time stamps. In contrast to PHY time stamping drivers (which
+can simply attach their interface directly to the PHY instance), stand
+alone MII time stamping drivers use this binding to specify the
+connection between the snooping device and a given network interface.
+
+Non-PHY MII time stamping drivers typically talk to the control
+interface over another bus like I2C, SPI, UART, or via a memory mapped
+peripheral. This controller device is associated with one or more
+time stamping channels, each of which snoops on a MII bus.
+
+The "timestamper" property lives in a phy node and links a time
+stamping channel from the controller device to that phy's MII bus.
+
+Example:
+
+ tstamper: timestamper@10000000 {
+ compatible = "ines,ptp-ctrl";
+ reg = <0x10000000 0x80>;
+ };
+
+ ethernet@20000000 {
+ mdio {
+ ethernet-phy@1 {
+ timestamper = <&tstamper 0>;
+ };
+ };
+ };
+
+ ethernet@30000000 {
+ mdio {
+ ethernet-phy@2 {
+ timestamper = <&tstamper 1>;
+ };
+ };
+ };
+
+In this example, time stamps from the MII bus attached to phy@1 will
+appear on time stamp channel 0 (zero), and those from phy@2 appear on
+channel 1.
diff --git a/Documentation/networking/device_drivers/index.rst b/Documentation/networking/device_drivers/index.rst
index c1f7f75e5fd9..4bc6ff29976a 100644
--- a/Documentation/networking/device_drivers/index.rst
+++ b/Documentation/networking/device_drivers/index.rst
@@ -25,6 +25,7 @@ Contents:
mellanox/mlx5
netronome/nfp
pensando/ionic
+ stmicro/stmmac
.. only:: subproject and html
diff --git a/Documentation/networking/device_drivers/netronome/nfp.rst b/Documentation/networking/device_drivers/netronome/nfp.rst
index 6c08ac8b5147..ada611fb427c 100644
--- a/Documentation/networking/device_drivers/netronome/nfp.rst
+++ b/Documentation/networking/device_drivers/netronome/nfp.rst
@@ -131,3 +131,119 @@ abi_drv_reset
abi_drv_load_ifc
Defines a list of PF devices allowed to load FW on the device.
This variable is not currently user configurable.
+
+Statistics
+==========
+
+Following device statistics are available through the ``ethtool -S`` interface:
+
+.. flat-table:: NFP device statistics
+ :header-rows: 1
+ :widths: 3 1 11
+
+ * - Name
+ - ID
+ - Meaning
+
+ * - dev_rx_discards
+ - 1
+ - Packet can be discarded on the RX path for one of the following reasons:
+
+ * The NIC is not in promisc mode, and the destination MAC address
+ doesn't match the interfaces' MAC address.
+ * The received packet is larger than the max buffer size on the host.
+ I.e. it exceeds the Layer 3 MRU.
+ * There is no freelist descriptor available on the host for the packet.
+ It is likely that the NIC couldn't cache one in time.
+ * A BPF program discarded the packet.
+ * The datapath drop action was executed.
+ * The MAC discarded the packet due to lack of ingress buffer space
+ on the NIC.
+
+ * - dev_rx_errors
+ - 2
+ - A packet can be counted (and dropped) as RX error for the following
+ reasons:
+
+ * A problem with the VEB lookup (only when SR-IOV is used).
+ * A physical layer problem that causes Ethernet errors, like FCS or
+ alignment errors. The cause is usually faulty cables or SFPs.
+
+ * - dev_rx_bytes
+ - 3
+ - Total number of bytes received.
+
+ * - dev_rx_uc_bytes
+ - 4
+ - Unicast bytes received.
+
+ * - dev_rx_mc_bytes
+ - 5
+ - Multicast bytes received.
+
+ * - dev_rx_bc_bytes
+ - 6
+ - Broadcast bytes received.
+
+ * - dev_rx_pkts
+ - 7
+ - Total number of packets received.
+
+ * - dev_rx_mc_pkts
+ - 8
+ - Multicast packets received.
+
+ * - dev_rx_bc_pkts
+ - 9
+ - Broadcast packets received.
+
+ * - dev_tx_discards
+ - 10
+ - A packet can be discarded in the TX direction if the MAC is
+ being flow controlled and the NIC runs out of TX queue space.
+
+ * - dev_tx_errors
+ - 11
+ - A packet can be counted as TX error (and dropped) for one for the
+ following reasons:
+
+ * The packet is an LSO segment, but the Layer 3 or Layer 4 offset
+ could not be determined. Therefore LSO could not continue.
+ * An invalid packet descriptor was received over PCIe.
+ * The packet Layer 3 length exceeds the device MTU.
+ * An error on the MAC/physical layer. Usually due to faulty cables or
+ SFPs.
+ * A CTM buffer could not be allocated.
+ * The packet offset was incorrect and could not be fixed by the NIC.
+
+ * - dev_tx_bytes
+ - 12
+ - Total number of bytes transmitted.
+
+ * - dev_tx_uc_bytes
+ - 13
+ - Unicast bytes transmitted.
+
+ * - dev_tx_mc_bytes
+ - 14
+ - Multicast bytes transmitted.
+
+ * - dev_tx_bc_bytes
+ - 15
+ - Broadcast bytes transmitted.
+
+ * - dev_tx_pkts
+ - 16
+ - Total number of packets transmitted.
+
+ * - dev_tx_mc_pkts
+ - 17
+ - Multicast packets transmitted.
+
+ * - dev_tx_bc_pkts
+ - 18
+ - Broadcast packets transmitted.
+
+Note that statistics unknown to the driver will be displayed as
+``dev_unknown_stat$ID``, where ``$ID`` refers to the second column
+above.
diff --git a/Documentation/networking/device_drivers/stmicro/stmmac.rst b/Documentation/networking/device_drivers/stmicro/stmmac.rst
new file mode 100644
index 000000000000..c34bab3d2df0
--- /dev/null
+++ b/Documentation/networking/device_drivers/stmicro/stmmac.rst
@@ -0,0 +1,697 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+==============================================================
+Linux Driver for the Synopsys(R) Ethernet Controllers "stmmac"
+==============================================================
+
+Authors: Giuseppe Cavallaro <peppe.cavallaro@st.com>,
+Alexandre Torgue <alexandre.torgue@st.com>, Jose Abreu <joabreu@synopsys.com>
+
+Contents
+========
+
+- In This Release
+- Feature List
+- Kernel Configuration
+- Command Line Parameters
+- Driver Information and Notes
+- Debug Information
+- Support
+
+In This Release
+===============
+
+This file describes the stmmac Linux Driver for all the Synopsys(R) Ethernet
+Controllers.
+
+Currently, this network device driver is for all STi embedded MAC/GMAC
+(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XILINX XC2V3000
+FF1152AMT0221 D1215994A VIRTEX FPGA board. The Synopsys Ethernet QoS 5.0 IPK
+is also supported.
+
+DesignWare(R) Cores Ethernet MAC 10/100/1000 Universal version 3.70a
+(and older) and DesignWare(R) Cores Ethernet Quality-of-Service version 4.0
+(and upper) have been used for developing this driver as well as
+DesignWare(R) Cores XGMAC - 10G Ethernet MAC.
+
+This driver supports both the platform bus and PCI.
+
+This driver includes support for the following Synopsys(R) DesignWare(R)
+Cores Ethernet Controllers and corresponding minimum and maximum versions:
+
++-------------------------------+--------------+--------------+--------------+
+| Controller Name | Min. Version | Max. Version | Abbrev. Name |
++===============================+==============+==============+==============+
+| Ethernet MAC Universal | N/A | 3.73a | GMAC |
++-------------------------------+--------------+--------------+--------------+
+| Ethernet Quality-of-Service | 4.00a | N/A | GMAC4+ |
++-------------------------------+--------------+--------------+--------------+
+| XGMAC - 10G Ethernet MAC | 2.10a | N/A | XGMAC2+ |
++-------------------------------+--------------+--------------+--------------+
+
+For questions related to hardware requirements, refer to the documentation
+supplied with your Ethernet adapter. All hardware requirements listed apply
+to use with Linux.
+
+Feature List
+============
+
+The following features are available in this driver:
+ - GMII/MII/RGMII/SGMII/RMII/XGMII Interface
+ - Half-Duplex / Full-Duplex Operation
+ - Energy Efficient Ethernet (EEE)
+ - IEEE 802.3x PAUSE Packets (Flow Control)
+ - RMON/MIB Counters
+ - IEEE 1588 Timestamping (PTP)
+ - Pulse-Per-Second Output (PPS)
+ - MDIO Clause 22 / Clause 45 Interface
+ - MAC Loopback
+ - ARP Offloading
+ - Automatic CRC / PAD Insertion and Checking
+ - Checksum Offload for Received and Transmitted Packets
+ - Standard or Jumbo Ethernet Packets
+ - Source Address Insertion / Replacement
+ - VLAN TAG Insertion / Replacement / Deletion / Filtering (HASH and PERFECT)
+ - Programmable TX and RX Watchdog and Coalesce Settings
+ - Destination Address Filtering (PERFECT)
+ - HASH Filtering (Multicast)
+ - Layer 3 / Layer 4 Filtering
+ - Remote Wake-Up Detection
+ - Receive Side Scaling (RSS)
+ - Frame Preemption for TX and RX
+ - Programmable Burst Length, Threshold, Queue Size
+ - Multiple Queues (up to 8)
+ - Multiple Scheduling Algorithms (TX: WRR, DWRR, WFQ, SP, CBS, EST, TBS;
+ RX: WRR, SP)
+ - Flexible RX Parser
+ - TCP / UDP Segmentation Offload (TSO, USO)
+ - Split Header (SPH)
+ - Safety Features (ECC Protection, Data Parity Protection)
+ - Selftests using Ethtool
+
+Kernel Configuration
+====================
+
+The kernel configuration option is ``CONFIG_STMMAC_ETH``:
+ - ``CONFIG_STMMAC_PLATFORM``: is to enable the platform driver.
+ - ``CONFIG_STMMAC_PCI``: is to enable the pci driver.
+
+Command Line Parameters
+=======================
+
+If the driver is built as a module the following optional parameters are used
+by entering them on the command line with the modprobe command using this
+syntax (e.g. for PCI module)::
+
+ modprobe stmmac_pci [<option>=<VAL1>,<VAL2>,...]
+
+Driver parameters can be also passed in command line by using::
+
+ stmmaceth=watchdog:100,chain_mode=1
+
+The default value for each parameter is generally the recommended setting,
+unless otherwise noted.
+
+watchdog
+--------
+:Valid Range: 5000-None
+:Default Value: 5000
+
+This parameter overrides the transmit timeout in milliseconds.
+
+debug
+-----
+:Valid Range: 0-16 (0=none,...,16=all)
+:Default Value: 0
+
+This parameter adjusts the level of debug messages displayed in the system
+logs.
+
+phyaddr
+-------
+:Valid Range: 0-31
+:Default Value: -1
+
+This parameter overrides the physical address of the PHY device.
+
+flow_ctrl
+---------
+:Valid Range: 0-3 (0=off,1=rx,2=tx,3=rx/tx)
+:Default Value: 3
+
+This parameter changes the default Flow Control ability.
+
+pause
+-----
+:Valid Range: 0-65535
+:Default Value: 65535
+
+This parameter changes the default Flow Control Pause time.
+
+tc
+--
+:Valid Range: 64-256
+:Default Value: 64
+
+This parameter changes the default HW FIFO Threshold control value.
+
+buf_sz
+------
+:Valid Range: 1536-16384
+:Default Value: 1536
+
+This parameter changes the default RX DMA packet buffer size.
+
+eee_timer
+---------
+:Valid Range: 0-None
+:Default Value: 1000
+
+This parameter changes the default LPI TX Expiration time in milliseconds.
+
+chain_mode
+----------
+:Valid Range: 0-1 (0=off,1=on)
+:Default Value: 0
+
+This parameter changes the default mode of operation from Ring Mode to
+Chain Mode.
+
+Driver Information and Notes
+============================
+
+Transmit Process
+----------------
+
+The xmit method is invoked when the kernel needs to transmit a packet; it sets
+the descriptors in the ring and informs the DMA engine that there is a packet
+ready to be transmitted.
+
+By default, the driver sets the ``NETIF_F_SG`` bit in the features field of
+the ``net_device`` structure, enabling the scatter-gather feature. This is
+true on chips and configurations where the checksum can be done in hardware.
+
+Once the controller has finished transmitting the packet, timer will be
+scheduled to release the transmit resources.
+
+Receive Process
+---------------
+
+When one or more packets are received, an interrupt happens. The interrupts
+are not queued, so the driver has to scan all the descriptors in the ring
+during the receive process.
+
+This is based on NAPI, so the interrupt handler signals only if there is work
+to be done, and it exits. Then the poll method will be scheduled at some
+future point.
+
+The incoming packets are stored, by the DMA, in a list of pre-allocated socket
+buffers in order to avoid the memcpy (zero-copy).
+
+Interrupt Mitigation
+--------------------
+
+The driver is able to mitigate the number of its DMA interrupts using NAPI for
+the reception on chips older than the 3.50. New chips have an HW RX Watchdog
+used for this mitigation.
+
+Mitigation parameters can be tuned by ethtool.
+
+WoL
+---
+
+Wake up on Lan feature through Magic and Unicast frames are supported for the
+GMAC, GMAC4/5 and XGMAC core.
+
+DMA Descriptors
+---------------
+
+Driver handles both normal and alternate descriptors. The latter has been only
+tested on DesignWare(R) Cores Ethernet MAC Universal version 3.41a and later.
+
+stmmac supports DMA descriptor to operate both in dual buffer (RING) and
+linked-list(CHAINED) mode. In RING each descriptor points to two data buffer
+pointers whereas in CHAINED mode they point to only one data buffer pointer.
+RING mode is the default.
+
+In CHAINED mode each descriptor will have pointer to next descriptor in the
+list, hence creating the explicit chaining in the descriptor itself, whereas
+such explicit chaining is not possible in RING mode.
+
+Extended Descriptors
+--------------------
+
+The extended descriptors give us information about the Ethernet payload when
+it is carrying PTP packets or TCP/UDP/ICMP over IP. These are not available on
+GMAC Synopsys(R) chips older than the 3.50. At probe time the driver will
+decide if these can be actually used. This support also is mandatory for PTPv2
+because the extra descriptors are used for saving the hardware timestamps and
+Extended Status.
+
+Ethtool Support
+---------------
+
+Ethtool is supported. For example, driver statistics (including RMON),
+internal errors can be taken using::
+
+ ethtool -S ethX
+
+Ethtool selftests are also supported. This allows to do some early sanity
+checks to the HW using MAC and PHY loopback mechanisms::
+
+ ethtool -t ethX
+
+Jumbo and Segmentation Offloading
+---------------------------------
+
+Jumbo frames are supported and tested for the GMAC. The GSO has been also
+added but it's performed in software. LRO is not supported.
+
+TSO Support
+-----------
+
+TSO (TCP Segmentation Offload) feature is supported by GMAC > 4.x and XGMAC
+chip family. When a packet is sent through TCP protocol, the TCP stack ensures
+that the SKB provided to the low level driver (stmmac in our case) matches
+with the maximum frame len (IP header + TCP header + payload <= 1500 bytes
+(for MTU set to 1500)). It means that if an application using TCP want to send
+a packet which will have a length (after adding headers) > 1514 the packet
+will be split in several TCP packets: The data payload is split and headers
+(TCP/IP ..) are added. It is done by software.
+
+When TSO is enabled, the TCP stack doesn't care about the maximum frame length
+and provide SKB packet to stmmac as it is. The GMAC IP will have to perform
+the segmentation by it self to match with maximum frame length.
+
+This feature can be enabled in device tree through ``snps,tso`` entry.
+
+Energy Efficient Ethernet
+-------------------------
+
+Energy Efficient Ethernet (EEE) enables IEEE 802.3 MAC sublayer along with a
+family of Physical layer to operate in the Low Power Idle (LPI) mode. The EEE
+mode supports the IEEE 802.3 MAC operation at 100Mbps, 1000Mbps and 1Gbps.
+
+The LPI mode allows power saving by switching off parts of the communication
+device functionality when there is no data to be transmitted & received.
+The system on both the side of the link can disable some functionalities and
+save power during the period of low-link utilization. The MAC controls whether
+the system should enter or exit the LPI mode and communicate this to PHY.
+
+As soon as the interface is opened, the driver verifies if the EEE can be
+supported. This is done by looking at both the DMA HW capability register and
+the PHY devices MCD registers.
+
+To enter in TX LPI mode the driver needs to have a software timer that enable
+and disable the LPI mode when there is nothing to be transmitted.
+
+Precision Time Protocol (PTP)
+-----------------------------
+
+The driver supports the IEEE 1588-2002, Precision Time Protocol (PTP), which
+enables precise synchronization of clocks in measurement and control systems
+implemented with technologies such as network communication.
+
+In addition to the basic timestamp features mentioned in IEEE 1588-2002
+Timestamps, new GMAC cores support the advanced timestamp features.
+IEEE 1588-2008 can be enabled when configuring the Kernel.
+
+SGMII/RGMII Support
+-------------------
+
+New GMAC devices provide own way to manage RGMII/SGMII. This information is
+available at run-time by looking at the HW capability register. This means
+that the stmmac can manage auto-negotiation and link status w/o using the
+PHYLIB stuff. In fact, the HW provides a subset of extended registers to
+restart the ANE, verify Full/Half duplex mode and Speed. Thanks to these
+registers, it is possible to look at the Auto-negotiated Link Parter Ability.
+
+Physical
+--------
+
+The driver is compatible with Physical Abstraction Layer to be connected with
+PHY and GPHY devices.
+
+Platform Information
+--------------------
+
+Several information can be passed through the platform and device-tree.
+
+::
+
+ struct plat_stmmacenet_data {
+
+1) Bus identifier::
+
+ int bus_id;
+
+2) PHY Physical Address. If set to -1 the driver will pick the first PHY it
+finds::
+
+ int phy_addr;
+
+3) PHY Device Interface::
+
+ int interface;
+
+4) Specific platform fields for the MDIO bus::
+
+ struct stmmac_mdio_bus_data *mdio_bus_data;
+
+5) Internal DMA parameters::
+
+ struct stmmac_dma_cfg *dma_cfg;
+
+6) Fixed CSR Clock Range selection::
+
+ int clk_csr;
+
+7) HW uses the GMAC core::
+
+ int has_gmac;
+
+8) If set the MAC will use Enhanced Descriptors::
+
+ int enh_desc;
+
+9) Core is able to perform TX Checksum and/or RX Checksum in HW::
+
+ int tx_coe;
+ int rx_coe;
+
+11) Some HWs are not able to perform the csum in HW for over-sized frames due
+to limited buffer sizes. Setting this flag the csum will be done in SW on
+JUMBO frames::
+
+ int bugged_jumbo;
+
+12) Core has the embedded power module::
+
+ int pmt;
+
+13) Force DMA to use the Store and Forward mode or Threshold mode::
+
+ int force_sf_dma_mode;
+ int force_thresh_dma_mode;
+
+15) Force to disable the RX Watchdog feature and switch to NAPI mode::
+
+ int riwt_off;
+
+16) Limit the maximum operating speed and MTU::
+
+ int max_speed;
+ int maxmtu;
+
+18) Number of Multicast/Unicast filters::
+
+ int multicast_filter_bins;
+ int unicast_filter_entries;
+
+20) Limit the maximum TX and RX FIFO size::
+
+ int tx_fifo_size;
+ int rx_fifo_size;
+
+21) Use the specified number of TX and RX Queues::
+
+ u32 rx_queues_to_use;
+ u32 tx_queues_to_use;
+
+22) Use the specified TX and RX scheduling algorithm::
+
+ u8 rx_sched_algorithm;
+ u8 tx_sched_algorithm;
+
+23) Internal TX and RX Queue parameters::
+
+ struct stmmac_rxq_cfg rx_queues_cfg[MTL_MAX_RX_QUEUES];
+ struct stmmac_txq_cfg tx_queues_cfg[MTL_MAX_TX_QUEUES];
+
+24) This callback is used for modifying some syscfg registers (on ST SoCs)
+according to the link speed negotiated by the physical layer::
+
+ void (*fix_mac_speed)(void *priv, unsigned int speed);
+
+25) Callbacks used for calling a custom initialization; This is sometimes
+necessary on some platforms (e.g. ST boxes) where the HW needs to have set
+some PIO lines or system cfg registers. init/exit callbacks should not use
+or modify platform data::
+
+ int (*init)(struct platform_device *pdev, void *priv);
+ void (*exit)(struct platform_device *pdev, void *priv);
+
+26) Perform HW setup of the bus. For example, on some ST platforms this field
+is used to configure the AMBA bridge to generate more efficient STBus traffic::
+
+ struct mac_device_info *(*setup)(void *priv);
+ void *bsp_priv;
+
+27) Internal clocks and rates::
+
+ struct clk *stmmac_clk;
+ struct clk *pclk;
+ struct clk *clk_ptp_ref;
+ unsigned int clk_ptp_rate;
+ unsigned int clk_ref_rate;
+ s32 ptp_max_adj;
+
+28) Main reset::
+
+ struct reset_control *stmmac_rst;
+
+29) AXI Internal Parameters::
+
+ struct stmmac_axi *axi;
+
+30) HW uses GMAC>4 cores::
+
+ int has_gmac4;
+
+31) HW is sun8i based::
+
+ bool has_sun8i;
+
+32) Enables TSO feature::
+
+ bool tso_en;
+
+33) Enables Receive Side Scaling (RSS) feature::
+
+ int rss_en;
+
+34) MAC Port selection::
+
+ int mac_port_sel_speed;
+
+35) Enables TX LPI Clock Gating::
+
+ bool en_tx_lpi_clockgating;
+
+36) HW uses XGMAC>2.10 cores::
+
+ int has_xgmac;
+
+::
+
+ }
+
+For MDIO bus data, we have:
+
+::
+
+ struct stmmac_mdio_bus_data {
+
+1) PHY mask passed when MDIO bus is registered::
+
+ unsigned int phy_mask;
+
+2) List of IRQs, one per PHY::
+
+ int *irqs;
+
+3) If IRQs is NULL, use this for probed PHY::
+
+ int probed_phy_irq;
+
+4) Set to true if PHY needs reset::
+
+ bool needs_reset;
+
+::
+
+ }
+
+For DMA engine configuration, we have:
+
+::
+
+ struct stmmac_dma_cfg {
+
+1) Programmable Burst Length (TX and RX)::
+
+ int pbl;
+
+2) If set, DMA TX / RX will use this value rather than pbl::
+
+ int txpbl;
+ int rxpbl;
+
+3) Enable 8xPBL::
+
+ bool pblx8;
+
+4) Enable Fixed or Mixed burst::
+
+ int fixed_burst;
+ int mixed_burst;
+
+5) Enable Address Aligned Beats::
+
+ bool aal;
+
+6) Enable Enhanced Addressing (> 32 bits)::
+
+ bool eame;
+
+::
+
+ }
+
+For DMA AXI parameters, we have:
+
+::
+
+ struct stmmac_axi {
+
+1) Enable AXI LPI::
+
+ bool axi_lpi_en;
+ bool axi_xit_frm;
+
+2) Set AXI Write / Read maximum outstanding requests::
+
+ u32 axi_wr_osr_lmt;
+ u32 axi_rd_osr_lmt;
+
+3) Set AXI 4KB bursts::
+
+ bool axi_kbbe;
+
+4) Set AXI maximum burst length map::
+
+ u32 axi_blen[AXI_BLEN];
+
+5) Set AXI Fixed burst / mixed burst::
+
+ bool axi_fb;
+ bool axi_mb;
+
+6) Set AXI rebuild incrx mode::
+
+ bool axi_rb;
+
+::
+
+ }
+
+For the RX Queues configuration, we have:
+
+::
+
+ struct stmmac_rxq_cfg {
+
+1) Mode to use (DCB or AVB)::
+
+ u8 mode_to_use;
+
+2) DMA channel to use::
+
+ u32 chan;
+
+3) Packet routing, if applicable::
+
+ u8 pkt_route;
+
+4) Use priority routing, and priority to route::
+
+ bool use_prio;
+ u32 prio;
+
+::
+
+ }
+
+For the TX Queues configuration, we have:
+
+::
+
+ struct stmmac_txq_cfg {
+
+1) Queue weight in scheduler::
+
+ u32 weight;
+
+2) Mode to use (DCB or AVB)::
+
+ u8 mode_to_use;
+
+3) Credit Base Shaper Parameters::
+
+ u32 send_slope;
+ u32 idle_slope;
+ u32 high_credit;
+ u32 low_credit;
+
+4) Use priority scheduling, and priority::
+
+ bool use_prio;
+ u32 prio;
+
+::
+
+ }
+
+Device Tree Information
+-----------------------
+
+Please refer to the following document:
+Documentation/devicetree/bindings/net/snps,dwmac.yaml
+
+HW Capabilities
+---------------
+
+Note that, starting from new chips, where it is available the HW capability
+register, many configurations are discovered at run-time for example to
+understand if EEE, HW csum, PTP, enhanced descriptor etc are actually
+available. As strategy adopted in this driver, the information from the HW
+capability register can replace what has been passed from the platform.
+
+Debug Information
+=================
+
+The driver exports many information i.e. internal statistics, debug
+information, MAC and DMA registers etc.
+
+These can be read in several ways depending on the type of the information
+actually needed.
+
+For example a user can be use the ethtool support to get statistics: e.g.
+using: ``ethtool -S ethX`` (that shows the Management counters (MMC) if
+supported) or sees the MAC/DMA registers: e.g. using: ``ethtool -d ethX``
+
+Compiling the Kernel with ``CONFIG_DEBUG_FS`` the driver will export the
+following debugfs entries:
+
+ - ``descriptors_status``: To show the DMA TX/RX descriptor rings
+ - ``dma_cap``: To show the HW Capabilities
+
+Developer can also use the ``debug`` module parameter to get further debug
+information (please see: NETIF Msg Level).
+
+Support
+=======
+
+If an issue is identified with the released source code on a supported kernel
+with a supported adapter, email the specific information related to the
+issue to netdev@vger.kernel.org
diff --git a/Documentation/networking/device_drivers/stmicro/stmmac.txt b/Documentation/networking/device_drivers/stmicro/stmmac.txt
deleted file mode 100644
index 1ae979fd90d2..000000000000
--- a/Documentation/networking/device_drivers/stmicro/stmmac.txt
+++ /dev/null
@@ -1,401 +0,0 @@
- STMicroelectronics 10/100/1000 Synopsys Ethernet driver
-
-Copyright (C) 2007-2015 STMicroelectronics Ltd
-Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
-
-This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
-(Synopsys IP blocks).
-
-Currently this network device driver is for all STi embedded MAC/GMAC
-(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000
-FF1152AMT0221 D1215994A VIRTEX FPGA board.
-
-DWC Ether MAC 10/100/1000 Universal version 3.70a (and older) and DWC Ether
-MAC 10/100 Universal version 4.0 have been used for developing this driver.
-
-This driver supports both the platform bus and PCI.
-
-Please, for more information also visit: www.stlinux.com
-
-1) Kernel Configuration
-The kernel configuration option is STMMAC_ETH:
- Device Drivers ---> Network device support ---> Ethernet (1000 Mbit) --->
- STMicroelectronics 10/100/1000 Ethernet driver (STMMAC_ETH)
-
-CONFIG_STMMAC_PLATFORM: is to enable the platform driver.
-CONFIG_STMMAC_PCI: is to enable the pci driver.
-
-2) Driver parameters list:
- debug: message level (0: no output, 16: all);
- phyaddr: to manually provide the physical address to the PHY device;
- buf_sz: DMA buffer size;
- tc: control the HW FIFO threshold;
- watchdog: transmit timeout (in milliseconds);
- flow_ctrl: Flow control ability [on/off];
- pause: Flow Control Pause Time;
- eee_timer: tx EEE timer;
- chain_mode: select chain mode instead of ring.
-
-3) Command line options
-Driver parameters can be also passed in command line by using:
- stmmaceth=watchdog:100,chain_mode=1
-
-4) Driver information and notes
-
-4.1) Transmit process
-The xmit method is invoked when the kernel needs to transmit a packet; it sets
-the descriptors in the ring and informs the DMA engine, that there is a packet
-ready to be transmitted.
-By default, the driver sets the NETIF_F_SG bit in the features field of the
-net_device structure, enabling the scatter-gather feature. This is true on
-chips and configurations where the checksum can be done in hardware.
-Once the controller has finished transmitting the packet, timer will be
-scheduled to release the transmit resources.
-
-4.2) Receive process
-When one or more packets are received, an interrupt happens. The interrupts
-are not queued, so the driver has to scan all the descriptors in the ring during
-the receive process.
-This is based on NAPI, so the interrupt handler signals only if there is work
-to be done, and it exits.
-Then the poll method will be scheduled at some future point.
-The incoming packets are stored, by the DMA, in a list of pre-allocated socket
-buffers in order to avoid the memcpy (zero-copy).
-
-4.3) Interrupt mitigation
-The driver is able to mitigate the number of its DMA interrupts
-using NAPI for the reception on chips older than the 3.50.
-New chips have an HW RX-Watchdog used for this mitigation.
-Mitigation parameters can be tuned by ethtool.
-
-4.4) WOL
-Wake up on Lan feature through Magic and Unicast frames are supported for the
-GMAC core.
-
-4.5) DMA descriptors
-Driver handles both normal and alternate descriptors. The latter has been only
-tested on DWC Ether MAC 10/100/1000 Universal version 3.41a and later.
-
-STMMAC supports DMA descriptor to operate both in dual buffer (RING)
-and linked-list(CHAINED) mode. In RING each descriptor points to two
-data buffer pointers whereas in CHAINED mode they point to only one data
-buffer pointer. RING mode is the default.
-
-In CHAINED mode each descriptor will have pointer to next descriptor in
-the list, hence creating the explicit chaining in the descriptor itself,
-whereas such explicit chaining is not possible in RING mode.
-
-4.5.1) Extended descriptors
-The extended descriptors give us information about the Ethernet payload
-when it is carrying PTP packets or TCP/UDP/ICMP over IP.
-These are not available on GMAC Synopsys chips older than the 3.50.
-At probe time the driver will decide if these can be actually used.
-This support also is mandatory for PTPv2 because the extra descriptors
-are used for saving the hardware timestamps and Extended Status.
-
-4.6) Ethtool support
-Ethtool is supported.
-
-For example, driver statistics (including RMON), internal errors can be taken
-using:
- # ethtool -S ethX
-command
-
-4.7) Jumbo and Segmentation Offloading
-Jumbo frames are supported and tested for the GMAC.
-The GSO has been also added but it's performed in software.
-LRO is not supported.
-
-4.8) Physical
-The driver is compatible with Physical Abstraction Layer to be connected with
-PHY and GPHY devices.
-
-4.9) Platform information
-Several information can be passed through the platform and device-tree.
-
-struct plat_stmmacenet_data {
- char *phy_bus_name;
- int bus_id;
- int phy_addr;
- int interface;
- struct stmmac_mdio_bus_data *mdio_bus_data;
- struct stmmac_dma_cfg *dma_cfg;
- int clk_csr;
- int has_gmac;
- int enh_desc;
- int tx_coe;
- int rx_coe;
- int bugged_jumbo;
- int pmt;
- int force_sf_dma_mode;
- int force_thresh_dma_mode;
- int riwt_off;
- int max_speed;
- int maxmtu;
- void (*fix_mac_speed)(void *priv, unsigned int speed);
- void (*bus_setup)(void __iomem *ioaddr);
- int (*init)(struct platform_device *pdev, void *priv);
- void (*exit)(struct platform_device *pdev, void *priv);
- void *bsp_priv;
- int has_gmac4;
- bool tso_en;
-};
-
-Where:
- o phy_bus_name: phy bus name to attach to the stmmac.
- o bus_id: bus identifier.
- o phy_addr: the physical address can be passed from the platform.
- If it is set to -1 the driver will automatically
- detect it at run-time by probing all the 32 addresses.
- o interface: PHY device's interface.
- o mdio_bus_data: specific platform fields for the MDIO bus.
- o dma_cfg: internal DMA parameters
- o pbl: the Programmable Burst Length is maximum number of beats to
- be transferred in one DMA transaction.
- GMAC also enables the 4xPBL by default. (8xPBL for GMAC 3.50 and newer)
- o txpbl/rxpbl: GMAC and newer supports independent DMA pbl for tx/rx.
- o pblx8: Enable 8xPBL (4xPBL for core rev < 3.50). Enabled by default.
- o fixed_burst/mixed_burst/aal
- o clk_csr: fixed CSR Clock range selection.
- o has_gmac: uses the GMAC core.
- o enh_desc: if sets the MAC will use the enhanced descriptor structure.
- o tx_coe: core is able to perform the tx csum in HW.
- o rx_coe: the supports three check sum offloading engine types:
- type_1, type_2 (full csum) and no RX coe.
- o bugged_jumbo: some HWs are not able to perform the csum in HW for
- over-sized frames due to limited buffer sizes.
- Setting this flag the csum will be done in SW on
- JUMBO frames.
- o pmt: core has the embedded power module (optional).
- o force_sf_dma_mode: force DMA to use the Store and Forward mode
- instead of the Threshold.
- o force_thresh_dma_mode: force DMA to use the Threshold mode other than
- the Store and Forward mode.
- o riwt_off: force to disable the RX watchdog feature and switch to NAPI mode.
- o fix_mac_speed: this callback is used for modifying some syscfg registers
- (on ST SoCs) according to the link speed negotiated by the
- physical layer .
- o bus_setup: perform HW setup of the bus. For example, on some ST platforms
- this field is used to configure the AMBA bridge to generate more
- efficient STBus traffic.
- o init/exit: callbacks used for calling a custom initialization;
- this is sometime necessary on some platforms (e.g. ST boxes)
- where the HW needs to have set some PIO lines or system cfg
- registers. init/exit callbacks should not use or modify
- platform data.
- o bsp_priv: another private pointer.
- o has_gmac4: uses GMAC4 core.
- o tso_en: Enables TSO (TCP Segmentation Offload) feature.
-
-For MDIO bus The we have:
-
- struct stmmac_mdio_bus_data {
- int (*phy_reset)(void *priv);
- unsigned int phy_mask;
- int *irqs;
- int probed_phy_irq;
- };
-
-Where:
- o phy_reset: hook to reset the phy device attached to the bus.
- o phy_mask: phy mask passed when register the MDIO bus within the driver.
- o irqs: list of IRQs, one per PHY.
- o probed_phy_irq: if irqs is NULL, use this for probed PHY.
-
-For DMA engine we have the following internal fields that should be
-tuned according to the HW capabilities.
-
-struct stmmac_dma_cfg {
- int pbl;
- int txpbl;
- int rxpbl;
- bool pblx8;
- int fixed_burst;
- int mixed_burst;
- bool aal;
-};
-
-Where:
- o pbl: Programmable Burst Length (tx and rx)
- o txpbl: Transmit Programmable Burst Length. Only for GMAC and newer.
- If set, DMA tx will use this value rather than pbl.
- o rxpbl: Receive Programmable Burst Length. Only for GMAC and newer.
- If set, DMA rx will use this value rather than pbl.
- o pblx8: Enable 8xPBL (4xPBL for core rev < 3.50). Enabled by default.
- o fixed_burst: program the DMA to use the fixed burst mode
- o mixed_burst: program the DMA to use the mixed burst mode
- o aal: Address-Aligned Beats
-
----
-
-Below an example how the structures above are using on ST platforms.
-
- static struct plat_stmmacenet_data stxYYY_ethernet_platform_data = {
- .has_gmac = 0,
- .enh_desc = 0,
- .fix_mac_speed = stxYYY_ethernet_fix_mac_speed,
- |
- |-> to write an internal syscfg
- | on this platform when the
- | link speed changes from 10 to
- | 100 and viceversa
- .init = &stmmac_claim_resource,
- |
- |-> On ST SoC this calls own "PAD"
- | manager framework to claim
- | all the resources necessary
- | (GPIO ...). The .custom_cfg field
- | is used to pass a custom config.
-};
-
-Below the usage of the stmmac_mdio_bus_data: on this SoC, in fact,
-there are two MAC cores: one MAC is for MDIO Bus/PHY emulation
-with fixed_link support.
-
-static struct stmmac_mdio_bus_data stmmac1_mdio_bus = {
- .phy_reset = phy_reset;
- |
- |-> function to provide the phy_reset on this board
- .phy_mask = 0,
-};
-
-static struct fixed_phy_status stmmac0_fixed_phy_status = {
- .link = 1,
- .speed = 100,
- .duplex = 1,
-};
-
-During the board's device_init we can configure the first
-MAC for fixed_link by calling:
- fixed_phy_add(PHY_POLL, 1, &stmmac0_fixed_phy_status);
-and the second one, with a real PHY device attached to the bus,
-by using the stmmac_mdio_bus_data structure (to provide the id, the
-reset procedure etc).
-
-Note that, starting from new chips, where it is available the HW capability
-register, many configurations are discovered at run-time for example to
-understand if EEE, HW csum, PTP, enhanced descriptor etc are actually
-available. As strategy adopted in this driver, the information from the HW
-capability register can replace what has been passed from the platform.
-
-4.10) Device-tree support.
-
-Please see the following document:
- Documentation/devicetree/bindings/net/stmmac.txt
-
-4.11) This is a summary of the content of some relevant files:
- o stmmac_main.c: implements the main network device driver;
- o stmmac_mdio.c: provides MDIO functions;
- o stmmac_pci: this is the PCI driver;
- o stmmac_platform.c: this the platform driver (OF supported);
- o stmmac_ethtool.c: implements the ethtool support;
- o stmmac.h: private driver structure;
- o common.h: common definitions and VFTs;
- o mmc_core.c/mmc.h: Management MAC Counters;
- o stmmac_hwtstamp.c: HW timestamp support for PTP;
- o stmmac_ptp.c: PTP 1588 clock;
- o stmmac_pcs.h: Physical Coding Sublayer common implementation;
- o dwmac-<XXX>.c: these are for the platform glue-logic file; e.g. dwmac-sti.c
- for STMicroelectronics SoCs.
-
-- GMAC 3.x
- o descs.h: descriptor structure definitions;
- o dwmac1000_core.c: dwmac GiGa core functions;
- o dwmac1000_dma.c: dma functions for the GMAC chip;
- o dwmac1000.h: specific header file for the dwmac GiGa;
- o dwmac100_core: dwmac 100 core code;
- o dwmac100_dma.c: dma functions for the dwmac 100 chip;
- o dwmac1000.h: specific header file for the MAC;
- o dwmac_lib.c: generic DMA functions;
- o enh_desc.c: functions for handling enhanced descriptors;
- o norm_desc.c: functions for handling normal descriptors;
- o chain_mode.c/ring_mode.c:: functions to manage RING/CHAINED modes;
-
-- GMAC4.x generation
- o dwmac4_core.c: dwmac GMAC4.x core functions;
- o dwmac4_desc.c: functions for handling GMAC4.x descriptors;
- o dwmac4_descs.h: descriptor definitions;
- o dwmac4_dma.c: dma functions for the GMAC4.x chip;
- o dwmac4_dma.h: dma definitions for the GMAC4.x chip;
- o dwmac4.h: core definitions for the GMAC4.x chip;
- o dwmac4_lib.c: generic GMAC4.x functions;
-
-4.12) TSO support (GMAC4.x)
-
-TSO (Tcp Segmentation Offload) feature is supported by GMAC 4.x chip family.
-When a packet is sent through TCP protocol, the TCP stack ensures that
-the SKB provided to the low level driver (stmmac in our case) matches with
-the maximum frame len (IP header + TCP header + payload <= 1500 bytes (for
-MTU set to 1500)). It means that if an application using TCP want to send a
-packet which will have a length (after adding headers) > 1514 the packet
-will be split in several TCP packets: The data payload is split and headers
-(TCP/IP ..) are added. It is done by software.
-
-When TSO is enabled, the TCP stack doesn't care about the maximum frame
-length and provide SKB packet to stmmac as it is. The GMAC IP will have to
-perform the segmentation by it self to match with maximum frame length.
-
-This feature can be enabled in device tree through "snps,tso" entry.
-
-5) Debug Information
-
-The driver exports many information i.e. internal statistics,
-debug information, MAC and DMA registers etc.
-
-These can be read in several ways depending on the
-type of the information actually needed.
-
-For example a user can be use the ethtool support
-to get statistics: e.g. using: ethtool -S ethX
-(that shows the Management counters (MMC) if supported)
-or sees the MAC/DMA registers: e.g. using: ethtool -d ethX
-
-Compiling the Kernel with CONFIG_DEBUG_FS the driver will export the following
-debugfs entries:
-
-/sys/kernel/debug/stmmaceth/descriptors_status
- To show the DMA TX/RX descriptor rings
-
-Developer can also use the "debug" module parameter to get further debug
-information (please see: NETIF Msg Level).
-
-6) Energy Efficient Ethernet
-
-Energy Efficient Ethernet(EEE) enables IEEE 802.3 MAC sublayer along
-with a family of Physical layer to operate in the Low power Idle(LPI)
-mode. The EEE mode supports the IEEE 802.3 MAC operation at 100Mbps,
-1000Mbps & 10Gbps.
-
-The LPI mode allows power saving by switching off parts of the
-communication device functionality when there is no data to be
-transmitted & received. The system on both the side of the link can
-disable some functionalities & save power during the period of low-link
-utilization. The MAC controls whether the system should enter or exit
-the LPI mode & communicate this to PHY.
-
-As soon as the interface is opened, the driver verifies if the EEE can
-be supported. This is done by looking at both the DMA HW capability
-register and the PHY devices MCD registers.
-To enter in Tx LPI mode the driver needs to have a software timer
-that enable and disable the LPI mode when there is nothing to be
-transmitted.
-
-7) Precision Time Protocol (PTP)
-The driver supports the IEEE 1588-2002, Precision Time Protocol (PTP),
-which enables precise synchronization of clocks in measurement and
-control systems implemented with technologies such as network
-communication.
-
-In addition to the basic timestamp features mentioned in IEEE 1588-2002
-Timestamps, new GMAC cores support the advanced timestamp features.
-IEEE 1588-2008 that can be enabled when configure the Kernel.
-
-8) SGMII/RGMII support
-New GMAC devices provide own way to manage RGMII/SGMII.
-This information is available at run-time by looking at the
-HW capability register. This means that the stmmac can manage
-auto-negotiation and link status w/o using the PHYLIB stuff.
-In fact, the HW provides a subset of extended registers to
-restart the ANE, verify Full/Half duplex mode and Speed.
-Thanks to these registers, it is possible to look at the
-Auto-negotiated Link Parter Ability.
diff --git a/Documentation/networking/device_drivers/ti/cpsw_switchdev.txt b/Documentation/networking/device_drivers/ti/cpsw_switchdev.txt
index 5c8cee17fca9..12855ab268b8 100644
--- a/Documentation/networking/device_drivers/ti/cpsw_switchdev.txt
+++ b/Documentation/networking/device_drivers/ti/cpsw_switchdev.txt
@@ -39,7 +39,7 @@ but without enabling "switch" mode, or to different bridges.
Devlink configuration parameters
====================
-See Documentation/networking/devlink-params-ti-cpsw-switch.txt
+See Documentation/networking/devlink/ti-cpsw-switch.rst
====================
# Bridging in dual mac mode
diff --git a/Documentation/networking/devlink-health.txt b/Documentation/networking/devlink-health.txt
deleted file mode 100644
index 1db3fbea0831..000000000000
--- a/Documentation/networking/devlink-health.txt
+++ /dev/null
@@ -1,86 +0,0 @@
-The health mechanism is targeted for Real Time Alerting, in order to know when
-something bad had happened to a PCI device
-- Provide alert debug information
-- Self healing
-- If problem needs vendor support, provide a way to gather all needed debugging
- information.
-
-The main idea is to unify and centralize driver health reports in the
-generic devlink instance and allow the user to set different
-attributes of the health reporting and recovery procedures.
-
-The devlink health reporter:
-Device driver creates a "health reporter" per each error/health type.
-Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
-or unknown (driver specific).
-For each registered health reporter a driver can issue error/health reports
-asynchronously. All health reports handling is done by devlink.
-Device driver can provide specific callbacks for each "health reporter", e.g.
- - Recovery procedures
- - Diagnostics and object dump procedures
- - OOB initial parameters
-Different parts of the driver can register different types of health reporters
-with different handlers.
-
-Once an error is reported, devlink health will do the following actions:
- * A log is being send to the kernel trace events buffer
- * Health status and statistics are being updated for the reporter instance
- * Object dump is being taken and saved at the reporter instance (as long as
- there is no other dump which is already stored)
- * Auto recovery attempt is being done. Depends on:
- - Auto-recovery configuration
- - Grace period vs. time passed since last recover
-
-The user interface:
-User can access/change each reporter's parameters and driver specific callbacks
-via devlink, e.g per error type (per health reporter)
- - Configure reporter's generic parameters (like: disable/enable auto recovery)
- - Invoke recovery procedure
- - Run diagnostics
- - Object dump
-
-The devlink health interface (via netlink):
-DEVLINK_CMD_HEALTH_REPORTER_GET
- Retrieves status and configuration info per DEV and reporter.
-DEVLINK_CMD_HEALTH_REPORTER_SET
- Allows reporter-related configuration setting.
-DEVLINK_CMD_HEALTH_REPORTER_RECOVER
- Triggers a reporter's recovery procedure.
-DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE
- Retrieves diagnostics data from a reporter on a device.
-DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET
- Retrieves the last stored dump. Devlink health
- saves a single dump. If an dump is not already stored by the devlink
- for this reporter, devlink generates a new dump.
- dump output is defined by the reporter.
-DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR
- Clears the last saved dump file for the specified reporter.
-
-
- netlink
- +--------------------------+
- | |
- | + |
- | | |
- +--------------------------+
- |request for ops
- |(diagnose,
- mlx5_core devlink |recover,
- |dump)
-+--------+ +--------------------------+
-| | | reporter| |
-| | | +---------v----------+ |
-| | ops execution | | | |
-| <----------------------------------+ | |
-| | | | | |
-| | | + ^------------------+ |
-| | | | request for ops |
-| | | | (recover, dump) |
-| | | | |
-| | | +-+------------------+ |
-| | health report | | health handler | |
-| +-------------------------------> | |
-| | | +--------------------+ |
-| | health reporter create | |
-| +----------------------------> |
-+--------+ +--------------------------+
diff --git a/Documentation/networking/devlink-info-versions.rst b/Documentation/networking/devlink-info-versions.rst
deleted file mode 100644
index 4914f581b1fd..000000000000
--- a/Documentation/networking/devlink-info-versions.rst
+++ /dev/null
@@ -1,64 +0,0 @@
-.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
-
-=====================
-Devlink info versions
-=====================
-
-board.id
-========
-
-Unique identifier of the board design.
-
-board.rev
-=========
-
-Board design revision.
-
-asic.id
-=======
-
-ASIC design identifier.
-
-asic.rev
-========
-
-ASIC design revision.
-
-board.manufacture
-=================
-
-An identifier of the company or the facility which produced the part.
-
-fw
-==
-
-Overall firmware version, often representing the collection of
-fw.mgmt, fw.app, etc.
-
-fw.mgmt
-=======
-
-Control unit firmware version. This firmware is responsible for house
-keeping tasks, PHY control etc. but not the packet-by-packet data path
-operation.
-
-fw.app
-======
-
-Data path microcode controlling high-speed packet processing.
-
-fw.undi
-=======
-
-UNDI software, may include the UEFI driver, firmware or both.
-
-fw.ncsi
-=======
-
-Version of the software responsible for supporting/handling the
-Network Controller Sideband Interface.
-
-fw.psid
-=======
-
-Unique identifier of the firmware parameter set.
diff --git a/Documentation/networking/devlink-params-bnxt.txt b/Documentation/networking/devlink-params-bnxt.txt
deleted file mode 100644
index 481aa303d5b4..000000000000
--- a/Documentation/networking/devlink-params-bnxt.txt
+++ /dev/null
@@ -1,18 +0,0 @@
-enable_sriov [DEVICE, GENERIC]
- Configuration mode: Permanent
-
-ignore_ari [DEVICE, GENERIC]
- Configuration mode: Permanent
-
-msix_vec_per_pf_max [DEVICE, GENERIC]
- Configuration mode: Permanent
-
-msix_vec_per_pf_min [DEVICE, GENERIC]
- Configuration mode: Permanent
-
-gre_ver_check [DEVICE, DRIVER-SPECIFIC]
- Generic Routing Encapsulation (GRE) version check will
- be enabled in the device. If disabled, device skips
- version checking for incoming packets.
- Type: Boolean
- Configuration mode: Permanent
diff --git a/Documentation/networking/devlink-params-mlx5.txt b/Documentation/networking/devlink-params-mlx5.txt
deleted file mode 100644
index 5071467118bd..000000000000
--- a/Documentation/networking/devlink-params-mlx5.txt
+++ /dev/null
@@ -1,17 +0,0 @@
-flow_steering_mode [DEVICE, DRIVER-SPECIFIC]
- Controls the flow steering mode of the driver.
- Two modes are supported:
- 1. 'dmfs' - Device managed flow steering.
- 2. 'smfs - Software/Driver managed flow steering.
- In DMFS mode, the HW steering entities are created and
- managed through the Firmware.
- In SMFS mode, the HW steering entities are created and
- managed though by the driver directly into Hardware
- without firmware intervention.
- Type: String
- Configuration mode: runtime
-
-enable_roce [DEVICE, GENERIC]
- Enable handling of RoCE traffic in the device.
- Defaultly enabled.
- Configuration mode: driverinit
diff --git a/Documentation/networking/devlink-params-mlxsw.txt b/Documentation/networking/devlink-params-mlxsw.txt
deleted file mode 100644
index c63ea9fc7009..000000000000
--- a/Documentation/networking/devlink-params-mlxsw.txt
+++ /dev/null
@@ -1,10 +0,0 @@
-fw_load_policy [DEVICE, GENERIC]
- Configuration mode: driverinit
-
-acl_region_rehash_interval [DEVICE, DRIVER-SPECIFIC]
- Sets an interval for periodic ACL region rehashes.
- The value is in milliseconds, minimal value is "3000".
- Value "0" disables the periodic work.
- The first rehash will be run right after value is set.
- Type: u32
- Configuration mode: runtime
diff --git a/Documentation/networking/devlink-params-mv88e6xxx.txt b/Documentation/networking/devlink-params-mv88e6xxx.txt
deleted file mode 100644
index 21c4b3556ef2..000000000000
--- a/Documentation/networking/devlink-params-mv88e6xxx.txt
+++ /dev/null
@@ -1,7 +0,0 @@
-ATU_hash [DEVICE, DRIVER-SPECIFIC]
- Select one of four possible hashing algorithms for
- MAC addresses in the Address Translation Unit.
- A value of 3 seems to work better than the default of
- 1 when many MAC addresses have the same OUI.
- Configuration mode: runtime
- Type: u8. 0-3 valid.
diff --git a/Documentation/networking/devlink-params-nfp.txt b/Documentation/networking/devlink-params-nfp.txt
deleted file mode 100644
index 43e4d4034865..000000000000
--- a/Documentation/networking/devlink-params-nfp.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-fw_load_policy [DEVICE, GENERIC]
- Configuration mode: permanent
-
-reset_dev_on_drv_probe [DEVICE, GENERIC]
- Configuration mode: permanent
diff --git a/Documentation/networking/devlink-params-ti-cpsw-switch.txt b/Documentation/networking/devlink-params-ti-cpsw-switch.txt
deleted file mode 100644
index 4037458499f7..000000000000
--- a/Documentation/networking/devlink-params-ti-cpsw-switch.txt
+++ /dev/null
@@ -1,10 +0,0 @@
-ale_bypass [DEVICE, DRIVER-SPECIFIC]
- Allows to enable ALE_CONTROL(4).BYPASS mode for debug purposes.
- All packets will be sent to the Host port only if enabled.
- Type: bool
- Configuration mode: runtime
-
-switch_mode [DEVICE, DRIVER-SPECIFIC]
- Enable switch mode
- Type: bool
- Configuration mode: runtime
diff --git a/Documentation/networking/devlink-params.txt b/Documentation/networking/devlink-params.txt
deleted file mode 100644
index 04e234e9acc9..000000000000
--- a/Documentation/networking/devlink-params.txt
+++ /dev/null
@@ -1,71 +0,0 @@
-Devlink configuration parameters
-================================
-Following is the list of configuration parameters via devlink interface.
-Each parameter can be generic or driver specific and are device level
-parameters.
-
-Note that the driver-specific files should contain the generic params
-they support to, with supported config modes.
-
-Each parameter can be set in different configuration modes:
- runtime - set while driver is running, no reset required.
- driverinit - applied while driver initializes, requires restart
- driver by devlink reload command.
- permanent - written to device's non-volatile memory, hard reset
- required.
-
-Following is the list of parameters:
-====================================
-enable_sriov [DEVICE, GENERIC]
- Enable Single Root I/O Virtualisation (SRIOV) in
- the device.
- Type: Boolean
-
-ignore_ari [DEVICE, GENERIC]
- Ignore Alternative Routing-ID Interpretation (ARI)
- capability. If enabled, adapter will ignore ARI
- capability even when platforms has the support
- enabled and creates same number of partitions when
- platform does not support ARI.
- Type: Boolean
-
-msix_vec_per_pf_max [DEVICE, GENERIC]
- Provides the maximum number of MSIX interrupts that
- a device can create. Value is same across all
- physical functions (PFs) in the device.
- Type: u32
-
-msix_vec_per_pf_min [DEVICE, GENERIC]
- Provides the minimum number of MSIX interrupts required
- for the device initialization. Value is same across all
- physical functions (PFs) in the device.
- Type: u32
-
-fw_load_policy [DEVICE, GENERIC]
- Controls the device's firmware loading policy.
- Valid values:
- * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DRIVER (0)
- Load firmware version preferred by the driver.
- * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH (1)
- Load firmware currently stored in flash.
- * DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DISK (2)
- Load firmware currently available on host's disk.
- Type: u8
-
-reset_dev_on_drv_probe [DEVICE, GENERIC]
- Controls the device's reset policy on driver probe.
- Valid values:
- * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_UNKNOWN (0)
- Unknown or invalid value.
- * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_ALWAYS (1)
- Always reset device on driver probe.
- * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_NEVER (2)
- Never reset device on driver probe.
- * DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_DISK (3)
- Reset only if device firmware can be found in the
- filesystem.
- Type: u8
-
-enable_roce [DEVICE, GENERIC]
- Enable handling of RoCE traffic in the device.
- Type: Boolean
diff --git a/Documentation/networking/devlink-trap-netdevsim.rst b/Documentation/networking/devlink-trap-netdevsim.rst
deleted file mode 100644
index b721c9415473..000000000000
--- a/Documentation/networking/devlink-trap-netdevsim.rst
+++ /dev/null
@@ -1,20 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-======================
-Devlink Trap netdevsim
-======================
-
-Driver-specific Traps
-=====================
-
-.. list-table:: List of Driver-specific Traps Registered by ``netdevsim``
- :widths: 5 5 90
-
- * - Name
- - Type
- - Description
- * - ``fid_miss``
- - ``exception``
- - When a packet enters the device it is classified to a filtering
- indentifier (FID) based on the ingress port and VLAN. This trap is used
- to trap packets for which a FID could not be found
diff --git a/Documentation/networking/devlink/bnxt.rst b/Documentation/networking/devlink/bnxt.rst
new file mode 100644
index 000000000000..79e746d22a53
--- /dev/null
+++ b/Documentation/networking/devlink/bnxt.rst
@@ -0,0 +1,41 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+bnxt devlink support
+====================
+
+This document describes the devlink features implemented by the ``bnxt``
+device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+
+ * - Name
+ - Mode
+ * - ``enable_sriov``
+ - Permanent
+ * - ``ignore_ari``
+ - Permanent
+ * - ``msix_vec_per_pf_max``
+ - Permanent
+ * - ``msix_vec_per_pf_min``
+ - Permanent
+
+The ``bnxt`` driver also implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``gre_ver_check``
+ - Boolean
+ - Permanent
+ - Generic Routing Encapsulation (GRE) version check will be enabled in
+ the device. If disabled, the device will skip the version check for
+ incoming packets.
diff --git a/Documentation/networking/devlink/devlink-dpipe.rst b/Documentation/networking/devlink/devlink-dpipe.rst
new file mode 100644
index 000000000000..468fe1001b74
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-dpipe.rst
@@ -0,0 +1,252 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============
+Devlink DPIPE
+=============
+
+Background
+==========
+
+While performing the hardware offloading process, much of the hardware
+specifics cannot be presented. These details are useful for debugging, and
+``devlink-dpipe`` provides a standardized way to provide visibility into the
+offloading process.
+
+For example, the routing longest prefix match (LPM) algorithm used by the
+Linux kernel may differ from the hardware implementation. The pipeline debug
+API (DPIPE) is aimed at providing the user visibility into the ASIC's
+pipeline in a generic way.
+
+The hardware offload process is expected to be done in a way that the user
+should not be able to distinguish between the hardware vs. software
+implementation. In this process, hardware specifics are neglected. In
+reality those details can have lots of meaning and should be exposed in some
+standard way.
+
+This problem is made even more complex when one wishes to offload the
+control path of the whole networking stack to a switch ASIC. Due to
+differences in the hardware and software models some processes cannot be
+represented correctly.
+
+One example is the kernel's LPM algorithm which in many cases differs
+greatly to the hardware implementation. The configuration API is the same,
+but one cannot rely on the Forward Information Base (FIB) to look like the
+Level Path Compression trie (LPC-trie) in hardware.
+
+In many situations trying to analyze systems failure solely based on the
+kernel's dump may not be enough. By combining this data with complementary
+information about the underlying hardware, this debugging can be made
+easier; additionally, the information can be useful when debugging
+performance issues.
+
+Overview
+========
+
+The ``devlink-dpipe`` interface closes this gap. The hardware's pipeline is
+modeled as a graph of match/action tables. Each table represents a specific
+hardware block. This model is not new, first being used by the P4 language.
+
+Traditionally it has been used as an alternative model for hardware
+configuration, but the ``devlink-dpipe`` interface uses it for visibility
+purposes as a standard complementary tool. The system's view from
+``devlink-dpipe`` should change according to the changes done by the
+standard configuration tools.
+
+For example, it’s quiet common to implement Access Control Lists (ACL)
+using Ternary Content Addressable Memory (TCAM). The TCAM memory can be
+divided into TCAM regions. Complex TC filters can have multiple rules with
+different priorities and different lookup keys. On the other hand hardware
+TCAM regions have a predefined lookup key. Offloading the TC filter rules
+using TCAM engine can result in multiple TCAM regions being interconnected
+in a chain (which may affect the data path latency). In response to a new TC
+filter new tables should be created describing those regions.
+
+Model
+=====
+
+The ``DPIPE`` model introduces several objects:
+
+ * headers
+ * tables
+ * entries
+
+A ``header`` describes packet formats and provides names for fields within
+the packet. A ``table`` describes hardware blocks. An ``entry`` describes
+the actual content of a specific table.
+
+The hardware pipeline is not port specific, but rather describes the whole
+ASIC. Thus it is tied to the top of the ``devlink`` infrastructure.
+
+Drivers can register and unregister tables at run time, in order to support
+dynamic behavior. This dynamic behavior is mandatory for describing hardware
+blocks like TCAM regions which can be allocated and freed dynamically.
+
+``devlink-dpipe`` generally is not intended for configuration. The exception
+is hardware counting for a specific table.
+
+The following commands are used to obtain the ``dpipe`` objects from
+userspace:
+
+ * ``table_get``: Receive a table's description.
+ * ``headers_get``: Receive a device's supported headers.
+ * ``entries_get``: Receive a table's current entries.
+ * ``counters_set``: Enable or disable counters on a table.
+
+Table
+-----
+
+The driver should implement the following operations for each table:
+
+ * ``matches_dump``: Dump the supported matches.
+ * ``actions_dump``: Dump the supported actions.
+ * ``entries_dump``: Dump the actual content of the table.
+ * ``counters_set_update``: Synchronize hardware with counters enabled or
+ disabled.
+
+Header/Field
+------------
+
+In a similar way to P4 headers and fields are used to describe a table's
+behavior. There is a slight difference between the standard protocol headers
+and specific ASIC metadata. The protocol headers should be declared in the
+``devlink`` core API. On the other hand ASIC meta data is driver specific
+and should be defined in the driver. Additionally, each driver-specific
+devlink documentation file should document the driver-specific ``dpipe``
+headers it implements. The headers and fields are identified by enumeration.
+
+In order to provide further visibility some ASIC metadata fields could be
+mapped to kernel objects. For example, internal router interface indexes can
+be directly mapped to the net device ifindex. FIB table indexes used by
+different Virtual Routing and Forwarding (VRF) tables can be mapped to
+internal routing table indexes.
+
+Match
+-----
+
+Matches are kept primitive and close to hardware operation. Match types like
+LPM are not supported due to the fact that this is exactly a process we wish
+to describe in full detail. Example of matches:
+
+ * ``field_exact``: Exact match on a specific field.
+ * ``field_exact_mask``: Exact match on a specific field after masking.
+ * ``field_range``: Match on a specific range.
+
+The id's of the header and the field should be specified in order to
+identify the specific field. Furthermore, the header index should be
+specified in order to distinguish multiple headers of the same type in a
+packet (tunneling).
+
+Action
+------
+
+Similar to match, the actions are kept primitive and close to hardware
+operation. For example:
+
+ * ``field_modify``: Modify the field value.
+ * ``field_inc``: Increment the field value.
+ * ``push_header``: Add a header.
+ * ``pop_header``: Remove a header.
+
+Entry
+-----
+
+Entries of a specific table can be dumped on demand. Each eentry is
+identified with an index and its properties are described by a list of
+match/action values and specific counter. By dumping the tables content the
+interactions between tables can be resolved.
+
+Abstraction Example
+===================
+
+The following is an example of the abstraction model of the L3 part of
+Mellanox Spectrum ASIC. The blocks are described in the order they appear in
+the pipeline. The table sizes in the following examples are not real
+hardware sizes and are provided for demonstration purposes.
+
+LPM
+---
+
+The LPM algorithm can be implemented as a list of hash tables. Each hash
+table contains routes with the same prefix length. The root of the list is
+/32, and in case of a miss the hardware will continue to the next hash
+table. The depth of the search will affect the data path latency.
+
+In case of a hit the entry contains information about the next stage of the
+pipeline which resolves the MAC address. The next stage can be either local
+host table for directly connected routes, or adjacency table for next-hops.
+The ``meta.lpm_prefix`` field is used to connect two LPM tables.
+
+.. code::
+
+ table lpm_prefix_16 {
+ size: 4096,
+ counters_enabled: true,
+ match: { meta.vr_id: exact,
+ ipv4.dst_addr: exact_mask,
+ ipv6.dst_addr: exact_mask,
+ meta.lpm_prefix: exact },
+ action: { meta.adj_index: set,
+ meta.adj_group_size: set,
+ meta.rif_port: set,
+ meta.lpm_prefix: set },
+ }
+
+Local Host
+----------
+
+In the case of local routes the LPM lookup already resolves the egress
+router interface (RIF), yet the exact MAC address is not known. The local
+host table is a hash table combining the output interface id with
+destination IP address as a key. The result is the MAC address.
+
+.. code::
+
+ table local_host {
+ size: 4096,
+ counters_enabled: true,
+ match: { meta.rif_port: exact,
+ ipv4.dst_addr: exact},
+ action: { ethernet.daddr: set }
+ }
+
+Adjacency
+---------
+
+In case of remote routes this table does the ECMP. The LPM lookup results in
+ECMP group size and index that serves as a global offset into this table.
+Concurrently a hash of the packet is generated. Based on the ECMP group size
+and the packet's hash a local offset is generated. Multiple LPM entries can
+point to the same adjacency group.
+
+.. code::
+
+ table adjacency {
+ size: 4096,
+ counters_enabled: true,
+ match: { meta.adj_index: exact,
+ meta.adj_group_size: exact,
+ meta.packet_hash_index: exact },
+ action: { ethernet.daddr: set,
+ meta.erif: set }
+ }
+
+ERIF
+----
+
+In case the egress RIF and destination MAC have been resolved by previous
+tables this table does multiple operations like TTL decrease and MTU check.
+Then the decision of forward/drop is taken and the port L3 statistics are
+updated based on the packet's type (broadcast, unicast, multicast).
+
+.. code::
+
+ table erif {
+ size: 800,
+ counters_enabled: true,
+ match: { meta.rif_port: exact,
+ meta.is_l3_unicast: exact,
+ meta.is_l3_broadcast: exact,
+ meta.is_l3_multicast, exact },
+ action: { meta.l3_drop: set,
+ meta.l3_forward: set }
+ }
diff --git a/Documentation/networking/devlink/devlink-health.rst b/Documentation/networking/devlink/devlink-health.rst
new file mode 100644
index 000000000000..0c99b11f05f9
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-health.rst
@@ -0,0 +1,114 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+Devlink Health
+==============
+
+Background
+==========
+
+The ``devlink`` health mechanism is targeted for Real Time Alerting, in
+order to know when something bad happened to a PCI device.
+
+ * Provide alert debug information.
+ * Self healing.
+ * If problem needs vendor support, provide a way to gather all needed
+ debugging information.
+
+Overview
+========
+
+The main idea is to unify and centralize driver health reports in the
+generic ``devlink`` instance and allow the user to set different
+attributes of the health reporting and recovery procedures.
+
+The ``devlink`` health reporter:
+Device driver creates a "health reporter" per each error/health type.
+Error/Health type can be a known/generic (eg pci error, fw error, rx/tx error)
+or unknown (driver specific).
+For each registered health reporter a driver can issue error/health reports
+asynchronously. All health reports handling is done by ``devlink``.
+Device driver can provide specific callbacks for each "health reporter", e.g.:
+
+ * Recovery procedures
+ * Diagnostics procedures
+ * Object dump procedures
+ * OOB initial parameters
+
+Different parts of the driver can register different types of health reporters
+with different handlers.
+
+Actions
+=======
+
+Once an error is reported, devlink health will perform the following actions:
+
+ * A log is being send to the kernel trace events buffer
+ * Health status and statistics are being updated for the reporter instance
+ * Object dump is being taken and saved at the reporter instance (as long as
+ there is no other dump which is already stored)
+ * Auto recovery attempt is being done. Depends on:
+ - Auto-recovery configuration
+ - Grace period vs. time passed since last recover
+
+User Interface
+==============
+
+User can access/change each reporter's parameters and driver specific callbacks
+via ``devlink``, e.g per error type (per health reporter):
+
+ * Configure reporter's generic parameters (like: disable/enable auto recovery)
+ * Invoke recovery procedure
+ * Run diagnostics
+ * Object dump
+
+.. list-table:: List of devlink health interfaces
+ :widths: 10 90
+
+ * - Name
+ - Description
+ * - ``DEVLINK_CMD_HEALTH_REPORTER_GET``
+ - Retrieves status and configuration info per DEV and reporter.
+ * - ``DEVLINK_CMD_HEALTH_REPORTER_SET``
+ - Allows reporter-related configuration setting.
+ * - ``DEVLINK_CMD_HEALTH_REPORTER_RECOVER``
+ - Triggers a reporter's recovery procedure.
+ * - ``DEVLINK_CMD_HEALTH_REPORTER_DIAGNOSE``
+ - Retrieves diagnostics data from a reporter on a device.
+ * - ``DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET``
+ - Retrieves the last stored dump. Devlink health
+ saves a single dump. If an dump is not already stored by the devlink
+ for this reporter, devlink generates a new dump.
+ dump output is defined by the reporter.
+ * - ``DEVLINK_CMD_HEALTH_REPORTER_DUMP_CLEAR``
+ - Clears the last saved dump file for the specified reporter.
+
+The following diagram provides a general overview of ``devlink-health``::
+
+ netlink
+ +--------------------------+
+ | |
+ | + |
+ | | |
+ +--------------------------+
+ |request for ops
+ |(diagnose,
+ mlx5_core devlink |recover,
+ |dump)
+ +--------+ +--------------------------+
+ | | | reporter| |
+ | | | +---------v----------+ |
+ | | ops execution | | | |
+ | <----------------------------------+ | |
+ | | | | | |
+ | | | + ^------------------+ |
+ | | | | request for ops |
+ | | | | (recover, dump) |
+ | | | | |
+ | | | +-+------------------+ |
+ | | health report | | health handler | |
+ | +-------------------------------> | |
+ | | | +--------------------+ |
+ | | health reporter create | |
+ | +----------------------------> |
+ +--------+ +--------------------------+
diff --git a/Documentation/networking/devlink/devlink-info.rst b/Documentation/networking/devlink/devlink-info.rst
new file mode 100644
index 000000000000..0385f15028b1
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-info.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+
+============
+Devlink Info
+============
+
+The ``devlink-info`` mechanism enables device drivers to report device
+information in a generic fashion. It is extensible, and enables exporting
+even device or driver specific information.
+
+devlink supports representing the following types of versions
+
+.. list-table:: List of version types
+ :widths: 5 95
+
+ * - Type
+ - Description
+ * - ``fixed``
+ - Represents fixed versions, which cannot change. For example,
+ component identifiers or the board version reported in the PCI VPD.
+ * - ``running``
+ - Represents the version of the currently running component. For
+ example the running version of firmware. These versions generally
+ only update after a reboot.
+ * - ``stored``
+ - Represents the version of a component as stored, such as after a
+ flash update. Stored values should update to reflect changes in the
+ flash even if a reboot has not yet occurred.
+
+Generic Versions
+================
+
+It is expected that drivers use the following generic names for exporting
+version information. Other information may be exposed using driver-specific
+names, but these should be documented in the driver-specific file.
+
+board.id
+--------
+
+Unique identifier of the board design.
+
+board.rev
+---------
+
+Board design revision.
+
+asic.id
+-------
+
+ASIC design identifier.
+
+asic.rev
+--------
+
+ASIC design revision.
+
+board.manufacture
+-----------------
+
+An identifier of the company or the facility which produced the part.
+
+fw
+--
+
+Overall firmware version, often representing the collection of
+fw.mgmt, fw.app, etc.
+
+fw.mgmt
+-------
+
+Control unit firmware version. This firmware is responsible for house
+keeping tasks, PHY control etc. but not the packet-by-packet data path
+operation.
+
+fw.app
+------
+
+Data path microcode controlling high-speed packet processing.
+
+fw.undi
+-------
+
+UNDI software, may include the UEFI driver, firmware or both.
+
+fw.ncsi
+-------
+
+Version of the software responsible for supporting/handling the
+Network Controller Sideband Interface.
+
+fw.psid
+-------
+
+Unique identifier of the firmware parameter set.
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst
new file mode 100644
index 000000000000..da2f85c0fa21
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-params.rst
@@ -0,0 +1,108 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+Devlink Params
+==============
+
+``devlink`` provides capability for a driver to expose device parameters for low
+level device functionality. Since devlink can operate at the device-wide
+level, it can be used to provide configuration that may affect multiple
+ports on a single device.
+
+This document describes a number of generic parameters that are supported
+across multiple drivers. Each driver is also free to add their own
+parameters. Each driver must document the specific parameters they support,
+whether generic or not.
+
+Configuration modes
+===================
+
+Parameters may be set in different configuration modes.
+
+.. list-table:: Possible configuration modes
+ :widths: 5 90
+
+ * - Name
+ - Description
+ * - ``runtime``
+ - set while the driver is running, and takes effect immediately. No
+ reset is required.
+ * - ``driverinit``
+ - applied while the driver initializes. Requires the user to restart
+ the driver using the ``devlink`` reload command.
+ * - ``permanent``
+ - written to the device's non-volatile memory. A hard reset is required
+ for it to take effect.
+
+Reloading
+---------
+
+In order for ``driverinit`` parameters to take effect, the driver must
+support reloading via the ``devlink-reload`` command. This command will
+request a reload of the device driver.
+
+Generic configuration parameters
+================================
+The following is a list of generic configuration parameters that drivers may
+add. Use of generic parameters is preferred over each driver creating their
+own name.
+
+.. list-table:: List of generic parameters
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``enable_sriov``
+ - Boolean
+ - Enable Single Root I/O Virtualization (SRIOV) in the device.
+ * - ``ignore_ari``
+ - Boolean
+ - Ignore Alternative Routing-ID Interpretation (ARI) capability. If
+ enabled, the adapter will ignore ARI capability even when the
+ platform has support enabled. The device will create the same number
+ of partitions as when the platform does not support ARI.
+ * - ``msix_vec_per_pf_max``
+ - u32
+ - Provides the maximum number of MSI-X interrupts that a device can
+ create. Value is the same across all physical functions (PFs) in the
+ device.
+ * - ``msix_vec_per_pf_min``
+ - u32
+ - Provides the minimum number of MSI-X interrupts required for the
+ device to initialize. Value is the same across all physical functions
+ (PFs) in the device.
+ * - ``fw_load_policy``
+ - u8
+ - Control the device's firmware loading policy.
+ - ``DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DRIVER`` (0)
+ Load firmware version preferred by the driver.
+ - ``DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_FLASH`` (1)
+ Load firmware currently stored in flash.
+ - ``DEVLINK_PARAM_FW_LOAD_POLICY_VALUE_DISK`` (2)
+ Load firmware currently available on host's disk.
+ * - ``reset_dev_on_drv_probe``
+ - u8
+ - Controls the device's reset policy on driver probe.
+ - ``DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_UNKNOWN`` (0)
+ Unknown or invalid value.
+ - ``DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_ALWAYS`` (1)
+ Always reset device on driver probe.
+ - ``DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_NEVER`` (2)
+ Never reset device on driver probe.
+ - ``DEVLINK_PARAM_RESET_DEV_ON_DRV_PROBE_VALUE_DISK`` (3)
+ Reset the device only if firmware can be found in the filesystem.
+ * - ``enable_roce``
+ - Boolean
+ - Enable handling of RoCE traffic in the device.
+ * - ``internal_err_reset``
+ - Boolean
+ - When enabled, the device driver will reset the device on internal
+ errors.
+ * - ``max_macs``
+ - u32
+ - Specifies the maximum number of MAC addresses per ethernet port of
+ this device.
+ * - ``region_snapshot_enable``
+ - Boolean
+ - Enable capture of ``devlink-region`` snapshots.
diff --git a/Documentation/networking/devlink/devlink-region.rst b/Documentation/networking/devlink/devlink-region.rst
new file mode 100644
index 000000000000..1a7683e7acb2
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-region.rst
@@ -0,0 +1,60 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============
+Devlink Region
+==============
+
+``devlink`` regions enable access to driver defined address regions using
+devlink.
+
+Each device can create and register its own supported address regions. The
+region can then be accessed via the devlink region interface.
+
+Region snapshots are collected by the driver, and can be accessed via read
+or dump commands. This allows future analysis on the created snapshots.
+Regions may optionally support triggering snapshots on demand.
+
+The major benefit to creating a region is to provide access to internal
+address regions that are otherwise inaccessible to the user.
+
+Regions may also be used to provide an additional way to debug complex error
+states, but see also :doc:`devlink-health`
+
+example usage
+-------------
+
+.. code:: shell
+
+ $ devlink region help
+ $ devlink region show [ DEV/REGION ]
+ $ devlink region del DEV/REGION snapshot SNAPSHOT_ID
+ $ devlink region dump DEV/REGION [ snapshot SNAPSHOT_ID ]
+ $ devlink region read DEV/REGION [ snapshot SNAPSHOT_ID ]
+ address ADDRESS length length
+
+ # Show all of the exposed regions with region sizes:
+ $ devlink region show
+ pci/0000:00:05.0/cr-space: size 1048576 snapshot [1 2]
+ pci/0000:00:05.0/fw-health: size 64 snapshot [1 2]
+
+ # Delete a snapshot using:
+ $ devlink region del pci/0000:00:05.0/cr-space snapshot 1
+
+ # Trigger (request) a snapshot be taken:
+ $ devlink region trigger pci/0000:00:05.0/cr-space
+
+ # Dump a snapshot:
+ $ devlink region dump pci/0000:00:05.0/fw-health snapshot 1
+ 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+ 0000000000000010 0000 0000 ffff ff04 0029 8c00 0028 8cc8
+ 0000000000000020 0016 0bb8 0016 1720 0000 0000 c00f 3ffc
+ 0000000000000030 bada cce5 bada cce5 bada cce5 bada cce5
+
+ # Read a specific part of a snapshot:
+ $ devlink region read pci/0000:00:05.0/fw-health snapshot 1 address 0
+ length 16
+ 0000000000000000 0014 95dc 0014 9514 0035 1670 0034 db30
+
+As regions are likely very device or driver specific, no generic regions are
+defined. See the driver-specific documentation files for information on the
+specific regions a driver supports.
diff --git a/Documentation/networking/devlink/devlink-resource.rst b/Documentation/networking/devlink/devlink-resource.rst
new file mode 100644
index 000000000000..93e92d2f0752
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-resource.rst
@@ -0,0 +1,62 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================
+Devlink Resource
+================
+
+``devlink`` provides the ability for drivers to register resources, which
+can allow administrators to see the device restrictions for a given
+resource, as well as how much of the given resource is currently
+in use. Additionally, these resources can optionally have configurable size.
+This could enable the administrator to limit the number of resources that
+are used.
+
+For example, the ``netdevsim`` driver enables ``/IPv4/fib`` and
+``/IPv4/fib-rules`` as resources to limit the number of IPv4 FIB entries and
+rules for a given device.
+
+Resource Ids
+============
+
+Each resource is represented by an id, and contains information about its
+current size and related sub resources. To access a sub resource, you
+specify the path of the resource. For example ``/IPv4/fib`` is the id for
+the ``fib`` sub-resource under the ``IPv4`` resource.
+
+example usage
+-------------
+
+The resources exposed by the driver can be observed, for example:
+
+.. code:: shell
+
+ $devlink resource show pci/0000:03:00.0
+ pci/0000:03:00.0:
+ name kvd size 245760 unit entry
+ resources:
+ name linear size 98304 occ 0 unit entry size_min 0 size_max 147456 size_gran 128
+ name hash_double size 60416 unit entry size_min 32768 size_max 180224 size_gran 128
+ name hash_single size 87040 unit entry size_min 65536 size_max 212992 size_gran 128
+
+Some resource's size can be changed. Examples:
+
+.. code:: shell
+
+ $devlink resource set pci/0000:03:00.0 path /kvd/hash_single size 73088
+ $devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 74368
+
+The changes do not apply immediately, this can be validated by the 'size_new'
+attribute, which represents the pending change in size. For example:
+
+.. code:: shell
+
+ $devlink resource show pci/0000:03:00.0
+ pci/0000:03:00.0:
+ name kvd size 245760 unit entry size_valid false
+ resources:
+ name linear size 98304 size_new 147456 occ 0 unit entry size_min 0 size_max 147456 size_gran 128
+ name hash_double size 60416 unit entry size_min 32768 size_max 180224 size_gran 128
+ name hash_single size 87040 unit entry size_min 65536 size_max 212992 size_gran 128
+
+Note that changes in resource size may require a device reload to properly
+take effect.
diff --git a/Documentation/networking/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst
index 03311849bfb1..47a429bb8658 100644
--- a/Documentation/networking/devlink-trap.rst
+++ b/Documentation/networking/devlink/devlink-trap.rst
@@ -223,6 +223,21 @@ be added to the following table:
* - ``ipv6_lpm_miss``
- ``exception``
- Traps unicast IPv6 packets that did not match any route
+ * - ``non_routable_packet``
+ - ``drop``
+ - Traps packets that the device decided to drop because they are not
+ supposed to be routed. For example, IGMP queries can be flooded by the
+ device in layer 2 and reach the router. Such packets should not be
+ routed and instead dropped
+ * - ``decap_error``
+ - ``exception``
+ - Traps NVE and IPinIP packets that the device decided to drop because of
+ failure during decapsulation (e.g., packet being too short, reserved
+ bits set in VXLAN header)
+ * - ``overlay_smac_is_mc``
+ - ``drop``
+ - Traps NVE packets that the device decided to drop because their overlay
+ source MAC is multicast
Driver-specific Packet Traps
============================
@@ -233,7 +248,8 @@ help debug packet drops caused by these exceptions. The following list includes
links to the description of driver-specific traps registered by various device
drivers:
- * :doc:`devlink-trap-netdevsim`
+ * :doc:`netdevsim`
+ * :doc:`mlxsw`
Generic Packet Trap Groups
==========================
@@ -258,6 +274,9 @@ narrow. The description of these groups must be added to the following table:
* - ``buffer_drops``
- Contains packet traps for packets that were dropped by the device due to
an enqueue decision
+ * - ``tunnel_drops``
+ - Contains packet traps for packets that were dropped by the device during
+ tunnel encapsulation / decapsulation
Testing
=======
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
new file mode 100644
index 000000000000..087ff54d53fc
--- /dev/null
+++ b/Documentation/networking/devlink/index.rst
@@ -0,0 +1,42 @@
+Linux Devlink Documentation
+===========================
+
+devlink is an API to expose device information and resources not directly
+related to any device class, such as chip-wide/switch-ASIC-wide configuration.
+
+Interface documentation
+-----------------------
+
+The following pages describe various interfaces available through devlink in
+general.
+
+.. toctree::
+ :maxdepth: 1
+
+ devlink-dpipe
+ devlink-health
+ devlink-info
+ devlink-params
+ devlink-region
+ devlink-resource
+ devlink-trap
+
+Driver-specific documentation
+-----------------------------
+
+Each driver that implements ``devlink`` is expected to document what
+parameters, info versions, and other features it supports.
+
+.. toctree::
+ :maxdepth: 1
+
+ bnxt
+ ionic
+ mlx4
+ mlx5
+ mlxsw
+ mv88e6xxx
+ netdevsim
+ nfp
+ qed
+ ti-cpsw-switch
diff --git a/Documentation/networking/devlink/ionic.rst b/Documentation/networking/devlink/ionic.rst
new file mode 100644
index 000000000000..48da9c92d584
--- /dev/null
+++ b/Documentation/networking/devlink/ionic.rst
@@ -0,0 +1,29 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+ionic devlink support
+=====================
+
+This document describes the devlink features implemented by the ``ionic``
+device driver.
+
+Info versions
+=============
+
+The ``ionic`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``fw``
+ - running
+ - Version of firmware running on the device
+ * - ``asic.id``
+ - fixed
+ - The ASIC type for this device
+ * - ``asic.rev``
+ - fixed
+ - The revision of the ASIC for this device
diff --git a/Documentation/networking/devlink/mlx4.rst b/Documentation/networking/devlink/mlx4.rst
new file mode 100644
index 000000000000..7b2d17ea5471
--- /dev/null
+++ b/Documentation/networking/devlink/mlx4.rst
@@ -0,0 +1,56 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+mlx4 devlink support
+====================
+
+This document describes the devlink features implemented by the ``mlx4``
+device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+
+ * - Name
+ - Mode
+ * - ``internal_err_reset``
+ - driverinit, runtime
+ * - ``max_macs``
+ - driverinit
+ * - ``region_snapshot_enable``
+ - driverinit, runtime
+
+The ``mlx4`` driver also implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``enable_64b_cqe_eqe``
+ - Boolean
+ - driverinit
+ - Enable 64 byte CQEs/EQEs, if the FW supports it.
+ * - ``enable_4k_uar``
+ - Boolean
+ - driverinit
+ - Enable using the 4k UAR.
+
+The ``mlx4`` driver supports reloading via ``DEVLINK_CMD_RELOAD``
+
+Regions
+=======
+
+The ``mlx4`` driver supports dumping the firmware PCI crspace and health
+buffer during a critical firmware issue.
+
+In case a firmware command times out, firmware getting stuck, or a non zero
+value on the catastrophic buffer, a snapshot will be taken by the driver.
+
+The ``cr-space`` region will contain the firmware PCI crspace contents. The
+``fw-health`` region will contain the device firmware's health buffer.
+Snapshots for both of these regions are taken on the same event triggers.
diff --git a/Documentation/networking/devlink/mlx5.rst b/Documentation/networking/devlink/mlx5.rst
new file mode 100644
index 000000000000..629a6e69c036
--- /dev/null
+++ b/Documentation/networking/devlink/mlx5.rst
@@ -0,0 +1,59 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
+mlx5 devlink support
+====================
+
+This document describes the devlink features implemented by the ``mlx5``
+device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+
+ * - Name
+ - Mode
+ * - ``enable_roce``
+ - driverinit
+
+The ``mlx5`` driver also implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``flow_steering_mode``
+ - string
+ - runtime
+ - Controls the flow steering mode of the driver
+
+ * ``dmfs`` Device managed flow steering. In DMFS mode, the HW
+ steering entities are created and managed through firmware.
+ * ``smfs`` Software managed flow steering. In SMFS mode, the HW
+ steering entities are created and manage through the driver without
+ firmware intervention.
+
+The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD``
+
+Info versions
+=============
+
+The ``mlx5`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``fw.psid``
+ - fixed
+ - Used to represent the board id of the device.
+ * - ``fw.version``
+ - stored, running
+ - Three digit major.minor.subminor firmware version number.
diff --git a/Documentation/networking/devlink/mlxsw.rst b/Documentation/networking/devlink/mlxsw.rst
new file mode 100644
index 000000000000..cf857cb4ba8f
--- /dev/null
+++ b/Documentation/networking/devlink/mlxsw.rst
@@ -0,0 +1,81 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================
+mlxsw devlink support
+=====================
+
+This document describes the devlink features implemented by the ``mlxsw``
+device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+
+ * - Name
+ - Mode
+ * - ``fw_load_policy``
+ - driverinit
+
+The ``mlxsw`` driver also implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``acl_region_rehash_interval``
+ - u32
+ - runtime
+ - Sets an interval for periodic ACL region rehashes. The value is
+ specified in milliseconds, with a minimum of ``3000``. The value of
+ ``0`` disables periodic work entirely. The first rehash will be run
+ immediately after the value is set.
+
+The ``mlxsw`` driver supports reloading via ``DEVLINK_CMD_RELOAD``
+
+Info versions
+=============
+
+The ``mlxsw`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``hw.revision``
+ - fixed
+ - The hardware revision for this board
+ * - ``fw.psid``
+ - fixed
+ - Firmware PSID
+ * - ``fw.version``
+ - running
+ - Three digit firmware version
+
+Driver-specific Traps
+=====================
+
+.. list-table:: List of Driver-specific Traps Registered by ``mlxsw``
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``irif_disabled``
+ - ``drop``
+ - Traps packets that the device decided to drop because they need to be
+ routed from a disabled router interface (RIF). This can happen during
+ RIF dismantle, when the RIF is first disabled before being removed
+ completely
+ * - ``erif_disabled``
+ - ``drop``
+ - Traps packets that the device decided to drop because they need to be
+ routed through a disabled router interface (RIF). This can happen during
+ RIF dismantle, when the RIF is first disabled before being removed
+ completely
diff --git a/Documentation/networking/devlink/mv88e6xxx.rst b/Documentation/networking/devlink/mv88e6xxx.rst
new file mode 100644
index 000000000000..c621212a47a1
--- /dev/null
+++ b/Documentation/networking/devlink/mv88e6xxx.rst
@@ -0,0 +1,28 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
+mv88e6xxx devlink support
+=========================
+
+This document describes the devlink features implemented by the ``mv88e6xxx``
+device driver.
+
+Parameters
+==========
+
+The ``mv88e6xxx`` driver implements the following driver-specific parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``ATU_hash``
+ - u8
+ - runtime
+ - Select one of four possible hashing algorithms for MAC addresses in
+ the Address Translation Unit. A value of 3 may work better than the
+ default of 1 when many MAC addresses have the same OUI. Only the
+ values 0 to 3 are valid for this parameter.
diff --git a/Documentation/networking/devlink/netdevsim.rst b/Documentation/networking/devlink/netdevsim.rst
new file mode 100644
index 000000000000..2a266b7e7b38
--- /dev/null
+++ b/Documentation/networking/devlink/netdevsim.rst
@@ -0,0 +1,72 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
+netdevsim devlink support
+=========================
+
+This document describes the ``devlink`` features supported by the
+``netdevsim`` device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+
+ * - Name
+ - Mode
+ * - ``max_macs``
+ - driverinit
+
+The ``netdevsim`` driver also implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``test1``
+ - Boolean
+ - driverinit
+ - Test parameter used to show how a driver-specific devlink parameter
+ can be implemented.
+
+The ``netdevsim`` driver supports reloading via ``DEVLINK_CMD_RELOAD``
+
+Regions
+=======
+
+The ``netdevsim`` driver exposes a ``dummy`` region as an example of how the
+devlink-region interfaces work. A snapshot is taken whenever the
+``take_snapshot`` debugfs file is written to.
+
+Resources
+=========
+
+The ``netdevsim`` driver exposes resources to control the number of FIB
+entries and FIB rule entries that the driver will allow.
+
+.. code:: shell
+
+ $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96
+ $ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
+ $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
+ $ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
+ $ devlink dev reload netdevsim/netdevsim0
+
+Driver-specific Traps
+=====================
+
+.. list-table:: List of Driver-specific Traps Registered by ``netdevsim``
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``fid_miss``
+ - ``exception``
+ - When a packet enters the device it is classified to a filtering
+ indentifier (FID) based on the ingress port and VLAN. This trap is used
+ to trap packets for which a FID could not be found
diff --git a/Documentation/networking/devlink/nfp.rst b/Documentation/networking/devlink/nfp.rst
new file mode 100644
index 000000000000..a1717db0dfcc
--- /dev/null
+++ b/Documentation/networking/devlink/nfp.rst
@@ -0,0 +1,65 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+nfp devlink support
+===================
+
+This document describes the devlink features implemented by the ``nfp``
+device driver.
+
+Parameters
+==========
+
+.. list-table:: Generic parameters implemented
+
+ * - Name
+ - Mode
+ * - ``fw_load_policy``
+ - permanent
+ * - ``reset_dev_on_drv_probe``
+ - permanent
+
+Info versions
+=============
+
+The ``nfp`` driver reports the following versions
+
+.. list-table:: devlink info versions implemented
+ :widths: 5 5 90
+
+ * - Name
+ - Type
+ - Description
+ * - ``board.id``
+ - fixed
+ - Part number identifying the board design
+ * - ``board.rev``
+ - fixed
+ - Revision of the board design
+ * - ``board.manufacture``
+ - fixed
+ - Vendor of the board design
+ * - ``board.model``
+ - fixed
+ - Model name of the board design
+ * - ``fw.bundle_id``
+ - stored, running
+ - Firmware bundle id
+ * - ``fw.mgmt``
+ - stored, running
+ - Version of the management firmware
+ * - ``fw.cpld``
+ - stored, running
+ - The CPLD firmware component version
+ * - ``fw.app``
+ - stored, running
+ - The APP firmware component version
+ * - ``fw.undi``
+ - stored, running
+ - The UNDI firmware component version
+ * - ``fw.ncsi``
+ - stored, running
+ - The NSCI firmware component version
+ * - ``chip.init``
+ - stored, running
+ - The CFGR firmware component version
diff --git a/Documentation/networking/devlink/qed.rst b/Documentation/networking/devlink/qed.rst
new file mode 100644
index 000000000000..805c6f63621a
--- /dev/null
+++ b/Documentation/networking/devlink/qed.rst
@@ -0,0 +1,26 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+qed devlink support
+===================
+
+This document describes the devlink features implemented by the ``qed`` core
+device driver.
+
+Parameters
+==========
+
+The ``qed`` driver implements the following driver-specific parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``iwarp_cmt``
+ - Boolean
+ - runtime
+ - Enable iWARP functionality for 100g devices. Note that this impacts
+ L2 performance, and is therefore not enabled by default.
diff --git a/Documentation/networking/devlink/ti-cpsw-switch.rst b/Documentation/networking/devlink/ti-cpsw-switch.rst
new file mode 100644
index 000000000000..dc399e32abaa
--- /dev/null
+++ b/Documentation/networking/devlink/ti-cpsw-switch.rst
@@ -0,0 +1,31 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+ti-cpsw-switch devlink support
+==============================
+
+This document describes the devlink features implemented by the ``ti-cpsw-switch``
+device driver.
+
+Parameters
+==========
+
+The ``ti-cpsw-switch`` driver implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``ale_bypass``
+ - Boolean
+ - runtime
+ - Enables ALE_CONTROL(4).BYPASS mode for debugging purposes. In this
+ mode, all packets will be sent to the host port only.
+ * - ``switch_mode``
+ - Boolean
+ - runtime
+ - Enable switch mode
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
new file mode 100644
index 000000000000..c60afba69e3c
--- /dev/null
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -0,0 +1,520 @@
+=============================
+Netlink interface for ethtool
+=============================
+
+
+Basic information
+=================
+
+Netlink interface for ethtool uses generic netlink family ``ethtool``
+(userspace application should use macros ``ETHTOOL_GENL_NAME`` and
+``ETHTOOL_GENL_VERSION`` defined in ``<linux/ethtool_netlink.h>`` uapi
+header). This family does not use a specific header, all information in
+requests and replies is passed using netlink attributes.
+
+The ethtool netlink interface uses extended ACK for error and warning
+reporting, userspace application developers are encouraged to make these
+messages available to user in a suitable way.
+
+Requests can be divided into three categories: "get" (retrieving information),
+"set" (setting parameters) and "action" (invoking an action).
+
+All "set" and "action" type requests require admin privileges
+(``CAP_NET_ADMIN`` in the namespace). Most "get" type requests are allowed for
+anyone but there are exceptions (where the response contains sensitive
+information). In some cases, the request as such is allowed for anyone but
+unprivileged users have attributes with sensitive information (e.g.
+wake-on-lan password) omitted.
+
+
+Conventions
+===========
+
+Attributes which represent a boolean value usually use NLA_U8 type so that we
+can distinguish three states: "on", "off" and "not present" (meaning the
+information is not available in "get" requests or value is not to be changed
+in "set" requests). For these attributes, the "true" value should be passed as
+number 1 but any non-zero value should be understood as "true" by recipient.
+In the tables below, "bool" denotes NLA_U8 attributes interpreted in this way.
+
+In the message structure descriptions below, if an attribute name is suffixed
+with "+", parent nest can contain multiple attributes of the same type. This
+implements an array of entries.
+
+
+Request header
+==============
+
+Each request or reply message contains a nested attribute with common header.
+Structure of this header is
+
+ ============================== ====== =============================
+ ``ETHTOOL_A_HEADER_DEV_INDEX`` u32 device ifindex
+ ``ETHTOOL_A_HEADER_DEV_NAME`` string device name
+ ``ETHTOOL_A_HEADER_FLAGS`` u32 flags common for all requests
+ ============================== ====== =============================
+
+``ETHTOOL_A_HEADER_DEV_INDEX`` and ``ETHTOOL_A_HEADER_DEV_NAME`` identify the
+device message relates to. One of them is sufficient in requests, if both are
+used, they must identify the same device. Some requests, e.g. global string
+sets, do not require device identification. Most ``GET`` requests also allow
+dump requests without device identification to query the same information for
+all devices providing it (each device in a separate message).
+
+``ETHTOOL_A_HEADER_FLAGS`` is a bitmap of request flags common for all request
+types. The interpretation of these flags is the same for all request types but
+the flags may not apply to requests. Recognized flags are:
+
+ ================================= ===================================
+ ``ETHTOOL_FLAG_COMPACT_BITSETS`` use compact format bitsets in reply
+ ``ETHTOOL_FLAG_OMIT_REPLY`` omit optional reply (_SET and _ACT)
+ ================================= ===================================
+
+New request flags should follow the general idea that if the flag is not set,
+the behaviour is backward compatible, i.e. requests from old clients not aware
+of the flag should be interpreted the way the client expects. A client must
+not set flags it does not understand.
+
+
+Bit sets
+========
+
+For short bitmaps of (reasonably) fixed length, standard ``NLA_BITFIELD32``
+type is used. For arbitrary length bitmaps, ethtool netlink uses a nested
+attribute with contents of one of two forms: compact (two binary bitmaps
+representing bit values and mask of affected bits) and bit-by-bit (list of
+bits identified by either index or name).
+
+Verbose (bit-by-bit) bitsets allow sending symbolic names for bits together
+with their values which saves a round trip (when the bitset is passed in a
+request) or at least a second request (when the bitset is in a reply). This is
+useful for one shot applications like traditional ethtool command. On the
+other hand, long running applications like ethtool monitor (displaying
+notifications) or network management daemons may prefer fetching the names
+only once and using compact form to save message size. Notifications from
+ethtool netlink interface always use compact form for bitsets.
+
+A bitset can represent either a value/mask pair (``ETHTOOL_A_BITSET_NOMASK``
+not set) or a single bitmap (``ETHTOOL_A_BITSET_NOMASK`` set). In requests
+modifying a bitmap, the former changes the bit set in mask to values set in
+value and preserves the rest; the latter sets the bits set in the bitmap and
+clears the rest.
+
+Compact form: nested (bitset) atrribute contents:
+
+ ============================ ====== ============================
+ ``ETHTOOL_A_BITSET_NOMASK`` flag no mask, only a list
+ ``ETHTOOL_A_BITSET_SIZE`` u32 number of significant bits
+ ``ETHTOOL_A_BITSET_VALUE`` binary bitmap of bit values
+ ``ETHTOOL_A_BITSET_MASK`` binary bitmap of valid bits
+ ============================ ====== ============================
+
+Value and mask must have length at least ``ETHTOOL_A_BITSET_SIZE`` bits
+rounded up to a multiple of 32 bits. They consist of 32-bit words in host byte
+order, words ordered from least significant to most significant (i.e. the same
+way as bitmaps are passed with ioctl interface).
+
+For compact form, ``ETHTOOL_A_BITSET_SIZE`` and ``ETHTOOL_A_BITSET_VALUE`` are
+mandatory. ``ETHTOOL_A_BITSET_MASK`` attribute is mandatory if
+``ETHTOOL_A_BITSET_NOMASK`` is not set (bitset represents a value/mask pair);
+if ``ETHTOOL_A_BITSET_NOMASK`` is not set, ``ETHTOOL_A_BITSET_MASK`` is not
+allowed (bitset represents a single bitmap.
+
+Kernel bit set length may differ from userspace length if older application is
+used on newer kernel or vice versa. If userspace bitmap is longer, an error is
+issued only if the request actually tries to set values of some bits not
+recognized by kernel.
+
+Bit-by-bit form: nested (bitset) attribute contents:
+
+ +------------------------------------+--------+-----------------------------+
+ | ``ETHTOOL_A_BITSET_NOMASK`` | flag | no mask, only a list |
+ +------------------------------------+--------+-----------------------------+
+ | ``ETHTOOL_A_BITSET_SIZE`` | u32 | number of significant bits |
+ +------------------------------------+--------+-----------------------------+
+ | ``ETHTOOL_A_BITSET_BITS`` | nested | array of bits |
+ +-+----------------------------------+--------+-----------------------------+
+ | | ``ETHTOOL_A_BITSET_BITS_BIT+`` | nested | one bit |
+ +-+-+--------------------------------+--------+-----------------------------+
+ | | | ``ETHTOOL_A_BITSET_BIT_INDEX`` | u32 | bit index (0 for LSB) |
+ +-+-+--------------------------------+--------+-----------------------------+
+ | | | ``ETHTOOL_A_BITSET_BIT_NAME`` | string | bit name |
+ +-+-+--------------------------------+--------+-----------------------------+
+ | | | ``ETHTOOL_A_BITSET_BIT_VALUE`` | flag | present if bit is set |
+ +-+-+--------------------------------+--------+-----------------------------+
+
+Bit size is optional for bit-by-bit form. ``ETHTOOL_A_BITSET_BITS`` nest can
+only contain ``ETHTOOL_A_BITSET_BITS_BIT`` attributes but there can be an
+arbitrary number of them. A bit may be identified by its index or by its
+name. When used in requests, listed bits are set to 0 or 1 according to
+``ETHTOOL_A_BITSET_BIT_VALUE``, the rest is preserved. A request fails if
+index exceeds kernel bit length or if name is not recognized.
+
+When ``ETHTOOL_A_BITSET_NOMASK`` flag is present, bitset is interpreted as
+a simple bitmap. ``ETHTOOL_A_BITSET_BIT_VALUE`` attributes are not used in
+such case. Such bitset represents a bitmap with listed bits set and the rest
+zero.
+
+In requests, application can use either form. Form used by kernel in reply is
+determined by ``ETHTOOL_FLAG_COMPACT_BITSETS`` flag in flags field of request
+header. Semantics of value and mask depends on the attribute.
+
+
+List of message types
+=====================
+
+All constants identifying message types use ``ETHTOOL_CMD_`` prefix and suffix
+according to message purpose:
+
+ ============== ======================================
+ ``_GET`` userspace request to retrieve data
+ ``_SET`` userspace request to set data
+ ``_ACT`` userspace request to perform an action
+ ``_GET_REPLY`` kernel reply to a ``GET`` request
+ ``_SET_REPLY`` kernel reply to a ``SET`` request
+ ``_ACT_REPLY`` kernel reply to an ``ACT`` request
+ ``_NTF`` kernel notification
+ ============== ======================================
+
+Userspace to kernel:
+
+ ===================================== ================================
+ ``ETHTOOL_MSG_STRSET_GET`` get string set
+ ``ETHTOOL_MSG_LINKINFO_GET`` get link settings
+ ``ETHTOOL_MSG_LINKINFO_SET`` set link settings
+ ``ETHTOOL_MSG_LINKMODES_GET`` get link modes info
+ ``ETHTOOL_MSG_LINKMODES_SET`` set link modes info
+ ``ETHTOOL_MSG_LINKSTATE_GET`` get link state
+ ===================================== ================================
+
+Kernel to userspace:
+
+ ===================================== ================================
+ ``ETHTOOL_MSG_STRSET_GET_REPLY`` string set contents
+ ``ETHTOOL_MSG_LINKINFO_GET_REPLY`` link settings
+ ``ETHTOOL_MSG_LINKINFO_NTF`` link settings notification
+ ``ETHTOOL_MSG_LINKMODES_GET_REPLY`` link modes info
+ ``ETHTOOL_MSG_LINKMODES_NTF`` link modes notification
+ ``ETHTOOL_MSG_LINKSTATE_GET_REPLY`` link state info
+ ===================================== ================================
+
+``GET`` requests are sent by userspace applications to retrieve device
+information. They usually do not contain any message specific attributes.
+Kernel replies with corresponding "GET_REPLY" message. For most types, ``GET``
+request with ``NLM_F_DUMP`` and no device identification can be used to query
+the information for all devices supporting the request.
+
+If the data can be also modified, corresponding ``SET`` message with the same
+layout as corresponding ``GET_REPLY`` is used to request changes. Only
+attributes where a change is requested are included in such request (also, not
+all attributes may be changed). Replies to most ``SET`` request consist only
+of error code and extack; if kernel provides additional data, it is sent in
+the form of corresponding ``SET_REPLY`` message which can be suppressed by
+setting ``ETHTOOL_FLAG_OMIT_REPLY`` flag in request header.
+
+Data modification also triggers sending a ``NTF`` message with a notification.
+These usually bear only a subset of attributes which was affected by the
+change. The same notification is issued if the data is modified using other
+means (mostly ioctl ethtool interface). Unlike notifications from ethtool
+netlink code which are only sent if something actually changed, notifications
+triggered by ioctl interface may be sent even if the request did not actually
+change any data.
+
+``ACT`` messages request kernel (driver) to perform a specific action. If some
+information is reported by kernel (which can be suppressed by setting
+``ETHTOOL_FLAG_OMIT_REPLY`` flag in request header), the reply takes form of
+an ``ACT_REPLY`` message. Performing an action also triggers a notification
+(``NTF`` message).
+
+Later sections describe the format and semantics of these messages.
+
+
+STRSET_GET
+==========
+
+Requests contents of a string set as provided by ioctl commands
+``ETHTOOL_GSSET_INFO`` and ``ETHTOOL_GSTRINGS.`` String sets are not user
+writeable so that the corresponding ``STRSET_SET`` message is only used in
+kernel replies. There are two types of string sets: global (independent of
+a device, e.g. device feature names) and device specific (e.g. device private
+flags).
+
+Request contents:
+
+ +---------------------------------------+--------+------------------------+
+ | ``ETHTOOL_A_STRSET_HEADER`` | nested | request header |
+ +---------------------------------------+--------+------------------------+
+ | ``ETHTOOL_A_STRSET_STRINGSETS`` | nested | string set to request |
+ +-+-------------------------------------+--------+------------------------+
+ | | ``ETHTOOL_A_STRINGSETS_STRINGSET+`` | nested | one string set |
+ +-+-+-----------------------------------+--------+------------------------+
+ | | | ``ETHTOOL_A_STRINGSET_ID`` | u32 | set id |
+ +-+-+-----------------------------------+--------+------------------------+
+
+Kernel response contents:
+
+ +---------------------------------------+--------+-----------------------+
+ | ``ETHTOOL_A_STRSET_HEADER`` | nested | reply header |
+ +---------------------------------------+--------+-----------------------+
+ | ``ETHTOOL_A_STRSET_STRINGSETS`` | nested | array of string sets |
+ +-+-------------------------------------+--------+-----------------------+
+ | | ``ETHTOOL_A_STRINGSETS_STRINGSET+`` | nested | one string set |
+ +-+-+-----------------------------------+--------+-----------------------+
+ | | | ``ETHTOOL_A_STRINGSET_ID`` | u32 | set id |
+ +-+-+-----------------------------------+--------+-----------------------+
+ | | | ``ETHTOOL_A_STRINGSET_COUNT`` | u32 | number of strings |
+ +-+-+-----------------------------------+--------+-----------------------+
+ | | | ``ETHTOOL_A_STRINGSET_STRINGS`` | nested | array of strings |
+ +-+-+-+---------------------------------+--------+-----------------------+
+ | | | | ``ETHTOOL_A_STRINGS_STRING+`` | nested | one string |
+ +-+-+-+-+-------------------------------+--------+-----------------------+
+ | | | | | ``ETHTOOL_A_STRING_INDEX`` | u32 | string index |
+ +-+-+-+-+-------------------------------+--------+-----------------------+
+ | | | | | ``ETHTOOL_A_STRING_VALUE`` | string | string value |
+ +-+-+-+-+-------------------------------+--------+-----------------------+
+ | ``ETHTOOL_A_STRSET_COUNTS_ONLY`` | flag | return only counts |
+ +---------------------------------------+--------+-----------------------+
+
+Device identification in request header is optional. Depending on its presence
+a and ``NLM_F_DUMP`` flag, there are three type of ``STRSET_GET`` requests:
+
+ - no ``NLM_F_DUMP,`` no device: get "global" stringsets
+ - no ``NLM_F_DUMP``, with device: get string sets related to the device
+ - ``NLM_F_DUMP``, no device: get device related string sets for all devices
+
+If there is no ``ETHTOOL_A_STRSET_STRINGSETS`` array, all string sets of
+requested type are returned, otherwise only those specified in the request.
+Flag ``ETHTOOL_A_STRSET_COUNTS_ONLY`` tells kernel to only return string
+counts of the sets, not the actual strings.
+
+
+LINKINFO_GET
+============
+
+Requests link settings as provided by ``ETHTOOL_GLINKSETTINGS`` except for
+link modes and autonegotiation related information. The request does not use
+any attributes.
+
+Request contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKINFO_HEADER`` nested request header
+ ==================================== ====== ==========================
+
+Kernel response contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKINFO_HEADER`` nested reply header
+ ``ETHTOOL_A_LINKINFO_PORT`` u8 physical port
+ ``ETHTOOL_A_LINKINFO_PHYADDR`` u8 phy MDIO address
+ ``ETHTOOL_A_LINKINFO_TP_MDIX`` u8 MDI(-X) status
+ ``ETHTOOL_A_LINKINFO_TP_MDIX_CTRL`` u8 MDI(-X) control
+ ``ETHTOOL_A_LINKINFO_TRANSCEIVER`` u8 transceiver
+ ==================================== ====== ==========================
+
+Attributes and their values have the same meaning as matching members of the
+corresponding ioctl structures.
+
+``LINKINFO_GET`` allows dump requests (kernel returns reply message for all
+devices supporting the request).
+
+
+LINKINFO_SET
+============
+
+``LINKINFO_SET`` request allows setting some of the attributes reported by
+``LINKINFO_GET``.
+
+Request contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKINFO_HEADER`` nested request header
+ ``ETHTOOL_A_LINKINFO_PORT`` u8 physical port
+ ``ETHTOOL_A_LINKINFO_PHYADDR`` u8 phy MDIO address
+ ``ETHTOOL_A_LINKINFO_TP_MDIX_CTRL`` u8 MDI(-X) control
+ ==================================== ====== ==========================
+
+MDI(-X) status and transceiver cannot be set, request with the corresponding
+attributes is rejected.
+
+
+LINKMODES_GET
+=============
+
+Requests link modes (supported, advertised and peer advertised) and related
+information (autonegotiation status, link speed and duplex) as provided by
+``ETHTOOL_GLINKSETTINGS``. The request does not use any attributes.
+
+Request contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKMODES_HEADER`` nested request header
+ ==================================== ====== ==========================
+
+Kernel response contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKMODES_HEADER`` nested reply header
+ ``ETHTOOL_A_LINKMODES_AUTONEG`` u8 autonegotiation status
+ ``ETHTOOL_A_LINKMODES_OURS`` bitset advertised link modes
+ ``ETHTOOL_A_LINKMODES_PEER`` bitset partner link modes
+ ``ETHTOOL_A_LINKMODES_SPEED`` u32 link speed (Mb/s)
+ ``ETHTOOL_A_LINKMODES_DUPLEX`` u8 duplex mode
+ ==================================== ====== ==========================
+
+For ``ETHTOOL_A_LINKMODES_OURS``, value represents advertised modes and mask
+represents supported modes. ``ETHTOOL_A_LINKMODES_PEER`` in the reply is a bit
+list.
+
+``LINKMODES_GET`` allows dump requests (kernel returns reply messages for all
+devices supporting the request).
+
+
+LINKMODES_SET
+=============
+
+Request contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKMODES_HEADER`` nested request header
+ ``ETHTOOL_A_LINKMODES_AUTONEG`` u8 autonegotiation status
+ ``ETHTOOL_A_LINKMODES_OURS`` bitset advertised link modes
+ ``ETHTOOL_A_LINKMODES_PEER`` bitset partner link modes
+ ``ETHTOOL_A_LINKMODES_SPEED`` u32 link speed (Mb/s)
+ ``ETHTOOL_A_LINKMODES_DUPLEX`` u8 duplex mode
+ ==================================== ====== ==========================
+
+``ETHTOOL_A_LINKMODES_OURS`` bit set allows setting advertised link modes. If
+autonegotiation is on (either set now or kept from before), advertised modes
+are not changed (no ``ETHTOOL_A_LINKMODES_OURS`` attribute) and at least one
+of speed and duplex is specified, kernel adjusts advertised modes to all
+supported modes matching speed, duplex or both (whatever is specified). This
+autoselection is done on ethtool side with ioctl interface, netlink interface
+is supposed to allow requesting changes without knowing what exactly kernel
+supports.
+
+
+LINKSTATE_GET
+=============
+
+Requests link state information. At the moment, only link up/down flag (as
+provided by ``ETHTOOL_GLINK`` ioctl command) is provided but some future
+extensions are planned (e.g. link down reason). This request does not have any
+attributes.
+
+Request contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKSTATE_HEADER`` nested request header
+ ==================================== ====== ==========================
+
+Kernel response contents:
+
+ ==================================== ====== ==========================
+ ``ETHTOOL_A_LINKSTATE_HEADER`` nested reply header
+ ``ETHTOOL_A_LINKSTATE_LINK`` bool link state (up/down)
+ ==================================== ====== ==========================
+
+For most NIC drivers, the value of ``ETHTOOL_A_LINKSTATE_LINK`` returns
+carrier flag provided by ``netif_carrier_ok()`` but there are drivers which
+define their own handler.
+
+``LINKSTATE_GET`` allows dump requests (kernel returns reply messages for all
+devices supporting the request).
+
+
+Request translation
+===================
+
+The following table maps ioctl commands to netlink commands providing their
+functionality. Entries with "n/a" in right column are commands which do not
+have their netlink replacement yet.
+
+ =================================== =====================================
+ ioctl command netlink command
+ =================================== =====================================
+ ``ETHTOOL_GSET`` ``ETHTOOL_MSG_LINKINFO_GET``
+ ``ETHTOOL_MSG_LINKMODES_GET``
+ ``ETHTOOL_SSET`` ``ETHTOOL_MSG_LINKINFO_SET``
+ ``ETHTOOL_MSG_LINKMODES_SET``
+ ``ETHTOOL_GDRVINFO`` n/a
+ ``ETHTOOL_GREGS`` n/a
+ ``ETHTOOL_GWOL`` n/a
+ ``ETHTOOL_SWOL`` n/a
+ ``ETHTOOL_GMSGLVL`` n/a
+ ``ETHTOOL_SMSGLVL`` n/a
+ ``ETHTOOL_NWAY_RST`` n/a
+ ``ETHTOOL_GLINK`` ``ETHTOOL_MSG_LINKSTATE_GET``
+ ``ETHTOOL_GEEPROM`` n/a
+ ``ETHTOOL_SEEPROM`` n/a
+ ``ETHTOOL_GCOALESCE`` n/a
+ ``ETHTOOL_SCOALESCE`` n/a
+ ``ETHTOOL_GRINGPARAM`` n/a
+ ``ETHTOOL_SRINGPARAM`` n/a
+ ``ETHTOOL_GPAUSEPARAM`` n/a
+ ``ETHTOOL_SPAUSEPARAM`` n/a
+ ``ETHTOOL_GRXCSUM`` n/a
+ ``ETHTOOL_SRXCSUM`` n/a
+ ``ETHTOOL_GTXCSUM`` n/a
+ ``ETHTOOL_STXCSUM`` n/a
+ ``ETHTOOL_GSG`` n/a
+ ``ETHTOOL_SSG`` n/a
+ ``ETHTOOL_TEST`` n/a
+ ``ETHTOOL_GSTRINGS`` ``ETHTOOL_MSG_STRSET_GET``
+ ``ETHTOOL_PHYS_ID`` n/a
+ ``ETHTOOL_GSTATS`` n/a
+ ``ETHTOOL_GTSO`` n/a
+ ``ETHTOOL_STSO`` n/a
+ ``ETHTOOL_GPERMADDR`` rtnetlink ``RTM_GETLINK``
+ ``ETHTOOL_GUFO`` n/a
+ ``ETHTOOL_SUFO`` n/a
+ ``ETHTOOL_GGSO`` n/a
+ ``ETHTOOL_SGSO`` n/a
+ ``ETHTOOL_GFLAGS`` n/a
+ ``ETHTOOL_SFLAGS`` n/a
+ ``ETHTOOL_GPFLAGS`` n/a
+ ``ETHTOOL_SPFLAGS`` n/a
+ ``ETHTOOL_GRXFH`` n/a
+ ``ETHTOOL_SRXFH`` n/a
+ ``ETHTOOL_GGRO`` n/a
+ ``ETHTOOL_SGRO`` n/a
+ ``ETHTOOL_GRXRINGS`` n/a
+ ``ETHTOOL_GRXCLSRLCNT`` n/a
+ ``ETHTOOL_GRXCLSRULE`` n/a
+ ``ETHTOOL_GRXCLSRLALL`` n/a
+ ``ETHTOOL_SRXCLSRLDEL`` n/a
+ ``ETHTOOL_SRXCLSRLINS`` n/a
+ ``ETHTOOL_FLASHDEV`` n/a
+ ``ETHTOOL_RESET`` n/a
+ ``ETHTOOL_SRXNTUPLE`` n/a
+ ``ETHTOOL_GRXNTUPLE`` n/a
+ ``ETHTOOL_GSSET_INFO`` ``ETHTOOL_MSG_STRSET_GET``
+ ``ETHTOOL_GRXFHINDIR`` n/a
+ ``ETHTOOL_SRXFHINDIR`` n/a
+ ``ETHTOOL_GFEATURES`` n/a
+ ``ETHTOOL_SFEATURES`` n/a
+ ``ETHTOOL_GCHANNELS`` n/a
+ ``ETHTOOL_SCHANNELS`` n/a
+ ``ETHTOOL_SET_DUMP`` n/a
+ ``ETHTOOL_GET_DUMP_FLAG`` n/a
+ ``ETHTOOL_GET_DUMP_DATA`` n/a
+ ``ETHTOOL_GET_TS_INFO`` n/a
+ ``ETHTOOL_GMODULEINFO`` n/a
+ ``ETHTOOL_GMODULEEEPROM`` n/a
+ ``ETHTOOL_GEEE`` n/a
+ ``ETHTOOL_SEEE`` n/a
+ ``ETHTOOL_GRSSH`` n/a
+ ``ETHTOOL_SRSSH`` n/a
+ ``ETHTOOL_GTUNABLE`` n/a
+ ``ETHTOOL_STUNABLE`` n/a
+ ``ETHTOOL_GPHYSTATS`` n/a
+ ``ETHTOOL_PERQUEUE`` n/a
+ ``ETHTOOL_GLINKSETTINGS`` ``ETHTOOL_MSG_LINKINFO_GET``
+ ``ETHTOOL_MSG_LINKMODES_GET``
+ ``ETHTOOL_SLINKSETTINGS`` ``ETHTOOL_MSG_LINKINFO_SET``
+ ``ETHTOOL_MSG_LINKMODES_SET``
+ ``ETHTOOL_PHY_GTUNABLE`` n/a
+ ``ETHTOOL_PHY_STUNABLE`` n/a
+ ``ETHTOOL_GFECPARAM`` n/a
+ ``ETHTOOL_SFECPARAM`` n/a
+ =================================== =====================================
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 5acab1290e03..d07d9855dcd3 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -13,9 +13,8 @@ Contents:
can_ucan_protocol
device_drivers/index
dsa/index
- devlink-info-versions
- devlink-trap
- devlink-trap-netdevsim
+ devlink/index
+ ethtool-netlink
ieee802154
j1939
kapi
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 48ccb1b31160..5f53faff4e25 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -479,6 +479,10 @@ tcp_no_metrics_save - BOOLEAN
degradation. If set, TCP will not cache metrics on closing
connections.
+tcp_no_ssthresh_metrics_save - BOOLEAN
+ Controls whether TCP saves ssthresh metrics in the route cache.
+ Default is 1, which disables ssthresh metrics.
+
tcp_orphan_retries - INTEGER
This value influences the timeout of a locally closed TCP connection,
when RTO retransmissions remain unacknowledged.
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst
index e0a7c7af6525..1e4735cc0553 100644
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -267,6 +267,24 @@ Some of the interface modes are described below:
duplex, pause or other settings. This is dependent on the MAC and/or
PHY behaviour.
+``PHY_INTERFACE_MODE_10GBASER``
+ This is the IEEE 802.3 Clause 49 defined 10GBASE-R protocol used with
+ various different mediums. Please refer to the IEEE standard for a
+ definition of this.
+
+ Note: 10GBASE-R is just one protocol that can be used with XFI and SFI.
+ XFI and SFI permit multiple protocols over a single SERDES lane, and
+ also defines the electrical characteristics of the signals with a host
+ compliance board plugged into the host XFP/SFP connector. Therefore,
+ XFI and SFI are not PHY interface types in their own right.
+
+``PHY_INTERFACE_MODE_10GKR``
+ This is the IEEE 802.3 Clause 49 defined 10GBASE-R with Clause 73
+ autonegotiation. Please refer to the IEEE standard for further
+ information.
+
+ Note: due to legacy usage, some 10GBASE-R usage incorrectly makes
+ use of this definition.
Pause frames / flow control
===========================
diff --git a/Documentation/networking/sfp-phylink.rst b/Documentation/networking/sfp-phylink.rst
index a5e00a159d21..d753a309f9d1 100644
--- a/Documentation/networking/sfp-phylink.rst
+++ b/Documentation/networking/sfp-phylink.rst
@@ -251,7 +251,8 @@ this documentation.
phylink_mac_change(priv->phylink, link_is_up);
where ``link_is_up`` is true if the link is currently up or false
- otherwise.
+ otherwise. If a MAC is unable to provide these interrupts, then
+ it should set ``priv->phylink_config.pcs_poll = true;`` in step 9.
11. Verify that the driver does not call::