aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-class-net15
-rw-r--r--Documentation/ABI/testing/sysfs-class-net-dsa11
-rw-r--r--Documentation/ABI/testing/sysfs-class-net-qmi10
-rw-r--r--Documentation/bpf/bpf_design_QA.rst6
-rw-r--r--Documentation/bpf/bpf_devel_QA.rst11
-rw-r--r--Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml55
-rw-r--r--Documentation/devicetree/bindings/net/brcm,bcm4908-enet.yaml48
-rw-r--r--Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt101
-rw-r--r--Documentation/devicetree/bindings/net/btusb.txt2
-rw-r--r--Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml11
-rw-r--r--Documentation/devicetree/bindings/net/dsa/arrow,xrs700x.yaml73
-rw-r--r--Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml173
-rw-r--r--Documentation/devicetree/bindings/net/dsa/mt7530.txt6
-rw-r--r--Documentation/devicetree/bindings/net/ethernet-controller.yaml1
-rw-r--r--Documentation/devicetree/bindings/net/marvell-pp2.txt6
-rw-r--r--Documentation/devicetree/bindings/net/qca,ar803x.yaml16
-rw-r--r--Documentation/devicetree/bindings/net/qcom,ipa.yaml15
-rw-r--r--Documentation/devicetree/bindings/net/renesas,etheravb.yaml2
-rw-r--r--Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml50
-rw-r--r--Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml7
-rw-r--r--Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml85
-rw-r--r--Documentation/devicetree/bindings/net/xilinx_axienet.txt4
-rw-r--r--Documentation/driver-api/auxiliary_bus.rst2
-rw-r--r--Documentation/networking/bonding.rst13
-rw-r--r--Documentation/networking/device_drivers/ethernet/index.rst1
-rw-r--r--Documentation/networking/device_drivers/ethernet/intel/ice.rst1027
-rw-r--r--Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst70
-rw-r--r--Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst217
-rw-r--r--Documentation/networking/device_drivers/ethernet/ti/am65_nuss_cpsw_switchdev.rst143
-rw-r--r--Documentation/networking/devlink/am65-nuss-cpsw-switch.rst26
-rw-r--r--Documentation/networking/devlink/devlink-port.rst199
-rw-r--r--Documentation/networking/devlink/devlink-resource.rst14
-rw-r--r--Documentation/networking/devlink/devlink-trap.rst5
-rw-r--r--Documentation/networking/devlink/index.rst2
-rw-r--r--Documentation/networking/dsa/dsa.rst4
-rw-r--r--Documentation/networking/ethtool-netlink.rst11
-rw-r--r--Documentation/networking/filter.rst67
-rw-r--r--Documentation/networking/ip-sysctl.rst78
-rw-r--r--Documentation/networking/netdev-FAQ.rst16
-rw-r--r--Documentation/networking/netdev-features.rst21
-rw-r--r--Documentation/networking/phy.rst13
-rw-r--r--Documentation/networking/sfp-phylink.rst2
-rw-r--r--Documentation/networking/snmp_counter.rst28
-rw-r--r--Documentation/networking/timestamping.rst3
44 files changed, 2460 insertions, 210 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
index 1f2002df5ba2..1419103d11f9 100644
--- a/Documentation/ABI/testing/sysfs-class-net
+++ b/Documentation/ABI/testing/sysfs-class-net
@@ -337,3 +337,18 @@ Contact: netdev@vger.kernel.org
Description:
32-bit unsigned integer counting the number of times the link has
been down
+
+What: /sys/class/net/<iface>/threaded
+Date: Jan 2021
+KernelVersion: 5.12
+Contact: netdev@vger.kernel.org
+Description:
+ Boolean value to control the threaded mode per device. User could
+ set this value to enable/disable threaded mode for all napi
+ belonging to this device, without the need to do device up/down.
+
+ Possible values:
+ == ==================================
+ 0 threaded mode disabled for this dev
+ 1 threaded mode enabled for this dev
+ == ==================================
diff --git a/Documentation/ABI/testing/sysfs-class-net-dsa b/Documentation/ABI/testing/sysfs-class-net-dsa
index 985d84c585c6..e2da26b44dd0 100644
--- a/Documentation/ABI/testing/sysfs-class-net-dsa
+++ b/Documentation/ABI/testing/sysfs-class-net-dsa
@@ -3,5 +3,12 @@ Date: August 2018
KernelVersion: 4.20
Contact: netdev@vger.kernel.org
Description:
- String indicating the type of tagging protocol used by the
- DSA slave network device.
+ On read, this file returns a string indicating the type of
+ tagging protocol used by the DSA network devices that are
+ attached to this master interface.
+ On write, this file changes the tagging protocol of the
+ attached DSA switches, if this operation is supported by the
+ driver. Changing the tagging protocol must be done with the DSA
+ interfaces and the master interface all administratively down.
+ See the "name" field of each registered struct dsa_device_ops
+ for a list of valid values.
diff --git a/Documentation/ABI/testing/sysfs-class-net-qmi b/Documentation/ABI/testing/sysfs-class-net-qmi
index c310db4ccbc2..ed79f5893421 100644
--- a/Documentation/ABI/testing/sysfs-class-net-qmi
+++ b/Documentation/ABI/testing/sysfs-class-net-qmi
@@ -48,3 +48,13 @@ Description:
Write a number ranging from 1 to 254 to delete a previously
created qmap mux based network device.
+
+What: /sys/class/net/<qmimux iface>/qmap/mux_id
+Date: January 2021
+KernelVersion: 5.12
+Contact: Daniele Palmas <dnlplm@gmail.com>
+Description:
+ Unsigned integer
+
+ Indicates the mux id associated to the qmimux network interface
+ during its creation.
diff --git a/Documentation/bpf/bpf_design_QA.rst b/Documentation/bpf/bpf_design_QA.rst
index 2df7b067ab93..0e15f9b05c9d 100644
--- a/Documentation/bpf/bpf_design_QA.rst
+++ b/Documentation/bpf/bpf_design_QA.rst
@@ -208,6 +208,12 @@ data structures and compile with kernel internal headers. Both of these
kernel internals are subject to change and can break with newer kernels
such that the program needs to be adapted accordingly.
+Q: Are tracepoints part of the stable ABI?
+------------------------------------------
+A: NO. Tracepoints are tied to internal implementation details hence they are
+subject to change and can break with newer kernels. BPF programs need to change
+accordingly when this happens.
+
Q: How much stack space a BPF program uses?
-------------------------------------------
A: Currently all program types are limited to 512 bytes of stack
diff --git a/Documentation/bpf/bpf_devel_QA.rst b/Documentation/bpf/bpf_devel_QA.rst
index 5b613d2a5f1a..2ed89abbf9a4 100644
--- a/Documentation/bpf/bpf_devel_QA.rst
+++ b/Documentation/bpf/bpf_devel_QA.rst
@@ -501,16 +501,19 @@ All LLVM releases can be found at: http://releases.llvm.org/
Q: Got it, so how do I build LLVM manually anyway?
--------------------------------------------------
-A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
-that set up, proceed with building the latest LLVM and clang version
+A: We recommend that developers who want the fastest incremental builds
+use the Ninja build system, you can find it in your system's package
+manager, usually the package is ninja or ninja-build.
+
+You need ninja, cmake and gcc-c++ as build requisites for LLVM. Once you
+have that set up, proceed with building the latest LLVM and clang version
from the git repositories::
$ git clone https://github.com/llvm/llvm-project.git
- $ mkdir -p llvm-project/llvm/build/install
+ $ mkdir -p llvm-project/llvm/build
$ cd llvm-project/llvm/build
$ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
-DLLVM_ENABLE_PROJECTS="clang" \
- -DBUILD_SHARED_LIBS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF
$ ninja
diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
index 1f133f4a2924..0467441d7037 100644
--- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
+++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
@@ -74,17 +74,60 @@ allOf:
Any configuration is ignored when the phy-mode is set to "rmii".
amlogic,rx-delay-ns:
+ deprecated: true
enum:
- 0
- 2
default: 0
description:
- The internal RGMII RX clock delay (provided by this IP block) in
- nanoseconds. When phy-mode is set to "rgmii" then the RX delay
- should be explicitly configured. When the phy-mode is set to
- either "rgmii-id" or "rgmii-rxid" the RX clock delay is already
- provided by the PHY. Any configuration is ignored when the
- phy-mode is set to "rmii".
+ The internal RGMII RX clock delay in nanoseconds. Deprecated, use
+ rx-internal-delay-ps instead.
+
+ rx-internal-delay-ps:
+ default: 0
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - amlogic,meson8b-dwmac
+ - amlogic,meson8m2-dwmac
+ - amlogic,meson-gxbb-dwmac
+ - amlogic,meson-axg-dwmac
+ then:
+ properties:
+ rx-internal-delay-ps:
+ enum:
+ - 0
+ - 2000
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - amlogic,meson-g12a-dwmac
+ then:
+ properties:
+ rx-internal-delay-ps:
+ enum:
+ - 0
+ - 200
+ - 400
+ - 600
+ - 800
+ - 1000
+ - 1200
+ - 1400
+ - 1600
+ - 1800
+ - 2000
+ - 2200
+ - 2400
+ - 2600
+ - 2800
+ - 3000
properties:
compatible:
diff --git a/Documentation/devicetree/bindings/net/brcm,bcm4908-enet.yaml b/Documentation/devicetree/bindings/net/brcm,bcm4908-enet.yaml
new file mode 100644
index 000000000000..79c38ea14237
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/brcm,bcm4908-enet.yaml
@@ -0,0 +1,48 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/brcm,bcm4908-enet.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Broadcom BCM4908 Ethernet controller
+
+description: Broadcom's Ethernet controller integrated into BCM4908 family SoCs
+
+maintainers:
+ - Rafał Miłecki <rafal@milecki.pl>
+
+allOf:
+ - $ref: ethernet-controller.yaml#
+
+properties:
+ compatible:
+ const: brcm,bcm4908-enet
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ description: RX interrupt
+
+ interrupt-names:
+ const: rx
+
+required:
+ - reg
+ - interrupts
+ - interrupt-names
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+ ethernet@80002000 {
+ compatible = "brcm,bcm4908-enet";
+ reg = <0x80002000 0x1000>;
+
+ interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "rx";
+ };
diff --git a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
index 97ca62b0e14d..d0935d2afef8 100644
--- a/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
+++ b/Documentation/devicetree/bindings/net/brcm,bcm7445-switch-v4.0.txt
@@ -1,108 +1,13 @@
* Broadcom Starfighter 2 integrated swich
-Required properties:
+See dsa/brcm,bcm7445-switch-v4.0.yaml for the documentation.
-- compatible: should be one of
- "brcm,bcm7445-switch-v4.0"
- "brcm,bcm7278-switch-v4.0"
- "brcm,bcm7278-switch-v4.8"
-- reg: addresses and length of the register sets for the device, must be 6
- pairs of register addresses and lengths
-- interrupts: interrupts for the devices, must be two interrupts
-- #address-cells: must be 1, see dsa/dsa.txt
-- #size-cells: must be 0, see dsa/dsa.txt
-
-Deprecated binding required properties:
+*Deprecated* binding required properties:
- dsa,mii-bus: phandle to the MDIO bus controller, see dsa/dsa.txt
- dsa,ethernet: phandle to the CPU network interface controller, see dsa/dsa.txt
- #address-cells: must be 2, see dsa/dsa.txt
-Subnodes:
-
-The integrated switch subnode should be specified according to the binding
-described in dsa/dsa.txt.
-
-Optional properties:
-
-- reg-names: litteral names for the device base register addresses, when present
- must be: "core", "reg", "intrl2_0", "intrl2_1", "fcb", "acb"
-
-- interrupt-names: litternal names for the device interrupt lines, when present
- must be: "switch_0" and "switch_1"
-
-- brcm,num-gphy: specify the maximum number of integrated gigabit PHYs in the
- switch
-
-- brcm,num-rgmii-ports: specify the maximum number of RGMII interfaces supported
- by the switch
-
-- brcm,fcb-pause-override: boolean property, if present indicates that the switch
- supports Failover Control Block pause override capability
-
-- brcm,acb-packets-inflight: boolean property, if present indicates that the switch
- Admission Control Block supports reporting the number of packets in-flight in a
- switch queue
-
-- resets: a single phandle and reset identifier pair. See
- Documentation/devicetree/bindings/reset/reset.txt for details.
-
-- reset-names: If the "reset" property is specified, this property should have
- the value "switch" to denote the switch reset line.
-
-- clocks: when provided, the first phandle is to the switch's main clock and
- is valid for both BCM7445 and BCM7278. The second phandle is only applicable
- to BCM7445 and is to support dividing the switch core clock.
-
-- clock-names: when provided, the first phandle must be "sw_switch", and the
- second must be named "sw_switch_mdiv".
-
-Port subnodes:
-
-Optional properties:
-
-- brcm,use-bcm-hdr: boolean property, if present, indicates that the switch
- port has Broadcom tags enabled (per-packet metadata)
-
-Example:
-
-switch_top@f0b00000 {
- compatible = "simple-bus";
- #size-cells = <1>;
- #address-cells = <1>;
- ranges = <0 0xf0b00000 0x40804>;
-
- ethernet_switch@0 {
- compatible = "brcm,bcm7445-switch-v4.0";
- #size-cells = <0>;
- #address-cells = <1>;
- reg = <0x0 0x40000
- 0x40000 0x110
- 0x40340 0x30
- 0x40380 0x30
- 0x40400 0x34
- 0x40600 0x208>;
- reg-names = "core", "reg", intrl2_0", "intrl2_1",
- "fcb, "acb";
- interrupts = <0 0x18 0
- 0 0x19 0>;
- brcm,num-gphy = <1>;
- brcm,num-rgmii-ports = <2>;
- brcm,fcb-pause-override;
- brcm,acb-packets-inflight;
-
- ports {
- #address-cells = <1>;
- #size-cells = <0>;
-
- port@0 {
- label = "gphy";
- reg = <0>;
- };
- };
- };
-};
-
Example using the old DSA DeviceTree binding:
switch_top@f0b00000 {
@@ -132,7 +37,7 @@ switch_top@f0b00000 {
switch@0 {
reg = <0 0>;
#size-cells = <0>;
- #address-cells <1>;
+ #address-cells = <1>;
port@0 {
label = "gphy";
diff --git a/Documentation/devicetree/bindings/net/btusb.txt b/Documentation/devicetree/bindings/net/btusb.txt
index b1ad6ee68e90..c51dd99dc0d3 100644
--- a/Documentation/devicetree/bindings/net/btusb.txt
+++ b/Documentation/devicetree/bindings/net/btusb.txt
@@ -38,7 +38,7 @@ Following example uses irq pin number 3 of gpio0 for out of band wake-on-bt:
compatible = "usb1286,204e";
reg = <1>;
interrupt-parent = <&gpio0>;
- interrupt-name = "wakeup";
+ interrupt-names = "wakeup";
interrupts = <3 IRQ_TYPE_LEVEL_LOW>;
};
};
diff --git a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml
index 0d2df30f19db..fe6a949a2eab 100644
--- a/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml
+++ b/Documentation/devicetree/bindings/net/can/fsl,flexcan.yaml
@@ -110,6 +110,16 @@ properties:
description:
Enable CAN remote wakeup.
+ fsl,scu-index:
+ description: |
+ The scu index of CAN instance.
+ For SoCs with SCU support, need setup stop mode via SCU firmware, so this
+ property can help indicate a resource. It supports up to 3 CAN instances
+ now.
+ $ref: /schemas/types.yaml#/definitions/uint8
+ minimum: 0
+ maximum: 2
+
required:
- compatible
- reg
@@ -137,4 +147,5 @@ examples:
clocks = <&clks 1>, <&clks 2>;
clock-names = "ipg", "per";
fsl,stop-mode = <&gpr 0x34 28>;
+ fsl,scu-index = /bits/ 8 <1>;
};
diff --git a/Documentation/devicetree/bindings/net/dsa/arrow,xrs700x.yaml b/Documentation/devicetree/bindings/net/dsa/arrow,xrs700x.yaml
new file mode 100644
index 000000000000..3f01b65f3b22
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/arrow,xrs700x.yaml
@@ -0,0 +1,73 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/dsa/arrow,xrs700x.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arrow SpeedChips XRS7000 Series Switch Device Tree Bindings
+
+allOf:
+ - $ref: dsa.yaml#
+
+maintainers:
+ - George McCollister <george.mccollister@gmail.com>
+
+description:
+ The Arrow SpeedChips XRS7000 Series of single chip gigabit Ethernet switches
+ are designed for critical networking applications. They have up to three
+ RGMII ports and one RMII port and are managed via i2c or mdio.
+
+properties:
+ compatible:
+ oneOf:
+ - enum:
+ - arrow,xrs7003e
+ - arrow,xrs7003f
+ - arrow,xrs7004e
+ - arrow,xrs7004f
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ switch@8 {
+ compatible = "arrow,xrs7004e";
+ reg = <0x8>;
+
+ ethernet-ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ ethernet-port@1 {
+ reg = <1>;
+ label = "lan0";
+ phy-handle = <&swphy0>;
+ phy-mode = "rgmii-id";
+ };
+ ethernet-port@2 {
+ reg = <2>;
+ label = "lan1";
+ phy-handle = <&swphy1>;
+ phy-mode = "rgmii-id";
+ };
+ ethernet-port@3 {
+ reg = <3>;
+ label = "cpu";
+ ethernet = <&fec1>;
+ fixed-link {
+ speed = <1000>;
+ full-duplex;
+ };
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml b/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml
new file mode 100644
index 000000000000..d730fe5a4355
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/dsa/brcm,sf2.yaml
@@ -0,0 +1,173 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/net/dsa/brcm,sf2.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Broadcom Starfighter 2 integrated swich
+
+maintainers:
+ - Florian Fainelli <f.fainelli@gmail.com>
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - brcm,bcm4908-switch
+ - brcm,bcm7278-switch-v4.0
+ - brcm,bcm7278-switch-v4.8
+ - brcm,bcm7445-switch-v4.0
+
+ reg:
+ minItems: 6
+ maxItems: 6
+
+ reg-names:
+ items:
+ - const: core
+ - const: reg
+ - const: intrl2_0
+ - const: intrl2_1
+ - const: fcb
+ - const: acb
+
+ interrupts:
+ minItems: 2
+ maxItems: 2
+
+ interrupt-names:
+ items:
+ - const: switch_0
+ - const: switch_1
+
+ resets:
+ maxItems: 1
+
+ reset-names:
+ const: switch
+
+ clocks:
+ minItems: 1
+ maxItems: 2
+ items:
+ - description: switch's main clock
+ - description: dividing of the switch core clock
+
+ clock-names:
+ minItems: 1
+ maxItems: 2
+ items:
+ - const: sw_switch
+ - const: sw_switch_mdiv
+
+ brcm,num-gphy:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: maximum number of integrated gigabit PHYs in the switch
+
+ brcm,num-rgmii-ports:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: maximum number of RGMII interfaces supported by the switch
+
+ brcm,fcb-pause-override:
+ description: if present indicates that the switch supports Failover Control
+ Block pause override capability
+ type: boolean
+
+ brcm,acb-packets-inflight:
+ description: if present indicates that the switch Admission Control Block
+ supports reporting the number of packets in-flight in a switch queue
+ type: boolean
+
+ "#address-cells":
+ const: 1
+
+ "#size-cells":
+ const: 0
+
+ ports:
+ type: object
+
+ properties:
+ brcm,use-bcm-hdr:
+ description: if present, indicates that the switch port has Broadcom
+ tags enabled (per-packet metadata)
+ type: boolean
+
+required:
+ - reg
+ - interrupts
+ - "#address-cells"
+ - "#size-cells"
+
+allOf:
+ - $ref: "dsa.yaml#"
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - brcm,bcm7278-switch-v4.0
+ - brcm,bcm7278-switch-v4.8
+ then:
+ properties:
+ clocks:
+ minItems: 1
+ maxItems: 1
+ clock-names:
+ minItems: 1
+ maxItems: 1
+ required:
+ - clocks
+ - clock-names
+ - if:
+ properties:
+ compatible:
+ contains:
+ const: brcm,bcm7445-switch-v4.0
+ then:
+ properties:
+ clocks:
+ minItems: 2
+ maxItems: 2
+ clock-names:
+ minItems: 2
+ maxItems: 2
+ required:
+ - clocks
+ - clock-names
+
+additionalProperties: false
+
+examples:
+ - |
+ switch@f0b00000 {
+ compatible = "brcm,bcm7445-switch-v4.0";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ reg = <0xf0b00000 0x40000>,
+ <0xf0b40000 0x110>,
+ <0xf0b40340 0x30>,
+ <0xf0b40380 0x30>,
+ <0xf0b40400 0x34>,
+ <0xf0b40600 0x208>;
+ reg-names = "core", "reg", "intrl2_0", "intrl2_1",
+ "fcb", "acb";
+ interrupts = <0 0x18 0>,
+ <0 0x19 0>;
+ clocks = <&sw_switch>, <&sw_switch_mdiv>;
+ clock-names = "sw_switch", "sw_switch_mdiv";
+ brcm,num-gphy = <1>;
+ brcm,num-rgmii-ports = <2>;
+ brcm,fcb-pause-override;
+ brcm,acb-packets-inflight;
+
+ ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ port@0 {
+ label = "gphy";
+ reg = <0>;
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/dsa/mt7530.txt b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
index 560369efad6c..de04626a8e9d 100644
--- a/Documentation/devicetree/bindings/net/dsa/mt7530.txt
+++ b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
@@ -76,6 +76,12 @@ phy-mode must be set, see also example 2 below!
* mt7621: phy-mode = "rgmii-txid";
* mt7623: phy-mode = "rgmii";
+Optional properties:
+
+- gpio-controller: Boolean; if defined, MT7530's LED controller will run on
+ GPIO mode.
+- #gpio-cells: Must be 2 if gpio-controller is defined.
+
See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
required, optional properties and how the integrated switch subnodes must
be specified.
diff --git a/Documentation/devicetree/bindings/net/ethernet-controller.yaml b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
index dac4aadb6e2e..f599c1d9c961 100644
--- a/Documentation/devicetree/bindings/net/ethernet-controller.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-controller.yaml
@@ -89,6 +89,7 @@ properties:
- trgmii
- 1000base-x
- 2500base-x
+ - 5gbase-r
- rxaui
- xaui
diff --git a/Documentation/devicetree/bindings/net/marvell-pp2.txt b/Documentation/devicetree/bindings/net/marvell-pp2.txt
index b78397669320..ce15c173f43f 100644
--- a/Documentation/devicetree/bindings/net/marvell-pp2.txt
+++ b/Documentation/devicetree/bindings/net/marvell-pp2.txt
@@ -1,5 +1,6 @@
* Marvell Armada 375 Ethernet Controller (PPv2.1)
Marvell Armada 7K/8K Ethernet Controller (PPv2.2)
+ Marvell CN913X Ethernet Controller (PPv2.3)
Required properties:
@@ -12,10 +13,11 @@ Required properties:
- common controller registers
- LMS registers
- one register area per Ethernet port
- For "marvell,armada-7k-pp2", must contain the following register
+ For "marvell,armada-7k-pp2" used by 7K/8K and CN913X, must contain the following register
sets:
- packet processor registers
- networking interfaces registers
+ - CM3 address space used for TX Flow Control
- clocks: pointers to the reference clocks for this device, consequently:
- main controller clock (for both armada-375-pp2 and armada-7k-pp2)
@@ -81,7 +83,7 @@ Example for marvell,armada-7k-pp2:
cpm_ethernet: ethernet@0 {
compatible = "marvell,armada-7k-pp22";
- reg = <0x0 0x100000>, <0x129000 0xb000>;
+ reg = <0x0 0x100000>, <0x129000 0xb000>, <0x220000 0x800>;
clocks = <&cpm_syscon0 1 3>, <&cpm_syscon0 1 9>,
<&cpm_syscon0 1 5>, <&cpm_syscon0 1 6>, <&cpm_syscon0 1 18>;
clock-names = "pp_clk", "gop_clk", "mg_clk", "mg_core_clk", "axi_clk";
diff --git a/Documentation/devicetree/bindings/net/qca,ar803x.yaml b/Documentation/devicetree/bindings/net/qca,ar803x.yaml
index 64b3357ade8a..b3d4013b7ca6 100644
--- a/Documentation/devicetree/bindings/net/qca,ar803x.yaml
+++ b/Documentation/devicetree/bindings/net/qca,ar803x.yaml
@@ -28,6 +28,10 @@ properties:
$ref: /schemas/types.yaml#/definitions/uint32
enum: [0, 1, 2]
+ qca,disable-smarteee:
+ description: Disable Atheros SmartEEE feature.
+ type: boolean
+
qca,keep-pll-enabled:
description: |
If set, keep the PLL enabled even if there is no link. Useful if you
@@ -36,6 +40,18 @@ properties:
Only supported on the AR8031.
type: boolean
+ qca,smarteee-tw-us-100m:
+ description: EEE Tw parameter for 100M links.
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 255
+
+ qca,smarteee-tw-us-1g:
+ description: EEE Tw parameter for gigabit links.
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 255
+
vddio-supply:
description: |
RGMII I/O voltage regulator (see regulator/regulator.yaml).
diff --git a/Documentation/devicetree/bindings/net/qcom,ipa.yaml b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
index 8a2d12644675..8f86084bf12e 100644
--- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
@@ -113,13 +113,6 @@ properties:
performing early IPA initialization, including loading and
validating firwmare used by the GSI.
- modem-remoteproc:
- $ref: /schemas/types.yaml#/definitions/phandle
- description:
- This defines the phandle to the remoteproc node representing
- the modem subsystem. This is requied so the IPA driver can
- receive and act on notifications of modem up/down events.
-
memory-region:
maxItems: 1
description:
@@ -135,7 +128,6 @@ required:
- interrupts
- interconnects
- qcom,smem-states
- - modem-remoteproc
oneOf:
- required:
@@ -147,7 +139,7 @@ additionalProperties: false
examples:
- |
- #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
#include <dt-bindings/clock/qcom,rpmh.h>
#include <dt-bindings/interconnect/qcom,sdm845.h>
@@ -168,7 +160,6 @@ examples:
compatible = "qcom,sdm845-ipa";
modem-init;
- modem-remoteproc = <&mss_pil>;
iommus = <&apps_smmu 0x720 0x3>;
reg = <0x1e40000 0x7000>,
@@ -178,8 +169,8 @@ examples:
"ipa-shared",
"gsi";
- interrupts-extended = <&intc 0 311 IRQ_TYPE_EDGE_RISING>,
- <&intc 0 432 IRQ_TYPE_LEVEL_HIGH>,
+ interrupts-extended = <&intc GIC_SPI 311 IRQ_TYPE_EDGE_RISING>,
+ <&intc GIC_SPI 432 IRQ_TYPE_LEVEL_HIGH>,
<&ipa_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
<&ipa_smp2p_in 1 IRQ_TYPE_EDGE_RISING>;
interrupt-names = "ipa",
diff --git a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
index de9dd574a2f9..91ba96d43c6c 100644
--- a/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
+++ b/Documentation/devicetree/bindings/net/renesas,etheravb.yaml
@@ -40,6 +40,7 @@ properties:
- renesas,etheravb-r8a77980 # R-Car V3H
- renesas,etheravb-r8a77990 # R-Car E3
- renesas,etheravb-r8a77995 # R-Car D3
+ - renesas,etheravb-r8a779a0 # R-Car V3U
- const: renesas,etheravb-rcar-gen3 # R-Car Gen3 and RZ/G2
reg: true
@@ -170,6 +171,7 @@ allOf:
- renesas,etheravb-r8a77965
- renesas,etheravb-r8a77970
- renesas,etheravb-r8a77980
+ - renesas,etheravb-r8a779a0
then:
required:
- tx-internal-delay-ps
diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml
index c47b58f3e3f6..3fae9a5f0c6a 100644
--- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml
+++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpsw-nuss.yaml
@@ -4,7 +4,7 @@
$id: http://devicetree.org/schemas/net/ti,k3-am654-cpsw-nuss.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
-title: The TI AM654x/J721E SoC Gigabit Ethernet MAC (Media Access Controller) Device Tree Bindings
+title: The TI AM654x/J721E/AM642x SoC Gigabit Ethernet MAC (Media Access Controller) Device Tree Bindings
maintainers:
- Grygorii Strashko <grygorii.strashko@ti.com>
@@ -13,19 +13,16 @@ maintainers:
description:
The TI AM654x/J721E SoC Gigabit Ethernet MAC (CPSW2G NUSS) has two ports
(one external) and provides Ethernet packet communication for the device.
- CPSW2G NUSS features - the Reduced Gigabit Media Independent Interface (RGMII),
- Reduced Media Independent Interface (RMII), the Management Data
- Input/Output (MDIO) interface for physical layer device (PHY) management,
- new version of Common Platform Time Sync (CPTS), updated Address Lookup
- Engine (ALE).
- One external Ethernet port (port 1) with selectable RGMII/RMII interfaces and
- an internal Communications Port Programming Interface (CPPI5) (Host port 0).
+ The TI AM642x SoC Gigabit Ethernet MAC (CPSW3G NUSS) has three ports
+ (two external) and provides Ethernet packet communication and switching.
+
+ The internal Communications Port Programming Interface (CPPI5) (Host port 0).
Host Port 0 CPPI Packet Streaming Interface interface supports 8 TX channels
- and one RX channels and operating by TI AM654x/J721E NAVSS Unified DMA
- Peripheral Root Complex (UDMA-P) controller.
- The CPSW2G NUSS is integrated into device MCU domain named MCU_CPSW0.
+ and one RX channels and operating by NAVSS Unified DMA Peripheral Root
+ Complex (UDMA-P) controller.
- Additional features
+ CPSWxG features
+ updated Address Lookup Engine (ALE).
priority level Quality Of Service (QOS) support (802.1p)
Support for Audio/Video Bridging (P802.1Qav/D6.0)
Support for IEEE 1588 Clock Synchronization (2008 Annex D, Annex E and Annex F)
@@ -38,10 +35,18 @@ description:
VLAN support, 802.1Q compliant, Auto add port VLAN for untagged frames on
ingress, Auto VLAN removal on egress and auto pad to minimum frame size.
RX/TX csum offload
+ Management Data Input/Output (MDIO) interface for PHYs management
+ RMII/RGMII Interfaces support
+ new version of Common Platform Time Sync (CPTS)
+
+ The CPSWxG NUSS is integrated into
+ device MCU domain named MCU_CPSW0 on AM654x/J721E SoC.
+ device MAIN domain named CPSW0 on AM642x SoC.
Specifications can be found at
- http://www.ti.com/lit/ug/spruid7e/spruid7e.pdf
- http://www.ti.com/lit/ug/spruil1a/spruil1a.pdf
+ https://www.ti.com/lit/pdf/spruid7
+ https://www.ti.com/lit/zip/spruil1
+ https://www.ti.com/lit/pdf/spruim2
properties:
"#address-cells": true
@@ -51,11 +56,12 @@ properties:
oneOf:
- const: ti,am654-cpsw-nuss
- const: ti,j721e-cpsw-nuss
+ - const: ti,am642-cpsw-nuss
reg:
maxItems: 1
description:
- The physical base address and size of full the CPSW2G NUSS IO range
+ The physical base address and size of full the CPSWxG NUSS IO range
reg-names:
items:
@@ -66,12 +72,16 @@ properties:
dma-coherent: true
clocks:
- description: CPSW2G NUSS functional clock
+ description: CPSWxG NUSS functional clock
clock-names:
items:
- const: fck
+ assigned-clock-parents: true
+
+ assigned-clocks: true
+
power-domains:
maxItems: 1
@@ -99,16 +109,16 @@ properties:
const: 0
patternProperties:
- port@1:
+ port@[1-2]:
type: object
- description: CPSW2G NUSS external ports
+ description: CPSWxG NUSS external ports
$ref: ethernet-controller.yaml#
properties:
reg:
- items:
- - const: 1
+ minimum: 1
+ maximum: 2
description: CPSW port number
phys:
diff --git a/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml b/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml
index 9b7117920d90..ce43a1c58a57 100644
--- a/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml
+++ b/Documentation/devicetree/bindings/net/ti,k3-am654-cpts.yaml
@@ -73,6 +73,13 @@ properties:
items:
- const: cpts
+ assigned-clock-parents: true
+
+ assigned-clocks: true
+
+ power-domains:
+ maxItems: 1
+
ti,cpts-ext-ts-inputs:
$ref: /schemas/types.yaml#/definitions/uint32
maximum: 8
diff --git a/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
new file mode 100644
index 000000000000..59724d18e6f3
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/toshiba,visconti-dwmac.yaml
@@ -0,0 +1,85 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/net/toshiba,visconti-dwmac.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: Toshiba Visconti DWMAC Ethernet controller
+
+maintainers:
+ - Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
+
+select:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - toshiba,visconti-dwmac
+ required:
+ - compatible
+
+allOf:
+ - $ref: "snps,dwmac.yaml#"
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - toshiba,visconti-dwmac
+ - const: snps,dwmac-4.20a
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: main clock
+ - description: PHY reference clock
+
+ clock-names:
+ items:
+ - const: stmmaceth
+ - const: phy_ref_clk
+
+required:
+ - compatible
+ - reg
+ - clocks
+ - clock-names
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+ soc {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ piether: ethernet@28000000 {
+ compatible = "toshiba,visconti-dwmac", "snps,dwmac-4.20a";
+ reg = <0 0x28000000 0 0x10000>;
+ interrupts = <GIC_SPI 156 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "macirq";
+ clocks = <&clk300mhz>, <&clk125mhz>;
+ clock-names = "stmmaceth", "phy_ref_clk";
+ snps,txpbl = <4>;
+ snps,rxpbl = <4>;
+ snps,tso;
+ phy-mode = "rgmii-id";
+ phy-handle = <&phy0>;
+
+ mdio0 {
+ #address-cells = <0x1>;
+ #size-cells = <0x0>;
+ compatible = "snps,dwmac-mdio";
+
+ phy0: ethernet-phy@1 {
+ device_type = "ethernet-phy";
+ reg = <0x1>;
+ };
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/net/xilinx_axienet.txt b/Documentation/devicetree/bindings/net/xilinx_axienet.txt
index 7360617cdedb..2cd452419ed0 100644
--- a/Documentation/devicetree/bindings/net/xilinx_axienet.txt
+++ b/Documentation/devicetree/bindings/net/xilinx_axienet.txt
@@ -38,6 +38,10 @@ Optional properties:
1 to enable partial TX checksum offload,
2 to enable full TX checksum offload
- xlnx,rxcsum : Same values as xlnx,txcsum but for RX checksum offload
+- xlnx,switch-x-sgmii : Boolean to indicate the Ethernet core is configured to
+ support both 1000BaseX and SGMII modes. If set, the phy-mode
+ should be set to match the mode selected on core reset (i.e.
+ by the basex_or_sgmii core input line).
- clocks : AXI bus clock for the device. Refer to common clock bindings.
Used to calculate MDIO clock divisor. If not specified, it is
auto-detected from the CPU clock (but only on platforms where
diff --git a/Documentation/driver-api/auxiliary_bus.rst b/Documentation/driver-api/auxiliary_bus.rst
index 2312506b0674..fff96c7ba7a8 100644
--- a/Documentation/driver-api/auxiliary_bus.rst
+++ b/Documentation/driver-api/auxiliary_bus.rst
@@ -1,5 +1,7 @@
.. SPDX-License-Identifier: GPL-2.0-only
+.. _auxiliary_bus:
+
=============
Auxiliary Bus
=============
diff --git a/Documentation/networking/bonding.rst b/Documentation/networking/bonding.rst
index adc314639085..5f690f0ad0e4 100644
--- a/Documentation/networking/bonding.rst
+++ b/Documentation/networking/bonding.rst
@@ -951,6 +951,19 @@ xmit_hash_policy
packets will be distributed according to the encapsulated
flows.
+ vlan+srcmac
+
+ This policy uses a very rudimentary vlan ID and source mac
+ hash to load-balance traffic per-vlan, with failover
+ should one leg fail. The intended use case is for a bond
+ shared by multiple virtual machines, all configured to
+ use their own vlan, to give lacp-like functionality
+ without requiring lacp-capable switching hardware.
+
+ The formula for the hash is simply
+
+ hash = (vlan ID) XOR (source MAC vendor) XOR (source MAC dev)
+
The default value is layer2. This option was added in bonding
version 2.6.3. In earlier versions of bonding, this parameter
does not exist, and the layer2 policy is the only policy. The
diff --git a/Documentation/networking/device_drivers/ethernet/index.rst b/Documentation/networking/device_drivers/ethernet/index.rst
index cbb75a1818c0..6b5dc203da2b 100644
--- a/Documentation/networking/device_drivers/ethernet/index.rst
+++ b/Documentation/networking/device_drivers/ethernet/index.rst
@@ -49,6 +49,7 @@ Contents:
stmicro/stmmac
ti/cpsw
ti/cpsw_switchdev
+ ti/am65_nuss_cpsw_switchdev
ti/tlan
toshiba/spider_net
diff --git a/Documentation/networking/device_drivers/ethernet/intel/ice.rst b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
index ee43ea57d443..e7d9cbff771b 100644
--- a/Documentation/networking/device_drivers/ethernet/intel/ice.rst
+++ b/Documentation/networking/device_drivers/ethernet/intel/ice.rst
@@ -1,46 +1,1031 @@
.. SPDX-License-Identifier: GPL-2.0+
-==================================================================
-Linux Base Driver for the Intel(R) Ethernet Connection E800 Series
-==================================================================
+=================================================================
+Linux Base Driver for the Intel(R) Ethernet Controller 800 Series
+=================================================================
Intel ice Linux driver.
-Copyright(c) 2018 Intel Corporation.
+Copyright(c) 2018-2021 Intel Corporation.
Contents
========
-- Enabling the driver
-- Support
+- Overview
+- Identifying Your Adapter
+- Important Notes
+- Additional Features & Configurations
+- Performance Optimization
-The driver in this release supports Intel's E800 Series of products. For
-more information, visit Intel's support page at https://support.intel.com.
-Enabling the driver
-===================
-The driver is enabled via the standard kernel configuration system,
-using the make command::
+The associated Virtual Function (VF) driver for this driver is iavf.
- make oldconfig/menuconfig/etc.
+Driver information can be obtained using ethtool and lspci.
-The driver is located in the menu structure at:
+For questions related to hardware requirements, refer to the documentation
+supplied with your Intel adapter. All hardware requirements listed apply to use
+with Linux.
+
+This driver supports XDP (Express Data Path) and AF_XDP zero-copy. Note that
+XDP is blocked for frame sizes larger than 3KB.
+
+
+Identifying Your Adapter
+========================
+For information on how to identify your adapter, and for the latest Intel
+network drivers, refer to the Intel Support website:
+https://www.intel.com/support
+
+
+Important Notes
+===============
+
+Packet drops may occur under receive stress
+-------------------------------------------
+Devices based on the Intel(R) Ethernet Controller 800 Series are designed to
+tolerate a limited amount of system latency during PCIe and DMA transactions.
+If these transactions take longer than the tolerated latency, it can impact the
+length of time the packets are buffered in the device and associated memory,
+which may result in dropped packets. These packets drops typically do not have
+a noticeable impact on throughput and performance under standard workloads.
+
+If these packet drops appear to affect your workload, the following may improve
+the situation:
+
+1) Make sure that your system's physical memory is in a high-performance
+ configuration, as recommended by the platform vendor. A common
+ recommendation is for all channels to be populated with a single DIMM
+ module.
+2) In your system's BIOS/UEFI settings, select the "Performance" profile.
+3) Your distribution may provide tools like "tuned," which can help tweak
+ kernel settings to achieve better standard settings for different workloads.
+
+
+Configuring SR-IOV for improved network security
+------------------------------------------------
+In a virtualized environment, on Intel(R) Ethernet Network Adapters that
+support SR-IOV, the virtual function (VF) may be subject to malicious behavior.
+Software-generated layer two frames, like IEEE 802.3x (link flow control), IEEE
+802.1Qbb (priority based flow-control), and others of this type, are not
+expected and can throttle traffic between the host and the virtual switch,
+reducing performance. To resolve this issue, and to ensure isolation from
+unintended traffic streams, configure all SR-IOV enabled ports for VLAN tagging
+from the administrative interface on the PF. This configuration allows
+unexpected, and potentially malicious, frames to be dropped.
+
+See "Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports" later in this
+README for configuration instructions.
+
+
+Do not unload port driver if VF with active VM is bound to it
+-------------------------------------------------------------
+Do not unload a port's driver if a Virtual Function (VF) with an active Virtual
+Machine (VM) is bound to it. Doing so will cause the port to appear to hang.
+Once the VM shuts down, or otherwise releases the VF, the command will
+complete.
+
+
+Important notes for SR-IOV and Link Aggregation
+-----------------------------------------------
+Link Aggregation is mutually exclusive with SR-IOV.
+
+- If Link Aggregation is active, SR-IOV VFs cannot be created on the PF.
+- If SR-IOV is active, you cannot set up Link Aggregation on the interface.
+
+Bridging and MACVLAN are also affected by this. If you wish to use bridging or
+MACVLAN with SR-IOV, you must set up bridging or MACVLAN before enabling
+SR-IOV. If you are using bridging or MACVLAN in conjunction with SR-IOV, and
+you want to remove the interface from the bridge or MACVLAN, you must follow
+these steps:
+
+1. Destroy SR-IOV VFs if they exist
+2. Remove the interface from the bridge or MACVLAN
+3. Recreate SRIOV VFs as needed
+
+
+Additional Features and Configurations
+======================================
+
+ethtool
+-------
+The driver utilizes the ethtool interface for driver configuration and
+diagnostics, as well as displaying statistical information. The latest ethtool
+version is required for this functionality. Download it at:
+https://kernel.org/pub/software/network/ethtool/
+
+NOTE: The rx_bytes value of ethtool does not match the rx_bytes value of
+Netdev, due to the 4-byte CRC being stripped by the device. The difference
+between the two rx_bytes values will be 4 x the number of Rx packets. For
+example, if Rx packets are 10 and Netdev (software statistics) displays
+rx_bytes as "X", then ethtool (hardware statistics) will display rx_bytes as
+"X+40" (4 bytes CRC x 10 packets).
+
+
+Viewing Link Messages
+---------------------
+Link messages will not be displayed to the console if the distribution is
+restricting system messages. In order to see network driver link messages on
+your console, set dmesg to eight by entering the following::
+
+ # dmesg -n 8
+
+NOTE: This setting is not saved across reboots.
+
+
+Dynamic Device Personalization
+------------------------------
+Dynamic Device Personalization (DDP) allows you to change the packet processing
+pipeline of a device by applying a profile package to the device at runtime.
+Profiles can be used to, for example, add support for new protocols, change
+existing protocols, or change default settings. DDP profiles can also be rolled
+back without rebooting the system.
+
+The DDP package loads during device initialization. The driver looks for
+``intel/ice/ddp/ice.pkg`` in your firmware root (typically ``/lib/firmware/``
+or ``/lib/firmware/updates/``) and checks that it contains a valid DDP package
+file.
+
+NOTE: Your distribution should likely have provided the latest DDP file, but if
+ice.pkg is missing, you can find it in the linux-firmware repository or from
+intel.com.
+
+If the driver is unable to load the DDP package, the device will enter Safe
+Mode. Safe Mode disables advanced and performance features and supports only
+basic traffic and minimal functionality, such as updating the NVM or
+downloading a new driver or DDP package. Safe Mode only applies to the affected
+physical function and does not impact any other PFs. See the "Intel(R) Ethernet
+Adapters and Devices User Guide" for more details on DDP and Safe Mode.
+
+NOTES:
+
+- If you encounter issues with the DDP package file, you may need to download
+ an updated driver or DDP package file. See the log messages for more
+ information.
+
+- The ice.pkg file is a symbolic link to the default DDP package file.
+
+- You cannot update the DDP package if any PF drivers are already loaded. To
+ overwrite a package, unload all PFs and then reload the driver with the new
+ package.
+
+- Only the first loaded PF per device can download a package for that device.
+
+You can install specific DDP package files for different physical devices in
+the same system. To install a specific DDP package file:
+
+1. Download the DDP package file you want for your device.
+
+2. Rename the file ice-xxxxxxxxxxxxxxxx.pkg, where 'xxxxxxxxxxxxxxxx' is the
+ unique 64-bit PCI Express device serial number (in hex) of the device you
+ want the package downloaded on. The filename must include the complete
+ serial number (including leading zeros) and be all lowercase. For example,
+ if the 64-bit serial number is b887a3ffffca0568, then the file name would be
+ ice-b887a3ffffca0568.pkg.
+
+ To find the serial number from the PCI bus address, you can use the
+ following command::
+
+ # lspci -vv -s af:00.0 | grep -i Serial
+ Capabilities: [150 v1] Device Serial Number b8-87-a3-ff-ff-ca-05-68
+
+ You can use the following command to format the serial number without the
+ dashes::
+
+ # lspci -vv -s af:00.0 | grep -i Serial | awk '{print $7}' | sed s/-//g
+ b887a3ffffca0568
+
+3. Copy the renamed DDP package file to
+ ``/lib/firmware/updates/intel/ice/ddp/``. If the directory does not yet
+ exist, create it before copying the file.
+
+4. Unload all of the PFs on the device.
+
+5. Reload the driver with the new package.
+
+NOTE: The presence of a device-specific DDP package file overrides the loading
+of the default DDP package file (ice.pkg).
+
+
+Intel(R) Ethernet Flow Director
+-------------------------------
+The Intel Ethernet Flow Director performs the following tasks:
+
+- Directs receive packets according to their flows to different queues
+- Enables tight control on routing a flow in the platform
+- Matches flows and CPU cores for flow affinity
+
+NOTE: This driver supports the following flow types:
+
+- IPv4
+- TCPv4
+- UDPv4
+- SCTPv4
+- IPv6
+- TCPv6
+- UDPv6
+- SCTPv6
+
+Each flow type supports valid combinations of IP addresses (source or
+destination) and UDP/TCP/SCTP ports (source and destination). You can supply
+only a source IP address, a source IP address and a destination port, or any
+combination of one or more of these four parameters.
+
+NOTE: This driver allows you to filter traffic based on a user-defined flexible
+two-byte pattern and offset by using the ethtool user-def and mask fields. Only
+L3 and L4 flow types are supported for user-defined flexible filters. For a
+given flow type, you must clear all Intel Ethernet Flow Director filters before
+changing the input set (for that flow type).
+
+
+Flow Director Filters
+---------------------
+Flow Director filters are used to direct traffic that matches specified
+characteristics. They are enabled through ethtool's ntuple interface. To enable
+or disable the Intel Ethernet Flow Director and these filters::
+
+ # ethtool -K <ethX> ntuple <off|on>
+
+NOTE: When you disable ntuple filters, all the user programmed filters are
+flushed from the driver cache and hardware. All needed filters must be re-added
+when ntuple is re-enabled.
+
+To display all of the active filters::
+
+ # ethtool -u <ethX>
+
+To add a new filter::
+
+ # ethtool -U <ethX> flow-type <type> src-ip <ip> [m <ip_mask>] dst-ip <ip>
+ [m <ip_mask>] src-port <port> [m <port_mask>] dst-port <port> [m <port_mask>]
+ action <queue>
+
+ Where:
+ <ethX> - the Ethernet device to program
+ <type> - can be ip4, tcp4, udp4, sctp4, ip6, tcp6, udp6, sctp6
+ <ip> - the IP address to match on
+ <ip_mask> - the IPv4 address to mask on
+ NOTE: These filters use inverted masks.
+ <port> - the port number to match on
+ <port_mask> - the 16-bit integer for masking
+ NOTE: These filters use inverted masks.
+ <queue> - the queue to direct traffic toward (-1 discards the
+ matched traffic)
+
+To delete a filter::
+
+ # ethtool -U <ethX> delete <N>
+
+ Where <N> is the filter ID displayed when printing all the active filters,
+ and may also have been specified using "loc <N>" when adding the filter.
+
+EXAMPLES:
+
+To add a filter that directs packet to queue 2::
+
+ # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
+ 192.168.10.2 src-port 2000 dst-port 2001 action 2 [loc 1]
+
+To set a filter using only the source and destination IP address::
+
+ # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
+ 192.168.10.2 action 2 [loc 1]
+
+To set a filter based on a user-defined pattern and offset::
+
+ # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.10.1 dst-ip \
+ 192.168.10.2 user-def 0x4FFFF action 2 [loc 1]
+
+ where the value of the user-def field contains the offset (4 bytes) and
+ the pattern (0xffff).
+
+To match TCP traffic sent from 192.168.0.1, port 5300, directed to 192.168.0.5,
+port 80, and then send it to queue 7::
+
+ # ethtool -U enp130s0 flow-type tcp4 src-ip 192.168.0.1 dst-ip 192.168.0.5
+ src-port 5300 dst-port 80 action 7
+
+To add a TCPv4 filter with a partial mask for a source IP subnet::
+
+ # ethtool -U <ethX> flow-type tcp4 src-ip 192.168.0.0 m 0.255.255.255 dst-ip
+ 192.168.5.12 src-port 12600 dst-port 31 action 12
+
+NOTES:
+
+For each flow-type, the programmed filters must all have the same matching
+input set. For example, issuing the following two commands is acceptable::
+
+ # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7
+ # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.5 src-port 55 action 10
+
+Issuing the next two commands, however, is not acceptable, since the first
+specifies src-ip and the second specifies dst-ip::
+
+ # ethtool -U enp130s0 flow-type ip4 src-ip 192.168.0.1 src-port 5300 action 7
+ # ethtool -U enp130s0 flow-type ip4 dst-ip 192.168.0.5 src-port 55 action 10
+
+The second command will fail with an error. You may program multiple filters
+with the same fields, using different values, but, on one device, you may not
+program two tcp4 filters with different matching fields.
+
+The ice driver does not support matching on a subportion of a field, thus
+partial mask fields are not supported.
+
+
+Flex Byte Flow Director Filters
+-------------------------------
+The driver also supports matching user-defined data within the packet payload.
+This flexible data is specified using the "user-def" field of the ethtool
+command in the following way:
+
+.. table::
+
+ ============================== ============================
+ ``31 28 24 20 16`` ``15 12 8 4 0``
+ ``offset into packet payload`` ``2 bytes of flexible data``
+ ============================== ============================
+
+For example,
+
+::
+
+ ... user-def 0x4FFFF ...
+
+tells the filter to look 4 bytes into the payload and match that value against
+0xFFFF. The offset is based on the beginning of the payload, and not the
+beginning of the packet. Thus
+
+::
+
+ flow-type tcp4 ... user-def 0x8BEAF ...
+
+would match TCP/IPv4 packets which have the value 0xBEAF 8 bytes into the
+TCP/IPv4 payload.
+
+Note that ICMP headers are parsed as 4 bytes of header and 4 bytes of payload.
+Thus to match the first byte of the payload, you must actually add 4 bytes to
+the offset. Also note that ip4 filters match both ICMP frames as well as raw
+(unknown) ip4 frames, where the payload will be the L3 payload of the IP4
+frame.
+
+The maximum offset is 64. The hardware will only read up to 64 bytes of data
+from the payload. The offset must be even because the flexible data is 2 bytes
+long and must be aligned to byte 0 of the packet payload.
+
+The user-defined flexible offset is also considered part of the input set and
+cannot be programmed separately for multiple filters of the same type. However,
+the flexible data is not part of the input set and multiple filters may use the
+same offset but match against different data.
+
+
+RSS Hash Flow
+-------------
+Allows you to set the hash bytes per flow type and any combination of one or
+more options for Receive Side Scaling (RSS) hash byte configuration.
+
+::
+
+ # ethtool -N <ethX> rx-flow-hash <type> <option>
+
+ Where <type> is:
+ tcp4 signifying TCP over IPv4
+ udp4 signifying UDP over IPv4
+ tcp6 signifying TCP over IPv6
+ udp6 signifying UDP over IPv6
+ And <option> is one or more of:
+ s Hash on the IP source address of the Rx packet.
+ d Hash on the IP destination address of the Rx packet.
+ f Hash on bytes 0 and 1 of the Layer 4 header of the Rx packet.
+ n Hash on bytes 2 and 3 of the Layer 4 header of the Rx packet.
+
+
+Accelerated Receive Flow Steering (aRFS)
+----------------------------------------
+Devices based on the Intel(R) Ethernet Controller 800 Series support
+Accelerated Receive Flow Steering (aRFS) on the PF. aRFS is a load-balancing
+mechanism that allows you to direct packets to the same CPU where an
+application is running or consuming the packets in that flow.
+
+NOTES:
+
+- aRFS requires that ntuple filtering is enabled via ethtool.
+- aRFS support is limited to the following packet types:
+
+ - TCP over IPv4 and IPv6
+ - UDP over IPv4 and IPv6
+ - Nonfragmented packets
+
+- aRFS only supports Flow Director filters, which consist of the
+ source/destination IP addresses and source/destination ports.
+- aRFS and ethtool's ntuple interface both use the device's Flow Director. aRFS
+ and ntuple features can coexist, but you may encounter unexpected results if
+ there's a conflict between aRFS and ntuple requests. See "Intel(R) Ethernet
+ Flow Director" for additional information.
+
+To set up aRFS:
+
+1. Enable the Intel Ethernet Flow Director and ntuple filters using ethtool.
+
+::
+
+ # ethtool -K <ethX> ntuple on
+
+2. Set up the number of entries in the global flow table. For example:
+
+::
+
+ # NUM_RPS_ENTRIES=16384
+ # echo $NUM_RPS_ENTRIES > /proc/sys/net/core/rps_sock_flow_entries
+
+3. Set up the number of entries in the per-queue flow table. For example:
+
+::
+
+ # NUM_RX_QUEUES=64
+ # for file in /sys/class/net/$IFACE/queues/rx-*/rps_flow_cnt; do
+ # echo $(($NUM_RPS_ENTRIES/$NUM_RX_QUEUES)) > $file;
+ # done
+
+4. Disable the IRQ balance daemon (this is only a temporary stop of the service
+ until the next reboot).
+
+::
+
+ # systemctl stop irqbalance
+
+5. Configure the interrupt affinity.
+
+ See ``/Documentation/core-api/irq/irq-affinity.rst``
+
+
+To disable aRFS using ethtool::
+
+ # ethtool -K <ethX> ntuple off
+
+NOTE: This command will disable ntuple filters and clear any aRFS filters in
+software and hardware.
+
+Example Use Case:
+
+1. Set the server application on the desired CPU (e.g., CPU 4).
+
+::
+
+ # taskset -c 4 netserver
+
+2. Use netperf to route traffic from the client to CPU 4 on the server with
+ aRFS configured. This example uses TCP over IPv4.
+
+::
+
+ # netperf -H <Host IPv4 Address> -t TCP_STREAM
+
+
+Enabling Virtual Functions (VFs)
+--------------------------------
+Use sysfs to enable virtual functions (VF).
+
+For example, you can create 4 VFs as follows::
+
+ # echo 4 > /sys/class/net/<ethX>/device/sriov_numvfs
+
+To disable VFs, write 0 to the same file::
+
+ # echo 0 > /sys/class/net/<ethX>/device/sriov_numvfs
+
+The maximum number of VFs for the ice driver is 256 total (all ports). To check
+how many VFs each PF supports, use the following command::
+
+ # cat /sys/class/net/<ethX>/device/sriov_totalvfs
+
+Note: You cannot use SR-IOV when link aggregation (LAG)/bonding is active, and
+vice versa. To enforce this, the driver checks for this mutual exclusion.
+
+
+Displaying VF Statistics on the PF
+----------------------------------
+Use the following command to display the statistics for the PF and its VFs::
+
+ # ip -s link show dev <ethX>
+
+NOTE: The output of this command can be very large due to the maximum number of
+possible VFs.
+
+The PF driver will display a subset of the statistics for the PF and for all
+VFs that are configured. The PF will always print a statistics block for each
+of the possible VFs, and it will show zero for all unconfigured VFs.
+
+
+Configuring VLAN Tagging on SR-IOV Enabled Adapter Ports
+--------------------------------------------------------
+To configure VLAN tagging for the ports on an SR-IOV enabled adapter, use the
+following command. The VLAN configuration should be done before the VF driver
+is loaded or the VM is booted. The VF is not aware of the VLAN tag being
+inserted on transmit and removed on received frames (sometimes called "port
+VLAN" mode).
+
+::
+
+ # ip link set dev <ethX> vf <id> vlan <vlan id>
+
+For example, the following will configure PF eth0 and the first VF on VLAN 10::
+
+ # ip link set dev eth0 vf 0 vlan 10
+
+
+Enabling a VF link if the port is disconnected
+----------------------------------------------
+If the physical function (PF) link is down, you can force link up (from the
+host PF) on any virtual functions (VF) bound to the PF.
+
+For example, to force link up on VF 0 bound to PF eth0::
+
+ # ip link set eth0 vf 0 state enable
+
+Note: If the command does not work, it may not be supported by your system.
+
+
+Setting the MAC Address for a VF
+--------------------------------
+To change the MAC address for the specified VF::
+
+ # ip link set <ethX> vf 0 mac <address>
+
+For example::
+
+ # ip link set <ethX> vf 0 mac 00:01:02:03:04:05
+
+This setting lasts until the PF is reloaded.
+
+NOTE: Assigning a MAC address for a VF from the host will disable any
+subsequent requests to change the MAC address from within the VM. This is a
+security feature. The VM is not aware of this restriction, so if this is
+attempted in the VM, it will trigger MDD events.
+
+
+Trusted VFs and VF Promiscuous Mode
+-----------------------------------
+This feature allows you to designate a particular VF as trusted and allows that
+trusted VF to request selective promiscuous mode on the Physical Function (PF).
+
+To set a VF as trusted or untrusted, enter the following command in the
+Hypervisor::
+
+ # ip link set dev <ethX> vf 1 trust [on|off]
+
+NOTE: It's important to set the VF to trusted before setting promiscuous mode.
+If the VM is not trusted, the PF will ignore promiscuous mode requests from the
+VF. If the VM becomes trusted after the VF driver is loaded, you must make a
+new request to set the VF to promiscuous.
+
+Once the VF is designated as trusted, use the following commands in the VM to
+set the VF to promiscuous mode.
+
+For promiscuous all::
+
+ # ip link set <ethX> promisc on
+ Where <ethX> is a VF interface in the VM
+
+For promiscuous Multicast::
+
+ # ip link set <ethX> allmulticast on
+ Where <ethX> is a VF interface in the VM
+
+NOTE: By default, the ethtool private flag vf-true-promisc-support is set to
+"off," meaning that promiscuous mode for the VF will be limited. To set the
+promiscuous mode for the VF to true promiscuous and allow the VF to see all
+ingress traffic, use the following command::
+
+ # ethtool --set-priv-flags <ethX> vf-true-promisc-support on
+
+The vf-true-promisc-support private flag does not enable promiscuous mode;
+rather, it designates which type of promiscuous mode (limited or true) you will
+get when you enable promiscuous mode using the ip link commands above. Note
+that this is a global setting that affects the entire device. However, the
+vf-true-promisc-support private flag is only exposed to the first PF of the
+device. The PF remains in limited promiscuous mode regardless of the
+vf-true-promisc-support setting.
+
+Next, add a VLAN interface on the VF interface. For example::
+
+ # ip link add link eth2 name eth2.100 type vlan id 100
+
+Note that the order in which you set the VF to promiscuous mode and add the
+VLAN interface does not matter (you can do either first). The result in this
+example is that the VF will get all traffic that is tagged with VLAN 100.
+
+
+Malicious Driver Detection (MDD) for VFs
+----------------------------------------
+Some Intel Ethernet devices use Malicious Driver Detection (MDD) to detect
+malicious traffic from the VF and disable Tx/Rx queues or drop the offending
+packet until a VF driver reset occurs. You can view MDD messages in the PF's
+system log using the dmesg command.
+
+- If the PF driver logs MDD events from the VF, confirm that the correct VF
+ driver is installed.
+- To restore functionality, you can manually reload the VF or VM or enable
+ automatic VF resets.
+- When automatic VF resets are enabled, the PF driver will immediately reset
+ the VF and reenable queues when it detects MDD events on the receive path.
+- If automatic VF resets are disabled, the PF will not automatically reset the
+ VF when it detects MDD events.
+
+To enable or disable automatic VF resets, use the following command::
+
+ # ethtool --set-priv-flags <ethX> mdd-auto-reset-vf on|off
+
+
+MAC and VLAN Anti-Spoofing Feature for VFs
+------------------------------------------
+When a malicious driver on a Virtual Function (VF) interface attempts to send a
+spoofed packet, it is dropped by the hardware and not transmitted.
+
+NOTE: This feature can be disabled for a specific VF::
+
+ # ip link set <ethX> vf <vf id> spoofchk {off|on}
+
+
+Jumbo Frames
+------------
+Jumbo Frames support is enabled by changing the Maximum Transmission Unit (MTU)
+to a value larger than the default value of 1500.
+
+Use the ifconfig command to increase the MTU size. For example, enter the
+following where <ethX> is the interface number::
+
+ # ifconfig <ethX> mtu 9000 up
+
+Alternatively, you can use the ip command as follows::
+
+ # ip link set mtu 9000 dev <ethX>
+ # ip link set up dev <ethX>
+
+This setting is not saved across reboots.
+
+
+NOTE: The maximum MTU setting for jumbo frames is 9702. This corresponds to the
+maximum jumbo frame size of 9728 bytes.
+
+NOTE: This driver will attempt to use multiple page sized buffers to receive
+each jumbo packet. This should help to avoid buffer starvation issues when
+allocating receive packets.
+
+NOTE: Packet loss may have a greater impact on throughput when you use jumbo
+frames. If you observe a drop in performance after enabling jumbo frames,
+enabling flow control may mitigate the issue.
+
+
+Speed and Duplex Configuration
+------------------------------
+In addressing speed and duplex configuration issues, you need to distinguish
+between copper-based adapters and fiber-based adapters.
+
+In the default mode, an Intel(R) Ethernet Network Adapter using copper
+connections will attempt to auto-negotiate with its link partner to determine
+the best setting. If the adapter cannot establish link with the link partner
+using auto-negotiation, you may need to manually configure the adapter and link
+partner to identical settings to establish link and pass packets. This should
+only be needed when attempting to link with an older switch that does not
+support auto-negotiation or one that has been forced to a specific speed or
+duplex mode. Your link partner must match the setting you choose. 1 Gbps speeds
+and higher cannot be forced. Use the autonegotiation advertising setting to
+manually set devices for 1 Gbps and higher.
+
+Speed, duplex, and autonegotiation advertising are configured through the
+ethtool utility. For the latest version, download and install ethtool from the
+following website:
+
+ https://kernel.org/pub/software/network/ethtool/
+
+To see the speed configurations your device supports, run the following::
+
+ # ethtool <ethX>
+
+Caution: Only experienced network administrators should force speed and duplex
+or change autonegotiation advertising manually. The settings at the switch must
+always match the adapter settings. Adapter performance may suffer or your
+adapter may not operate if you configure the adapter differently from your
+switch.
+
+
+Data Center Bridging (DCB)
+--------------------------
+NOTE: The kernel assumes that TC0 is available, and will disable Priority Flow
+Control (PFC) on the device if TC0 is not available. To fix this, ensure TC0 is
+enabled when setting up DCB on your switch.
+
+DCB is a configuration Quality of Service implementation in hardware. It uses
+the VLAN priority tag (802.1p) to filter traffic. That means that there are 8
+different priorities that traffic can be filtered into. It also enables
+priority flow control (802.1Qbb) which can limit or eliminate the number of
+dropped packets during network stress. Bandwidth can be allocated to each of
+these priorities, which is enforced at the hardware level (802.1Qaz).
+
+DCB is normally configured on the network using the DCBX protocol (802.1Qaz), a
+specialization of LLDP (802.1AB). The ice driver supports the following
+mutually exclusive variants of DCBX support:
+
+1) Firmware-based LLDP Agent
+2) Software-based LLDP Agent
+
+In firmware-based mode, firmware intercepts all LLDP traffic and handles DCBX
+negotiation transparently for the user. In this mode, the adapter operates in
+"willing" DCBX mode, receiving DCB settings from the link partner (typically a
+switch). The local user can only query the negotiated DCB configuration. For
+information on configuring DCBX parameters on a switch, please consult the
+switch manufacturer's documentation.
+
+In software-based mode, LLDP traffic is forwarded to the network stack and user
+space, where a software agent can handle it. In this mode, the adapter can
+operate in either "willing" or "nonwilling" DCBX mode and DCB configuration can
+be both queried and set locally. This mode requires the FW-based LLDP Agent to
+be disabled.
+
+NOTE:
+
+- You can enable and disable the firmware-based LLDP Agent using an ethtool
+ private flag. Refer to the "FW-LLDP (Firmware Link Layer Discovery Protocol)"
+ section in this README for more information.
+- In software-based DCBX mode, you can configure DCB parameters using software
+ LLDP/DCBX agents that interface with the Linux kernel's DCB Netlink API. We
+ recommend using OpenLLDP as the DCBX agent when running in software mode. For
+ more information, see the OpenLLDP man pages and
+ https://github.com/intel/openlldp.
+- The driver implements the DCB netlink interface layer to allow the user space
+ to communicate with the driver and query DCB configuration for the port.
+- iSCSI with DCB is not supported.
+
+
+FW-LLDP (Firmware Link Layer Discovery Protocol)
+------------------------------------------------
+Use ethtool to change FW-LLDP settings. The FW-LLDP setting is per port and
+persists across boots.
+
+To enable LLDP::
+
+ # ethtool --set-priv-flags <ethX> fw-lldp-agent on
+
+To disable LLDP::
+
+ # ethtool --set-priv-flags <ethX> fw-lldp-agent off
+
+To check the current LLDP setting::
+
+ # ethtool --show-priv-flags <ethX>
+
+NOTE: You must enable the UEFI HII "LLDP Agent" attribute for this setting to
+take effect. If "LLDP AGENT" is set to disabled, you cannot enable it from the
+OS.
+
+
+Flow Control
+------------
+Ethernet Flow Control (IEEE 802.3x) can be configured with ethtool to enable
+receiving and transmitting pause frames for ice. When transmit is enabled,
+pause frames are generated when the receive packet buffer crosses a predefined
+threshold. When receive is enabled, the transmit unit will halt for the time
+delay specified when a pause frame is received.
+
+NOTE: You must have a flow control capable link partner.
+
+Flow Control is disabled by default.
+
+Use ethtool to change the flow control settings.
+
+To enable or disable Rx or Tx Flow Control::
+
+ # ethtool -A <ethX> rx <on|off> tx <on|off>
+
+Note: This command only enables or disables Flow Control if auto-negotiation is
+disabled. If auto-negotiation is enabled, this command changes the parameters
+used for auto-negotiation with the link partner.
+
+Note: Flow Control auto-negotiation is part of link auto-negotiation. Depending
+on your device, you may not be able to change the auto-negotiation setting.
+
+NOTE:
+
+- The ice driver requires flow control on both the port and link partner. If
+ flow control is disabled on one of the sides, the port may appear to hang on
+ heavy traffic.
+- You may encounter issues with link-level flow control (LFC) after disabling
+ DCB. The LFC status may show as enabled but traffic is not paused. To resolve
+ this issue, disable and reenable LFC using ethtool::
+
+ # ethtool -A <ethX> rx off tx off
+ # ethtool -A <ethX> rx on tx on
+
+
+NAPI
+----
+This driver supports NAPI (Rx polling mode).
+For more information on NAPI, see
+https://www.linuxfoundation.org/collaborate/workgroups/networking/napi
+
+
+MACVLAN
+-------
+This driver supports MACVLAN. Kernel support for MACVLAN can be tested by
+checking if the MACVLAN driver is loaded. You can run 'lsmod | grep macvlan' to
+see if the MACVLAN driver is loaded or run 'modprobe macvlan' to try to load
+the MACVLAN driver.
+
+NOTE:
+
+- In passthru mode, you can only set up one MACVLAN device. It will inherit the
+ MAC address of the underlying PF (Physical Function) device.
+
+
+IEEE 802.1ad (QinQ) Support
+---------------------------
+The IEEE 802.1ad standard, informally known as QinQ, allows for multiple VLAN
+IDs within a single Ethernet frame. VLAN IDs are sometimes referred to as
+"tags," and multiple VLAN IDs are thus referred to as a "tag stack." Tag stacks
+allow L2 tunneling and the ability to segregate traffic within a particular
+VLAN ID, among other uses.
+
+NOTES:
+
+- Receive checksum offloads and VLAN acceleration are not supported for 802.1ad
+ (QinQ) packets.
+
+- 0x88A8 traffic will not be received unless VLAN stripping is disabled with
+ the following command::
+
+ # ethool -K <ethX> rxvlan off
+
+- 0x88A8/0x8100 double VLANs cannot be used with 0x8100 or 0x8100/0x8100 VLANS
+ configured on the same port. 0x88a8/0x8100 traffic will not be received if
+ 0x8100 VLANs are configured.
+
+- The VF can only transmit 0x88A8/0x8100 (i.e., 802.1ad/802.1Q) traffic if:
+
+ 1) The VF is not assigned a port VLAN.
+ 2) spoofchk is disabled from the PF. If you enable spoofchk, the VF will
+ not transmit 0x88A8/0x8100 traffic.
+
+- The VF may not receive all network traffic based on the Inner VLAN header
+ when VF true promiscuous mode (vf-true-promisc-support) and double VLANs are
+ enabled in SR-IOV mode.
+
+The following are examples of how to configure 802.1ad (QinQ)::
+
+ # ip link add link eth0 eth0.24 type vlan proto 802.1ad id 24
+ # ip link add link eth0.24 eth0.24.371 type vlan proto 802.1Q id 371
+
+ Where "24" and "371" are example VLAN IDs.
+
+
+Tunnel/Overlay Stateless Offloads
+---------------------------------
+Supported tunnels and overlays include VXLAN, GENEVE, and others depending on
+hardware and software configuration. Stateless offloads are enabled by default.
+
+To view the current state of all offloads::
+
+ # ethtool -k <ethX>
+
+
+UDP Segmentation Offload
+------------------------
+Allows the adapter to offload transmit segmentation of UDP packets with
+payloads up to 64K into valid Ethernet frames. Because the adapter hardware is
+able to complete data segmentation much faster than operating system software,
+this feature may improve transmission performance.
+In addition, the adapter may use fewer CPU resources.
+
+NOTE:
+
+- The application sending UDP packets must support UDP segmentation offload.
+
+To enable/disable UDP Segmentation Offload, issue the following command::
+
+ # ethtool -K <ethX> tx-udp-segmentation [off|on]
+
+
+Performance Optimization
+========================
+Driver defaults are meant to fit a wide variety of workloads, but if further
+optimization is required, we recommend experimenting with the following
+settings.
+
+
+Rx Descriptor Ring Size
+-----------------------
+To reduce the number of Rx packet discards, increase the number of Rx
+descriptors for each Rx ring using ethtool.
+
+ Check if the interface is dropping Rx packets due to buffers being full
+ (rx_dropped.nic can mean that there is no PCIe bandwidth)::
+
+ # ethtool -S <ethX> | grep "rx_dropped"
+
+ If the previous command shows drops on queues, it may help to increase
+ the number of descriptors using 'ethtool -G'::
+
+ # ethtool -G <ethX> rx <N>
+ Where <N> is the desired number of ring entries/descriptors
+
+ This can provide temporary buffering for issues that create latency while
+ the CPUs process descriptors.
+
+
+Interrupt Rate Limiting
+-----------------------
+This driver supports an adaptive interrupt throttle rate (ITR) mechanism that
+is tuned for general workloads. The user can customize the interrupt rate
+control for specific workloads, via ethtool, adjusting the number of
+microseconds between interrupts.
+
+To set the interrupt rate manually, you must disable adaptive mode::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off
+
+For lower CPU utilization:
+
+ Disable adaptive ITR and lower Rx and Tx interrupts. The examples below
+ affect every queue of the specified interface.
+
+ Setting rx-usecs and tx-usecs to 80 will limit interrupts to about
+ 12,500 interrupts per second per queue::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 80 tx-usecs 80
+
+For reduced latency:
+
+ Disable adaptive ITR and ITR by setting rx-usecs and tx-usecs to 0
+ using ethtool::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs 0 tx-usecs 0
+
+Per-queue interrupt rate settings:
+
+ The following examples are for queues 1 and 3, but you can adjust other
+ queues.
+
+ To disable Rx adaptive ITR and set static Rx ITR to 10 microseconds or
+ about 100,000 interrupts/second, for queues 1 and 3::
+
+ # ethtool --per-queue <ethX> queue_mask 0xa --coalesce adaptive-rx off
+ rx-usecs 10
+
+ To show the current coalesce settings for queues 1 and 3::
+
+ # ethtool --per-queue <ethX> queue_mask 0xa --show-coalesce
+
+Bounding interrupt rates using rx-usecs-high:
+
+ :Valid Range: 0-236 (0=no limit)
+
+ The range of 0-236 microseconds provides an effective range of 4,237 to
+ 250,000 interrupts per second. The value of rx-usecs-high can be set
+ independently of rx-usecs and tx-usecs in the same ethtool command, and is
+ also independent of the adaptive interrupt moderation algorithm. The
+ underlying hardware supports granularity in 4-microsecond intervals, so
+ adjacent values may result in the same interrupt rate.
+
+ The following command would disable adaptive interrupt moderation, and allow
+ a maximum of 5 microseconds before indicating a receive or transmit was
+ complete. However, instead of resulting in as many as 200,000 interrupts per
+ second, it limits total interrupts per second to 50,000 via the rx-usecs-high
+ parameter.
+
+ ::
+
+ # ethtool -C <ethX> adaptive-rx off adaptive-tx off rx-usecs-high 20
+ rx-usecs 5 tx-usecs 5
+
+
+Virtualized Environments
+------------------------
+In addition to the other suggestions in this section, the following may be
+helpful to optimize performance in VMs.
+
+ Using the appropriate mechanism (vcpupin) in the VM, pin the CPUs to
+ individual LCPUs, making sure to use a set of CPUs included in the
+ device's local_cpulist: ``/sys/class/net/<ethX>/device/local_cpulist``.
+
+ Configure as many Rx/Tx queues in the VM as available. (See the iavf driver
+ documentation for the number of queues supported.) For example::
+
+ # ethtool -L <virt_interface> rx <max> tx <max>
- -> Device Drivers
- -> Network device support (NETDEVICES [=y])
- -> Ethernet driver support
- -> Intel devices
- -> Intel(R) Ethernet Connection E800 Series Support
Support
=======
For general information, go to the Intel support website at:
-
https://www.intel.com/support/
or the Intel Wired Networking project hosted by Sourceforge at:
-
https://sourceforge.net/projects/e1000
If an issue is identified with the released source code on a supported kernel
with a supported adapter, email the specific information related to the issue
to e1000-devel@lists.sf.net.
+
+
+Trademarks
+==========
+Intel is a trademark or registered trademark of Intel Corporation or its
+subsidiaries in the United States and/or other countries.
+
+* Other names and brands may be claimed as the property of others.
diff --git a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
index 61e850460e18..dd5cd69467be 100644
--- a/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
+++ b/Documentation/networking/device_drivers/ethernet/marvell/octeontx2.rst
@@ -217,3 +217,73 @@ For example::
NPA_AF_ERR:
NPA Error Interrupt Reg : 4096
AQ Doorbell Error
+
+
+NIX Reporters
+-------------
+The NIX reporters are responsible for reporting and recovering the following group of errors:
+
+1. GENERAL events
+
+ - Receive mirror/multicast packet drop due to insufficient buffer.
+ - SMQ Flush operation.
+
+2. ERROR events
+
+ - Memory Fault due to WQE read/write from multicast/mirror buffer.
+ - Receive multicast/mirror replication list error.
+ - Receive packet on an unmapped PF.
+ - Fault due to NIX_AQ_INST_S read or NIX_AQ_RES_S write.
+ - AQ Doorbell Error.
+
+3. RAS events
+
+ - RAS Error Reporting for NIX Receive Multicast/Mirror Entry Structure.
+ - RAS Error Reporting for WQE/Packet Data read from Multicast/Mirror Buffer..
+ - RAS Error Reporting for NIX_AQ_INST_S/NIX_AQ_RES_S.
+
+4. RVU events
+
+ - Error due to unmapped slot.
+
+Sample Output::
+
+ ~# ./devlink health
+ pci/0002:01:00.0:
+ reporter hw_npa_intr
+ state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
+ reporter hw_npa_gen
+ state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
+ reporter hw_npa_err
+ state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
+ reporter hw_npa_ras
+ state healthy error 0 recover 0 grace_period 0 auto_recover true auto_dump true
+ reporter hw_nix_intr
+ state healthy error 1121 recover 1121 last_dump_date 2021-01-19 last_dump_time 05:42:26 grace_period 0 auto_recover true auto_dump true
+ reporter hw_nix_gen
+ state healthy error 949 recover 949 last_dump_date 2021-01-19 last_dump_time 05:42:43 grace_period 0 auto_recover true auto_dump true
+ reporter hw_nix_err
+ state healthy error 1147 recover 1147 last_dump_date 2021-01-19 last_dump_time 05:42:59 grace_period 0 auto_recover true auto_dump true
+ reporter hw_nix_ras
+ state healthy error 409 recover 409 last_dump_date 2021-01-19 last_dump_time 05:43:16 grace_period 0 auto_recover true auto_dump true
+
+Each reporter dumps the
+
+ - Error Type
+ - Error Register value
+ - Reason in words
+
+For example::
+
+ ~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_intr
+ NIX_AF_RVU:
+ NIX RVU Interrupt Reg : 1
+ Unmap Slot Error
+ ~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_gen
+ NIX_AF_GENERAL:
+ NIX General Interrupt Reg : 1
+ Rx multicast pkt drop
+ ~# devlink health dump show pci/0002:01:00.0 reporter hw_nix_err
+ NIX_AF_ERR:
+ NIX Error Interrupt Reg : 64
+ Rx on unmapped PF_FUNC
diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
index e9b65035cd47..1b7e32d8a61b 100644
--- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5.rst
@@ -12,11 +12,13 @@ Contents
- `Enabling the driver and kconfig options`_
- `Devlink info`_
- `Devlink parameters`_
+- `mlx5 subfunction`_
+- `mlx5 function attributes`_
- `Devlink health reporters`_
- `mlx5 tracepoints`_
Enabling the driver and kconfig options
-================================================
+=======================================
| mlx5 core is modular and most of the major mlx5 core driver features can be selected (compiled in/out)
| at build time via kernel Kconfig flags.
@@ -97,6 +99,11 @@ Enabling the driver and kconfig options
| Provides low-level InfiniBand/RDMA and `RoCE <https://community.mellanox.com/s/article/recommended-network-configuration-examples-for-roce-deployment>`_ support.
+**CONFIG_MLX5_SF=(y/n)**
+
+| Build support for subfunction.
+| Subfunctons are more light weight than PCI SRIOV VFs. Choosing this option
+| will enable support for creating subfunction devices.
**External options** ( Choose if the corresponding mlx5 feature is required )
@@ -176,6 +183,214 @@ User command examples:
values:
cmode driverinit value true
+mlx5 subfunction
+================
+mlx5 supports subfunction management using devlink port (see :ref:`Documentation/networking/devlink/devlink-port.rst <devlink_port>`) interface.
+
+A Subfunction has its own function capabilities and its own resources. This
+means a subfunction has its own dedicated queues (txq, rxq, cq, eq). These
+queues are neither shared nor stolen from the parent PCI function.
+
+When a subfunction is RDMA capable, it has its own QP1, GID table and rdma
+resources neither shared nor stolen from the parent PCI function.
+
+A subfunction has a dedicated window in PCI BAR space that is not shared
+with ther other subfunctions or the parent PCI function. This ensures that all
+devices (netdev, rdma, vdpa etc.) of the subfunction accesses only assigned
+PCI BAR space.
+
+A Subfunction supports eswitch representation through which it supports tc
+offloads. The user configures eswitch to send/receive packets from/to
+the subfunction port.
+
+Subfunctions share PCI level resources such as PCI MSI-X IRQs with
+other subfunctions and/or with its parent PCI function.
+
+Example mlx5 software, system and device view::
+
+ _______
+ | admin |
+ | user |----------
+ |_______| |
+ | |
+ ____|____ __|______ _________________
+ | | | | | |
+ | devlink | | tc tool | | user |
+ | tool | |_________| | applications |
+ |_________| | |_________________|
+ | | | |
+ | | | | Userspace
+ +---------|-------------|-------------------|----------|--------------------+
+ | | +----------+ +----------+ Kernel
+ | | | netdev | | rdma dev |
+ | | +----------+ +----------+
+ (devlink port add/del | ^ ^
+ port function set) | | |
+ | | +---------------|
+ _____|___ | | _______|_______
+ | | | | | mlx5 class |
+ | devlink | +------------+ | | drivers |
+ | kernel | | rep netdev | | |(mlx5_core,ib) |
+ |_________| +------------+ | |_______________|
+ | | | ^
+ (devlink ops) | | (probe/remove)
+ _________|________ | | ____|________
+ | subfunction | | +---------------+ | subfunction |
+ | management driver|----- | subfunction |---| driver |
+ | (mlx5_core) | | auxiliary dev | | (mlx5_core) |
+ |__________________| +---------------+ |_____________|
+ | ^
+ (sf add/del, vhca events) |
+ | (device add/del)
+ _____|____ ____|________
+ | | | subfunction |
+ | PCI NIC |---- activate/deactive events---->| host driver |
+ |__________| | (mlx5_core) |
+ |_____________|
+
+Subfunction is created using devlink port interface.
+
+- Change device to switchdev mode::
+
+ $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
+
+- Add a devlink port of subfunction flaovur::
+
+ $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88
+ pci/0000:06:00.0/32768: type eth netdev eth6 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+ function:
+ hw_addr 00:00:00:00:00:00 state inactive opstate detached
+
+- Show a devlink port of the subfunction::
+
+ $ devlink port show pci/0000:06:00.0/32768
+ pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88
+ function:
+ hw_addr 00:00:00:00:00:00 state inactive opstate detached
+
+- Delete a devlink port of subfunction after use::
+
+ $ devlink port del pci/0000:06:00.0/32768
+
+mlx5 function attributes
+========================
+The mlx5 driver provides a mechanism to setup PCI VF/SF function attributes in
+a unified way for SmartNIC and non-SmartNIC.
+
+This is supported only when the eswitch mode is set to switchdev. Port function
+configuration of the PCI VF/SF is supported through devlink eswitch port.
+
+Port function attributes should be set before PCI VF/SF is enumerated by the
+driver.
+
+MAC address setup
+-----------------
+mlx5 driver provides mechanism to setup the MAC address of the PCI VF/SF.
+
+The configured MAC address of the PCI VF/SF will be used by netdevice and rdma
+device created for the PCI VF/SF.
+
+- Get the MAC address of the VF identified by its unique devlink port index::
+
+ $ devlink port show pci/0000:06:00.0/2
+ pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+ function:
+ hw_addr 00:00:00:00:00:00
+
+- Set the MAC address of the VF identified by its unique devlink port index::
+
+ $ devlink port function set pci/0000:06:00.0/2 hw_addr 00:11:22:33:44:55
+
+ $ devlink port show pci/0000:06:00.0/2
+ pci/0000:06:00.0/2: type eth netdev enp6s0pf0vf1 flavour pcivf pfnum 0 vfnum 1
+ function:
+ hw_addr 00:11:22:33:44:55
+
+- Get the MAC address of the SF identified by its unique devlink port index::
+
+ $ devlink port show pci/0000:06:00.0/32768
+ pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcisf pfnum 0 sfnum 88
+ function:
+ hw_addr 00:00:00:00:00:00
+
+- Set the MAC address of the VF identified by its unique devlink port index::
+
+ $ devlink port function set pci/0000:06:00.0/32768 hw_addr 00:00:00:00:88:88
+
+ $ devlink port show pci/0000:06:00.0/32768
+ pci/0000:06:00.0/32768: type eth netdev enp6s0pf0sf88 flavour pcivf pfnum 0 sfnum 88
+ function:
+ hw_addr 00:00:00:00:88:88
+
+SF state setup
+--------------
+To use the SF, the user must active the SF using the SF function state
+attribute.
+
+- Get the state of the SF identified by its unique devlink port index::
+
+ $ devlink port show ens2f0npf0sf88
+ pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+ function:
+ hw_addr 00:00:00:00:88:88 state inactive opstate detached
+
+- Activate the function and verify its state is active::
+
+ $ devlink port function set ens2f0npf0sf88 state active
+
+ $ devlink port show ens2f0npf0sf88
+ pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+ function:
+ hw_addr 00:00:00:00:88:88 state active opstate detached
+
+Upon function activation, the PF driver instance gets the event from the device
+that a particular SF was activated. It's the cue to put the device on bus, probe
+it and instantiate the devlink instance and class specific auxiliary devices
+for it.
+
+- Show the auxiliary device and port of the subfunction::
+
+ $ devlink dev show
+ devlink dev show auxiliary/mlx5_core.sf.4
+
+ $ devlink port show auxiliary/mlx5_core.sf.4/1
+ auxiliary/mlx5_core.sf.4/1: type eth netdev p0sf88 flavour virtual port 0 splittable false
+
+ $ rdma link show mlx5_0/1
+ link mlx5_0/1 state ACTIVE physical_state LINK_UP netdev p0sf88
+
+ $ rdma dev show
+ 8: rocep6s0f1: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d113 sys_image_guid 248a:0703:00b3:d112
+ 13: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112
+
+- Subfunction auxiliary device and class device hierarchy::
+
+ mlx5_core.sf.4
+ (subfunction auxiliary device)
+ /\
+ / \
+ / \
+ / \
+ / \
+ mlx5_core.eth.4 mlx5_core.rdma.4
+ (sf eth aux dev) (sf rdma aux dev)
+ | |
+ | |
+ p0sf88 mlx5_0
+ (sf netdev) (sf rdma device)
+
+Additionally, the SF port also gets the event when the driver attaches to the
+auxiliary device of the subfunction. This results in changing the operational
+state of the function. This provides visiblity to the user to decide when is it
+safe to delete the SF port for graceful termination of the subfunction.
+
+- Show the SF port operational state::
+
+ $ devlink port show ens2f0npf0sf88
+ pci/0000:06:00.0/32768: type eth netdev ens2f0npf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 external false splittable false
+ function:
+ hw_addr 00:00:00:00:88:88 state active opstate attached
+
Devlink health reporters
========================
diff --git a/Documentation/networking/device_drivers/ethernet/ti/am65_nuss_cpsw_switchdev.rst b/Documentation/networking/device_drivers/ethernet/ti/am65_nuss_cpsw_switchdev.rst
new file mode 100644
index 000000000000..f24adfab6a1b
--- /dev/null
+++ b/Documentation/networking/device_drivers/ethernet/ti/am65_nuss_cpsw_switchdev.rst
@@ -0,0 +1,143 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
+Texas Instruments K3 AM65 CPSW NUSS switchdev based ethernet driver
+===================================================================
+
+:Version: 1.0
+
+Port renaming
+=============
+
+In order to rename via udev::
+
+ ip -d link show dev sw0p1 | grep switchid
+
+ SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}==<switchid>, \
+ ATTR{phys_port_name}!="", NAME="sw0$attr{phys_port_name}"
+
+
+Multi mac mode
+==============
+
+- The driver is operating in multi-mac mode by default, thus
+ working as N individual network interfaces.
+
+Devlink configuration parameters
+================================
+
+See Documentation/networking/devlink/am65-nuss-cpsw-switch.rst
+
+Enabling "switch"
+=================
+
+The Switch mode can be enabled by configuring devlink driver parameter
+"switch_mode" to 1/true::
+
+ devlink dev param set platform/c000000.ethernet \
+ name switch_mode value true cmode runtime
+
+This can be done regardless of the state of Port's netdev devices - UP/DOWN, but
+Port's netdev devices have to be in UP before joining to the bridge to avoid
+overwriting of bridge configuration as CPSW switch driver completely reloads its
+configuration when first port changes its state to UP.
+
+When the both interfaces joined the bridge - CPSW switch driver will enable
+marking packets with offload_fwd_mark flag.
+
+All configuration is implemented via switchdev API.
+
+Bridge setup
+============
+
+::
+
+ devlink dev param set platform/c000000.ethernet \
+ name switch_mode value true cmode runtime
+
+ ip link add name br0 type bridge
+ ip link set dev br0 type bridge ageing_time 1000
+ ip link set dev sw0p1 up
+ ip link set dev sw0p2 up
+ ip link set dev sw0p1 master br0
+ ip link set dev sw0p2 master br0
+
+ [*] bridge vlan add dev br0 vid 1 pvid untagged self
+
+ [*] if vlan_filtering=1. where default_pvid=1
+
+ Note. Steps [*] are mandatory.
+
+
+On/off STP
+==========
+
+::
+
+ ip link set dev BRDEV type bridge stp_state 1/0
+
+VLAN configuration
+==================
+
+::
+
+ bridge vlan add dev br0 vid 1 pvid untagged self <---- add cpu port to VLAN 1
+
+Note. This step is mandatory for bridge/default_pvid.
+
+Add extra VLANs
+===============
+
+ 1. untagged::
+
+ bridge vlan add dev sw0p1 vid 100 pvid untagged master
+ bridge vlan add dev sw0p2 vid 100 pvid untagged master
+ bridge vlan add dev br0 vid 100 pvid untagged self <---- Add cpu port to VLAN100
+
+ 2. tagged::
+
+ bridge vlan add dev sw0p1 vid 100 master
+ bridge vlan add dev sw0p2 vid 100 master
+ bridge vlan add dev br0 vid 100 pvid tagged self <---- Add cpu port to VLAN100
+
+FDBs
+----
+
+FDBs are automatically added on the appropriate switch port upon detection
+
+Manually adding FDBs::
+
+ bridge fdb add aa:bb:cc:dd:ee:ff dev sw0p1 master vlan 100
+ bridge fdb add aa:bb:cc:dd:ee:fe dev sw0p2 master <---- Add on all VLANs
+
+MDBs
+----
+
+MDBs are automatically added on the appropriate switch port upon detection
+
+Manually adding MDBs::
+
+ bridge mdb add dev br0 port sw0p1 grp 239.1.1.1 permanent vid 100
+ bridge mdb add dev br0 port sw0p1 grp 239.1.1.1 permanent <---- Add on all VLANs
+
+Multicast flooding
+==================
+CPU port mcast_flooding is always on
+
+Turning flooding on/off on swithch ports:
+bridge link set dev sw0p1 mcast_flood on/off
+
+Access and Trunk port
+=====================
+
+::
+
+ bridge vlan add dev sw0p1 vid 100 pvid untagged master
+ bridge vlan add dev sw0p2 vid 100 master
+
+
+ bridge vlan add dev br0 vid 100 self
+ ip link add link br0 name br0.100 type vlan id 100
+
+Note. Setting PVID on Bridge device itself works only for
+default VLAN (default_pvid).
diff --git a/Documentation/networking/devlink/am65-nuss-cpsw-switch.rst b/Documentation/networking/devlink/am65-nuss-cpsw-switch.rst
new file mode 100644
index 000000000000..1e589c26abff
--- /dev/null
+++ b/Documentation/networking/devlink/am65-nuss-cpsw-switch.rst
@@ -0,0 +1,26 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+am65-cpsw-nuss devlink support
+==============================
+
+This document describes the devlink features implemented by the ``am65-cpsw-nuss``
+device driver.
+
+Parameters
+==========
+
+The ``am65-cpsw-nuss`` driver implements the following driver-specific
+parameters.
+
+.. list-table:: Driver-specific parameters implemented
+ :widths: 5 5 5 85
+
+ * - Name
+ - Type
+ - Mode
+ - Description
+ * - ``switch_mode``
+ - Boolean
+ - runtime
+ - Enable switch mode
diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst
new file mode 100644
index 000000000000..e99b41599465
--- /dev/null
+++ b/Documentation/networking/devlink/devlink-port.rst
@@ -0,0 +1,199 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _devlink_port:
+
+============
+Devlink Port
+============
+
+``devlink-port`` is a port that exists on the device. It has a logically
+separate ingress/egress point of the device. A devlink port can be any one
+of many flavours. A devlink port flavour along with port attributes
+describe what a port represents.
+
+A device driver that intends to publish a devlink port sets the
+devlink port attributes and registers the devlink port.
+
+Devlink port flavours are described below.
+
+.. list-table:: List of devlink port flavours
+ :widths: 33 90
+
+ * - Flavour
+ - Description
+ * - ``DEVLINK_PORT_FLAVOUR_PHYSICAL``
+ - Any kind of physical port. This can be an eswitch physical port or any
+ other physical port on the device.
+ * - ``DEVLINK_PORT_FLAVOUR_DSA``
+ - This indicates a DSA interconnect port.
+ * - ``DEVLINK_PORT_FLAVOUR_CPU``
+ - This indicates a CPU port applicable only to DSA.
+ * - ``DEVLINK_PORT_FLAVOUR_PCI_PF``
+ - This indicates an eswitch port representing a port of PCI
+ physical function (PF).
+ * - ``DEVLINK_PORT_FLAVOUR_PCI_VF``
+ - This indicates an eswitch port representing a port of PCI
+ virtual function (VF).
+ * - ``DEVLINK_PORT_FLAVOUR_PCI_SF``
+ - This indicates an eswitch port representing a port of PCI
+ subfunction (SF).
+ * - ``DEVLINK_PORT_FLAVOUR_VIRTUAL``
+ - This indicates a virtual port for the PCI virtual function.
+
+Devlink port can have a different type based on the link layer described below.
+
+.. list-table:: List of devlink port types
+ :widths: 23 90
+
+ * - Type
+ - Description
+ * - ``DEVLINK_PORT_TYPE_ETH``
+ - Driver should set this port type when a link layer of the port is
+ Ethernet.
+ * - ``DEVLINK_PORT_TYPE_IB``
+ - Driver should set this port type when a link layer of the port is
+ InfiniBand.
+ * - ``DEVLINK_PORT_TYPE_AUTO``
+ - This type is indicated by the user when driver should detect the port
+ type automatically.
+
+PCI controllers
+---------------
+In most cases a PCI device has only one controller. A controller consists of
+potentially multiple physical, virtual functions and subfunctions. A function
+consists of one or more ports. This port is represented by the devlink eswitch
+port.
+
+A PCI device connected to multiple CPUs or multiple PCI root complexes or a
+SmartNIC, however, may have multiple controllers. For a device with multiple
+controllers, each controller is distinguished by a unique controller number.
+An eswitch is on the PCI device which supports ports of multiple controllers.
+
+An example view of a system with two controllers::
+
+ ---------------------------------------------------------
+ | |
+ | --------- --------- ------- ------- |
+ ----------- | | vf(s) | | sf(s) | |vf(s)| |sf(s)| |
+ | server | | ------- ----/---- ---/----- ------- ---/--- ---/--- |
+ | pci rc |=== | pf0 |______/________/ | pf1 |___/_______/ |
+ | connect | | ------- ------- |
+ ----------- | | controller_num=1 (no eswitch) |
+ ------|--------------------------------------------------
+ (internal wire)
+ |
+ ---------------------------------------------------------
+ | devlink eswitch ports and reps |
+ | ----------------------------------------------------- |
+ | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
+ | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | |
+ | ----------------------------------------------------- |
+ | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
+ | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | |
+ | ----------------------------------------------------- |
+ | |
+ | |
+ ----------- | --------- --------- ------- ------- |
+ | smartNIC| | | vf(s) | | sf(s) | |vf(s)| |sf(s)| |
+ | pci rc |==| ------- ----/---- ---/----- ------- ---/--- ---/--- |
+ | connect | | | pf0 |______/________/ | pf1 |___/_______/ |
+ ----------- | ------- ------- |
+ | |
+ | local controller_num=0 (eswitch) |
+ ---------------------------------------------------------
+
+In the above example, the external controller (identified by controller number = 1)
+doesn't have the eswitch. Local controller (identified by controller number = 0)
+has the eswitch. The Devlink instance on the local controller has eswitch
+devlink ports for both the controllers.
+
+Function configuration
+======================
+
+A user can configure the function attribute before enumerating the PCI
+function. Usually it means, user should configure function attribute
+before a bus specific device for the function is created. However, when
+SRIOV is enabled, virtual function devices are created on the PCI bus.
+Hence, function attribute should be configured before binding virtual
+function device to the driver. For subfunctions, this means user should
+configure port function attribute before activating the port function.
+
+A user may set the hardware address of the function using
+'devlink port function set hw_addr' command. For Ethernet port function
+this means a MAC address.
+
+Subfunction
+============
+
+Subfunction is a lightweight function that has a parent PCI function on which
+it is deployed. Subfunction is created and deployed in unit of 1. Unlike
+SRIOV VFs, a subfunction doesn't require its own PCI virtual function.
+A subfunction communicates with the hardware through the parent PCI function.
+
+To use a subfunction, 3 steps setup sequence is followed.
+(1) create - create a subfunction;
+(2) configure - configure subfunction attributes;
+(3) deploy - deploy the subfunction;
+
+Subfunction management is done using devlink port user interface.
+User performs setup on the subfunction management device.
+
+(1) Create
+----------
+A subfunction is created using a devlink port interface. A user adds the
+subfunction by adding a devlink port of subfunction flavour. The devlink
+kernel code calls down to subfunction management driver (devlink ops) and asks
+it to create a subfunction devlink port. Driver then instantiates the
+subfunction port and any associated objects such as health reporters and
+representor netdevice.
+
+(2) Configure
+-------------
+A subfunction devlink port is created but it is not active yet. That means the
+entities are created on devlink side, the e-switch port representor is created,
+but the subfunction device itself it not created. A user might use e-switch port
+representor to do settings, putting it into bridge, adding TC rules, etc. A user
+might as well configure the hardware address (such as MAC address) of the
+subfunction while subfunction is inactive.
+
+(3) Deploy
+----------
+Once a subfunction is configured, user must activate it to use it. Upon
+activation, subfunction management driver asks the subfunction management
+device to instantiate the subfunction device on particular PCI function.
+A subfunction device is created on the :ref:`Documentation/driver-api/auxiliary_bus.rst <auxiliary_bus>`.
+At this point a matching subfunction driver binds to the subfunction's auxiliary device.
+
+Terms and Definitions
+=====================
+
+.. list-table:: Terms and Definitions
+ :widths: 22 90
+
+ * - Term
+ - Definitions
+ * - ``PCI device``
+ - A physical PCI device having one or more PCI bus consists of one or
+ more PCI controllers.
+ * - ``PCI controller``
+ - A controller consists of potentially multiple physical functions,
+ virtual functions and subfunctions.
+ * - ``Port function``
+ - An object to manage the function of a port.
+ * - ``Subfunction``
+ - A lightweight function that has parent PCI function on which it is
+ deployed.
+ * - ``Subfunction device``
+ - A bus device of the subfunction, usually on a auxiliary bus.
+ * - ``Subfunction driver``
+ - A device driver for the subfunction auxiliary device.
+ * - ``Subfunction management device``
+ - A PCI physical function that supports subfunction management.
+ * - ``Subfunction management driver``
+ - A device driver for PCI physical function that supports
+ subfunction management using devlink port interface.
+ * - ``Subfunction host driver``
+ - A device driver for PCI physical function that hosts subfunction
+ devices. In most cases it is same as subfunction management driver. When
+ subfunction is used on external controller, subfunction management and
+ host drivers are different.
diff --git a/Documentation/networking/devlink/devlink-resource.rst b/Documentation/networking/devlink/devlink-resource.rst
index 93e92d2f0752..3d5ae51e65a2 100644
--- a/Documentation/networking/devlink/devlink-resource.rst
+++ b/Documentation/networking/devlink/devlink-resource.rst
@@ -23,6 +23,20 @@ current size and related sub resources. To access a sub resource, you
specify the path of the resource. For example ``/IPv4/fib`` is the id for
the ``fib`` sub-resource under the ``IPv4`` resource.
+Generic Resources
+=================
+
+Generic resources are used to describe resources that can be shared by multiple
+device drivers and their description must be added to the following table:
+
+.. list-table:: List of Generic Resources
+ :widths: 10 90
+
+ * - Name
+ - Description
+ * - ``physical_ports``
+ - A limited capacity of physical ports that the switch ASIC can support
+
example usage
-------------
diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst
index d875f3e1e9cf..935b6397e8cf 100644
--- a/Documentation/networking/devlink/devlink-trap.rst
+++ b/Documentation/networking/devlink/devlink-trap.rst
@@ -480,6 +480,11 @@ be added to the following table:
- ``drop``
- Traps packets that the device decided to drop in case they hit a
blackhole nexthop
+ * - ``dmac_filter``
+ - ``drop``
+ - Traps incoming packets that the device decided to drop because
+ the destination MAC is not configured in the MAC table and
+ the interface is not in promiscuous mode
Driver-specific Packet Traps
============================
diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst
index d82874760ae2..8428a1220723 100644
--- a/Documentation/networking/devlink/index.rst
+++ b/Documentation/networking/devlink/index.rst
@@ -18,6 +18,7 @@ general.
devlink-info
devlink-flash
devlink-params
+ devlink-port
devlink-region
devlink-resource
devlink-reload
@@ -44,3 +45,4 @@ parameters, info versions, and other features it supports.
sja1105
qed
ti-cpsw-switch
+ am65-nuss-cpsw-switch
diff --git a/Documentation/networking/dsa/dsa.rst b/Documentation/networking/dsa/dsa.rst
index a8d15dd2b42b..e9517af5fe02 100644
--- a/Documentation/networking/dsa/dsa.rst
+++ b/Documentation/networking/dsa/dsa.rst
@@ -273,10 +273,6 @@ will not make us go through the switch tagging protocol transmit function, so
the Ethernet switch on the other end, expecting a tag will typically drop this
frame.
-Slave network devices check that the master network device is UP before allowing
-you to administratively bring UP these slave network devices. A common
-configuration mistake is forgetting to bring UP the master network device first.
-
Interactions with other subsystems
==================================
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 30b98245979f..05073482db05 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -431,16 +431,17 @@ Request contents:
``ETHTOOL_A_LINKMODES_SPEED`` u32 link speed (Mb/s)
``ETHTOOL_A_LINKMODES_DUPLEX`` u8 duplex mode
``ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG`` u8 Master/slave port mode
+ ``ETHTOOL_A_LINKMODES_LANES`` u32 lanes
========================================== ====== ==========================
``ETHTOOL_A_LINKMODES_OURS`` bit set allows setting advertised link modes. If
autonegotiation is on (either set now or kept from before), advertised modes
are not changed (no ``ETHTOOL_A_LINKMODES_OURS`` attribute) and at least one
-of speed and duplex is specified, kernel adjusts advertised modes to all
-supported modes matching speed, duplex or both (whatever is specified). This
-autoselection is done on ethtool side with ioctl interface, netlink interface
-is supposed to allow requesting changes without knowing what exactly kernel
-supports.
+of speed, duplex and lanes is specified, kernel adjusts advertised modes to all
+supported modes matching speed, duplex, lanes or all (whatever is specified).
+This autoselection is done on ethtool side with ioctl interface, netlink
+interface is supposed to allow requesting changes without knowing what exactly
+kernel supports.
LINKSTATE_GET
diff --git a/Documentation/networking/filter.rst b/Documentation/networking/filter.rst
index debb59e374de..251c6bd73d15 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -1006,13 +1006,13 @@ Size modifier is one of ...
Mode modifier is one of::
- BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
- BPF_ABS 0x20
- BPF_IND 0x40
- BPF_MEM 0x60
- BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
- BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
- BPF_XADD 0xc0 /* eBPF only, exclusive add */
+ BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
+ BPF_ABS 0x20
+ BPF_IND 0x40
+ BPF_MEM 0x60
+ BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */
+ BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */
+ BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */
eBPF has two non-generic instructions: (BPF_ABS | <size> | BPF_LD) and
(BPF_IND | <size> | BPF_LD) which are used to access packet data.
@@ -1044,16 +1044,57 @@ Unlike classic BPF instruction set, eBPF has generic load/store operations::
BPF_MEM | <size> | BPF_STX: *(size *) (dst_reg + off) = src_reg
BPF_MEM | <size> | BPF_ST: *(size *) (dst_reg + off) = imm32
BPF_MEM | <size> | BPF_LDX: dst_reg = *(size *) (src_reg + off)
- BPF_XADD | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
- BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
-Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
-2 byte atomic increments are not supported.
+Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW.
-eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
+It also includes atomic operations, which use the immediate field for extra
+encoding::
+
+ .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_W | BPF_STX: lock xadd *(u32 *)(dst_reg + off16) += src_reg
+ .imm = BPF_ADD, .code = BPF_ATOMIC | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + off16) += src_reg
+
+The basic atomic operations supported are::
+
+ BPF_ADD
+ BPF_AND
+ BPF_OR
+ BPF_XOR
+
+Each having equivalent semantics with the ``BPF_ADD`` example, that is: the
+memory location addresed by ``dst_reg + off`` is atomically modified, with
+``src_reg`` as the other operand. If the ``BPF_FETCH`` flag is set in the
+immediate, then these operations also overwrite ``src_reg`` with the
+value that was in memory before it was modified.
+
+The more special operations are::
+
+ BPF_XCHG
+
+This atomically exchanges ``src_reg`` with the value addressed by ``dst_reg +
+off``. ::
+
+ BPF_CMPXCHG
+
+This atomically compares the value addressed by ``dst_reg + off`` with
+``R0``. If they match it is replaced with ``src_reg``. In either case, the
+value that was there before is zero-extended and loaded back to ``R0``.
+
+Note that 1 and 2 byte atomic operations are not supported.
+
+Clang can generate atomic instructions by default when ``-mcpu=v3`` is
+enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
+Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
+the atomics features, while keeping a lower ``-mcpu`` version, you can use
+``-Xclang -target-feature -Xclang +alu32``.
+
+You may encounter ``BPF_XADD`` - this is a legacy name for ``BPF_ATOMIC``,
+referring to the exclusive-add operation encoded when the immediate field is
+zero.
+
+eBPF has one 16-byte instruction: ``BPF_LD | BPF_DW | BPF_IMM`` which consists
of two consecutive ``struct bpf_insn`` 8-byte blocks and interpreted as single
instruction that loads 64-bit immediate value into a dst_reg.
-Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
+Classic BPF has similar instruction: ``BPF_LD | BPF_W | BPF_IMM`` which loads
32-bit immediate value into a register.
eBPF verifier
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index fa544e9037b9..c7952ac5bd2f 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -178,6 +178,27 @@ min_adv_mss - INTEGER
The advertised MSS depends on the first hop route MTU, but will
never be lower than this setting.
+fib_notify_on_flag_change - INTEGER
+ Whether to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/
+ RTM_F_TRAP/RTM_F_OFFLOAD_FAILED flags are changed.
+
+ After installing a route to the kernel, user space receives an
+ acknowledgment, which means the route was installed in the kernel,
+ but not necessarily in hardware.
+ It is also possible for a route already installed in hardware to change
+ its action and therefore its flags. For example, a host route that is
+ trapping packets can be "promoted" to perform decapsulation following
+ the installation of an IPinIP/VXLAN tunnel.
+ The notifications will indicate to user-space the state of the route.
+
+ Default: 0 (Do not emit notifications.)
+
+ Possible values:
+
+ - 0 - Do not emit notifications.
+ - 1 - Emit notifications.
+ - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change.
+
IP Fragmentation:
ipfrag_high_thresh - LONG INTEGER
@@ -630,16 +651,15 @@ tcp_rmem - vector of 3 INTEGERs: min, default, max
default: initial size of receive buffer used by TCP sockets.
This value overrides net.core.rmem_default used by other protocols.
- Default: 87380 bytes. This value results in window of 65535 with
- default setting of tcp_adv_win_scale and tcp_app_win:0 and a bit
- less for default tcp_app_win. See below about these variables.
+ Default: 131072 bytes.
+ This value results in initial window of 65535.
max: maximal size of receive buffer allowed for automatically
selected receiver buffers for TCP socket. This value does not override
net.core.rmem_max. Calling setsockopt() with SO_RCVBUF disables
automatic tuning of that socket's receive buffer size, in which
case this value is ignored.
- Default: between 87380B and 6MB, depending on RAM size.
+ Default: between 131072 and 6MB, depending on RAM size.
tcp_sack - BOOLEAN
Enable select acknowledgments (SACKS).
@@ -1425,6 +1445,25 @@ rp_filter - INTEGER
Default value is 0. Note that some distributions enable it
in startup scripts.
+src_valid_mark - BOOLEAN
+ - 0 - The fwmark of the packet is not included in reverse path
+ route lookup. This allows for asymmetric routing configurations
+ utilizing the fwmark in only one direction, e.g., transparent
+ proxying.
+
+ - 1 - The fwmark of the packet is included in reverse path route
+ lookup. This permits rp_filter to function when the fwmark is
+ used for routing traffic in both directions.
+
+ This setting also affects the utilization of fmwark when
+ performing source address selection for ICMP replies, or
+ determining addresses stored for the IPOPT_TS_TSANDADDR and
+ IPOPT_RR IP options.
+
+ The max value from conf/{all,interface}/src_valid_mark is used.
+
+ Default value is 0.
+
arp_filter - BOOLEAN
- 1 - Allows you to have multiple network interfaces on the same
subnet, and have the ARPs for each interface be answered
@@ -1775,6 +1814,27 @@ nexthop_compat_mode - BOOLEAN
and extraneous notifications.
Default: true (backward compat mode)
+fib_notify_on_flag_change - INTEGER
+ Whether to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/
+ RTM_F_TRAP/RTM_F_OFFLOAD_FAILED flags are changed.
+
+ After installing a route to the kernel, user space receives an
+ acknowledgment, which means the route was installed in the kernel,
+ but not necessarily in hardware.
+ It is also possible for a route already installed in hardware to change
+ its action and therefore its flags. For example, a host route that is
+ trapping packets can be "promoted" to perform decapsulation following
+ the installation of an IPinIP/VXLAN tunnel.
+ The notifications will indicate to user-space the state of the route.
+
+ Default: 0 (Do not emit notifications.)
+
+ Possible values:
+
+ - 0 - Do not emit notifications.
+ - 1 - Emit notifications.
+ - 2 - Emit notifications only for RTM_F_OFFLOAD_FAILED flag change.
+
IPv6 Fragmentation:
ip6frag_high_thresh - INTEGER
@@ -1883,6 +1943,16 @@ accept_ra_defrtr - BOOLEAN
- enabled if accept_ra is enabled.
- disabled if accept_ra is disabled.
+ra_defrtr_metric - UNSIGNED INTEGER
+ Route metric for default route learned in Router Advertisement. This value
+ will be assigned as metric for the default route learned via IPv6 Router
+ Advertisement. Takes affect only if accept_ra_defrtr is enabled.
+
+ Possible values:
+ 1 to 0xFFFFFFFF
+
+ Default: IP6_RT_PRIO_USER i.e. 1024.
+
accept_ra_from_local - BOOLEAN
Accept RA with source-address that is found on local machine
if the RA is otherwise proper and able to be accepted.
diff --git a/Documentation/networking/netdev-FAQ.rst b/Documentation/networking/netdev-FAQ.rst
index ae2ae37cd921..a64c01b52b4c 100644
--- a/Documentation/networking/netdev-FAQ.rst
+++ b/Documentation/networking/netdev-FAQ.rst
@@ -272,6 +272,22 @@ to the mailing list, e.g.::
Posting as one thread is discouraged because it confuses patchwork
(as of patchwork 2.2.2).
+Can I reproduce the checks from patchwork on my local machine?
+--------------------------------------------------------------
+
+Checks in patchwork are mostly simple wrappers around existing kernel
+scripts, the sources are available at:
+
+https://github.com/kuba-moo/nipa/tree/master/tests
+
+Running all the builds and checks locally is a pain, can I post my patches and have the patchwork bot validate them?
+--------------------------------------------------------------------------------------------------------------------
+
+No, you must ensure that your patches are ready by testing them locally
+before posting to the mailing list. The patchwork build bot instance
+gets overloaded very easily and netdev@vger really doesn't need more
+traffic if we can help it.
+
Any other tips to help ensure my net/net-next patch gets OK'd?
--------------------------------------------------------------
Attention to detail. Re-read your own work as if you were the
diff --git a/Documentation/networking/netdev-features.rst b/Documentation/networking/netdev-features.rst
index a2d7d7160e39..d7b15bb64deb 100644
--- a/Documentation/networking/netdev-features.rst
+++ b/Documentation/networking/netdev-features.rst
@@ -182,3 +182,24 @@ stricter than Hardware LRO. A packet stream merged by Hardware GRO must
be re-segmentable by GSO or TSO back to the exact original packet stream.
Hardware GRO is dependent on RXCSUM since every packet successfully merged
by hardware must also have the checksum verified by hardware.
+
+* hsr-tag-ins-offload
+
+This should be set for devices which insert an HSR (High-availability Seamless
+Redundancy) or PRP (Parallel Redundancy Protocol) tag automatically.
+
+* hsr-tag-rm-offload
+
+This should be set for devices which remove HSR (High-availability Seamless
+Redundancy) or PRP (Parallel Redundancy Protocol) tags automatically.
+
+* hsr-fwd-offload
+
+This should be set for devices which forward HSR (High-availability Seamless
+Redundancy) frames from one port to another in hardware.
+
+* hsr-dup-offload
+
+This should be set for devices which duplicate outgoing HSR (High-availability
+Seamless Redundancy) or PRP (Parallel Redundancy Protocol) tags automatically
+frames in hardware.
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst
index b2f7ec794bc8..06adfc2afcf0 100644
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -216,7 +216,7 @@ put into an unsupported state.
Lastly, once the controller is ready to handle network traffic, you call
phy_start(phydev). This tells the PAL that you are ready, and configures the
PHY to connect to the network. If the MAC interrupt of your network driver
-also handles PHY status changes, just set phydev->irq to PHY_IGNORE_INTERRUPT
+also handles PHY status changes, just set phydev->irq to PHY_MAC_INTERRUPT
before you call phy_start and use phy_mac_interrupt() from the network
driver. If you don't want to use interrupts, set phydev->irq to PHY_POLL.
phy_start() enables the PHY interrupts (if applicable) and starts the
@@ -267,6 +267,12 @@ Some of the interface modes are described below:
duplex, pause or other settings. This is dependent on the MAC and/or
PHY behaviour.
+``PHY_INTERFACE_MODE_5GBASER``
+ This is the IEEE 802.3 Clause 129 defined 5GBASE-R protocol. It is
+ identical to the 10GBASE-R protocol defined in Clause 49, with the
+ exception that it operates at half the frequency. Please refer to the
+ IEEE standard for the definition.
+
``PHY_INTERFACE_MODE_10GBASER``
This is the IEEE 802.3 Clause 49 defined 10GBASE-R protocol used with
various different mediums. Please refer to the IEEE standard for a
@@ -286,6 +292,11 @@ Some of the interface modes are described below:
Note: due to legacy usage, some 10GBASE-R usage incorrectly makes
use of this definition.
+``PHY_INTERFACE_MODE_100BASEX``
+ This defines IEEE 802.3 Clause 24. The link operates at a fixed data
+ rate of 125Mpbs using a 4B/5B encoding scheme, resulting in an underlying
+ data rate of 100Mpbs.
+
Pause frames / flow control
===========================
diff --git a/Documentation/networking/sfp-phylink.rst b/Documentation/networking/sfp-phylink.rst
index 5aec7c8857d0..328f8d39eeb3 100644
--- a/Documentation/networking/sfp-phylink.rst
+++ b/Documentation/networking/sfp-phylink.rst
@@ -163,7 +163,7 @@ this documentation.
err = phylink_of_phy_connect(priv->phylink, node, flags);
For the most part, ``flags`` can be zero; these flags are passed to
- the of_phy_attach() inside this function call if a PHY is specified
+ the phy_attach_direct() inside this function call if a PHY is specified
in the DT node ``node``.
``node`` should be the DT node which contains the network phy property,
diff --git a/Documentation/networking/snmp_counter.rst b/Documentation/networking/snmp_counter.rst
index 4edd0d38779e..423d138b5ff3 100644
--- a/Documentation/networking/snmp_counter.rst
+++ b/Documentation/networking/snmp_counter.rst
@@ -314,7 +314,7 @@ https://lwn.net/Articles/576263/
* TcpExtTCPOrigDataSent
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
-explaination below::
+explanation below::
TCPOrigDataSent: number of outgoing packets with original data (excluding
retransmission but including data-in-SYN). This counter is different from
@@ -324,7 +324,7 @@ explaination below::
* TCPSynRetrans
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
-explaination below::
+explanation below::
TCPSynRetrans: number of SYN and SYN/ACK retransmits to break down
retransmissions into SYN, fast-retransmits, timeout retransmits, etc.
@@ -332,7 +332,7 @@ explaination below::
* TCPFastOpenActiveFail
This counter is explained by `kernel commit f19c29e3e391`_, I pasted the
-explaination below::
+explanation below::
TCPFastOpenActiveFail: Fast Open attempts (SYN/data) failed because
the remote does not accept it or the attempts timed out.
@@ -382,7 +382,7 @@ Defined in `RFC1213 tcpAttemptFails`_.
Defined in `RFC1213 tcpOutRsts`_. The RFC says this counter indicates
the 'segments sent containing the RST flag', but in linux kernel, this
-couner indicates the segments kerenl tried to send. The sending
+counter indicates the segments kernel tried to send. The sending
process might be failed due to some errors (e.g. memory alloc failed).
.. _RFC1213 tcpOutRsts: https://tools.ietf.org/html/rfc1213#page-52
@@ -700,7 +700,7 @@ SACK option could have up to 4 blocks, they are checked
individually. E.g., if 3 blocks of a SACk is invalid, the
corresponding counter would be updated 3 times. The comment of the
`Add counters for discarded SACK blocks`_ patch has additional
-explaination:
+explanation:
.. _Add counters for discarded SACK blocks: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18f02545a9a16c9a89778b91a162ad16d510bb32
@@ -829,7 +829,7 @@ PAWS check fails or the received sequence number is out of window.
* TcpExtTCPACKSkippedTimeWait
-Tha ACK is skipped in Time-Wait status, the reason would be either
+The ACK is skipped in Time-Wait status, the reason would be either
PAWS check failed or the received sequence number is out of window.
* TcpExtTCPACKSkippedChallenge
@@ -984,7 +984,7 @@ TcpExtSyncookiesRecv counter wont be updated.
Challenge ACK
=============
-For details of challenge ACK, please refer the explaination of
+For details of challenge ACK, please refer the explanation of
TcpExtTCPACKSkippedChallenge.
* TcpExtTCPChallengeACK
@@ -1002,7 +1002,7 @@ prune
=====
When a socket is under memory pressure, the TCP stack will try to
reclaim memory from the receiving queue and out of order queue. One of
-the reclaiming method is 'collapse', which means allocate a big sbk,
+the reclaiming method is 'collapse', which means allocate a big skb,
copy the contiguous skbs to the single big skb, and free these
contiguous skbs.
@@ -1163,7 +1163,7 @@ The server side nstat output::
IpExtOutOctets 52 0.0
IpExtInNoECTPkts 1 0.0
-Input a string in nc client side again ('world' in our exmaple)::
+Input a string in nc client side again ('world' in our example)::
nstatuser@nstat-a:~$ nc -v nstat-b 9000
Connection to nstat-b 9000 port [tcp/*] succeeded!
@@ -1211,7 +1211,7 @@ replied an ACK. But kernel handled them in different ways. When the
TCP window scale option is not used, kernel will try to enable fast
path immediately when the connection comes into the established state,
but if the TCP window scale option is used, kernel will disable the
-fast path at first, and try to enable it after kerenl receives
+fast path at first, and try to enable it after kernel receives
packets. We could use the 'ss' command to verify whether the window
scale option is used. e.g. run below command on either server or
client::
@@ -1343,7 +1343,7 @@ Check TcpExtTCPAbortOnMemory on client::
nstatuser@nstat-a:~$ nstat | grep -i abort
TcpExtTCPAbortOnMemory 54 0.0
-Check orphane socket count on client::
+Check orphaned socket count on client::
nstatuser@nstat-a:~$ ss -s
Total: 131 (kernel 0)
@@ -1685,7 +1685,7 @@ Send 3 SYN repeatly to nstat-b::
nstatuser@nstat-a:~$ for i in {1..3}; do sudo tcpreplay -i ens3 /tmp/syn_fixcsum.pcap; done
-Check snmp cunter on nstat-b::
+Check snmp counter on nstat-b::
nstatuser@nstat-b:~$ nstat | grep -i skip
TcpExtTCPACKSkippedSynRecv 1 0.0
@@ -1770,7 +1770,7 @@ string 'foo' in our example::
Connection from nstat-a 42132 received!
foo
-On nstat-a, the tcpdump should have caputred the ACK. We should check
+On nstat-a, the tcpdump should have captured the ACK. We should check
the source port numbers of the two nc clients::
nstatuser@nstat-a:~$ ss -ta '( dport = :9000 || dport = :9001 )' | tee
@@ -1778,7 +1778,7 @@ the source port numbers of the two nc clients::
ESTAB 0 0 192.168.122.250:50208 192.168.122.251:9000
ESTAB 0 0 192.168.122.250:42132 192.168.122.251:9001
-Run tcprewrite, change port 9001 to port 9000, chagne port 42132 to
+Run tcprewrite, change port 9001 to port 9000, change port 42132 to
port 50208::
nstatuser@nstat-a:~$ tcprewrite --infile /tmp/seq_pre.pcap --outfile /tmp/seq.pcap -r 9001:9000 -r 42132:50208 --fixcsum
diff --git a/Documentation/networking/timestamping.rst b/Documentation/networking/timestamping.rst
index 03f7beade470..f682e88fa87e 100644
--- a/Documentation/networking/timestamping.rst
+++ b/Documentation/networking/timestamping.rst
@@ -55,7 +55,8 @@ struct __kernel_sock_timeval format.
SO_TIMESTAMP_OLD returns incorrect timestamps after the year 2038
on 32 bit machines.
-1.2 SO_TIMESTAMPNS (also SO_TIMESTAMPNS_OLD and SO_TIMESTAMPNS_NEW):
+1.2 SO_TIMESTAMPNS (also SO_TIMESTAMPNS_OLD and SO_TIMESTAMPNS_NEW)
+-------------------------------------------------------------------
This option is identical to SO_TIMESTAMP except for the returned data type.
Its struct timespec allows for higher resolution (ns) timestamps than the