linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-12-26	Merge branch 's390-qeth-next'	David S. Miller	4	-187/+200
	Julian Wiedmann says: ==================== s390/qeth: updates 2019-12-23 please apply the following patch series for qeth to your net-next tree. This reworks the RX code to use napi_gro_frags() when building non-linear skbs, along with some consolidation and cleanups. Happy holidays - and many thanks for all the effort & support over the past year, to both Jakub and you. It's much appreciated. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	s390/qeth: remove QETH_RX_PULL_LEN	Julian Wiedmann	2	-3/+4
	Since commit f677fcb9aeb6 ("s390/qeth: ensure linear access to packet headers"), the CQ-specific skbs are allocated with a slightly bigger linear part than necessary. Shrink it down to the maximum that's needed by qeth_extract_skb(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	s390/qeth: use napi_gro_frags() for SG skbs	Julian Wiedmann	1	-17/+50
	For non-linear packets, get the skb for attaching the page fragments from napi_get_frags() so that it can be recycled during GRO. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	s390/qeth: consolidate RX code	Julian Wiedmann	4	-174/+153
	To reduce the path length and levels of indirection, move the RX processing from the sub-drivers into the core. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	af_packet: refactoring code for prb_calc_retire_blk_tmo	Mao Wenan	1	-18/+12
	If __ethtool_get_link_ksettings() is failed and with non-zero value, prb_calc_retire_blk_tmo() should return DEFAULT_PRB_RETIRE_TOV firstly. This patch is to refactory code and make it more readable. Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	xen-netback: support dynamic unbind/bind	Paul Durrant	3	-7/+28
	By re-attaching RX, TX, and CTL rings during connect() rather than assuming they are freshly allocated (i.e. assuming the counters are zero), and avoiding forcing state to Closed in netback_remove() it is possible for vif instances to be unbound and re-bound from and to (respectively) a running guest. Dynamic unbind/bind is a highly useful feature for a backend module as it allows it to be unloaded and re-loaded (i.e. updated) without requiring domUs to be halted. This has been tested by running iperf as a server in the test VM and then running a client against it in a continuous loop, whilst also running: while true; do echo vif-$DOMID-$VIF >unbind; echo down; rmmod xen-netback; echo unloaded; modprobe xen-netback; cd $(pwd); brctl addif xenbr0 vif$DOMID.$VIF; ip link set vif$DOMID.$VIF up; echo up; sleep 5; done in dom0 from /sys/bus/xen-backend/drivers/vif to continuously unbind, unload, re-load, re-bind and re-plumb the backend. Clearly a performance drop was seen but no TCP connection resets were observed during this test and moreover a parallel SSH connection into the guest remained perfectly usable throughout. Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	Merge branch 'RTL8211F-RGMII-RX-TX-delay-configuration-improvements'	David S. Miller	1	-8/+51
	Martin Blumenstingl says: ==================== RTL8211F: RGMII RX/TX delay configuration improvements In discussion with Andrew [0] we figured out that it would be best to make the RX delay of the RTL8211F PHY configurable (just like the TX delay is already configurable). While here I took the opportunity to add some logging to the TX delay configuration as well. There is no public documentation for the RX and TX delay registers. I received this information a while ago (and created this RfC patch back then: [1]). Realtek gave me permission to take the information from the datasheet extracts and phase them in my own words and publish that (I am not allowed to publish the datasheet extracts). I have tested these patches on two boards: - Amlogic Meson8b Odroid-C1 - Amlogic GXM Khadas VIM2 Both still behave as before these changes (iperf3 speeds are the same in both directions: RX and TX), which is expected because they are currently using phy-mode = "rgmii" with the RX delay not being generated by the PHY. [0] https://patchwork.ozlabs.org/patch/1215313/ [1] https://patchwork.ozlabs.org/patch/843946/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	net: phy: realtek: add support for configuring the RX delay on RTL8211F	Martin Blumenstingl	1	-10/+36
	On RTL8211F the RX and TX delays (2ns) can be configured in two ways: - pin strapping (RXD1 for the TX delay and RXD0 for the RX delay, LOW means "off" and HIGH means "on") which is read during PHY reset - using software to configure the TX and RX delay registers So far only the configuration using pin strapping has been supported. Add support for enabling or disabling the RGMII RX delay based on the phy-mode to be able to get the RX delay into a known state. This is important because the RX delay has to be coordinated between the PHY, MAC and the PCB design (trace length). With an invalid RX delay applied (for example if both PHY and MAC add a 2ns RX delay) Ethernet may not work at all. Also add debug logging when configuring the RX delay (just like the TX delay) because this is a common source of problems. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	net: phy: realtek: add logging for the RGMII TX delay configuration	Martin Blumenstingl	1	-1/+18
	RGMII requires a delay of 2ns between the data and the clock signal. There are at least three ways this can happen. One possibility is by having the PHY generate this delay. This is a common source for problems (for example with slow TX speeds or packet loss when sending data). The TX delay configuration of the RTL8211F PHY can be set either by pin-strappping the RXD1 pin (HIGH means enabled, LOW means disabled) or through configuring a paged register. The setting from the RXD1 pin is also reflected in the register. Add debug logging to the TX delay configuration on RTL8211F so it's easier to spot these issues (for example if the TX delay is enabled for both, the RTL8211F PHY and the MAC). This is especially helpful because there is no public datasheet for the RTL8211F PHY available with all the RX/TX delay specifics. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	Merge branch 'mlxsw-spectrum_router-Cleanups'	David S. Miller	1	-217/+101
	Ido Schimmel says: ==================== mlxsw: spectrum_router: Cleanups This patch set removes from mlxsw code that is no longer necessary after the simplification of the IPv4 and IPv6 route offload API. The patches eliminate unnecessary code by taking advantage of the fact that mlxsw no longer needs to maintain a list of identical routes, following recent changes in route offload API. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	mlxsw: spectrum_router: Remove FIB entry list from FIB node	Ido Schimmel	1	-151/+74
	As explained in previous patches, the driver no longer needs to maintain a list of identical FIB entries (i.e, same {tb_id, prefix, prefix length}) and therefore each FIB node can only store one FIB entry. Remove the FIB entry list and simplify the code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	mlxsw: spectrum_router: Consolidate identical functions	Ido Schimmel	1	-49/+22
	After the last patch mlxsw_sp_fib{4,6}_node_entry_link() and mlxsw_sp_fib{4,6}_node_entry_unlink() are identical and can therefore be consolidated into the same common function. Perform the consolidation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	mlxsw: spectrum_router: Make route creation and destruction symmetric	Ido Schimmel	1	-3/+15
	Host routes that perform decapsulation of IP in IP tunnels have a special adjacency entry linked to them. This entry stores information such as the expected underlay source IP. When the route is deleted this entry needs to be freed. The allocation of the adjacency entry happens in mlxsw_sp_fib4_entry_type_set(), but it is freed in mlxsw_sp_fib4_node_entry_unlink(). Create a new function - mlxsw_sp_fib4_entry_type_unset() - and free the adjacency entry there. This will allow us to consolidate mlxsw_sp_fib{4,6}_node_entry_unlink() in the next patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	mlxsw: spectrum_router: Eliminate dead code	Ido Schimmel	1	-10/+0
	Since the driver no longer maintains a list of identical routes there is no route to promote when a route is deleted. Remove that code that took care of it. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	mlxsw: spectrum_router: Remove unnecessary checks	Ido Schimmel	1	-15/+1
	Now that the networking stack takes care of only notifying the routes of interest, we do not need to maintain a list of identical routes. Remove the check that tests if the route is the first route in the FIB node. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	bonding: rename AD_STATE_* to LACP_STATE_*	Andy Roulin	2	-64/+64
	As the LACP actor/partner state is now part of the uapi, rename the 3ad state defines with LACP prefix. The LACP prefix is preferred over BOND_3AD as the LACP standard moved to 802.1AX. Fixes: 826f66b30c2e3 ("bonding: move 802.3ad port state flags to uapi") Signed-off-by: Andy Roulin <aroulin@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26	sctp: move trace_sctp_probe_path into sctp_outq_sack	Kevin Kou	2	-9/+6
	The original patch bringed in the "SCTP ACK tracking trace event" feature was committed at Dec.20, 2017, it replaced jprobe usage with trace events, and bringed in two trace events, one is TRACE_EVENT(sctp_probe), another one is TRACE_EVENT(sctp_probe_path). The original patch intended to trigger the trace_sctp_probe_path in TRACE_EVENT(sctp_probe) as below code, +TRACE_EVENT(sctp_probe, + + TP_PROTO(const struct sctp_endpoint ep, + const struct sctp_association asoc, + struct sctp_chunk chunk), + + TP_ARGS(ep, asoc, chunk), + + TP_STRUCT__entry( + __field(__u64, asoc) + __field(__u32, mark) + __field(__u16, bind_port) + __field(__u16, peer_port) + __field(__u32, pathmtu) + __field(__u32, rwnd) + __field(__u16, unack_data) + ), + + TP_fast_assign( + struct sk_buff skb = chunk->skb; + + __entry->asoc = (unsigned long)asoc; + __entry->mark = skb->mark; + __entry->bind_port = ep->base.bind_addr.port; + __entry->peer_port = asoc->peer.port; + __entry->pathmtu = asoc->pathmtu; + __entry->rwnd = asoc->peer.rwnd; + __entry->unack_data = asoc->unack_data; + + if (trace_sctp_probe_path_enabled()) { + struct sctp_transport sp; + + list_for_each_entry(sp, &asoc->peer.transport_addr_list, + transports) { + trace_sctp_probe_path(sp, asoc); + } + } + ), But I found it did not work when I did testing, and trace_sctp_probe_path had no output, I finally found that there is trace buffer lock operation(trace_event_buffer_reserve) in include/trace/trace_events.h: static notrace void \ trace_event_raw_event_##call(void __data, proto) \ { \ struct trace_event_file trace_file = __data; \ struct trace_event_data_offsets_##call __maybe_unused __data_offsets;\ struct trace_event_buffer fbuffer; \ struct trace_event_raw_##call entry; \ int __data_size; \ \ if (trace_trigger_soft_disabled(trace_file)) \ return; \ \ __data_size = trace_event_get_offsets_##call(&__data_offsets, args); \ \ entry = trace_event_buffer_reserve(&fbuffer, trace_file, \ sizeof(entry) + __data_size); \ \ if (!entry) \ return; \ \ tstruct \ \ { assign; } \ \ trace_event_buffer_commit(&fbuffer); \ } The reason caused no output of trace_sctp_probe_path is that trace_sctp_probe_path written in TP_fast_assign part of TRACE_EVENT(sctp_probe), and it will be placed( { assign; } ) after the trace_event_buffer_reserve() when compiler expands Macro, entry = trace_event_buffer_reserve(&fbuffer, trace_file, \ sizeof(entry) + __data_size); \ \ if (!entry) \ return; \ \ tstruct \ \ { assign; } \ so trace_sctp_probe_path finally can not acquire trace_event_buffer and return no output, that is to say the nest of tracepoint entry function is not allowed. The function call flow is: trace_sctp_probe() -> trace_event_raw_event_sctp_probe() -> lock buffer -> trace_sctp_probe_path() -> trace_event_raw_event_sctp_probe_path() --nested -> buffer has been locked and return no output. This patch is to remove trace_sctp_probe_path from the TP_fast_assign part of TRACE_EVENT(sctp_probe) to avoid the nest of entry function, and trigger sctp_probe_path_trace in sctp_outq_sack. After this patch, you can enable both events individually, # cd /sys/kernel/debug/tracing # echo 1 > events/sctp/sctp_probe/enable # echo 1 > events/sctp/sctp_probe_path/enable Or, you can enable all the events under sctp. # echo 1 > events/sctp/enable Signed-off-by: Kevin Kou <qdkevin.kou@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	Merge branch 'Peer-to-Peer-One-Step-time-stamping'	David S. Miller	25	-154/+1437
	Richard Cochran says: ==================== Peer to Peer One-Step time stamping This series adds support for PTP (IEEE 1588) P2P one-step time stamping along with a driver for a hardware device that supports this. If the hardware supports p2p one-step, it subtracts the ingress time stamp value from the Pdelay_Request correction field. The user space software stack then simply copies the correction field into the Pdelay_Response, and on transmission the hardware adds the egress time stamp into the correction field. This new functionality extends CONFIG_NETWORK_PHY_TIMESTAMPING to cover MII snooping devices, but it still depends on phylib, just as that option does. Expanding beyond phylib is not within the scope of the this series. User space support is available in the current linuxptp master branch. - Patch 1 adds phy_device methods for existing time stamping fields. - Patches 2-5 convert the stack and drivers to the new methods. - Patch 6 moves code around the dp83640 driver. - Patches 7-10 add support for MII time stamping in non-PHY devices. - Patch 11 adds the new P2P 1-step option. - Patch 12 adds a driver implementing the new option. Thanks, Richard Changed in v9: ~~~~~~~~~~~~~~ - Fix two more drivers' switch/case blocks WRT the new HWTSTAMP ioctl. - Picked up two more review tags from Andrew. Changed in v8: ~~~~~~~~~~~~~~ - Avoided adding forward functional declarations in the dp83640 driver. - Picked up Florian's new review tags and another one from Andrew. Changed in v7: ~~~~~~~~~~~~~~ - Converted pr_debug\|err to dev_ variants in new driver. - Fixed device tree documentation per Rob's v6 review. - Picked up Andrew's and Rob's review tags. - Silenced sparse warnings in new driver. Changed in v6: ~~~~~~~~~~~~~~ - Added methods for accessing the phy_device time stamping fields. - Adjust the device tree documentation per Rob's v5 review. - Fixed the build failures due to missing exports. Changed in v5: ~~~~~~~~~~~~~~ - Fixed build failure in macvlan. - Fixed latent bug with its gcc warning in the driver. Changed in v4: ~~~~~~~~~~~~~~ - Correct error paths and PTR_ERR return values in the framework. - Expanded KernelDoc comments WRT PHY locking. - Pick up Andrew's review tag. Changed in v3: ~~~~~~~~~~~~~~ - Simplify the device tree binding and document the time stamping phandle by itself. Changed in v2: ~~~~~~~~~~~~~~ - Per the v1 review, changed the modeling of MII time stamping devices. They are no longer a kind of mdio device. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	ptp: Add a driver for InES time stamping IP core.	Richard Cochran	3	-0/+863
	The InES at the ZHAW offers a PTP time stamping IP core. The FPGA logic recognizes and time stamps PTP frames on the MII bus. This patch adds a driver for the core along with a device tree binding to allow hooking the driver to MII buses. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: Introduce peer to peer one step PTP time stamping.	Richard Cochran	6	-0/+15
	The 1588 standard defines one step operation for both Sync and PDelay_Resp messages. Up until now, hardware with P2P one step has been rare, and kernel support was lacking. This patch adds support of the mode in anticipation of new hardware developments. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: mdio: of: Register discovered MII time stampers.	Richard Cochran	2	-1/+32
	When parsing a PHY node, register its time stamper, if any, and attach the instance to the PHY device. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	dt-bindings: ptp: Introduce MII time stamping devices.	Richard Cochran	2	-0/+77
	This patch add a new binding that allows non-PHY MII time stamping devices to find their buses. The new documentation covers both the generic binding and one upcoming user. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: Add a layer for non-PHY MII time stamping drivers.	Richard Cochran	4	-3/+194
	While PHY time stamping drivers can simply attach their interface directly to the PHY instance, stand alone drivers require support in order to manage their services. Non-PHY MII time stamping drivers have a control interface over another bus like I2C, SPI, UART, or via a memory mapped peripheral. The controller device will be associated with one or more time stamping channels, each of which sits snoops in on a MII bus. This patch provides a glue layer that will enable time stamping channels to find their controlling device. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: Introduce a new MII time stamping interface.	Richard Cochran	6	-58/+106
	Currently the stack supports time stamping in PHY devices. However, there are newer, non-PHY devices that can snoop an MII bus and provide time stamps. In order to support such devices, this patch introduces a new interface to be used by both PHY and non-PHY devices. In addition, the one and only user of the old PHY time stamping API is converted to the new interface. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: phy: dp83640: Move the probe and remove methods around.	Richard Cochran	1	-90/+90
	An upcoming patch will change how the PHY time stamping functions are registered with the networking stack, and adapting this driver would entail adding forward declarations for four time stamping methods. However, forward declarations are considered to be stylistic defects. This patch avoids the issue by moving the probe and remove methods immediately above the phy_driver interface structure. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: netcp_ethss: Use the PHY time stamping interface.	Richard Cochran	1	-5/+3
	The netcp_ethss driver tests fields of the phy_device in order to determine whether to defer to the PHY's time stamping functionality. This patch replaces the open coded logic with an invocation of the proper methods. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: ethtool: Use the PHY time stamping interface.	Richard Cochran	1	-2/+2
	The ethtool layer tests fields of the phy_device in order to determine whether to invoke the PHY's tsinfo ethtool callback. This patch replaces the open coded logic with an invocation of the proper methods. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: vlan: Use the PHY time stamping interface.	Richard Cochran	1	-2/+2
	The vlan layer tests fields of the phy_device in order to determine whether to invoke the PHY's tsinfo ethtool callback. This patch replaces the open coded logic with an invocation of the proper methods. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: macvlan: Use the PHY time stamping interface.	Richard Cochran	1	-2/+2
	The macvlan layer tests fields of the phy_device in order to determine whether to invoke the PHY's tsinfo ethtool callback. This patch replaces the open coded logic with an invocation of the proper methods. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25	net: phy: Introduce helper functions for time stamping support.	Richard Cochran	1	-0/+60
	Some parts of the networking stack and at least one driver test fields within the 'struct phy_device' in order to query time stamping capabilities and to invoke time stamping methods. This patch adds a functional interface around the time stamping fields. This will allow insulating the callers from future changes to the details of the time stamping implemenation. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	Merge branch 'Simplify-IPv6-route-offload-API'	David S. Miller	5	-176/+238
	Ido Schimmel says: ==================== Simplify IPv6 route offload API Motivation ========== This is the IPv6 counterpart of "Simplify IPv4 route offload API" [1]. The aim of this patch set is to simplify the IPv6 route offload API by making the stack a bit smarter about the notifications it is generating. This allows driver authors to focus on programming the underlying device instead of having to duplicate the IPv6 route insertion logic in their driver, which is error-prone. Details ======= Today, whenever an IPv6 route is added or deleted a notification is sent in the FIB notification chain and it is up to offload drivers to decide if the route should be programmed to the hardware or not. This is not an easy task as in hardware routes are keyed by {prefix, prefix length, table id}, whereas the kernel can store multiple such routes that only differ in metric / nexthop info. This series makes sure that only routes that are actually used in the data path are notified to offload drivers. This greatly simplifies the work these drivers need to do, as they are now only concerned with programming the hardware and do not need to replicate the IPv6 route insertion logic and store multiple identical routes. The route that is notified is the first route in the IPv6 FIB node, which represents a single prefix and length in a given table. In case the route is deleted and there is another route with the same key, a replace notification is emitted. Otherwise, a delete notification is emitted. Unlike IPv4, in IPv6 it is possible to append individual nexthops to an existing multipath route. Therefore, in addition to the replace and delete notifications present in IPv4, an append notification is also used. Testing ======= To ensure there is no degradation in route insertion rates, I averaged the insertion rate of 512k routes (/64 and /128) over 50 runs. Did not observe any degradation. Functional tests are available here [2]. They rely on route trap indication, which is added in a subsequent patch set. In addition, I have been running syzkaller for the past couple of weeks with debug options enabled. Did not observe any problems. Patch set overview ================== Patches #1-#7 gradually introduce the new FIB notifications Patch #8 converts mlxsw to use the new notifications Patch #9 remove the old notifications [1] https://patchwork.ozlabs.org/cover/1209738/ [2] https://github.com/idosch/linux/tree/fib-notifier ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Remove old route notifications and convert listeners	Ido Schimmel	5	-70/+21
	Now that mlxsw is converted to use the new FIB notifications it is possible to delete the old ones and use the new replace / append / delete notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	mlxsw: spectrum_router: Start using new IPv6 route notifications	Ido Schimmel	1	-149/+76
	With the new notifications mlxsw does not need to handle identical routes itself, as this is taken care of by the core IPv6 code. Instead, mlxsw only needs to take care of inserting and removing routes from the device. Convert mlxsw to use the new IPv6 route notifications and simplify the code. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Handle multipath route deletion notification	Ido Schimmel	1	-0/+26
	When an entire multipath route is deleted, only emit a notification if it is the first route in the node. Emit a replace notification in case the last sibling is followed by another route. Otherwise, emit a delete notification. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Handle route deletion notification	Ido Schimmel	2	-1/+44
	For the purpose of route offload, when a single route is deleted, it is only of interest if it is the first route in the node or if it is sibling to such a route. In the first case, distinguish between several possibilities: 1. Route is the last route in the node. Emit a delete notification 2. Route is followed by a non-multipath route. Emit a replace notification for the non-multipath route. 3. Route is followed by a multipath route. Emit a replace notification for the multipath route. In the second case, only emit a delete notification to ensure the route is no longer used as a valid nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Only Replay routes of interest to new listeners	Ido Schimmel	1	-0/+40
	When a new listener is registered to the FIB notification chain it receives a dump of all the available routes in the system. Instead, make sure to only replay the IPv6 routes that are actually used in the data path and are of any interest to the new listener. This is done by iterating over all the routing tables in the given namespace, but from each traversed node only the first route ('leaf') is notified. Multipath routes are notified in a single notification instead of one for each nexthop. Add fib6_rt_dump_tmp() to do that. Later on in the patch set it will be renamed to fib6_rt_dump() instead of the existing one. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Notify multipath route if should be offloaded	Ido Schimmel	1	-0/+48
	In a similar fashion to previous patches, only notify the new multipath route if it is the first route in the node or if it was appended to such route. The type of the notification (replace vs. append) is determined based on the number of routes added ('nhn') and the number of sibling routes. If the two do not match, then an append notification should be sent. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Notify route if replacing currently offloaded one	Ido Schimmel	1	-0/+7
	Similar to the corresponding IPv4 patch, only notify the new route if it is replacing the currently offloaded one. Meaning, the one pointed to by 'fn->leaf'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	ipv6: Notify newly added route if should be offloaded	Ido Schimmel	1	-0/+18
	fib6_add_rt2node() takes care of adding a single route ('struct fib6_info') to a FIB node. The route in question should only be notified in case it is added as the first route in the node (lowest metric) or if it is added as a sibling route to the first route in the node. The first criterion can be tested by checking if the route is pointed to by 'fn->leaf'. The second criterion can be tested by checking the new 'notify_sibling_rt' variable that is set when the route is added as a sibling to the first route in the node. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	net: fib_notifier: Add temporary events to the FIB notification chain	Ido Schimmel	1	-0/+2
	Subsequent patches are going to simplify the IPv6 route offload API, which will only use three events - replace, delete and append. Introduce a temporary version of replace and delete in order to make the conversion easier to review. Note that append does not need a temporary version, as it is currently not used. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	r8169: move enabling EEE to rtl8169_init_phy	Heiner Kallweit	1	-11/+3
	Simplify the code by moving the call to rtl_enable_eee() from the individual PHY configs to rtl8169_init_phy(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	r8169: remove MAC workaround in rtl8168e_2_hw_phy_config	Heiner Kallweit	1	-3/+0
	Due to recent changes we don't need the call to rtl_rar_exgmac_set() and longer at this place. It's called from rtl_rar_set() which is called in rtl_init_mac_address() and rtl8169_resume(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	r8169: factor out rtl8168h_2_get_adc_bias_ioffset	Heiner Kallweit	1	-19/+20
	Simplify and factor out this magic from rtl8168h_2_hw_phy_config() and name it based on the vendor driver. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	Merge branch 'ovs-mpls-actions'	David S. Miller	4	-9/+96
	Martin Varghese says: ==================== New openvswitch MPLS actions for layer 2 tunnelling The existing PUSH MPLS action inserts MPLS header between ethernet header and the IP header. Though this behaviour is fine for L3 VPN where an IP packet is encapsulated inside a MPLS tunnel, it does not suffice the L2 VPN (l2 tunnelling) requirements. In L2 VPN the MPLS header should encapsulate the ethernet packet. The new mpls action ADD_MPLS inserts MPLS header at the start of the packet or at the start of the l3 header depending on the value of l3 tunnel flag in the ADD_MPLS arguments. POP_MPLS action is extended to support ethertype 0x6558 OVS userspace changes - --------------------- Encap & Decap ovs actions are extended to support MPLS packet type. The encap & decap adds and removes MPLS header at the start of packet as depicted below. Actions - encap(mpls(ether_type=0x8847)),encap(ethernet) Incoming packet -> \| ETH \| IP \| Payload \| 1 Actions - encap(mpls(ether_type=0x8847)) [Kernel action - add_mpls:0x8847] Outgoing packet -> \| MPLS \| ETH \| Payload\| 2 Actions - encap(ethernet) [ Kernel action - push_eth ] Outgoing packet -> \| ETH \| MPLS \| ETH \| Payload\| Decapsulation: Incoming packet -> \| ETH \| MPLS \| ETH \| IP \| Payload \| Actions - decap(),decap(packet_type(ns=0,type=0) 1 Actions - decap() [Kernel action - pop_eth) Outgoing packet -> \| MPLS \| ETH \| IP \| Payload\| 2 Actions - decap(packet_type(ns=0,type=0) [Kernel action - pop_mpls:0x6558] Outgoing packet -> \| ETH \| IP \| Payload ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	openvswitch: New MPLS actions for layer 2 tunnelling	Martin Varghese	3	-6/+89
	The existing PUSH MPLS action inserts MPLS header between ethernet header and the IP header. Though this behaviour is fine for L3 VPN where an IP packet is encapsulated inside a MPLS tunnel, it does not suffice the L2 VPN (l2 tunnelling) requirements. In L2 VPN the MPLS header should encapsulate the ethernet packet. The new mpls action ADD_MPLS inserts MPLS header at the start of the packet or at the start of the l3 header depending on the value of l3 tunnel flag in the ADD_MPLS arguments. POP_MPLS action is extended to support ethertype 0x6558. Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	net: Rephrased comments section of skb_mpls_pop()	Martin Varghese	1	-1/+1
	Rephrased comments section of skb_mpls_pop() to align it with comments section of skb_mpls_push(). Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-24	net: skb_mpls_push() modified to allow MPLS header push at start of packet.	Martin Varghese	1	-2/+6
	The existing skb_mpls_push() implementation always inserts mpls header after the mac header. L2 VPN use cases requires MPLS header to be inserted before the ethernet header as the ethernet packet gets tunnelled inside MPLS header in those cases. Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-22	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller	965	-4814/+9516
	Mere overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-22	Merge tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux	Linus Torvalds	13	-104/+341
	Pull xfs fixes from Darrick Wong: "Fix a few bugs that could lead to corrupt files, fsck complaints, and filesystem crashes: - Minor documentation fixes - Fix a file corruption due to read racing with an insert range operation. - Fix log reservation overflows when allocating large rt extents - Fix a buffer log item flags check - Don't allow administrators to mount with sunit= options that will cause later xfs_repair complaints about the root directory being suspicious because the fs geometry appeared inconsistent - Fix a non-static helper that should have been static" * tag 'xfs-5.5-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Make the symbol 'xfs_rtalloc_log_count' static xfs: don't commit sunit/swidth updates to disk if that would cause repair failures xfs: split the sunit parameter update into two parts xfs: refactor agfl length computation function libxfs: resync with the userspace libxfs xfs: use bitops interface for buf log item AIL flag check xfs: fix log reservation overflows when allocating large rt extents xfs: stabilize insert range start boundary to avoid COW writeback race xfs: fix Sphinx documentation warning
2019-12-22	Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4	Linus Torvalds	8	-104/+116
	Pull ext4 bug fixes from Ted Ts'o: "Ext4 bug fixes, including a regression fix" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: clarify impact of 'commit' mount option ext4: fix unused-but-set-variable warning in ext4_add_entry() jbd2: fix kernel-doc notation warning ext4: use RCU API in debug_print_tree ext4: validate the debug_want_extra_isize mount option at parse time ext4: reserve revoke credits in __ext4_new_inode ext4: unlock on error in ext4_expand_extra_isize() ext4: optimize __ext4_check_dir_entry() ext4: check for directory entries too close to block end ext4: fix ext4_empty_dir() for directories with holes