linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2017-08-31	net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv()	Andrii	1	-12/+35
	Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv() in net/dccp/ipv6.c, similar to the handling in net/ipv6/tcp_ipv6.c Signed-off-by: Andrii Vladyka <tulup@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-31	bridge: add tracepoint in br_fdb_update	Roopa Prabhu	3	-1/+36
	This extends bridge fdb table tracepoints to also cover learned fdb entries in the br_fdb_update path. Note that unlike other tracepoints I have moved this to when the fdb is modified because this is in the datapath and can generate a lot of noise in the trace output. br_fdb_update is also called from added_by_user context in the NTF_USE case which is already traced ..hence the !added_by_user check. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-31	net_sched: add reverse binding for tc class	Cong Wang	11	-2/+148
	TC filters when used as classifiers are bound to TC classes. However, there is a hidden difference when adding them in different orders: 1. If we add tc classes before its filters, everything is fine. Logically, the classes exist before we specify their ID's in filters, it is easy to bind them together, just as in the current code base. 2. If we add tc filters before the tc classes they bind, we have to do dynamic lookup in fast path. What's worse, this happens all the time not just once, because on fast path tcf_result is passed on stack, there is no way to propagate back to the one in tc filters. This hidden difference hurts performance silently if we have many tc classes in hierarchy. This patch intends to close this gap by doing the reverse binding when we create a new class, in this case we can actually search all the filters in its parent, match and fixup by classid. And because tcf_result is specific to each type of tc filter, we have to introduce a new ops for each filter to tell how to bind the class. Note, we still can NOT totally get rid of those class lookup in ->enqueue() because cgroup and flow filters have no way to determine the classid at setup time, they still have to go through dynamic lookup. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	Merge tag 'mlx5-GRE-Offload' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux	David S. Miller	6	-40/+384
	Saeed Mahameed says: ==================== mlx5-updates-2017-08-31 (GRE Offloads support) This series provides the support for MPLS RSS and GRE TX offloads and RSS support. The first patch from Gal and Ariel provides the mlx5 driver support for ConnectX capability to perform IP version identification and matching in order to distinguish between IPv4 and IPv6 without the need to specify the encapsulation type, thus perform RSS in MPLS automatically without specifying MPLS ethertyoe. This patch will also serve for inner GRE IPv4/6 classification for inner GRE RSS. 2nd patch from Gal, Adds the TX offloads support for GRE tunneled packets, by reporting the needed netdev features. 3rd patch from Gal, Adds GRE inner RSS support by creating the needed device resources (Steering Tables/rules and traffic classifiers) to Match GRE traffic and perform RSS hashing on the inner headers. Improvement: Testing 8 TCP streams bandwidth over GRE: System: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] Before: 21.3 Gbps (Single RQ) Now : 90.5 Gbps (RSS spread on 8 RQs) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	liquidio: fix crash in presence of zeroed-out base address regs	Rick Farrington	1	-0/+20
	Fix crash in linux PF driver when BARs have been cleared/de-programmed; fail early init (prior to mapping BARs) if the BAR0 or BAR1 registers are zero. This situation can arise when the PF is added to a VM (PCI pass-through), then a PF FLR is issued (in the VM). After this occurs, the BAR registers will be zero. If we attempt to load the PF driver in the host (after VM has been shutdown), the host can reset. Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	devlink: Maintain consistency in mac field name	David Ahern	1	-1/+1
	IPv4 name uses "destination ip" as does the IPv6 patch set. Make the mac field consistent. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	hv_netvsc: Fix typos in the document of UDP hashing	Haiyang Zhang	1	-2/+2
	There are two typos in the document, netvsc.txt, regarding UDP hashing level. This patch fixes them. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	xen-netfront: be more drop monitor friendly	Eric Dumazet	1	-1/+1
	xennet_start_xmit() might copy skb with inappropriate layout into a fresh one. Old skb is freed, and at this point it is not a drop, but a consume. New skb will then be either consumed or dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-31	net/mlx5e: Support RSS for GRE tunneled packets	Gal Pressman	5	-17/+321
	Introduce a new flow table and indirect TIRs which are used to hash the inner packet headers of GRE tunneled packets. When a GRE tunneled packet is received, the TTC flow table will match the new IPv4/6->GRE rules which will forward it to the inner TTC table. The inner TTC is similar to its counterpart outer TTC table, but matching the inner packet headers instead of the outer ones (and does not include the new IPv4/6->GRE rules). The new rules will not add steering hops since they are added to an already existing flow group which will be matched regardless of this patch. Non GRE traffic will not be affected. The inner flow table will forward the packet to inner indirect TIRs which hash the inner packet and thus result in RSS for the tunneled packets. Testing 8 TCP streams bandwidth over GRE: System: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] Before: 21.3 Gbps (Single RQ) Now : 90.5 Gbps (RSS spread on 8 RQs) Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-31	net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels	Gal Pressman	2	-19/+34
	Add TX offloads support for GRE tunneled packets by reporting the needed netdev features. Signed-off-by: Gal Pressman <galp@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-31	net/mlx5e: Use IP version matching to classify IP traffic	Gal Pressman	1	-4/+29
	This change adds the ability for flow steering to classify IPv4/6 packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP packets and hit IPv4/6 classification steering rules. Since IP packets with MPLS tag header have MPLS ethertype, they missed the IPv4/6 ethertype rule and ended up hitting the default filter forwarding all the packets to the same single RQ (No RSS). Since our device is able to look past the MPLS tag and identify the next protocol we introduce this solution which replaces ethertype matching by the device's capability to perform IP version identification and matching in order to distinguish between IPv4 and IPv6. Therefore, when driver is performing flow steering configuration on the device it will use IP version matching in IP classified rules instead of ethertype matching which will cause relevant MPLS tagged packets to hit this rule as well. If the device doesn't support IP version matching the driver will fall back to use legacy ethertype matching in the steering as before. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-30	bpf: test_maps: fix typos, "conenct" and "listeen"	Colin Ian King	1	-2/+2
	Trivial fix to typos in printf error messages: "conenct" -> "connect" "listeen" -> "listen" thanks to Daniel Borkmann for spotting one of these mistakes Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	qed: fix spelling mistake: "calescing" -> "coalescing"	Colin Ian King	1	-1/+1
	Trivial fix to spelling mistake in DP_NOTICE message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: hns3: Fixes the wrong IS_ERR check on the returned phydev value	Salil Mehta	1	-1/+1
	This patch removes the wrong check being done for the phy device being returned by the mdiobus_get_phy() function. This function never returns the error pointers. Fixes: 256727da7395 ("net: hns3: Add MDIO support to HNS3 Ethernet Driver for hip08 SoC") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: bcm63xx_enet: make bcm_enetsw_ethtool_ops const	Bhumika Goyal	1	-1/+1
	Make this const as it is never modified. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	ipv6: sr: fix get_srh() to comply with IPv6 standard "RFC 8200"	Ahmed Abdelsalam	1	-6/+12
	IPv6 packet may carry more than one extension header, and IPv6 nodes must accept and attempt to process extension headers in any order and occurring any number of times in the same packet. Hence, there should be no assumption that Segment Routing extension header is to appear immediately after the IPv6 header. Moreover, section 4.1 of RFC 8200 gives a recommendation on the order of appearance of those extension headers within an IPv6 packet. According to this recommendation, Segment Routing extension header should appear after Hop-by-Hop and Destination Options headers (if they present). This patch fixes the get_srh(), so it gets the segment routing header regardless of its position in the chain of the extension headers in IPv6 packet, and makes sure that the IPv6 routing extension header is of Type 4. Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com> Acked-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	Merge branch 'mvpp2-comphy'	David S. Miller	6	-33/+821
	Antoine Tenart says: ==================== net: mvpp2: comphy configuration This series, following up the one one the GoP/MAC configuration, aims at stopping to depend on the firmware/bootloader configuration when using the PPv2 engine. With this series the PPv2 driver does not need to rely on a previous configuration, and dynamic reconfiguration while the kernel is running can be done (i.e. switch one port from SGMII to 10G, or the opposite). A port can now be configured in a different mode than what's done in the firmware/bootloader as well. The series first contain patches in the generic PHY framework to support what is called the comphy (common PHYs), which is an h/w block providing PHYs that can be configured in various modes ranging from SGMII, 10G to SATA and others. As of now only the SGMII and 10G modes are supported by the comphy driver. Then patches are modifying the PPv2 driver to first add the comphy initialization sequence (i.e. calls to the generic PHY framework) and to then take advantage of this to allow dynamic reconfiguration (i.e. configuring the mode of a port given what's connected, between sgmii and 10G). Note the use of the comphy in the PPv2 driver is kept optional (i.e. if not described in dt the driver still as before an relies on the firmware/bootloader configuration). Finally there are dt/defconfig patches to describe and take advantage of this. This was tested on a range of devices: 8040-db, 8040-mcbin and 7040-db. @Dave: the dt patches should go through the mvebu tree (patches 9-13). Thanks! Antoine Since v3: - Now use of_phy_simple_xlate() to retrieve the phy. - Added an owner in the phy_ops structure. - Now allow the module to be selected with COMPILE_TEST. - Removed unused parameter in the comphy set_mode functions. - Added Kishon Acked-by in patch 1. Since v2: - Kept the link mode enforcement. - Removed the netif_running() check. - Reworded the "dynamic reconfiguration of the PHY mode" commit log. - Added one patch not to force the GMAC autoneg parameters when using the XLG MAC. Since v1: - Updated the mode settings variable name in the comphy driver to have 'cp110' in it. - Documented the PHY cell argument in the dt documentation. - New patch adding comphy phandles for the 7040-db board. - Checked if the carrier_on/off functions were needed. They are. - s/PHY/generic PHY/ in commit log of patch 1. - Rebased on the latest net-next/master. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: mvpp2: dynamic reconfiguration of the comphy/GoP/MAC	Antoine Tenart	1	-1/+20
	This patch adds logic to reconfigure the comphy/GoP/MAC when the link state is updated at runtime. This is very useful on boards where many link speed are supported: depending on what is negotiated the PPv2 driver will automatically reconfigures the link between the PHY and the MAC. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: mvpp2: do not set GMAC autoneg when using XLG MAC	Antoine Tenart	1	-22/+42
	When using the XLG MAC, it does not make sense to force the GMAC autoneg parameters. This patch adds checks to only set the GMAC autoneg parameters when needed (i.e. when not using the XLG MAC). Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: mvpp2: improve the link management function	Antoine Tenart	1	-0/+11
	When the link status changes, the phylib calls the link_event function in the mvpp2 driver. Before this patch only the egress/ingress transmit was enabled/disabled. This patch adds more functionality to the link status management code by enabling/disabling the port per-cpu interrupts, and the port itself. The queues are now stopped as well, and the netif carrier helpers are called. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: mvpp2: simplify the link_event function	Antoine Tenart	1	-9/+4
	The link_event function is somewhat complicated. This cosmetic patch simplifies it. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: mvpp2: initialize the comphy	Antoine Tenart	1	-1/+43
	On some platforms, the comphy is between the MAC GoP and the PHYs. The mvpp2 driver currently relies on the firmware/bootloader to configure the comphy. As a comphy driver was added to the generic PHY framework, this patch uses it in the mvpp2 driver to configure the comphy at boot time to avoid relying on the bootloader. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	Documentation/bindings: phy: document the Marvell comphy driver	Antoine Tenart	1	-0/+43
	The Marvell Armada 7K/8K SoCs contains an hardware block called COMPHY that provides a number of shared PHYs used by various interfaces in the SoC: network, SATA, PCIe, etc. This Device Tree binding allows to describe this COMPHY hardware block. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	phy: add the mvebu cp110 comphy driver	Antoine Tenart	3	-0/+656
	On the CP110 unit, which can be found on various Marvell platforms such as the 7k and 8k (currently), a comphy (common PHYs) hardware block can be found. This block provides a number of PHYs which can be used in various modes by other controllers (network, SATA ...). These common PHYs must be configured for the controllers using them to work correctly either at boot time, or when the system runs to switch the mode used. This patch adds a driver for this comphy hardware block, providing callbacks for the its PHYs so that consumers can configure the modes used. As of this commit, two modes are supported by the comphy driver: sgmii and 10gkr. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	phy: add sgmii and 10gkr modes to the phy_mode enum	Antoine Tenart	1	-0/+2
	This patch adds more generic PHY modes to the phy_mode enum, to allow configuring generic PHYs to the SGMII and/or the 10GKR mode by using the set_mode callback. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	dp83640: don't hold spinlock while calling netif_rx_ni	Stefan Sørensen	1	-2/+5
	We should not hold a spinlock while pushing the skb into the networking stack, so move the call to netif_rx_ni out of the critical region to where we have dropped the spinlock. Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	Merge branch 'net_sched-idr'	David S. Miller	23	-413/+426
	Chris Mi says: ==================== net/sched: Improve getting objects by indexes Using current TC code, it is very slow to insert a lot of rules. In order to improve the rules update rate in TC, we introduced the following two changes: 1) changed cls_flower to use IDR to manage the filters. 2) changed all act_xxx modules to use IDR instead of a small hash table But IDR has a limitation that it uses int. TC handle uses u32. To make sure there is no regression, we add several new IDR APIs to support unsigned long. v2 == Addressed Hannes's comment: express idr_alloc in terms of idr_alloc_ext and most of the other functions ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net/sched: Change act_api and act_xxx modules to use IDR	Chris Mi	18	-344/+278
	Typically, each TC filter has its own action. All the actions of the same type are saved in its hash table. But the hash buckets are too small that it degrades to a list. And the performance is greatly affected. For example, it takes about 0m11.914s to insert 64K rules. If we convert the hash table to IDR, it only takes about 0m1.500s. The improvement is huge. But please note that the test result is based on previous patch that cls_flower uses IDR. Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net/sched: Change cls_flower to use IDR	Chris Mi	1	-32/+23
	Currently, all filters with the same priority are linked in a doubly linked list. Every filter should have a unique handle. To make the handle unique, we need to iterate the list every time to see if the handle exists or not when inserting a new filter. It is time-consuming. For example, it takes about 5m3.169s to insert 64K rules. This patch changes cls_flower to use IDR. With this patch, it takes about 0m1.127s to insert 64K rules. The improvement is huge. But please note that in this testing, all filters share the same action. If every filter has a unique action, that is another bottleneck. Follow-up patch in this patchset addresses that. Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	idr: Add new APIs to support unsigned long	Chris Mi	4	-37/+125
	The following new APIs are added: int idr_alloc_ext(struct idr idr, void ptr, unsigned long index, unsigned long start, unsigned long end, gfp_t gfp); void idr_remove_ext(struct idr idr, unsigned long id); void idr_find_ext(const struct idr idr, unsigned long id); void idr_replace_ext(struct idr idr, void ptr, unsigned long id); void idr_get_next_ext(struct idr idr, unsigned long *nextid); Signed-off-by: Chris Mi <chrism@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	Merge branch 'add-rmnet-driver'	David S. Miller	17	-0/+1428
	Subash Abhinov Kasiviswanathan says: ==================== net: Add support for rmnet driver This patch series adds support for the rmnet driver which is required to support recent chipsets using Qualcomm Technologies, Inc. modems. The data from hardware follows the multiplexing and aggregation protocol (MAP). This driver can be used to register onto any physical network device in IP mode. Physical transports include USB, HSIC, PCIe and IP accelerator. rmnet driver helps to decode these packets and queue them to network stack (and encode and transmit it to the physical device). v1: Same as the RFC patch with some minor fixes for issues reported by kbuild test robot. v1->v2: Change datatypes and remove config IOCTL as mentioned by David. Also fix checkpatch issues and remove some unused code. v2->v3: Move location to drivers/net and rename to rmnet. Change the userspace - netlink communication from custom netlink to rtnl_link_ops. Refactor some code. Use a fixed config for ingress and egress. v3->v4: Move location to drivers/net/ethernet/qualcomm/. Fix comments from Stephen and Jiri - Split the ether and arp type changes into seperate patches. Remove debug and custom logging and switch to standard netdevice log. Remove module parameters. Refactor and change some code style issues. v4->v5: Rename some structs and variables. Move the initializer before the for loop start. Put the arp type in correct sequence. v5->v6: Fix comments from Dan - Use the upper link API. As a result, remove all the refcounting logic. Device refcount is explicitly held on real_dev on rx_handler registration only. Modifiy the flow control struct. Remove the unused ethernet mode handling. v6->v7: Fix comments from David - Add newline to end of Makefile. Remove inline from .c files. Move the module init/exit to rmnet config. Fix an error reported by kbuild test robot for an unused file. v7->v8: Use a smaller value for ETH_P_MAP as mentioned by David. Change netdev_info to netdev_dbg as mentioned by Andew. Fix comments from Stephen regarding netdev_priv and sparse related errors of using 0 as NULL v8->v9: Fix comments from David - Remove the CFLAG rule. Change the way rmnet devices are freed. Instead of using a workqueue to unregister devices individually, go through the list and free all devices within the rtnl_lock(). v9->v10: Actually fix the locking as mentioned by David. The locking scheme is mentioned in a comment in rmnet_config.c. Change comment near MAP type definition as mentioned by Dan. Refactor some code. v10->v11: Allow RMNET to compile as a module as mentioned by David ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	drivers: net: ethernet: qualcomm: rmnet: Initial implementation	Subash Abhinov Kasiviswanathan	15	-0/+1424
	RmNet driver provides a transport agnostic MAP (multiplexing and aggregation protocol) support in embedded module. Module provides virtual network devices which can be attached to any IP-mode physical device. This will be used to provide all MAP functionality on future hardware in a single consistent location. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: arp: Add support for raw IP device	Subash Abhinov Kasiviswanathan	1	-0/+1
	Define the raw IP type. This is needed for raw IP net devices like rmnet. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	net: ether: Add support for multiplexing and aggregation type	Subash Abhinov Kasiviswanathan	1	-0/+3
	Define the Qualcomm multiplexing and aggregation (MAP) ether type 0x00F9. This is needed for receiving data in the MAP protocol like RMNET. This is not an officially registered ID. Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	Merge branch 'tcp-readd-hp'	David S. Miller	9	-27/+271
	Florian Westphal says: ==================== tcp: re-add header prediction Eric reported a performance regression caused by header prediction removal. We now call tcp_ack() much more frequently, for some workloads this brings in enough cache line misses to become noticeable. We could possibly still kill HP provided we find a different way to suppress unneeded tcp_ack, but given we're late in the cycle it seems preferable to revert. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	tcp: Revert "tcp: remove header prediction"	Florian Westphal	8	-6/+223
	This reverts commit 45f119bf936b1f9f546a0b139c5b56f9bb2bdc78. Eric Dumazet says: We found at Google a significant regression caused by 45f119bf936b1f9f546a0b139c5b56f9bb2bdc78 tcp: remove header prediction In typical RPC (TCP_RR), when a TCP socket receives data, we now call tcp_ack() while we used to not call it. This touches enough cache lines to cause a slowdown. so problem does not seem to be HP removal itself but the tcp_ack() call. Therefore, it might be possible to remove HP after all, provided one finds a way to elide tcp_ack for most cases. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	tcp: Revert "tcp: remove CA_ACK_SLOWPATH"	Florian Westphal	3	-22/+49
	This change was a followup to the header prediction removal, so first revert this as a prerequisite to back out hp removal. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-30	staging: irda: fix init level for irda core	Greg KH	1	-1/+1
	When moving the IRDA code out of net/ into drivers/staging/irda/net, the link order changes when IRDA is built into the kernel. That causes a kernel crash at boot time as netfilter isn't initialized yet. To fix this, move the init call level of the irda core to be device_initcall() as the link order keeps this being initialized at the correct time. Reported-by: kernel test robot <fengguang.wu@intel.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	net: bcmgenet: Do not return from void function	Florian Fainelli	1	-1/+1
	A stray return was added in the macro bcmgenet_##name##_writel where it should not, drop it. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: 69d2ea9c7989 ("net: bcmgenet: Use correct I/O accessors") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	neigh: increase queue_len_bytes to match wmem_default	Eric Dumazet	7	-16/+19
	Florian reported UDP xmit drops that could be root caused to the too small neigh limit. Current limit is 64 KB, meaning that even a single UDP socket would hit it, since its default sk_sndbuf comes from net.core.wmem_default (~212992 bytes on 64bit arches). Once ARP/ND resolution is in progress, we should allow a little more packets to be queued, at least for one producer. Once neigh arp_queue is filled, a rogue socket should hit its sk_sndbuf limit and either block in sendmsg() or return -EAGAIN. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	net: remove dmaengine.h inclusion from netdevice.h	Dave Jiang	1	-1/+0
	Since the removal of NET_DMA, dmaengine.h header file shouldn't be needed by netdevice.h anymore. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	net: bcmgenet: Use correct I/O accessors	Florian Fainelli	2	-30/+58
	The GENET driver currently uses __raw_{read,write}l which means native I/O endian. This works correctly for an ARM LE kernel (default) but fails miserably on an ARM BE (BE8) kernel where registers are kept little endian, so replace uses with {read,write}l_relaxed here which is what we want because this is all performance sensitive code. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	liquidio: show NIC's U-Boot version in a dev_info() message	Weilin Chang	1	-0/+78
	Signed-off-by: Weilin Chang <weilin.chang@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	net: dsa: make some structures const	Bhumika Goyal	2	-2/+2
	Make these const as they are not modified anywhere. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	ipv6: Use rt6i_idev index for echo replies to a local address	David Ahern	2	-13/+30
	Tariq repored local pings to linklocal address is failing: $ ifconfig ens8 ens8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 11.141.16.6 netmask 255.255.0.0 broadcast 11.141.255.255 inet6 fe80::7efe:90ff:fecb:7502 prefixlen 64 scopeid 0x20<link> ether 7c:fe:90:cb:75:02 txqueuelen 1000 (Ethernet) RX packets 12 bytes 1164 (1.1 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 30 bytes 2484 (2.4 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 $ /bin/ping6 -c 3 fe80::7efe:90ff:fecb:7502%ens8 PING fe80::7efe:90ff:fecb:7502%ens8(fe80::7efe:90ff:fecb:7502) 56 data bytes Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	amd-xgbe: Interrupt summary bits are h/w version dependent	Tom Lendacky	2	-5/+16
	There is a difference in the bit position of the normal interrupt summary enable (NIE) and abnormal interrupt summary enable (AIE) between revisions of the hardware. For older revisions the NIE and AIE bits are positions 16 and 15 respectively. For newer revisions the NIE and AIE bits are positions 15 and 14. The effect in changing the bit position is that newer hardware won't receive AIE interrupts in the current version of the driver. Specifically, the driver uses this interrupt to collect statistics on when a receive buffer unavailable event occurs and to restart the driver/device when a fatal bus error occurs. Update the driver to set the interrupt enable bit based on the reported version of the hardware. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	Merge branch 'nsh-headers-GSO'	David S. Miller	10	-31/+467
	Jiri Benc says: ==================== nsh: headers, GSO This adds header structs and helpers for NSH together with GSO support. Note there is no code in this patchset that actually manipulates the NSH headers. That was sent to netdev by Yi Yang ("[PATCH net-next v6 0/3] openvswitch: add NSH support"). The aim of this series is to lay the groundwork and ease the implementation for him. In addition to openvswitch, the NSH support should be added to tc (flower to match, act_nsh to push/pop NSH headers). That will come later. There's currently no plan to support NSH by other means than those two. The patch 3 in this patchset was written by Yi Yang, I took it from the aforementioned series and slightly modified it - see the note in the patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	nsh: add GSO support	Jiri Benc	5	-0/+103
	Add a new nsh/ directory. It currently holds only GSO functions but more will come: in particular, code shared by openvswitch and tc to manipulate NSH headers. For now, assume there's no hardware support for NSH segmentation. We can always introduce netdev->nsh_features later. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	net: add NSH header structures and helpers	Yi Yang	1	-0/+307
	NSH (Network Service Header)[1] is a new protocol for service function chaining, it can be handled as a L3 protocol like IPv4 and IPv6, Eth + NSH + Inner packet or VxLAN-gpe + NSH + Inner packet are two typical use cases. This patch adds NSH header structures and helpers for NSH GSO support and Open vSwitch NSH support. [1] https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/ [Jiri: added nsh_hdr() helper and renamed the header struct to "struct nshhdr" to match the usual pattern. Removed packet type defines, these are now shared with VXLAN-GPE.] Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-29	vxlan: factor out VXLAN-GPE next protocol	Jiri Benc	3	-31/+56
	The values are shared between VXLAN-GPE and NSH. Originally probably by coincidence but I notified both working groups about this last year and they seem to keep the values in sync since then. Hopefully they'll get a single IANA registry for the values, too. (I asked them for that.) Factor out the code to be shared by the NSH implementation. NSH and MPLS values are added in this patch, too. For MPLS, the drafts incorrectly assign only a single value, while we have two MPLS ethertypes. I raised the problem with both groups. For now, I assume the value is for unicast. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>