linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-03-14	net: add a hardware buffer management helper API	Gregory CLEMENT	4	-0/+119
	This basic implementation allows to share code between driver using hardware buffer management. As the code is hardware agnostic, there is few helpers, most of the optimization brought by the an HW BM has to be done at driver level. Tested-by: Sebastian Careba <nitroshift@yahoo.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	net: mvneta: bm: add support for hardware buffer management	Marcin Wojtas	7	-39/+1285
	Buffer manager (BM) is a dedicated hardware unit that can be used by all ethernet ports of Armada XP and 38x SoC's. It allows to offload CPU on RX path by sparing DRAM access on refilling buffer pool, hardware-based filling of descriptor ring data and better memory utilization due to HW arbitration for using 'short' pools for small packets. Tests performed with A388 SoC working as a network bridge between two packet generators showed increase of maximum processed 64B packets by ~20k (~555k packets with BM enabled vs ~535 packets without BM). Also when pushing 1500B-packets with a line rate achieved, CPU load decreased from around 25% without BM to 20% with BM. BM comprise up to 4 buffer pointers' (BP) rings kept in DRAM, which are called external BP pools - BPPE. Allocating and releasing buffer pointers (BP) to/from BPPE is performed indirectly by write/read access to a dedicated internal SRAM, where internal BP pools (BPPI) are placed. BM hardware controls status of BPPE automatically, as well as assigning proper buffers to RX descriptors. For more details please refer to Functional Specification of Armada XP or 38x SoC. In order to enable support for a separate hardware block, common for all ports, a new driver has to be implemented ('mvneta_bm'). It provides initialization sequence of address space, clocks, registers, SRAM, empty pools' structures and also obtaining optional configuration from DT (please refer to device tree binding documentation). mvneta_bm exposes also a necessary API to mvneta driver, as well as a dedicated structure with BM information (bm_priv), whose presence is used as a flag notifying of BM usage by port. It has to be ensured that mvneta_bm probe is executed prior to the ones in ports' driver. In case BM is not used or its probe fails, mvneta falls back to use software buffer management. A sequence executed in mvneta_probe function is modified in order to have an access to needed resources before possible port's BM initialization is done. According to port-pools mapping provided by DT appropriate registers are configured and the buffer pools are filled. RX path is modified accordingly. Becaues the hardware allows a wide variety of configuration options, following assumptions are made: * using BM mechanisms can be selectively disabled/enabled basing on DT configuration among the ports * 'long' pool's single buffer size is tied to port's MTU * using 'long' pool by port is obligatory and it cannot be shared * using 'short' pool for smaller packets is optional * one 'short' pool can be shared among all ports This commit enables hardware buffer management operation cooperating with existing mvneta driver. New device tree binding documentation is added and the one of mvneta is updated accordingly. [gregory.clement@free-electrons.com: removed the suspend/resume part] Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	bus: mvebu-mbus: provide api for obtaining IO and DRAM window information	Marcin Wojtas	2	-0/+55
	This commit enables finding appropriate mbus window and obtaining its target id and attribute for given physical address in two separate routines, both for IO and DRAM windows. This functionality is needed for Armada XP/38x Network Controller's Buffer Manager and PnC configuration. [gregory.clement@free-electrons.com: Fix size test for mvebu_mbus_get_dram_win_info] Signed-off-by: Marcin Wojtas <mw@semihalf.com> [DRAM window information reference in LKv3.10] Signed-off-by: Evan Wang <xswang@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	ARM: dts: armada-xp-openblocks-ax3-4: Add BM support	Gregory CLEMENT	1	-1/+18
	Allow Openblock AX3 using hardware buffer management with mvneta. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	ARM: dts: armada-xp: enable buffer manager support on Armada XP boards	Marcin Wojtas	2	-2/+36
	Since mvneta driver supports using hardware buffer management (BM), in order to use it, board files have to be adjusted accordingly. This commit enables BM on AXP-DB and AXP-GP in same manner - because number of ports on those boards is the same as number of possible pools, each port is supposed to use single pool for all kind of packets. Moreover appropriate entry is added to 'soc' node ranges, as well as "okay" status for 'bm' and 'bm-bppi' (internal SRAM) nodes. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	ARM: dts: armada-xp: add buffer manager nodes	Marcin Wojtas	1	-0/+19
	Armada XP network controller supports hardware buffer management (BM). Since it is now enabled in mvneta driver, appropriate nodes can be added to armada-xp.dtsi - for the actual common BM unit (bm@c0000) and its internal SRAM (bm-bppi), which is used for indirect access to buffer pointer ring residing in DRAM. Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional parameters are supposed to be set in board files. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	ARM: dts: armada-38x: enable buffer manager support on Armada 38x boards	Marcin Wojtas	5	-4/+71
	Since mvneta driver supports using hardware buffer management (BM), in order to use it, board files have to be adjusted accordingly. This commit enables BM on: * A385-DB-AP - each port has its own pool for long and common pool for short packets, * A388-ClearFog - same as above, * A388-DB - to each port unique 'short' and 'long' pools are mapped, * A388-GP - same as above. Moreover appropriate entry is added to 'soc' node ranges, as well as "okay" status for 'bm' and 'bm-bppi' (internal SRAM) nodes. [gregory.clement@free-electrons.com: add suppport for the ClearFog board] Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	ARM: dts: armada-38x: add buffer manager nodes	Marcin Wojtas	1	-0/+19
	Armada 38x network controller supports hardware buffer management (BM). Since it is now enabled in mvneta driver, appropriate nodes can be added to armada-38x.dtsi - for the actual common BM unit (bm@c8000) and its internal SRAM (bm-bppi), which is used for indirect access to buffer pointer ring residing in DRAM. Pools - ports mapping, bm-bppi entry in 'soc' node's ranges and optional parameters are supposed to be set in board files. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-14	misc: sram: add optional ioremap without write combining	Marcin Wojtas	2	-1/+9
	Some SRAM users may require non-bufferable access to the memory, which is impossible, because devm_ioremap_wc() is used for setting sram->virt_base. This commit adds optional flag 'no-memory-wc', which allow to choose remap method, using DT property. Documentation is updated accordingly. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	GSO/UDP: Use skb->len instead of udph->len to determine length of original skb	Alexander Duyck	1	-5/+10
	It is possible for tunnels to end up generating IP or IPv6 datagrams that are larger than 64K and expecting to be segmented. As such we need to deal with length values greater than 64K. In order to accommodate this we need to update the code to work with a 32b length value instead of a 16b one. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	ipv6: Pass proto to csum_ipv6_magic as __u8 instead of unsigned short	Alexander Duyck	18	-29/+21
	This patch updates csum_ipv6_magic so that it correctly recognizes that protocol is a unsigned 8 bit value. This will allow us to better understand what limitations may or may not be present in how we handle the data. For example there are a number of places that call htonl on the protocol value. This is likely not necessary and can be replaced with a multiplication by ntohl(1) which will be converted to a shift by the compiler. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	ipv4: Update parameters for csum_tcpudp_magic to their original types	Alexander Duyck	34	-150/+106
	This patch updates all instances of csum_tcpudp_magic and csum_tcpudp_nofold to reflect the types that are usually used as the source inputs. For example the protocol field is populated based on nexthdr which is actually an unsigned 8 bit value. The length is usually populated based on skb->len which is an unsigned integer. This addresses an issue in which the IPv6 function csum_ipv6_magic was generating a checksum using the full 32b of skb->len while csum_tcpudp_magic was only using the lower 16 bits. As a result we could run into issues when attempting to adjust the checksum as there was no protocol agnostic way to update it. With this change the value is still truncated as many architectures use "(len + proto) << 8", however this truncation only occurs for values greater than 16776960 in length and as such is unlikely to occur as we stop the inner headers at ~64K in size. I did have to make a few minor changes in the arm, mn10300, nios2, and score versions of the function in order to support these changes as they were either using things such as an OR to combine the protocol and length, or were using ntohs to convert the length which would have truncated the value. I also updated a few spots in terms of whitespace and type differences for the addresses. Most of this was just to make sure all of the definitions were in sync going forward. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	ipv4: Don't do expensive useless work during inetdev destroy.	David S. Miller	3	-2/+18
	When an inetdev is destroyed, every address assigned to the interface is removed. And in this scenerio we do two pointless things which can be very expensive if the number of assigned interfaces is large: 1) Address promotion. We are deleting all addresses, so there is no point in doing this. 2) A full nf conntrack table purge for every address. We only need to do this once, as is already caught by the existing masq_dev_notifier so masq_inet_event() can skip this. Reported-by: Solar Designer <solar@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
2016-03-13	macsec: introduce IEEE 802.1AE driver	Sabrina Dubroca	3	-0/+3305
	This is an implementation of MACsec/IEEE 802.1AE. This driver provides authentication and encryption of traffic in a LAN, typically with GCM-AES-128, and optional replay protection. http://standards.ieee.org/getieee802/download/802.1AE-2006.pdf Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	net: add MACsec netdevice priv_flags and helper	Sabrina Dubroca	1	-0/+8
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	uapi: add MACsec bits	Sabrina Dubroca	4	-0/+192
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	net: socket: use pr_info_once to tip the obsolete usage of PF_PACKET	liping.zhang	1	-6/+2
	There is no need to use the static variable here, pr_info_once is more concise. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	at803x: fix suspend/resume for SGMII link	Zefir Kurtisi	1	-0/+26
	When operating the at803x in SGMII mode, resuming the chip from power down brings up the copper-side link but leaves the SGMII link in unconnected state (tested with at8031 attached to gianfar). In effect, this caused a permanent link loss once the related interface was put down. This patch ensures that power down handling in supspend() and resume() is also applied to the SGMII link. Signed-off-by: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	mlx5: use napi_consume_skb API to get bulk free operations	Jesper Dangaard Brouer	3	-4/+4
	Bulk free of SKBs happen transparently by the API call napi_consume_skb(). The napi budget parameter is needed by napi_consume_skb() to detect if called from netpoll. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	mlx4: use napi_consume_skb API to get bulk free operations	Jesper Dangaard Brouer	1	-6/+9
	Bulk free of SKBs happen transparently by the API call napi_consume_skb(). The napi budget parameter is usually needed by napi_consume_skb() to detect if called from netpoll. In this patch it has an extra meaning. For mlx4 driver, the mlx4_en_stop_port() call is done outside NAPI/softirq context, and cleanup the entire TX ring via mlx4_en_free_tx_buf(). The code mlx4_en_free_tx_desc() for freeing SKBs are shared with NAPI calls. To handle this shared use the zero budget indication is reused, and handled appropriately in napi_consume_skb(). To reflect this, variable is called napi_mode for the function call that needed this distinction. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	net: adjust napi_consume_skb to handle non-NAPI callers	Jesper Dangaard Brouer	1	-2/+2
	Some drivers reuse/share code paths that free SKBs between NAPI and non-NAPI calls. Adjust napi_consume_skb to handle this use-case. Before, calls from netpoll (w/ IRQs disabled) was handled and indicated with a budget zero indication. Use the same zero indication to handle calls not originating from NAPI/softirq. Simply handled by using dev_consume_skb_any(). This adds an extra branch+call for the netpoll case (checking in_irq() + irqs_disabled()), but that is okay as this is a slowpath. Suggested-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	r8169:Remove unnecessary phy reset for pcie nic when setting link spped.	Chun-Hao Lin	1	-1/+2
	For pcie nic, after setting link speed and there is no link driver does not need to do phy reset until link up. For some pcie nics, to do this will also reset phy speed down counter and prevent phy from auto speed down. This patch fix the issue reported in following link. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1547151 Signed-off-by: Chunhao Lin <hau@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	mlxsw: pci: Implement reset done check	Jiri Pirko	2	-4/+14
	Firmware now tells us that the reset is done by passing a magic value via register. Use it to shorten the wait in case this is supported. With old firmware, we still wait until the timeout is reached. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	sctp: allow sctp_transmit_packet and others to use gfp	Marcelo Ricardo Leitner	9	-72/+89
	Currently sctp_sendmsg() triggers some calls that will allocate memory with GFP_ATOMIC even when not necessary. In the case of sctp_packet_transmit it will allocate a linear skb that will be used to construct the packet and this may cause sends to fail due to ENOMEM more often than anticipated specially with big MTUs. This patch thus allows it to inherit gfp flags from upper calls so that it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or similar. All others, like retransmits or flushes started from BH, are still allocated using GFP_ATOMIC. In netperf tests this didn't result in any performance drawbacks when memory is not too fragmented and made it trigger ENOMEM way less often. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	ovs: allow nl 'flow set' to use ufid without flow key	Samuel Gauthier	1	-11/+17
	When we want to change a flow using netlink, we have to identify it to be able to perform a lookup. Both the flow key and unique flow ID (ufid) are valid identifiers, but we always have to specify the flow key in the netlink message. When both attributes are there, the ufid is used. The flow key is used to validate the actions provided by the userland. This commit allows to use the ufid without having to provide the flow key, as it is already done in the netlink 'flow get' and 'flow del' path. The flow key remains mandatory when an action is provided. Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	net: macb: fix default configuration for GMAC on AT91	Nicolas Ferre	2	-7/+8
	On AT91 SoCs, the User Register (USRIO) exposes a switch to configure the "Reduced" or "Traditional" version of the Media Independent Interface (RMII vs. MII or RGMII vs. GMII). As on the older EMAC version, on GMAC, this switch is set by default to the non-reduced type of interface, so use the existing capability and extend it to GMII as well. We then keep the current logic in the macb_init() function. The capabilities of sama5d2, sama5d4 and sama5d3 GEM interface are updated in the macb_config structure to be able to properly enable them with a traditional interface (GMII or MII). Reported-by: Romain HENRIET <romain.henriet@l-acoustics.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	phy: remove documentation of removed members of phy_device structure	LABBE Corentin	1	-3/+0
	Commit e5a03bfd873c ("phy: Add an mdio_device structure") removed addr, bus and dev member of the phy_device structure. This patch remove the documentation about those members. Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	xen-netback: reduce log spam	Paul Durrant	1	-2/+0
	Remove the "prepare for reconnect" pr_info in xenbus.c. It's largely uninteresting and the states of the frontend and backend can easily be observed by watching the (o)xenstored log. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	xen-netback: support multiple extra info fragments passed from frontend	Paul Durrant	2	-23/+43
	The code does not currently support a frontend passing multiple extra info fragments to the backend in a tx request. The xenvif_get_extras() function handles multiple extra_info fragments but make_tx_response() assumes there is only ever a single extra info fragment. This patch modifies xenvif_get_extras() to pass back a count of extra info fragments, which is then passed to make_tx_response() (after possibly being stashed in pending_tx_info for deferred responses). Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	xen-netback: re-import canonical netif header	Paul Durrant	1	-95/+766
	The canonical netif header (in the Xen source repo) and the Linux variant have diverged significantly. Recently much documentation has been added to the canonical header which is highly useful for developers making modifications to either xen-netfront or xen-netback. This patch therefore re-imports the canonical header in its entirity. To maintain compatibility and some style consistency with the old Linux variant, the header was stripped of its emacs boilerplate, and post-processed and copied into place with the following commands: ed -s netif.h << EOF H ,s/NETTXF_/XEN_NETTXF_/g ,s/NETRXF_/XEN_NETRXF_/g ,s/NETIF_/XEN_NETIF_/g ,s/XEN_XEN_/XEN_/g ,s/netif/xen_netif/g ,s/xen_xen_/xen_/g ,s/^typedef.*$//g ,s/^ /${TAB}/g w $ w EOF indent --line-length 80 --linux-style netif.h \ -o include/xen/interface/io/netif.h Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	netconf: add macro to represent all attributes	Zhang Shengju	3	-32/+45
	This patch adds macro NETCONFA_ALL to represent all type of netconf attributes for IPv4 and IPv6. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	sctp: fix the transports round robin issue when init is retransmitted	Xin Long	2	-2/+2
	prior to this patch, at the beginning if we have two paths in one assoc, they may have the same params other than the last_time_heard, it will try the paths like this: 1st cycle try trans1 fail. then trans2 is selected.(cause it's last_time_heard is after trans1). 2nd cycle: try trans2 fail then trans2 is selected.(cause it's last_time_heard is after trans1). 3rd cycle: try trans2 fail then trans2 is selected.(cause it's last_time_heard is after trans1). .... trans1 will never have change to be selected, which is not what we expect. we should keeping round robin all the paths if they are just added at the beginning. So at first every tranport's last_time_heard should be initialized 0, so that we ensure they have the same value at the beginning, only by this, all the transports could get equal chance to be selected. Then for sctp_trans_elect_best, it should return the trans_next one when trans == trans_next, so that we can try next if it fails, but now it always return trans. so we can fix it by exchanging these two params when we calls sctp_trans_elect_tie(). Fixes: 4c47af4d5eb2 ('net: sctp: rework multihoming retransmission path selection to rfc4960') Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	rxrpc: Replace all unsigned with unsigned int	David Howells	8	-39/+39
	Replace all "unsigned" types with "unsigned int" types. Reported-by: David Miller <davem@davemloft.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	csum: Update csum_block_add to use rotate instead of byteswap	Alexander Duyck	1	-6/+6
	The code for csum_block_add was doing a funky byteswap to swap the even and odd bytes of the checksum if the offset was odd. Instead of doing this we can save ourselves some trouble and just shift by 8 as this should have the same effect in terms of the final checksum value and only requires one instruction. In addition we can update csum_block_sub to just use csum_block_add with a inverse value for csum2. This way we follow the same code path as csum_block_add without having to duplicate it. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-13	gro: Defer clearing of flush bit in tunnel paths	Alexander Duyck	4	-11/+5
	This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that we do not clear the flush bit until after we have called the next level GRO handler. Previously this was being cleared before parsing through the list of frames, however this resulted in several paths where either the bit needed to be reset but wasn't as in the case of FOU, or cases where it was being set as in GENEVE. By just deferring the clearing of the bit until after the next level protocol has been parsed we can avoid any unnecessary bit twiddling and avoid bugs. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-12	rocker: move ageing_time from struct rocker to struct ofdpa	Jiri Pirko	3	-7/+7
	This is OF-DPA specific, used only there, similar to ofdpa_port->ageing_time. So move it to OF-DPA code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	qed: Enlrage the drain timeout	Yuval Mintz	1	-2/+2
	In the scenario where slowpath configuration isn't passing due to various pause configurations affecting the chip, the theoretical time required in worst-case-scenario to empty hw fifos sufficiently to guarantee that slowpath configuration would flow is currently insufficient. This increases such a drain request to the theoretical maximum. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	qed: Notify of transciever changes	Zvi Nachmani	2	-0/+41
	Handle a new message from the MFW, one that indicate that the transciever state has changed, and log that into the system logs. Signed-off-by: Zvi Nachmani <Zvi.Nachmani@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	qed: Major changes to MB locking	Tomer Tayar	2	-97/+167
	Driver interaction with the managemnt firmware is done via mailbox commands which the management firmware periodically sample, as well as placing of additional data in set places in the shared memory. Each PF has a single designated mailbox address, and all flows that require messaging to the management should use it. This patch does 2 things: 1. It re-defines the critical section surrounding the mailbox sending - that section should include the setting of the shared memory as well as the sending of the command [otherwise a race might send a command with the data of a different command]. 2. It moves the locking scheme from using mutices into using spinlocks. This lays the groundwork for sending MFW commands from non-sleepable contexts. Signed-off-by: Tomer Tayar <Tomer.Tayar@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	qed: Prevent MF link notifications	Sudarsana Reddy Kalluru	2	-1/+9
	When device is configured for Multi-function mode, some older management firmware might incorrectly notify interfaces of link changes while they haven't requested the physical link configuration to be set. This can create bizzare race conditions where unloading interfaces are getting notified that the link is up. Let the driver compensate - store the logical requested state of the link and don't propagate notifications after protocol driver explicitly requires the link to be unset. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	bpf: support flow label for bpf_skb_{set, get}_tunnel_key	Daniel Borkmann	2	-2/+13
	This patch extends bpf_tunnel_key with a tunnel_label member, that maps to ip_tunnel_key's label so underlying backends like vxlan and geneve can propagate the label to udp_tunnel6_xmit_skb(), where it's being set in the IPv6 header. It allows for having 20 more bits to encode/decode flow related meta information programmatically. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	geneve: support setting IPv6 flow label	Daniel Borkmann	2	-8/+28
	This work adds support for setting the IPv6 flow label for geneve per device and through collect metadata (ip_tunnel_key) frontends. Also here, the geneve dst cache does not need any special considerations, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	vxlan: support setting IPv6 flow label	Daniel Borkmann	3	-5/+23
	This work adds support for setting the IPv6 flow label for vxlan per device and through collect metadata (ip_tunnel_key) frontends. The vxlan dst cache does not need any special considerations here, for the cases where caches can be used, the label is static per cache. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	ip_tunnel: add support for setting flow label via collect metadata	Daniel Borkmann	7	-10/+15
	This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label from call sites. Currently, there's no such option and it's always set to zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so that flow-based tunnels via collect metadata frontends can make use of it. vxlan and geneve will be converted to add flow label support separately. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	cisco: enic: Update logging macros and uses	Joe Perches	6	-37/+43
	Don't hide varibles used by the logging macros. Miscellanea: o Use the more common ##__VA_ARGS__ extension o Add missing newlines to formats o Realign arguments Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	bridge: allow zero ageing time	Stephen Hemminger	2	-7/+8
	This fixes a regression in the bridge ageing time caused by: commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") There are users of Linux bridge which use the feature that if ageing time is set to 0 it causes entries to never expire. See: https://www.linuxfoundation.org/collaborate/workgroups/networking/bridge For a pure software bridge, it is unnecessary for the code to have arbitrary restrictions on what values are allowable. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	rocker: set FDB cleanup timer according to lowest ageing time	Ido Schimmel	3	-1/+7
	In rocker, ageing time is a per-port attribute, so the next time the FDB cleanup timer fires should be set according to the lowest ageing time. This will later allow us to delete the BR_MIN_AGEING_TIME macro, which was added to guarantee minimum ageing time in the bridge layer, thereby breaking existing behavior. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	mlxsw: spectrum: Check requested ageing time is valid	Ido Schimmel	2	-2/+9
	Commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") added a check for minimum and maximum ageing time, but this breaks existing behaviour where one can set ageing time to 0 for a non-learning bridge. Push this check down to the driver and allow the check in the bridge layer to be removed. Currently ageing time 0 is refused by the driver, but we can later add support for this functionality. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	macvtap: always pass ethernet header in linear	Willem de Bruijn	1	-3/+6
	The stack expects link layer headers in the skb linear section. Macvtap can create skbs with llheader in frags in edge cases: when (IFF_VNET_HDR is off or vnet_hdr.hdr_len < ETH_HLEN) and prepad + len > PAGE_SIZE and vnet_hdr.flags has no or bad csum. Add checks to ensure linear is always at least ETH_HLEN. At this point, len is already ensured to be >= ETH_HLEN. For backwards compatiblity, rounds up short vnet_hdr.hdr_len. This differs from tap and packet, which return an error. Fixes b9fb9ee07e67 ("macvtap: add GSO/csum offload support") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-11	net/flower: Fix pointer cast	Amir Vadai	2	-7/+7
	Cast pointer to unsigned long instead of u64, to fix compilation warning on 32 bit arch, spotted by 0day build. Fixes: 5b33f48 ("net/flower: Introduce hardware offload support") Signed-off-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>