aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-12-15net: diag: Support destroying TCP sockets.Lorenzo Colitti6-0/+68
This implements SOCK_DESTROY for TCP sockets. It causes all blocking calls on the socket to fail fast with ECONNABORTED and causes a protocol close of the socket. It informs the other end of the connection by sending a RST, i.e., initiating a TCP ABORT as per RFC 793. ECONNABORTED was chosen for consistency with FreeBSD. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: diag: Support SOCK_DESTROY for inet sockets.Lorenzo Colitti2-8/+19
This passes the SOCK_DESTROY operation to the underlying protocol diag handler, or returns -EOPNOTSUPP if that handler does not define a destroy operation. Most of this patch is just renaming functions. This is not strictly necessary, but it would be fairly counterintuitive to have the code to destroy inet sockets be in a function whose name starts with inet_diag_get. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: diag: Add the ability to destroy a socket.Lorenzo Colitti4-3/+24
This patch adds a SOCK_DESTROY operation, a destroy function pointer to sock_diag_handler, and a diag_destroy function pointer. It does not include any implementation code. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: diag: split inet_diag_dump_one_icsk into twoLorenzo Colitti2-15/+32
Currently, inet_diag_dump_one_icsk finds a socket and then dumps its information to userspace. Split it into a part that finds the socket and a part that dumps the information. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Merge branch 'ila-early-demux'David S. Miller13-81/+988
Tom Herbert says: ==================== ila: Optimization to preserve value of early demux In the current implementation of ILA, LWT is used to perform translation on both the input and output paths. This is functional, however there is a big performance hit in the receive path. Early demux occurs before the routing lookup (a hit actually obviates the route lookup). Therefore the stack currently performs early demux before translation so that a local connection with ILA addresses is never matched. Note that this issue is not just with ILA, but pretty much any translated or encapsulated packet handled by LWT would miss the opportunity for early demux. Solving the general problem seems non trivial since we would need to move the route lookup before early demx thereby mitigating the value. This patch set addresses the issue for ILA by adding a fast locator lookup that occurs before early demux. This done by hooking in to NF_INET_PRE_ROUTING For the backend we implement an rhashtable that contains identifier to locator to mappings. The table also allows more specific matches that include original locator and interface. This patch set: - Add an rhashtable function to atomically replace and element. This is useful to implement sub-trees from a table entry without needing to use a special anchor structure as the table entry. - Add a start callback for starting a netlink dump. - Creates an ila directory under net/ipv6 and moves ila.c to it. ila.c is split into ila_common.c and ila_lwt.c. - Implement a table to do identifier->locator mapping. This is an rhashtable (in ila_xlat.c). - Configuration for the table with netlink. - Add a hook into NF_INET_PRE_ROUTING to perform ILA translation before early demux. Changes in v2: - Use iptables targets instead of a new xfrm function Changes in v3: - Add __rcu to next pointer in struct ila_map Changes in v4: - Use hook for NF_INET_PRE_ROUTING Changed in v5: - Register hooks per namespace using nf_register_net_hooks - Only register hooks when first mapping is actually added Changed in v6: - Remove gfp argument in alloc_ila_locks, it is unnecessary - Set registered_hooks properly when hooks are registered Testing: Running 200 netperf TCP_RR streams No ILA, baseline 79.26% CPU utilization 1678282 tps 104/189/390 50/90/99% latencies ILA before fix (LWT on both input and output) 81.91% CPU utilization 1464723 tps (-14.5% from baseline) 121/215/411 50/90/99% latencies ILA after fix 80.62% CPU utilization 1622985 (-3.4% from baseline) 110/191/347 50/90/99% latencies ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15ila: Add generic ILA translation facilityTom Herbert6-1/+731
This patch implements an ILA tanslation table. This table can be configured with identifier to locator mappings, and can be be queried to resolve a mapping. Queries can be parameterized based on interface, direction (incoming or outoing), and matching locator. The table is implemented using rhashtable and is configured via netlink (through "ip ila .." in iproute). The table may be used as alternative means to do do ILA tanslations other than the lw tunnels Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15netlink: add a start callback for starting a netlink dumpTom Herbert4-0/+24
The start callback allows the caller to set up a context for the dump callbacks. Presumably, the context can then be destroyed in the done callback. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15rhashtable: add function to replace an elementTom Herbert1-0/+82
Add the rhashtable_replace_fast function. This replaces one object in the table with another atomically. The hashes of the new and old objects must be equal. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15ila: Create net/ipv6/ila directoryTom Herbert5-81/+152
Create ila directory in preparation for supporting other hooks in the kernel than LWT for doing ILA. This includes: - Moving ila.c to ila/ila_lwt.c - Splitting out some common functions into ila_common.c Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Merge branch 'stmmac-mdio-compat'David S. Miller8-28/+66
Merge branch 'stmmac-mdio-compat' Phil Reid says: ==================== stmmac: create of compatible mdio bus for stmacc driver Provide ability to specify a fixed phy in the device tree and retain the mdio bus if no phy is found. This is needed where a dsa is connected via a fixed phy and uses the mdio bus for config. Fixed ptp ref clock calculatins for the stmmac when ptp ref clock is running at <= 50Mhz. Also add device tree setting to config ptp clk source on socfpga platforms. Changes from V5: - Restore behaviour of unregister mdio bus when no phys found if there is no device tree node create the bus. - Modify condition to allocate mdio_base_data conditional on fixed phy presece as well. Maintains existing behaviour in conditions where a fixed phy is not present. Changes from V4: - Restore #ifdef CONFIG_OF around setting of reset_gpio. Member doesn't exist when this isn't defined. Changes from V3: - Use if (IS_ENABLED(CONFIG_OF)) instead of #if. Reorder some code to reduce if statements. - of_mdiobus_register already falls back to mdiobus_register - Tested on system with CONFIG_OF Changes from V2: - Formatting, spaces & lines > 80 chars. Using checkpatch - Drop PTP register debugfs patch. Changes from V1: - Fixed mismatch doc / code for ptp_ref_clk dt node. - Remove unit address from doc example. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15stmmac: socfpga: Provide dt node to config ptp clk source.Phil Reid2-0/+11
Provides an options to use the ptp clock routed from the Altera FPGA fabric. Instead of the defalt eosc1 clock connected to the ARM HPS core. This setting affects all emacs in the core as the ptp clock is common. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Phil Reid <preid@electromag.com.au> Acked-by: Dinh Nguyen <dinguyen@opensource.altera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15stmmac: Fix calculations for ptp counters when clock input = 50Mhz.Phil Reid3-15/+15
stmmac_config_sub_second_increment set the sub second increment to 20ns. Driver is configured to use the fine adjustment method where the sub second register is incremented when the acculumator incremented by the addend register wraps overflows. This accumulator is update on every ptp clk cycle. If a ptp clk with a period of greater than 20ns was used the sub second register would not get updated correctly. Instead set the sub sec increment to twice the period of the ptp clk. This result in the addend register being set mid range and overflow the accumlator every 2 clock cycles. Signed-off-by: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15stmmac: Correct documentation on stmmac clocks.Phil Reid1-9/+8
devm_get_clk looks in clock-name property for matching clock. the ptp_ref_clk property is ignored. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15stmmac: create of compatible mdio bus for stmmac driverPhil Reid3-4/+32
The DSA driver needs to be passed a reference to an mdio bus. Typically the mac is configured to use a fixed link but the mdio bus still needs to be registered so that it con configure the switch. This patch follows the same process as the altera tse ethernet driver for creation of the mdio bus. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Merge branch 'end-of-ip-csum'David S. Miller41-114/+437
Tom Herbert says: ==================== net: The beginning of the end for NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM Background: This patch set starts to address one front in the battle against protocol ossification. Protocol ossification describes the state that we have arrived at in the evolution of the Internet where we are materially limited to only using a very narrow range of protocols and protocol features. For instance, only TCP and UDP is sufficiently supported on the Internet so that deploying alternative protocols, such as SCTP and DCCP, are non-starters. Similarly, IP options and IPv6 extension headers are typically not considered feasible for wide deployment, so we have loss the extensibility of IP protocols. Protocol ossification is not only a problem on the Internet, but in the data center as well. A root cause of this seems to be narrow, protocol specific optimizations implemented in switches (for doing EMCP) and in NICs (NIC offloads). These tend to be performance optimization around TCP and UDP packets, and these have become requirements to implement performant network solutions at scale. Attempts to deal with protocol ossification in data center have yielded ad hoc, sub-optimal solutions. A main driver of foo-over-UDP (e.g. GRE/UDP, MPLS/UDP) is to leverage the existing EMCP and RSS support for UDP by setting the source port as an entropy value. This has seen some success, but the cost of additional overhead and layering limits its usefulness. An even more extreme solution is STT where non-TCP packets are spoofed as TCP to leverage NIC offloads. This patch set endeavours to address protocol ossification caused by techniques used in transmit checksum offload for NICs. Future work will address protocol ossification in the other primary NIC offloads-- namely receive checksum offload, LSO, LRO, and RSS. NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM: NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM exemplify the problem of protocol ossification. These features are relics from a simpler time in the Internet, before encapsulation, before GRE and IPIP. Many hardware vendors only saw the need to provide checksum offload for simple UDP and TCP packets over IPv4 (IPv6 support is an afterthought also). In today's Internet and data centers, checksum offload is well established as a valuable feature, but we can no longer afford to be contsrained to use a handful of protocols and features that are supported at the discretion of NIC vendors. Generic and protocol agnostic methods are needed. The actual interface that the stack uses with drivers for checksum offload is CHECKSUM_PARTIAL. This is a generic and protocol agnostic interface. A driver for a device that supports this generic interface advertises NETIF_F_HW_CSUM. Goals of this patch set: We propose that drivers advertise NETIF_F_HW_CSUM instead of protocol specific values of NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM. If the driver's device is constrained (for instance it can only offlaod simple IPv4 and IPv6 packets) then these constraints can be checked in the transmit path and skb_checksum_help would be called for packets that the driver is unable to offload. In order to facilitate this, we add some helper functions that takes a specification argument indicating the type of packets a device is able to offload. If a packet does not match the specification, the helper function calls skb_checksum_help. Benefits of this approach are: - Simplify the stack and clarify the interface for checksum offload - Encourage NIC vendors to implement the generic. protocol agnostic checksum offload methods in hardware - Encourage feature parity in NIC offloads for IPv4 and IPv6 Many drivers advertise NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM and it probably isn't feasible to convert them all in a given time frame (although if we could this would be a great simplification to the stack). A reasonable direction may be to declare that new drivers must use NETIF_F_HW_CSUM as NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM are considered deprecated. There is a class of drivers that should now be converted to advertise NETIF_F_HW_CSUM, namely those that support offload of ecapsulated checksums. These drivers have to date been using skb->encapsulation to infer that checksum offload is being performed for an encapsulated checksum. This is strictly not correct. skb->encapsulation indicates that the inner headers are valid in the skbuff, whereas the stack indicates checksum offload arguments exclusively in csum_start and csum_offset. At some point we may want to set the inner headers for an skbuff but offload the outer transport checksum, so this needs to be fixed. In this patch set: - Rename some of constants involved in checksum offload to be more reflective of their function - Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM entirely as unnecessary convolutions - Fix conditions in tcp_sendpage and tcp_sendmsg to take IP protocol into account when determining if checksum offload can be done - Add driver helper functions for determining if a checksum can be offloaded to a device. If not, the helper function can call skb_checksum_help - Document the checksum offload interface between the stack and drivers with detail and specifics Testing: Have been testing ixgbe and mlx4. No noticeable regressions seen yet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Elaborate on checksum offload interface descriptionTom Herbert1-29/+109
Add specifics and details the description of the interface between the stack and drivers for doing checksum offload. This description is meant to be as specific and complete as possible. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Add driver helper functions to determine checksum offloadabilityTom Herbert2-0/+214
Add skb_csum_offload_chk driver helper function to determine if a device with limited checksum offload capabilities is able to offload the checksum for a given packet. This patch includes: - The skb_csum_offload_chk function. Returns true if checksum is offloadable, else false. Optionally, in the case that the checksum is not offloable, the function can call skb_checksum_help to resolve the checksum. skb_csum_offload_chk also returns whether the checksum refers to an encapsulated checksum. - Definition of skb_csum_offl_spec structure that caller uses to indicate rules about what it can offload (e.g. IPv4/v6, TCP/UDP only, whether encapsulated checksums can be offloaded, whether checksum with IPv6 extension headers can be offloaded). - Ancilary functions called skb_csum_offload_chk_help, skb_csum_off_chk_help_cmn, skb_csum_off_chk_help_cmn_v4_only. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15tcp: Fix conditions to determine checksum offloadTom Herbert2-2/+11
In tcp_send_sendpage and tcp_sendmsg we check the route capabilities to determine if checksum offload can be performed. This check currently does not take the IP protocol into account for devices that advertise only one of NETIF_F_IPV6_CSUM or NETIF_F_IP_CSUM. This patch adds a function to check capabilities for checksum offload with a socket called sk_check_csum_caps. This function checks for specific IPv4 or IPv6 offload support based on the family of the socket. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUMTom Herbert13-39/+50
These netif flags are unnecessary convolutions. It is more straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM, and NETIF_F_IPV6_CSUM directly. This patch also: - Cleans up can_checksum_protocol - Simplifies netdev_intersect_features Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASKTom Herbert25-39/+43
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the set of features for offloading all checksums. This is a mask of the checksum offload related features bits. It is incorrect to set both NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for features of a device. This patch: - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where NETIF_F_ALL_CSUM is being used as a mask). - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15fcoe: Use CHECKSUM_PARTIAL to indicate CRC offloadTom Herbert1-1/+1
When setting up CRC offload set ip_summed to CHECKSUM_PARTIAL instead of CHECKSUM_UNNECESSARY. This is consistent with the definition of CHECKSUM_PARTIAL. The only driver that seems to be advertising NETIF_F_FCOE_CRC is ixgbe. AFICT the driver does not look at ip_summed for FCOE and just assumes that CRC is being offloaded. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRCTom Herbert10-14/+14
The SCTP checksum is really a CRC and is very different from the standards 1's complement checksum that serves as the checksum for IP protocols. This offload interface is also very different. Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these differences. The term CSUM should be reserved in the stack to refer to the standard 1's complement IP checksum. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Add skb_inner_transport_offset functionTom Herbert1-0/+5
Same thing as skb_transport_offset but returns the offset of the inner transport header (when skb->encpasulation is set). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15ravb: Add fixed-link supportKazuya Mizuguchi1-0/+12
This patch adds support of the fixed PHY. This patch is based on commit 87009814cdbb ("ucc_geth: use the new fixed PHY helpers"). Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com> Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Merge branch 'mlxsw-bridge-vlan-offloading'David S. Miller14-162/+1001
Ido Schimmel says: ==================== This patchset introduces support for the offloading of 802.1D bridges between VLAN devices. These can either be VLAN devices configured on top of the physical ports or on top of LAG devices. Patches 1-2 deal with the necessary infrastructure changes needed in order to enable the above. The main change is that switchdev drivers can now know the device from which the switchdev op originated from. Patches 3-10 lay the groundwork for 802.1D bridges support in the mlxsw driver, with patch 4 doing most of the heavy lifting. Patch 11 finally offloads these bridges to hardware by listening to the notifications sent when the VLAN device joins or leaves a bridge. It is very similar to the already existing 802.1Q bridge we support. Patches 12-14 add minor modifications to allow one to bridge a VLAN device configured on top of LAG. ==================== Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Add support for VLAN devices on top of LAGIdo Schimmel1-3/+29
When creating a VLAN device on top of LAG, we are basically creating a vPort on top of each of the port netdevs member in the LAG. Therefore, these vPorts should inherit both the LAG status and LAG ID from the underlying port netdevs. In addition, when the VLAN device joins or leaves a bridge each of the underlying vPorts should know about it and act accordingly. This is achieved by propagating the VLAN event down to the lower devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Enable FDB records for VLAN devices on top of LAGIdo Schimmel1-7/+14
When adding or removing FDB records of VLAN devices on top of LAG we should set the lag_vid parameter to the VLAN ID of the VLAN device. It is reserved otherwise. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: reg: Add lag_vid field to SFD registerIdo Schimmel2-2/+10
Unicast LAG records in the Switch Filtering Database (SFD) register have a lag_vid field indicating the VLAN ID in case of vFIDs. This field is no longer reserved since we are going to add support for VLAN devices on top of LAG. Add the lag_vid field to be used by VLAN devies on top of LAG. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Add support for VLAN devices bridgingIdo Schimmel3-1/+378
All the member VLAN devices in a bridge need to share the same vFID. To achieve that, expand the vFID struct to include the associated bridge device (or lack of) and allow one to lookup a vFID based on a bridge device. When joining a bridge, lookup the relevant vFID or create one if none exists. Next, make the VLAN device use the vFID. Leaving a bridge can either occur because a user removed the VLAN device from a bridge or because the VLAN device was deleted by the user. In the latter case the bridge's teardown sequence is invoked after the hardware vPort is already gone. Therefore, when unlinking the VLAN device from the real device, check if the associated vPort is bridged and act accordingly. The bridge's notification will be ignored in this case. Note that bridging a VLAN interface with an ordinary port netdev is currently not supported, but not forbidden. This will be addressed in a follow-up patchset. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Handle VLAN devices linking / unlinkingIdo Schimmel1-3/+51
When a VLAN interface is configured on top of a physical port we should associate the VLAN device with the matching vPort. Likewise, when it's removed, we should revert back to the underlying port netdev. While not a must, this is consistent with port netdevs and also provides a more accurate error printing via netdev_err() and friends. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Adjust FDB notifications for VLAN devicesIdo Schimmel2-4/+61
FDB notifications contain the FID and port (or LAG ID) on which the MAC was learned. In the case of the 802.1Q bridge one can easily derive the matching VID - as FID equals VID - and generate the appropriate notification for the software bridge. With VLAN devices this is no longer the case, as these are associated with a vFID. Solve that by converting the FID to a vFID and lookup the matching VLAN device. From that derive the VID and whether learning (and learning sync) should occur. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Adjust switchdev ops for VLAN devicesIdo Schimmel2-4/+98
switchdev ops can now be called for VLAN devices and we need to be prepared for it. Until now they were only called for the port netdev. Use the newly propagated orig_dev passed as part of the switchdev attr/obj and determine whether the original device is a VLAN device. If so, act accordingly, otherwise continue as usual. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Use FID instead of VID when accessing FDBIdo Schimmel2-28/+34
In the Spectrum ASIC - unlike SwitchX-2 - FDB access is done by specifying FID as parameter and not VID. Change the relevant variables and parameters names to reflect that. Note that this was OK up until now, since FID was always equal to VID, but with the introduction of VLAN interfaces this is no longer the case. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Add another flood table for vFIDsIdo Schimmel3-27/+16
We previously used only one flood table for packets classified to vFIDs. However, since we are going to add support for bridges between VLAN interfaces (mapped to vFIDs) we need to add one more flood table. That way we can separate the flooding domain of unknown unicast traffic from all the rest and support flood control (as we do with the 802.1Q bridge). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Use appropriate parameter nameIdo Schimmel1-4/+4
The __mlxsw_sp_port_flood_set function is now used to configure flooding for both FIDs and vFIDs, so change the parameter name to 'idx' instead of 'fid'. This is also consistent with hardware documentation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Split vFID range in twoIdo Schimmel3-99/+287
Up until now we used a 1:1 mapping - based on VID - to map a VLAN interface to a vFID. However, a different scheme is needed in order to support bridges between VLAN interfaces, as all the member interfaces - which can have different VIDs - need to share the same vFID. Solve that by splitting the vFID range in two: 1. Non-bridged VLAN interfaces 2. Bridged VLAN interfaces When a VLAN interface is created, assign it the next available vFID in the first range, unless one already exists for that VID or number of vFIDs in the range was exceeded. When interface is removed, free the vFID, unless other interfaces are mapped to it. To accomplish the above: 1. Store the VID to vFID mapping in a new struct (mlxsw_sp_vfid), which has a global context and holds a reference count. 2. Create a vPort (dummy in case of bridge SELF invocation) on top of of the physical port and hold a reference to the associated vFID. vfid vfid +-------------+ +-------------+ | vfid | | vfid | | vid +---> ... | vid | | nr_vports | | nr_vports | +------+------+ +------+------+ | +-----------------------+-------+ | | vport vport +-------------+ +-------------+ | ... | | ... | | *vfid +---> ... | *vfid +---> ... | ... | | ... | +------+------+ +------+------+ | | port port +-------------+ +-------------+ | ... | | ... | | vports_list | | vports_list | | ... | | ... | +-------------+ +-------------+ swXpY swXpZ Next patches in the series will add the missing infrastructure for the second range and transfer vPorts between the two ranges according to the received notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15mlxsw: spectrum: Allocate active VLANs only for port netdevsIdo Schimmel2-1/+11
When adding support for bridges between VLAN interfaces, we'll introduce a new entity called a vPort, which is a represntation of the VLAN interface in the hardware. The main difference between a vPort and a physical port is that several FIDs can be bound to the latter, whereas only one (called a vFID) can be bound to the first. Therefore, it makes sense to use the same struct to represent the two, but to only allocate the 'active_vlans' bitmap in case of a physical port. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15switchdev: Pass original device to port netdev driverIdo Schimmel8-0/+22
switchdev drivers need to know the netdev on which the switchdev op was invoked. For example, the STP state of a VLAN interface configured on top of a port can change while being member in a bridge. In this case, the underlying driver should only change the STP state of that particular VLAN and not of all the VLANs configured on the port. However, current switchdev infrastructure only passes the port netdev down to the driver. Solve that by passing the original device down to the driver as part of the required switchdev object / attribute. This doesn't entail any change in current switchdev drivers. It simply enables those supporting stacked devices to know the originating device and act accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15switchdev: vlan: Use switchdev_port* in vlan_netdev_opsIdo Schimmel1-0/+7
We need to be able to propagate static FDB entries and certain bridge port attributes (e.g. learning, flooding) down to the port netdev driver when bridge port is a VLAN interface. Achieve that by setting ndo_bridge* and ndo_fdb* in vlan_netdev_ops to the corresponding switchdev_port* functions. This is consistent with team and bond devices. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller11-179/+120
Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2015-12-14 This series contains updates to e1000e and igb. Alex Duyck changes e1000_up() to void since it always returned 0, also by making it void, we can drop some code since we no longer have to worry about non-zero return values. Aaron Sierra removes GS40G specific defines and functions since the i210 internal PHY can be accessed with the access functions shared by 82580, i350 and i354 devices. Also removes the code to add the PHY address into the PCDL register address, since there is no real reason to do so. Joe updates the cable length function reports all four pairs true min, max and average cable length for i210. Also updated ethtool to use enum-based labels instead of hard coded values. Benjamin Poirier cleans up code that is never reachable since MSI-X interrupts are not shared in e1000e. Also removes the ICR read in the other interrupt handler, since the information is not needed and IMS is configured such that the only link status change can trigger the other interrupt handler. Fixed in MSI-X mode, there is no handler for the LSC interrupt so there is no point in writing that to ICS now that we always assume other interrupts are caused by LSC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14e1000e: Fix msi-x interrupt automaskBenjamin Poirier1-6/+5
Since the introduction of 82574 support in e1000e, the driver has worked on the assumption that msi-x interrupt generation is automatically disabled after each irq. As it turns out, this is not the case. Currently, rx interrupts can fire multiple times before and during napi processing. This can be a problem for users because frames that arrive in a certain window (after adapter->clean_rx() but before napi_complete_done() has cleared NAPI_STATE_SCHED) generate an interrupt which does not lead to napi_schedule(). These frames sit in the rx queue until another frame arrives (a tcp retransmit for example). While the EIAC and CTRL_EXT registers are properly configured for irq automask, the modification of IAM in e1000_configure_msix() is what prevents automask from working as intended. This patch removes that erroneous write and fixes interrupt rearming for tx interrupts. It also clears IAME from CTRL_EXT. This is not strictly necessary for operation of the driver but it is to avoid disruption from potential programs that access the registers directly, like `ethregs -c`. Reported-by: Frank Steiner <steiner-reg@bio.ifi.lmu.de> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14e1000e: Do not write lsc to ics in msi-x modeBenjamin Poirier2-12/+19
In msi-x mode, there is no handler for the lsc interrupt so there is no point in writing that to ics now that we always assume Other interrupts are caused by lsc. Reviewed-by: Jasna Hodzic <jhodzic@ucdavis.edu> Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14e1000e: Do not read ICR in Other interruptBenjamin Poirier1-15/+7
Removes the ICR read in the other interrupt handler, uses EIAC to autoclear the Other bit from ICR and IMS. This allows us to avoid interference with Rx and Tx interrupts in the Other interrupt handler. The information read from ICR is not needed. IMS is configured such that the only interrupt cause that can trigger the Other interrupt is Link Status Change. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14e1000e: Remove unreachable codeBenjamin Poirier1-6/+0
msi-x interrupts are not shared so there's no need to check if the interrupt was really from this adapter. Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14net/macb: add support for resetting PHY using GPIOGregory CLEMENT3-0/+12
With device tree it is no more possible to reset the PHY at board level. Furthermore, doing in the driver allow to power down the PHY when the network interface is no more used. This reset can't be done at the PHY driver level. The PHY must be able to answer the to the mii bus scan to let the kernel creating a PHY device. The patch introduces a new optional property "phy-reset-gpios" inspired from the one use for the FEC. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14igb: Explicitly label self-test result indicesJoe Schultz1-14/+24
Previously, the ethtool self-test gstrings/data arrays were accessed via hardcoded indices, which made the code difficult to follow. This patch replaces the hardcoded values with enum-based labels. Signed-off-by: Joe Schultz <jschultz@xes-inc.com> Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14igb: Improve cable length function for I210, etc.Joe Schultz3-9/+51
Previously, the PHY-specific code to get the cable length for the I210 internal and related PHYs was reporting the cable length of a single pair and reporting it as the min, max, and total cable length. Update it so that all four pairs are checked so the true min, max, and average cable lengths are reported. Signed-off-by: Joe Schultz <jschultz@xes-inc.com> Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14igb: Don't add PHY address to PCDL addressAaron Sierra1-2/+1
There is no reason to add the PHY address into the PCDL register address. Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-14net: Fix typo in skb_fclone_busyMasanari Iida1-1/+1
This patch fix a typo found within comment of skb_fclone_busy. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-14Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-nextDavid S. Miller39-1513/+3101
Johan Hedberg says: ==================== pull request: bluetooth-next 2015-12-11 Here's another set of Bluetooth & 802.15.4 patches for the 4.5 kernel: - 6LoWPAN debugfs support - New 802.15.4 driver for ADF7242 MAC IEEE802154 - Initial code for 6LoWPAN Generic Header Compression (GHC) support - Refactor Bluetooth LE scan & advertising behind dedicated workqueue - Cleanups to Bluetooth H:5 HCI driver - Support for Toshiba Broadcom based Bluetooth controllers - Use continuous scanning when establishing Bluetooth LE connections Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>