aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2018-08-25Revert "net: stmmac: fix build failure due to missing COMMON_CLK dependency"Geert Uytterhoeven1-5/+5
This reverts commit bde4975310eb1982bd0bbff673989052d92fd481. All legacy clock implementations now implement clk_set_rate() (Some implementations may be dummies, though). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arnd.de> Acked-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-25net: macb: Fix regression breaking non-MDIO fixed-link PHYsAhmad Fatoum1-10/+17
commit 739de9a1563a ("net: macb: Reorganize macb_mii bringup") broke initializing macb on the EVB-KSZ9477 eval board. There, of_mdiobus_register was called even for the fixed-link representing the RGMII-link to the switch with the result that the driver attempts to enumerate PHYs on a non-existent MDIO bus: libphy: MACB_mii_bus: probed mdio_bus f0028000.ethernet-ffffffff: fixed-link has invalid PHY address mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 0 [snip] mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 31 The "MDIO" bus registration succeeds regardless, having claimed the reset GPIO, and calling of_phy_register_fixed_link later on fails because it tries to claim the same GPIO: macb f0028000.ethernet: broken fixed-link specification Fix this by registering the fixed-link before calling mdiobus_register. Fixes: 739de9a1563a ("net: macb: Reorganize macb_mii bringup") Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-25mlxsw: spectrum_switchdev: Do not leak RIFs when removing bridgeIdo Schimmel3-0/+33
When a bridge device is removed, the VLANs are flushed from each configured port. This causes the ports to decrement the reference count on the associated FIDs (filtering identifier). If the reference count of a FID is 1 and it has a RIF (router interface), then this RIF is destroyed. However, if no port is member in the VLAN for which a RIF exists, then the RIF will continue to exist after the removal of the bridge. To reproduce: # ip link add name br0 type bridge vlan_filtering 1 # ip link set dev swp1 master br0 # ip link add link br0 name br0.10 type vlan id 10 # ip address add 192.0.2.0/24 dev br0.10 # ip link del dev br0 The RIF associated with br0.10 continues to exist. Fix this by iterating over all the bridge device uppers when it is destroyed and take care of destroying their RIFs. Fixes: 99f44bb3527b ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-24Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queueDavid S. Miller10-23/+81
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-08-24 This series contains fixes to e1000, igb, ixgb, ixgbe and i40e. YueHaibing from Huawei provides a change to use dma_zalloc_coherent() instead of calls to allocator followed by a memset for ixgb. Bo Chen provides a couple of fixes for e1000, first by adding a check to prevent a NULL pointer dereference. The second change is to clean up a possible resource leak on old transmit and receive rings when the device is not up. Jesus fixes an issue in the usage of an advanced transmit context descriptor for retrieving the timestamp of a packet for AF_PACKET if the IGB_TX_FLAGS_VLAN is not set in igb. Jia-Ju Bai provides several patches which replace GFP_ATOMIC with GFP_KERNEL, when using kzalloc() and kcalloc() which is not necessary. Also found an instance of mdelay() call which could be replaced with msleep(). Tony fixes ixgbe to allow MTU changes with XDP, by adding checks to ensure only supported values and return -EINVAL for when it is not supported. Sebastian fixed an issue that was not clearing VF mailbox memory and ensure queues are re-enabled correctly. Martyna fixes a transmit timeout when DCB is configured when bringing up an interface. Jake fixes a previous commit which accidentally reversed the check of the data pointer, so we can accurately count the size of the stats. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-24i40e: fix condition of WARN_ONCE for stat stringsJacob Keller1-1/+1
Commit 9b10df596bd4 ("i40e: use WARN_ONCE to replace the commented BUG_ON size check") introduced a warning check to make sure that the size of the stat strings was always the expected value. This code accidentally inverted the check of the data pointer. Fix this so that we accurately count the size of the stats we copied in. This fixes an erroneous WARN kernel splat that occurs when requesting ethtool statistics. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Tested-by: Mauro S M Rodrigues <maurosr@linux.vnet.ibm.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24i40e: Fix for Tx timeouts when interface is brought up if DCB is enabledMartyna Szapar1-7/+8
If interface is connected to switch port configured for DCB there are TX timeouts when bringing up interface. Problem started appearing after adding in i40e driver code mqprio hardware offload mode. In function i40e_vsi_configure_bw_alloc was added resetting BW rate which should be executing when mqprio qdisc is removed but was also when there was no mqprio qdisc added and DCB was enabled. In this patch was added additional check for DCB flag so now when DCB is enabled the correct DCB configs from before mqprio patch are restored. Signed-off-by: Martyna Szapar <martyna.szapar@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24ixgbe: fix driver behaviour after issuing VFLRSebastian Basierski2-0/+27
Since VFLR doesn't clear VFMBMEM (VF Mailbox Memory) and is not re-enabling queues correctly we should fix this behavior. Signed-off-by: Sebastian Basierski <sebastianx.basierski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24ixgbe: Prevent unsupported configurations with XDPTony Nguyen2-2/+31
These changes address comments by Jakub Kicinski on commit 38b7e7f8ae82 ("ixgbe: Do not allow LRO or MTU change with XDP"). Change the MTU check with XDP to allow any supported value and only reject those outside of the range as opposed to rejecting any change when XDP is active. In situations where MTU size is not supported, return -EINVAL instead of -EPERM. Add checks when enabling SRIOV, DCB, or adding L2FW offloaded device as they are not supported with XDP. CC: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24ixgbe: Replace GFP_ATOMIC with GFP_KERNELJia-Ju Bai2-3/+3
ixgbe_fcoe_ddp_setup(), ixgbe_setup_fcoe_ddp_resources() and ixgbe_sw_init() are never called in atomic context. They call kmalloc(), dma_pool_alloc() and kzalloc() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Acked-by: Sebastian Basierski <sebastianx.basierski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24igb: Replace mdelay() with msleep() in igb_integrated_phy_loopback()Jia-Ju Bai1-1/+1
igb_integrated_phy_loopback() is never called in atomic context. It calls mdelay() to busily wait, which is not necessary. mdelay() can be replaced with msleep(). This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24igb: Replace GFP_ATOMIC with GFP_KERNEL in igb_sw_init()Jia-Ju Bai1-2/+2
igb_sw_init() is never called in atomic context. It calls kzalloc() and kcalloc() with GFP_ATOMIC, which is not necessary. GFP_ATOMIC can be replaced with GFP_KERNEL. This is found by a static analysis tool named DCNS written by myself. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24igb: Use an advanced ctx descriptor for launchtimeJesus Sanchez-Palencia1-1/+2
On i210, Launchtime (TxTime) requires the usage of an "Advanced Transmit Context Descriptor" for retrieving the timestamp of a packet. The igb driver correctly builds such descriptor on the segmentation flow (i.e. igb_tso()) or on the checksum one (i.e. igb_tx_csum()), but the feature is broken for AF_PACKET if the IGB_TX_FLAGS_VLAN is not set, which happens due to an early return on igb_tx_csum(). This flag is only set by the kernel when a VLAN interface is used, thus we can't just rely on it. Here we are fixing this issue by checking if launchtime is enabled for the current tx_ring before performing the early return. Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24e1000: ensure to free old tx/rx rings in set_ringparam()Bo Chen1-2/+2
In 'e1000_set_ringparam()', the tx_ring and rx_ring are updated with new value and the old tx/rx rings are freed only when the device is up. There are resource leaks on old tx/rx rings when the device is not up. This bug is reported by COD, a tool for testing kernel module binaries I am building. This patch fixes the bug by always calling 'kfree()' on old tx/rx rings in 'e1000_set_ringparam()'. Signed-off-by: Bo Chen <chenbo@pdx.edu> Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24e1000: check on netif_running() before calling e1000_up()Bo Chen1-1/+2
When the device is not up, the call to 'e1000_up()' from the error handling path of 'e1000_set_ringparam()' causes a kernel oops with a null-pointer dereference. The null-pointer dereference is triggered in function 'e1000_alloc_rx_buffers()' at line 'buffer_info = &rx_ring->buffer_info[i]'. This bug was reported by COD, a tool for testing kernel module binaries I am building. This bug was also detected by KFI from Dr. Kai Cong. This patch fixes the bug by checking on 'netif_running()' before calling 'e1000_up()' in 'e1000_set_ringparam()'. Signed-off-by: Bo Chen <chenbo@pdx.edu> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-24ixgb: use dma_zalloc_coherent instead of allocator/memsetYueHaibing1-3/+2
Use dma_zalloc_coherent instead of dma_alloc_coherent followed by memset 0. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller5-17/+35
Daniel Borkmann says: ==================== pull-request: bpf 2018-08-24 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix BPF sockmap and tls where we get a hang in do_tcp_sendpages() when sndbuf is full due to missing calls into underlying socket's sk_write_space(), from John. 2) Two BPF sockmap fixes to reject invalid parameters on map creation and to fix a map element miscount on allocation failure. Another fix for BPF hash tables to use per hash table salt for jhash(), from Daniel. 3) Fix for bpftool's command line parsing in order to terminate on bad arguments instead of keeping looping in some border cases, from Quentin. 4) Fix error value of xdp_umem_assign_dev() in order to comply with expected bind ops error codes, from Prashant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-23Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queueDavid S. Miller14-126/+185
Jeff Kirsher says: ==================== Intel Wired LAN Driver Fixes 2018-08-23 This series contains bug fixes to the ice driver. Anirudh provides several fixes, starting with static analysis fixes by replacing a bitwise-and with a constant value and replace "magic" numbers with defines. Fixed the control queue processing by removing unnecessary read/writes to registers, as well as getting a accurate value for "pending". Added additional checks to avoid NULL pointer dereferences. Fixed up code formatting issues, by cleaning up code comments and coding style. Bruce cleans up a duplicate check for owner, within the same function. Also cleans up interrupt causes that are not handled or applicable. Fix checkpatch warning about the use of bool in structures due to the wasted space and size of bool, so convert struct members to u8 instead. Jake fixes a number of potential bugs in the reporting of stats via ethtool, by simply reporting all the queue statistics, even for the queues that are not activated. Fixed a compiler warning, as well as make the code a bit cleaner but just using order_base_2() for calculating the power-of-2. Preethi adds a check to avoid a NULL pointer dereference crash during initialization. Brett clarifies the code when it comes to port VLANs and regular VLANs, by renaming defines and field values to match their intended use and purpose. Jesse initializes a variable to avoid garbage values being returned to the caller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-23ice: Trivial formatting fixesAnirudh Venkataramanan3-6/+8
1) Add missing "\n" when printing link event error message. 2) Update dev_err statement in probe. 3) Add function description for ice_clear_pf_cfg. 4) Fix coding style for ice_acquire_nvm. 5) netdev->mtu is unsigned so use %u. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Change struct members from bool to u8Bruce Allan4-16/+16
Recent versions of checkpatch have a new warning based on a documented preference of Linus to not use bool in structures due to wasted space and the size of bool is implementation dependent. For more information, see the email thread at https://lkml.org/lkml/2017/11/21/384. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Fix potential return of uninitialized valueJesse Brandeburg1-2/+2
In ice_vsi_setup_[tx|rx]_rings, err is uninitialized which can result in a garbage value return to the caller. Fix that. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Fix a few null pointer dereference issuesAnirudh Venkataramanan2-8/+11
1) When ice_ena_msix_range() fails to reserve vectors, a devm_kfree() warning was seen in the error flow path. So check pf->irq_tracker before use in ice_clear_interrupt_scheme(). 2) In ice_vsi_cfg(), check vsi->netdev before use. 3) In ice_get_link_status, check link_up before use. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Update to interrupts enabled in OICRBruce Allan2-14/+3
Remove the following interrupt causes that are not applicable or not handled: - PFINT_OICR_HLP_RDY_M - PFINT_OICR_CPM_RDY_M - PFINT_OICR_GPIO_M - PFINT_OICR_STORM_DETECT_M Add the following interrupt cause that's actually handled in ice_misc_intr: - PFINT_OICR_PE_CRITERR_M Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23tools: bpftool: return from do_event_pipe() on bad argumentsQuentin Monnet1-1/+4
When command line parsing fails in the while loop in do_event_pipe() because the number of arguments is incorrect or because the keyword is unknown, an error message is displayed, but bpftool remains stuck in the loop. Make sure we exit the loop upon failure. Fixes: f412eed9dfde ("tools: bpftool: add simple perf event output reader") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-23ice: Set VLAN flags correctlyBrett Creeley2-25/+30
In the struct ice_aqc_vsi_props the field port_vlan_flags is an overloaded term because it is used for both port VLANs (PVLANs) and regular VLANs. This is an issue and is very confusing especially when dealing with VFs because normal VLANs and port VLANs are not the same. To fix this the field was renamed to vlan_flags and all of the #define's labeled *_PVLAN_* were renamed to *_VLAN_* if they are not specific to port VLANs. Also in ice_vsi_manage_vlan_stripping, set the ICE_AQ_VSI_VLAN_MODE_ALL bit to allow the driver to add a VLAN tag to all packets it sends. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Use order_base_2 to calculate higher power of 2Jacob Keller1-5/+2
Currently, we use a combination of ilog2 and is_power_of_2() to calculate the next power of 2 for the qcount. This appears to be causing a warning on some combinations of GCC and the Linux kernel: MODPOST 1 modules WARNING: "____ilog2_NaN" [ice.ko] undefined! This appears to because because GCC realizes that qcount could be zero in some circumstances and thus attempts to link against the intentionally undefined ___ilog2_NaN function. The order_base_2 function is intentionally defined to return 0 when passed 0 as an argument, and thus will be safe to use here. This not only fixes the warning but makes the resulting code slightly cleaner, and is really what we should have used originally. Also update the comment to make it more clear that we are rounding up, not just incrementing the ilog2 of qcount unconditionally. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Fix bugs in control queue processingAnirudh Venkataramanan2-5/+26
This patch is a consolidation of multiple bug fixes for control queue processing. 1) In ice_clean_adminq_subtask() remove unnecessary reads/writes to registers. The bits PFINT_FW_CTL, PFINT_MBX_CTL and PFINT_SB_CTL are not set when an interrupt arrives, which means that clearing them again can be omitted. 2) Get an accurate value in "pending" by re-reading the control queue head register from the hardware. 3) Fix a corner case involving lost control queue messages by checking for new control messages (using ice_ctrlq_pending) before exiting the cleanup routine. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Clean control queues only when they are initializedPreethi Banala1-8/+16
Clean control queues only when they are initialized. One of the ways to validate if the basic initialization is done is by checking value of cq->sq.head and cq->rq.head variables that specify the register address. This patch adds a check to avoid NULL pointer dereference crash when tried to shutdown uninitialized control queue. Signed-off-by: Preethi Banala <preethi.banala@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Report stats for allocated queues via ethtool statsJacob Keller2-13/+46
It is not safe to have the string table for statistics change order or size over the lifetime of a given netdevice. This is because of the nature of the 3-step process for obtaining stats. First, user space performs a request for the size of the strings table. Second it performs a separate request for the strings themselves, after allocating space for the table. Third, it requests the stats themselves, also allocating space for the table. If the size decreased, there is potential to see garbage data or stats values. In the worst case, we could potentially see stats values become mis-aligned with their strings, so that it looks like a statistic is being reported differently than it actually is. Even worse, if the size increased, there is potential that the strings table or stats table was not allocated large enough and the stats code could access and write to memory it should not, potentially resulting in undefined behavior and system crashes. It isn't even safe if the size always changes under the RTNL lock. This is because the calls take place over multiple user space commands, so it is not possible to hold the RTNL lock for the entire duration of obtaining strings and stats. Further, not all consumers of the ethtool API are the user space ethtool program, and it is possible that one assumes the strings will not change (valid under the current contract), and thus only requests the stats values when requesting stats in a loop. Finally, it's not possible in the general case to detect when the size changes, because it is quite possible that one value which could impact the stat size increased, while another decreased. This would result in the same total number of stats, but reordering them so that stats no longer line up with the strings they belong to. Since only size changes aren't enough, we would need some sort of hash or token to determine when the strings no longer match. This would require extending the ethtool stats commands, but there is no more space in the relevant structures. The real solution to resolve this would be to add a completely new API for stats, probably over netlink. In the ice driver, the only thing impacting the stats that is not constant is the number of queues. Instead of reporting stats for each used queue, report stats for each allocated queue. We do not change the number of queues allocated for a given netdevice, as we pass this into the alloc_etherdev_mq() function to set the num_tx_queues and num_rx_queues. This resolves the potential bugs at the slight cost of displaying many queue statistics which will not be activated. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Cleanup magic numberAnirudh Venkataramanan2-1/+2
Use define for the unit size shift of the Rx LAN context descriptor base address instead of the magic number 7. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23bpf: use per htab salt for bucket hashDaniel Borkmann1-10/+13
All BPF hash and LRU maps currently have a known and global seed we feed into jhash() which is 0. This is suboptimal, thus fix it by generating a random seed upon hashtab setup time which we can later on feed into jhash() on lookup, update and deletions. Fixes: 0f8e4bd8a1fc8 ("bpf: add hashtable type of eBPF maps") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Reviewed-by: Eduardo Valentin <eduval@amazon.com>
2018-08-23ice: Remove unnecessary node owner checkBruce Allan1-2/+1
There is already a check for owner == ICE_SCHED_NODE_OWNER_LAN at the beginning of ice_sched_update_vsi_child_nodes. Remove the additional check to address the static analysis tool smatch issue "warn: we tested 'owner' before and it was 'false'". Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-23ice: Fix multiple static analyser warningsAnirudh Venkataramanan4-24/+25
This patch fixes the following smatch errors: 1) Fix "odd binop '0x0 & 0xc'" when performing the bitwise-and with a constant value of zero (ICE_AQC_GSET_RSS_LUT_TABLE_SIZE_128_FLAG). Remove a similar bitwise-and with 0 in ice_add_marker_act() and use the right mask ICE_LG_ACT_GENERIC_OFFSET_M in the expression. 2) Fix a similar issue "odd binop '0x0 & 0x1800' in ice_req_irq_msix_misc. 3) Fix "odd binop '0x380000 & 0x7fff8'" in ice_add_marker_act(). Also, use a new define ICE_LG_ACT_GENERIC_OFF_RX_DESC_PROF_IDX instead of magic number '7'. 4) Fix warn: odd binop '0x0 & 0x18' in ice_set_dflt_vsi_ctx() by removing unnecessary logic to explicitly unset bits 3 and 4 in port_vlan_bits. These bits are unset already by the memset on ctxt->info. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-08-22Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetoothDavid S. Miller2-3/+6
Johan Hedberg says: ==================== pull request: bluetooth 2018-08-23 Here are two important Bluetooth fixes for the MediaTek and RealTek HCI drivers. Please let me know if there are any issues pulling, thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22Merge branch 'hns3-fixes'David S. Miller2-4/+5
Huazhong Tan says: ==================== net: hns3: bug fix & optimization for HNS3 driver This patchset presents a bug fix found out when CONFIG_ARM64_64K_PAGES enable and an optimization for HNS3 driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net: hns3: modify variable type in hns3_nic_reuse_pageHuazhong Tan1-1/+2
'truesize' is supposed to be u32, not int, so fix it. Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net: hns3: fix page_offset overflow when CONFIG_ARM64_64K_PAGESHuazhong Tan1-3/+3
When enable the config item "CONFIG_ARM64_64K_PAGES", the size of PAGE_SIZE is 65536(64K). But the type of page_offset is u16, it will overflow. So change it to u32, when "CONFIG_ARM64_64K_PAGES" enabled. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net/ipv6: init ip6 anycast rt->dst.input as ip6_inputHangbin Liu1-1/+1
Commit 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path") forgot to handle anycast route and init anycast rt->dst.input to ip6_forward. Fix it by setting anycast rt->dst.input back to ip6_input. Fixes: 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22Merge branch 'hns-fixes'David S. Miller2-107/+7
Huazhong Tan says: ==================== net: hns: bug fixes & optimization for HNS driver This patchset presents some bug fixes found out when CONFIG_ARM64_64K_PAGES enable and an optimization for HNS driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net: hns: use eth_get_headlen interface instead of hns_nic_get_headlenHuazhong Tan1-102/+1
Update hns to drop the hns_nic_get_headlen function in favour of eth_get_headlen, and hence also removes now redundant hns_nic_get_headlen. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net: hns: fix skb->truesize underestimationHuazhong Tan1-1/+1
skb->truesize is not meant to be tracking amount of used bytes in a skb, but amount of reserved/consumed bytes in memory. For instance, if we use a single byte in last page fragment, we have to account the full size of the fragment. So skb_add_rx_frag needs to calculate the length of the entire buffer into turesize. Fixes: 9cbe9fd5214e ("net: hns: optimize XGE capability by reducing cpu usage") Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net: hns: modify variable type in hns_nic_reuse_pageHuazhong Tan1-1/+2
'truesize' is supposed to be u32, not int, so fix it. Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net: hns: fix length and page_offset overflow when CONFIG_ARM64_64K_PAGESHuazhong Tan1-3/+3
When enable the config item "CONFIG_ARM64_64K_PAGES", the size of PAGE_SIZE is 65536(64K). But the type of length and page_offset are u16, they will overflow. So change them to u32. Fixes: 6fe6611ff275 ("net: add Hisilicon Network Subsystem hnae framework support") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22Merge branch 'tcp_bbr-PROBE_RTT-minor-bug-fixes'David S. Miller1-18/+24
Kevin Yang says: ==================== tcp_bbr: PROBE_RTT minor bug fixes This series includes two minor bug fixes for the TCP BBR PROBE_RTT mechanism, and one preparatory patch: (1) A preparatory patch to reorganize the PROBE_RTT logic by refactoring (into its own function) the code to exit PROBE_RTT, since the next patch will be using that code in a new context. (2) Fix: When BBR restarts from idle and if BBR is in PROBE_RTT mode, BBR should check if it's time to exit PROBE_RTT. If yes, then BBR should exit PROBE_RTT mode and restore the cwnd to its full value. (3) Fix: Apply the PROBE_RTT cwnd cap even if the count of fully-ACKed packets is 0. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22tcp_bbr: apply PROBE_RTT cwnd cap even if acked==0Kevin Yang1-2/+2
This commit fixes a corner case where TCP BBR would enter PROBE_RTT mode but not reduce its cwnd. If a TCP receiver ACKed less than one full segment, the number of delivered/acked packets was 0, so that bbr_set_cwnd() would short-circuit and exit early, without cutting cwnd to the value we want for PROBE_RTT. The fix is to instead make sure that even when 0 full packets are ACKed, we do apply all the appropriate caps, including the cap that applies in PROBE_RTT mode. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Kevin Yang <yyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22tcp_bbr: in restart from idle, see if we should exit PROBE_RTTKevin Yang1-0/+4
This patch fix the case where BBR does not exit PROBE_RTT mode when it restarts from idle. When BBR restarts from idle and if BBR is in PROBE_RTT mode, BBR should check if it's time to exit PROBE_RTT. If yes, then BBR should exit PROBE_RTT mode and restore the cwnd to its full value. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Kevin Yang <yyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22tcp_bbr: add bbr_check_probe_rtt_done() helperKevin Yang1-16/+18
This patch add a helper function bbr_check_probe_rtt_done() to 1. check the condition to see if bbr should exit probe_rtt mode; 2. process the logic of exiting probe_rtt mode. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Kevin Yang <yyd@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22ipv4: tcp: send zero IPID for RST and ACK sent in SYN-RECV and TIME-WAIT stateEric Dumazet1-0/+6
tcp uses per-cpu (and per namespace) sockets (net->ipv4.tcp_sk) internally to send some control packets. 1) RST packets, through tcp_v4_send_reset() 2) ACK packets in SYN-RECV and TIME-WAIT state, through tcp_v4_send_ack() These packets assert IP_DF, and also use the hashed IP ident generator to provide an IPv4 ID number. Geoff Alexander reported this could be used to build off-path attacks. These packets should not be fragmented, since their size is smaller than IPV4_MIN_MTU. Only some tunneled paths could eventually have to fragment, regardless of inner IPID. We really can use zero IPID, to address the flaw, and as a bonus, avoid a couple of atomic operations in ip_idents_reserve() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Geoff Alexander <alexandg@cs.unm.edu> Tested-by: Geoff Alexander <alexandg@cs.unm.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22addrconf: reduce unnecessary atomic allocationsCong Wang1-3/+3
All the 3 callers of addrconf_add_mroute() assert RTNL lock, they don't take any additional lock either, so it is safe to convert it to GFP_KERNEL. Same for sit_add_v4_addrs(). Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22net_sched: fix unused variable warning in stmmacArnd Bergmann1-1/+1
The new tcf_exts_for_each_action() macro doesn't reference its arguments when CONFIG_NET_CLS_ACT is disabled, which leads to a harmless warning in at least one driver: drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c: In function 'tc_fill_actions': drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c:64:6: error: unused variable 'i' [-Werror=unused-variable] Adding a cast to void lets us avoid this kind of warning. To be on the safe side, do it for all three arguments, not just the one that caused the warning. Fixes: 244cd96adb5f ("net_sched: remove list_head from tc_action") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-22sch_cake: Fix TC filter flow override and expand it to hosts as wellToke Høiland-Jørgensen1-4/+19
The TC filter flow mapping override completely skipped the call to cake_hash(); however that meant that the internal state was not being updated, which ultimately leads to deadlocks in some configurations. Fix that by passing the overridden flow ID into cake_hash() instead so it can react appropriately. In addition, the major number of the class ID can now be set to override the host mapping in host isolation mode. If both host and flow are overridden (or if the respective modes are disabled), flow dissection and hashing will be skipped entirely; otherwise, the hashing will be kept for the portions that are not set by the filter. Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>