aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/intel (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni1-3/+2
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-20Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller11-116/+197
Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2019-03-19 This series contains updates to ice driver only. Michal adds support for the pruning enable flag to avoid seeing broadcast packets on different VLANs. Akeem fixes an issue with VF queues being disabled and the VF netdev network carrier being lost after reset. Fixed an issue issue when doing PFR and CORER resets, where all VF VSIs need to be reset and rebuilt with the main VSIs before replaying all VSIs. Resolved an issue to properly initialize VFs in the guest OS via PCI passthrough. Bruce adds a local variable to avoid unnecessary de-references throughout ice_probe(). Brett cleans up the code a bit by removing the need for a local variable and re-designs the loop to simply return when get a successful result. Cleans up the code to replace loop calls with a predefined macro to make the code more consistent. Updated the driver to ensure ITR granularity is always 2 usecs. Refactors the calculation of VSIs per PF into a general function that can calculate per PF allocations for not just VSIs but across multiple resource types. Improve the driver performance of the driver when using the default settings by determining the ring size and the number of descriptors for transmit and receive based on a calculation with the PAGE_SIZE, ICE_MAX_NUM_DESC, and ICE_REQ_DESC_MULTIPLE. Chinh fixes an issue, where a reserved bit was possibly being set when it should never be set. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-19ice: Determine descriptor count and ring size based on PAGE_SIZEBrett Creeley3-11/+43
Currently we set the default number of Tx and Rx descriptors to 128 by default. For Rx this amounts to a full page (assuming 4K pages) because each Rx descriptor is 32 Bytes, but for Tx it only amounts to a half page because each Tx descriptor is 16 Bytes (assuming 4K pages). Instead of assuming 4K pages, determine the ring size and the number of descriptors for Tx and Rx based on a calculation using the PAGE_SIZE, ICE_MAX_NUM_DESC, and ICE_REQ_DESC_MULTIPLE. This change is being made to improve the performance of the driver when using the default settings. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: Reset all VFs with VFLR during SR-IOV init flowAkeem G Abodunrin1-1/+1
During SR-IOV initialization, we allocate and setup VFs with reset, and since we were going to inform Firmware about our intention to do VFLR by disabling LAN TX Queue, then we really have to complete VF reset flow with VFLR using appropriate registers - Otherwise, reset status bit for VF in the Guest OS might returns DEADBEEF. This resolves issue to properly initialize VFs in the Guest OS via PCI passthrough. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: Get resources per functionBrett Creeley1-5/+8
ice_get_guar_num_vsi currently calculates the number of VSIs per PF. Rework this into a general function ice_get_num_per_func, that can calculate per PF allocations for not just VSIs but across multiple resource types. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: Implement flow to reset VFs with PFR and other resetsAkeem G Abodunrin1-6/+2
All VF VSIs need to be reset and rebuild with the main VSIs before replaying all VSIs, so that all existing switch filters, scheduler tree and other configuration could be replayed at once. This fixes issues when doing PFR and CORER reset. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: configure GLINT_ITR to always have an ITR gran of 2Brett Creeley4-18/+44
Instead of hoping that our ITR granularity will be 2 usec program the GLINT_CTL register to make sure the ITR granularity is always 2 usecs. Now that we know what the ITR granularity will be get rid of the check in ice_probe() to verify our previous assumption. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: use ice_for_each_vsi macro when possibleBrett Creeley3-9/+8
Replace all instances of: for (i = 0; i < pf->num_alloc_vsi; i++) with the following macro: ice_for_each_vsi(pf, i) This will allow the code to be consistent since there are currently cases of using both. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice : Ensure only valid bits are set in ice_aq_set_phy_cfgChinh T Cao3-2/+15
In the ice_aq_set_phy_cfg AQ command, the 16.4 bit is reserved. This patch will make sure that this bit will never be set to 1. Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: remove redundant variable and if conditionBrett Creeley1-7/+4
In ice_pf_rxq_wait we are using an unnecessary local variable and also we are checking if the timeout time was reached after the loop. Get rid of the local variable and return 0 right when we get a successful result. This makes it so we can return -ETIMEDOUT if we ever exit the loop because we know the timeout time has been hit. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: avoid multiple unnecessary de-references in probeBruce Allan1-18/+15
Add a local variable struct device *dev to avoid unnecessary de-references throughout ice_probe(). Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: Fix issue with VF reset and multiple VFs support on PFsAkeem G Abodunrin1-7/+13
This patch fixes issues with VF queues being disabled, and VF netdev network carrier being lost after reset. Basically, we need to check if VF is enabled, and queue configured in reset_all_vfs flow, and disable/enable those queues appropriately whenever the function is called after Global/CORER/PFR reset/rebuild/replay. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19ice: Fix broadcast traffic in port VLAN modeMichal Swiatkowski1-32/+44
Set egress (Rx) pruning enable flag for VF VSI in VSI ctxt to enable prune action. To avoid seeing broadcast packet in different VLAN, pruning enable flag in VSI ctxt should be set. Write new functions (fill VSI ctx) to not repeat send ctxt code. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igc: Remove unneeded hw_dbg printsSasha Neftin1-4/+0
Remove unneeded hw_dbg prints from igc_ethtool.c file. Clean up code. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igc: Fix the typo in igc_base.h header definitionSasha Neftin1-2/+2
Add the underline for the _IGC_BASE_H_. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igc: Add support for the ntuple featureSasha Neftin2-0/+89
Copy the ntuple feature into list of user selectable features. Enable the ntuple feature. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igc: Add support for statisticsSasha Neftin3-1/+403
Add support for statistics and show basic counters. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igc: Extend the ethtool supportingSasha Neftin5-3/+814
Add show and configure network flow classification (NFC) methods to the ethtool. Show the specifies Rx ntuple filters. Configures receive network flow classification option or rules. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igc: Add multiple receive queues control supportingSasha Neftin4-0/+73
Enable the multi queues to receive. Program the direction of packets to specified queues according to the mode selected in the MRQC register. Multiple receive queues defined by filters and RSS for 4 queues. Enable/disable RSS hashing and also to enable multiple receive queues. This patch will allow further ethtool support development. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19e1000e: Disable runtime PM on CNP+Kai-Heng Feng1-1/+1
There are some new e1000e devices can only be woken up from D3 one time, by plugging Ethernet cable. Subsequent cable plugging does set PME bit correctly, but it still doesn't get woken up. Since e1000e connects to the root complex directly, we rely on ACPI to wake it up. In this case, the GPE from _PRW only works once and stops working after that. Though it appears to be a platform bug, e1000e maintainers confirmed that I219 does not support D3. So disable runtime PM on CNP+ chips. We may need to disable earlier generations if this bug also hit older platforms. Bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=280819 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igb: fix various indentation issuesColin Ian King1-2/+2
There are some lines that have indentation issues, fix these Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19igb: Exclude device from suspend direct complete optimizationKai-Heng Feng1-0/+3
igb sets different WoL settings in system suspend callback and runtime suspend callback. The suspend direct complete optimization leaves igb in runtime suspended state with wrong WoL setting during system suspend. To fix this, we need to disable suspend direct complete optimization to let igb always use suspend callback to set correct WoL during system suspend. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-19intel: correct return from set features callbackSerhey Popovych5-5/+5
According to comments in <linux/netdevice.h> we should return either >0 or -errno from ->ndo_set_features() if changing dev->features by itself. Return 1 in such places to notify netdev_update_features() about applied changes in dev->features. Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-03-05mm: replace all open encodings for NUMA_NO_NODEAnshuman Khandual1-2/+3
Patch series "Replace all open encodings for NUMA_NO_NODE", v3. All these places for replacement were found by running the following grep patterns on the entire kernel code. Please let me know if this might have missed some instances. This might also have replaced some false positives. I will appreciate suggestions, inputs and review. 1. git grep "nid == -1" 2. git grep "node == -1" 3. git grep "nid = -1" 4. git grep "node = -1" This patch (of 2): At present there are multiple places where invalid node number is encoded as -1. Even though implicitly understood it is always better to have macros in there. Replace these open encodings for an invalid node number with the global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like 'invalid node' from various places redirecting them to a common definition. Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [ixgbe] Acked-by: Jens Axboe <axboe@kernel.dk> [mtip32xx] Acked-by: Vinod Koul <vkoul@kernel.org> [dmaengine.c] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Doug Ledford <dledford@redhat.com> [drivers/infiniband] Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Hans Verkuil <hverkuil@xs4all.nl> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-25ice: fix overlong string, update stats outputJesse Brandeburg1-40/+40
A test started warning on a string truncation. This led to an unfortunate realization that we are likely not accounting for the stats length correctly before this patch, so fix the issue by putting "port." in front of all the PF stats, instead of magically prepending it at runtime. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Fix for FC get rx/tx pause paramsLukasz Czapnik1-11/+26
Ethtool reported pause params based on the currently negotiated link settings instead of current PHY config. User was not able to turn off pause params because ethtool was incorrectly reporting parameters as off when link was down even though PHY was configured to support pause frames. Now pause params are taken from PHY config instead of link status. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: use absolute vector ID for VFsMitch Williams1-2/+4
When the PF driver sets up the VF MSI-X vector allocation, it needs to use the hardware absolute vector ID, not the per-PF vector ID. Without this change we see (apparent) TX hangs when using VFs on multiple PFs. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: check for a leaf node presenceVictor Raj2-0/+24
Check for a leaf node presence for a given VSI. This check is required before removing a VSI since VSIs can't be removed with enabled queues (with leaf nodes) from the FW scheduler tree unless its a reset. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: flush Tx pipe on disable queue timeoutVictor Raj1-2/+19
Set the flush Tx pipe flag instead of getting an EAGAIN error when FW times out in processing the disable Tx queue command. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: clear VF ARQLEN register on resetMitch Williams2-0/+6
On older devices like X710 and X722, the VF's ARQLEN register is cleared on reset, so the VF driver uses that register to detect an unannounced reset. Unfortunately, on devices controlled by ice, this register is NOT cleared on reset. This causes the VF to miss resets, and even on properly-announced resets, the VF driver complains that it didn't see the reset. To fix this, we'll do it in software. When we handle a VF reset (whether triggered by software or VFLR), clear this register after the HW reset is complete. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: don't spam VFs with link messagesMitch Williams1-1/+2
Don't send a link message to the VFs unless link actually changes state. This avoids a small timing hole in some VF drivers that can cause an apparent TX hang if they receive a link status message at the wrong time. Although we have fixed the timing hole in the current VF driver, there are still lots of drivers in the field that have this timing hole. Let's not fall into it if we can avoid it. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: only use the VF for ICE_VSI_VF in ice_vsi_releaseBrett Creeley1-2/+4
In ice_vsi_release we are always assigning a value to the local VF variable. Change this to only be assigned if the VSI is a VF VSI. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix numeric overflow warningBruce Allan2-4/+5
When compiling and analyzing the driver on newer kernels, a static analyzer warns about the following "numeric overflow" issues: "The result of expression: 'budget-1' generates 4-byte type while casting to a bigger size of 8-byte". "The result of expression: '*words-words_read' generates 4-byte type while casting to a bigger size of 8-byte". Fix them both. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix issue where host reboots on unload when iommu=onBrett Creeley1-17/+54
Currently if the kernel has the intel_iommu=on parameter set, on some platforms removing the driver causes a system reboot. In initialization we associate the control queue interrupts with the pf->hw_oicr_idx and enable the interrupts by setting the CAUSE_ENA bit. The problem comes on teardown because we are not clearing the CAUSE_ENA bit for the control queues, but the vector at pf->hw_oicr_idx (miscellaneous interrupt vector) gets disabled. Fix this by clearing the CAUSE_ENA bit in the appropriate control queue registers on when freeing the miscellaneous interrupt vector. Also, move the call to ice_free_irq_msix_misc() to after ice_deinit_sw() in ice_remove() because ice_deinit_sw() makes an AQ call, but ice_free_irq_msix_misc() disables the miscellaneous vector and it's associated interrupts. Also, create two small helper functions to enable and disable the control queue interrupts respectively. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix ice_remove_rule_internal vsi_list handlingJacob Keller1-2/+13
When adding multiple VLANs to the same VSI, the ice_add_vlan code will share the VSI list, so as not to create multiple unnecessary VSI lists. Consider the following flow ice_add_vlan(hw, <VSI 0 VID 7, VSI 0 VID 8, VSI 0 VID 9>) Where we add three VLAN filters for VIDs 7, 8, and 9, all for VSI 0. The ice_add_vlan will create a single vsi_list and share it among all the filters. Later, if we try to remove a VLAN, ice_remove_vlan(hw, <VSI 0 VID 7>) Then the removal code will update the vsi_list and remove VSI 0 from it. But, since the vsi_list is shared, this breaks the list for the other users who reference it. We actually even free the VSI list memory, and may result in segmentation faults. This is due to the way that VLAN rule share VSI lists with reference counts, and is caused because we call ice_rem_update_vsi_list even when the ref_cnt is greater than one. To fix this, handle the case where ref_cnt is greater than one separately. In this case, we need to remove the associated rule without modifying the vsi_list, since it is currently being referenced by another rule. Instead, we just need to decrement the VSI list ref_cnt. The case for handling sharing of VSI lists with multiple VSIs is not currently supported by this code. No such rules will be created today, and this code will require changes if/when such code is added. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: fix stack hogs from struct ice_vsi_ctx structuresBruce Allan3-67/+117
struct ice_vsi_ctx has gotten large enough that function local declarations of it on the stack are causing stack hogs. Fix that by allocating the structs on heap. Cleanup some formatting issues in the code around these changes and fix incorrect data type uses of returned functions in a couple places. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: sizeof(<type>) should be avoidedBruce Allan6-45/+38
With sizeof(), it is preferable to use the variable of type <type> instead of sizeof(<type>). There are multiple places where a temporary variable is used to hold a 'size' value which is then used for a subsequent alloc/memset. Get rid of the temporary variable by calculating size as part of the alloc/memset statement. Also remove unnecessary type-cast. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Fix added in VSI supported nodes calcVictor Raj1-2/+7
VSI supported nodes are calculated in order to add the VSI parent or intermediate nodes to the scheduler tree. If one of the node in below layers (from VSI layer) has space to add the new VSI or intermediate node above that layer then it's not required to continue the calculation further for below layers. Signed-off-by: Victor Raj <victor.raj@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Fix the calculation of ICE_MAX_MTUMaciej Fijalkowski1-1/+1
Currently ICE_MAX_MTU subtracts only ETH_HLEN from max frame size and adds ETH_FCS_LEN and VLAN_HLEN, which is not what was intended. The ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN expression should be surrounded with parentheses. Wrap mentioned expression and take into account VLAN double tagging. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-25ice: Mark extack argument as __always_unusedBruce Allan1-4/+5
Commit 87b0984ebfab ("net: Add extack argument to ndo_fdb_add()") in net-next added an extended parameter to the .ndo_fdb_add op and changed ice_fdb_add() accordingly. Update the function header and add the __always_unused attribute to the new parameter to avoid -Wunused-parameter warnings. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller5-10/+60
Three conflicts, one of which, for marvell10g.c is non-trivial and requires some follow-up from Heiner or someone else. The issue is that Heiner converted the marvell10g driver over to use the generic c45 code as much as possible. However, in 'net' a bug fix appeared which makes sure that a new local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0 is cleared. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23e1000e: Fix -Wformat-truncation warningsFlorian Fainelli1-2/+2
Provide precision hints to snprintf() since we know the destination buffer size of the RX/TX ring names are IFNAMSIZ + 5 - 1. This fixes the following warnings: drivers/net/ethernet/intel/e1000e/netdev.c: In function 'e1000_request_msix': drivers/net/ethernet/intel/e1000e/netdev.c:2109:13: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=] "%s-rx-0", netdev->name); ^ drivers/net/ethernet/intel/e1000e/netdev.c:2107:3: note: 'snprintf' output between 6 and 21 bytes into a destination of size 20 snprintf(adapter->rx_ring->name, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sizeof(adapter->rx_ring->name) - 1, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "%s-rx-0", netdev->name); ~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/intel/e1000e/netdev.c:2125:13: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=] "%s-tx-0", netdev->name); ^ drivers/net/ethernet/intel/e1000e/netdev.c:2123:3: note: 'snprintf' output between 6 and 21 bytes into a destination of size 20 snprintf(adapter->tx_ring->name, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sizeof(adapter->tx_ring->name) - 1, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "%s-tx-0", netdev->name); ~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21ixgbe: don't do any AF_XDP zero-copy transmit if netif is not OKJan Sokolowski1-1/+2
An issue has been found while testing zero-copy XDP that causes a reset to be triggered. As it takes some time to turn the carrier on after setting zc, and we already start trying to transmit some packets, watchdog considers this as an erroneous state and triggers a reset. Don't do any work if netif carrier is not OK. Fixes: 8221c5eba8c13 (ixgbe: add AF_XDP zero-copy Tx support) Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-21i40e: fix XDP_REDIRECT/XDP xmit ring cleanup raceBjörn Töpel2-3/+15
When the driver clears the XDP xmit ring due to re-configuration or teardown, in-progress ndo_xdp_xmit must be taken into consideration. The ndo_xdp_xmit function is typically called from a NAPI context that the driver does not control. Therefore, we must be careful not to clear the XDP ring, while the call is on-going. This patch adds a synchronize_rcu() to wait for napi(s) (preempt-disable regions and softirqs), prior clearing the queue. Further, the __I40E_CONFIG_BUSY flag is checked in the ndo_xdp_xmit implementation to avoid touching the XDP xmit queue during re-configuration. Fixes: d9314c474d4f ("i40e: add support for XDP_REDIRECT") Fixes: 123cecd427b6 ("i40e: added queue pair disable/enable functions") Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-21ixgbe: fix potential RX buffer starvation for AF_XDPMagnus Karlsson2-3/+21
When the RX rings are created they are also populated with buffers so that packets can be received. Usually these are kernel buffers, but for AF_XDP in zero-copy mode, these are user-space buffers and in this case the application might not have sent down any buffers to the driver at this point. And if no buffers are allocated at ring creation time, no packets can be received and no interrupts will be generated so the NAPI poll function that allocates buffers to the rings will never get executed. To rectify this, we kick the NAPI context of any queue with an attached AF_XDP zero-copy socket in two places in the code. Once after an XDP program has loaded and once after the umem is registered. This take care of both cases: XDP program gets loaded first then AF_XDP socket is created, and the reverse, AF_XDP socket is created first, then XDP program is loaded. Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-21i40e: fix potential RX buffer starvation for AF_XDPMagnus Karlsson2-1/+17
When the RX rings are created they are also populated with buffers so that packets can be received. Usually these are kernel buffers, but for AF_XDP in zero-copy mode, these are user-space buffers and in this case the application might not have sent down any buffers to the driver at this point. And if no buffers are allocated at ring creation time, no packets can be received and no interrupts will be generated so the NAPI poll function that allocates buffers to the rings will never get executed. To rectify this, we kick the NAPI context of any queue with an attached AF_XDP zero-copy socket in two places in the code. Once after an XDP program has loaded and once after the umem is registered. This take care of both cases: XDP program gets loaded first then AF_XDP socket is created, and the reverse, AF_XDP socket is created first, then XDP program is loaded. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-21ixgbe: fix older devices that do not support IXGBE_MRQC_L3L4TXSWENJeff Kirsher1-2/+5
The enabling L3/L4 filtering for transmit switched packets for all devices caused unforeseen issue on older devices when trying to send UDP traffic in an ordered sequence. This bit was originally intended for X550 devices, which supported this feature, so limit the scope of this bit to only X550 devices. Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2019-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller6-55/+0
Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-02-16 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong. 2) test all bpf progs in alu32 mode, from Jiong. 3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin. 4) support for IP encap in lwt bpf progs, from Peter. 5) remove XDP_QUERY_XSK_UMEM dead code, from Jan. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-15net: bpf: remove XDP_QUERY_XSK_UMEM enumeratorJan Sokolowski6-55/+0
Commit c9b47cc1fabc ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") moved the umem query code to the AF_XDP core, and therefore removed the need to query the netdevice for a umem. This patch removes XDP_QUERY_XSK_UMEM and all code that implement that behavior, which is just dead code. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-02-08ixgbe: Use struct_size() helperGustavo A. R. Silva1-5/+5
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; struct boo entry[]; }; size = sizeof(struct foo) + count * sizeof(struct boo); instance = kzalloc(size, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); Notice that, in this case, variable size is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>