aboutsummaryrefslogtreecommitdiffstatshomepage
diff options
context:
space:
mode:
authorJakub Kicinski <kuba@kernel.org>2025-02-06 17:34:36 -0800
committerJakub Kicinski <kuba@kernel.org>2025-02-06 17:34:36 -0800
commit26db4dbb747813b5946aff31485873f071a10332 (patch)
treed1b01bcd3099f55fa77da5538f4e8d5e4c20db19
parentnet: pcs: rzn1-miic: Convert to for_each_available_child_of_node() helper (diff)
parentice: init flow director before RDMA (diff)
downloadwireguard-linux-26db4dbb747813b5946aff31485873f071a10332.tar.xz
wireguard-linux-26db4dbb747813b5946aff31485873f071a10332.zip
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says: ==================== ice: managing MSI-X in driver Michal Swiatkowski says: It is another try to allow user to manage amount of MSI-X used for each feature in ice. First was via devlink resources API, it wasn't accepted in upstream. Also static MSI-X allocation using devlink resources isn't really user friendly. This try is using more dynamic way. "Dynamic" across whole kernel when platform supports it and "dynamic" across the driver when not. To achieve that reuse global devlink parameter pf_msix_max and pf_msix_min. It fits how ice hardware counts MSI-X. In case of ice amount of MSI-X reported on PCI is a whole MSI-X for the card (with MSI-X for VFs also). Having pf_msix_max allow user to statically set how many MSI-X he wants on PF and how many should be reserved for VFs. pf_msix_min is used to set minimum number of MSI-X with which ice driver should probe correctly. Meaning of this field in case of dynamic vs static allocation: - on system with dynamic MSI-X allocation support * alloc pf_msix_min as static, rest will be allocated dynamically - on system without dynamic MSI-X allocation support * try alloc pf_msix_max as static, minimum acceptable result is pf_msix_min As Jesse and Piotr suggested pf_msix_max and pf_msix_min can (an probably should) be stored in NVM. This patchset isn't implementing that. Dynamic (kernel or driver) way means that splitting MSI-X across the RDMA and eth in case there is a MSI-X shortage isn't correct. Can work when dynamic is only on driver site, but can't when dynamic is on kernel site. Let's remove this code and move to MSI-X allocation feature by feature. If there is no more MSI-X for a feature, a feature is working with less MSI-X or it is turned off. There is a regression here. With MSI-X splitting user can run RDMA and eth even on system with not enough MSI-X. Now only eth will work. RDMA can be turned on by changing number of PF queues (lowering) and reprobe RDMA driver. Example: 72 CPU number, eth, RDMA and flow director (1 MSI-X), 1 MSI-X for OICR on PF, and 1 more for RDMA. Card is using 1 + 72 + 1 + 72 + 1 = 147. We set pf_msix_min = 2, pf_msix_max = 128 OICR: 1 eth: 72 flow director: 1 RDMA: 128 - 74 = 54 We can change number of queues on pf to 36 and do devlink reinit OICR: 1 eth: 36 RDMA: 73 flow director: 1 We can also (implemented in "ice: enable_rdma devlink param") turned RDMA off. OICR: 1 eth: 72 RDMA: 0 (turned off) flow director: 1 After this changes we have a static base vector for SRIOV (SIOV probably in the feature). Last patch from this series is simplifying managing VF MSI-X code based on static vector. Now changing queues using ethtool is also changing MSI-X. If there is enough MSI-X it is always one to one. When there is not enough there will be more queues than MSI-X. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: init flow director before RDMA ice: simplify VF MSI-X managing ice: enable_rdma devlink param ice: treat dyn_allowed only as suggestion ice, irdma: move interrupts code to irdma ice: get rid of num_lan_msix field ice: remove splitting MSI-X between features ice: devlink PF MSI-X max and min parameter ice: count combined queues using Rx/Tx count ==================== Link: https://patch.msgid.link/20250205185512.895887-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to '')
-rw-r--r--Documentation/networking/devlink/ice.rst11
-rw-r--r--drivers/infiniband/hw/irdma/hw.c2
-rw-r--r--drivers/infiniband/hw/irdma/main.c46
-rw-r--r--drivers/infiniband/hw/irdma/main.h3
-rw-r--r--drivers/net/ethernet/intel/ice/devlink/devlink.c102
-rw-r--r--drivers/net/ethernet/intel/ice/ice.h21
-rw-r--r--drivers/net/ethernet/intel/ice/ice_base.c10
-rw-r--r--drivers/net/ethernet/intel/ice/ice_ethtool.c9
-rw-r--r--drivers/net/ethernet/intel/ice/ice_idc.c64
-rw-r--r--drivers/net/ethernet/intel/ice/ice_irq.c275
-rw-r--r--drivers/net/ethernet/intel/ice/ice_irq.h13
-rw-r--r--drivers/net/ethernet/intel/ice/ice_lib.c35
-rw-r--r--drivers/net/ethernet/intel/ice/ice_main.c6
-rw-r--r--drivers/net/ethernet/intel/ice/ice_sriov.c154
-rw-r--r--include/linux/net/intel/iidc.h2
15 files changed, 329 insertions, 424 deletions
diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst
index e3972d03cea0..792e9f8c846a 100644
--- a/Documentation/networking/devlink/ice.rst
+++ b/Documentation/networking/devlink/ice.rst
@@ -69,6 +69,17 @@ Parameters
To verify that value has been set:
$ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers
+ * - ``msix_vec_per_pf_max``
+ - driverinit
+ - Set the max MSI-X that can be used by the PF, rest can be utilized for
+ SRIOV. The range is from min value set in msix_vec_per_pf_min to
+ 2k/number of ports.
+ * - ``msix_vec_per_pf_min``
+ - driverinit
+ - Set the min MSI-X that will be used by the PF. This value inform how many
+ MSI-X will be allocated statically. The range is from 2 to value set
+ in msix_vec_per_pf_max.
+
.. list-table:: Driver specific parameters implemented
:widths: 5 5 90
diff --git a/drivers/infiniband/hw/irdma/hw.c b/drivers/infiniband/hw/irdma/hw.c
index ad50b77282f8..69ce1862eabe 100644
--- a/drivers/infiniband/hw/irdma/hw.c
+++ b/drivers/infiniband/hw/irdma/hw.c
@@ -498,8 +498,6 @@ static int irdma_save_msix_info(struct irdma_pci_f *rf)
iw_qvlist->num_vectors = rf->msix_count;
if (rf->msix_count <= num_online_cpus())
rf->msix_shared = true;
- else if (rf->msix_count > num_online_cpus() + 1)
- rf->msix_count = num_online_cpus() + 1;
pmsix = rf->msix_entries;
for (i = 0, ceq_idx = 0; i < rf->msix_count; i++, iw_qvinfo++) {
diff --git a/drivers/infiniband/hw/irdma/main.c b/drivers/infiniband/hw/irdma/main.c
index 3f13200ff71b..1ee8969595d3 100644
--- a/drivers/infiniband/hw/irdma/main.c
+++ b/drivers/infiniband/hw/irdma/main.c
@@ -206,6 +206,43 @@ static void irdma_lan_unregister_qset(struct irdma_sc_vsi *vsi,
ibdev_dbg(&iwdev->ibdev, "WS: LAN free_res for rdma qset failed.\n");
}
+static int irdma_init_interrupts(struct irdma_pci_f *rf, struct ice_pf *pf)
+{
+ int i;
+
+ rf->msix_count = num_online_cpus() + IRDMA_NUM_AEQ_MSIX;
+ rf->msix_entries = kcalloc(rf->msix_count, sizeof(*rf->msix_entries),
+ GFP_KERNEL);
+ if (!rf->msix_entries)
+ return -ENOMEM;
+
+ for (i = 0; i < rf->msix_count; i++)
+ if (ice_alloc_rdma_qvector(pf, &rf->msix_entries[i]))
+ break;
+
+ if (i < IRDMA_MIN_MSIX) {
+ for (; i > 0; i--)
+ ice_free_rdma_qvector(pf, &rf->msix_entries[i]);
+
+ kfree(rf->msix_entries);
+ return -ENOMEM;
+ }
+
+ rf->msix_count = i;
+
+ return 0;
+}
+
+static void irdma_deinit_interrupts(struct irdma_pci_f *rf, struct ice_pf *pf)
+{
+ int i;
+
+ for (i = 0; i < rf->msix_count; i++)
+ ice_free_rdma_qvector(pf, &rf->msix_entries[i]);
+
+ kfree(rf->msix_entries);
+}
+
static void irdma_remove(struct auxiliary_device *aux_dev)
{
struct iidc_auxiliary_dev *iidc_adev = container_of(aux_dev,
@@ -216,6 +253,7 @@ static void irdma_remove(struct auxiliary_device *aux_dev)
irdma_ib_unregister_device(iwdev);
ice_rdma_update_vsi_filter(pf, iwdev->vsi_num, false);
+ irdma_deinit_interrupts(iwdev->rf, pf);
pr_debug("INIT: Gen2 PF[%d] device remove success\n", PCI_FUNC(pf->pdev->devfn));
}
@@ -230,9 +268,7 @@ static void irdma_fill_device_info(struct irdma_device *iwdev, struct ice_pf *pf
rf->gen_ops.unregister_qset = irdma_lan_unregister_qset;
rf->hw.hw_addr = pf->hw.hw_addr;
rf->pcidev = pf->pdev;
- rf->msix_count = pf->num_rdma_msix;
rf->pf_id = pf->hw.pf_id;
- rf->msix_entries = &pf->msix_entries[pf->rdma_base_vector];
rf->default_vsi.vsi_idx = vsi->vsi_num;
rf->protocol_used = pf->rdma_mode & IIDC_RDMA_PROTOCOL_ROCEV2 ?
IRDMA_ROCE_PROTOCOL_ONLY : IRDMA_IWARP_PROTOCOL_ONLY;
@@ -281,6 +317,10 @@ static int irdma_probe(struct auxiliary_device *aux_dev, const struct auxiliary_
irdma_fill_device_info(iwdev, pf, vsi);
rf = iwdev->rf;
+ err = irdma_init_interrupts(rf, pf);
+ if (err)
+ goto err_init_interrupts;
+
err = irdma_ctrl_init_hw(rf);
if (err)
goto err_ctrl_init;
@@ -311,6 +351,8 @@ err_ibreg:
err_rt_init:
irdma_ctrl_deinit_hw(rf);
err_ctrl_init:
+ irdma_deinit_interrupts(rf, pf);
+err_init_interrupts:
kfree(iwdev->rf);
ib_dealloc_device(&iwdev->ibdev);
diff --git a/drivers/infiniband/hw/irdma/main.h b/drivers/infiniband/hw/irdma/main.h
index 9f0ed6e84471..ef9a9b79d711 100644
--- a/drivers/infiniband/hw/irdma/main.h
+++ b/drivers/infiniband/hw/irdma/main.h
@@ -117,6 +117,9 @@ extern struct auxiliary_driver i40iw_auxiliary_drv;
#define IRDMA_IRQ_NAME_STR_LEN (64)
+#define IRDMA_NUM_AEQ_MSIX 1
+#define IRDMA_MIN_MSIX 2
+
enum init_completion_state {
INVALID_STATE = 0,
INITIAL_STATE,
diff --git a/drivers/net/ethernet/intel/ice/devlink/devlink.c b/drivers/net/ethernet/intel/ice/devlink/devlink.c
index dbdb83567364..fcb199efbea5 100644
--- a/drivers/net/ethernet/intel/ice/devlink/devlink.c
+++ b/drivers/net/ethernet/intel/ice/devlink/devlink.c
@@ -1205,6 +1205,25 @@ static int ice_devlink_set_parent(struct devlink_rate *devlink_rate,
return status;
}
+static void ice_set_min_max_msix(struct ice_pf *pf)
+{
+ struct devlink *devlink = priv_to_devlink(pf);
+ union devlink_param_value val;
+ int err;
+
+ err = devl_param_driverinit_value_get(devlink,
+ DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MIN,
+ &val);
+ if (!err)
+ pf->msix.min = val.vu32;
+
+ err = devl_param_driverinit_value_get(devlink,
+ DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MAX,
+ &val);
+ if (!err)
+ pf->msix.max = val.vu32;
+}
+
/**
* ice_devlink_reinit_up - do reinit of the given PF
* @pf: pointer to the PF struct
@@ -1220,6 +1239,9 @@ static int ice_devlink_reinit_up(struct ice_pf *pf)
return err;
}
+ /* load MSI-X values */
+ ice_set_min_max_msix(pf);
+
err = ice_init_dev(pf);
if (err)
goto unroll_hw_init;
@@ -1533,6 +1555,43 @@ static int ice_devlink_local_fwd_validate(struct devlink *devlink, u32 id,
return 0;
}
+static int
+ice_devlink_msix_max_pf_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value val,
+ struct netlink_ext_ack *extack)
+{
+ struct ice_pf *pf = devlink_priv(devlink);
+
+ if (val.vu32 > pf->hw.func_caps.common_cap.num_msix_vectors)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int
+ice_devlink_msix_min_pf_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value val,
+ struct netlink_ext_ack *extack)
+{
+ if (val.vu32 < ICE_MIN_MSIX)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int ice_devlink_enable_rdma_validate(struct devlink *devlink, u32 id,
+ union devlink_param_value val,
+ struct netlink_ext_ack *extack)
+{
+ struct ice_pf *pf = devlink_priv(devlink);
+ bool new_state = val.vbool;
+
+ if (new_state && !test_bit(ICE_FLAG_RDMA_ENA, pf->flags))
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
enum ice_param_id {
ICE_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX,
ICE_DEVLINK_PARAM_ID_TX_SCHED_LAYERS,
@@ -1548,6 +1607,17 @@ static const struct devlink_param ice_dvl_rdma_params[] = {
ice_devlink_enable_iw_get,
ice_devlink_enable_iw_set,
ice_devlink_enable_iw_validate),
+ DEVLINK_PARAM_GENERIC(ENABLE_RDMA, BIT(DEVLINK_PARAM_CMODE_DRIVERINIT),
+ NULL, NULL, ice_devlink_enable_rdma_validate),
+};
+
+static const struct devlink_param ice_dvl_msix_params[] = {
+ DEVLINK_PARAM_GENERIC(MSIX_VEC_PER_PF_MAX,
+ BIT(DEVLINK_PARAM_CMODE_DRIVERINIT),
+ NULL, NULL, ice_devlink_msix_max_pf_validate),
+ DEVLINK_PARAM_GENERIC(MSIX_VEC_PER_PF_MIN,
+ BIT(DEVLINK_PARAM_CMODE_DRIVERINIT),
+ NULL, NULL, ice_devlink_msix_min_pf_validate),
};
static const struct devlink_param ice_dvl_sched_params[] = {
@@ -1651,6 +1721,7 @@ void ice_devlink_unregister(struct ice_pf *pf)
int ice_devlink_register_params(struct ice_pf *pf)
{
struct devlink *devlink = priv_to_devlink(pf);
+ union devlink_param_value value;
struct ice_hw *hw = &pf->hw;
int status;
@@ -1659,10 +1730,39 @@ int ice_devlink_register_params(struct ice_pf *pf)
if (status)
return status;
+ status = devl_params_register(devlink, ice_dvl_msix_params,
+ ARRAY_SIZE(ice_dvl_msix_params));
+ if (status)
+ goto unregister_rdma_params;
+
if (hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
status = devl_params_register(devlink, ice_dvl_sched_params,
ARRAY_SIZE(ice_dvl_sched_params));
+ if (status)
+ goto unregister_msix_params;
+
+ value.vu32 = pf->msix.max;
+ devl_param_driverinit_value_set(devlink,
+ DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MAX,
+ value);
+ value.vu32 = pf->msix.min;
+ devl_param_driverinit_value_set(devlink,
+ DEVLINK_PARAM_GENERIC_ID_MSIX_VEC_PER_PF_MIN,
+ value);
+
+ value.vbool = test_bit(ICE_FLAG_RDMA_ENA, pf->flags);
+ devl_param_driverinit_value_set(devlink,
+ DEVLINK_PARAM_GENERIC_ID_ENABLE_RDMA,
+ value);
+
+ return 0;
+unregister_msix_params:
+ devl_params_unregister(devlink, ice_dvl_msix_params,
+ ARRAY_SIZE(ice_dvl_msix_params));
+unregister_rdma_params:
+ devl_params_unregister(devlink, ice_dvl_rdma_params,
+ ARRAY_SIZE(ice_dvl_rdma_params));
return status;
}
@@ -1673,6 +1773,8 @@ void ice_devlink_unregister_params(struct ice_pf *pf)
devl_params_unregister(devlink, ice_dvl_rdma_params,
ARRAY_SIZE(ice_dvl_rdma_params));
+ devl_params_unregister(devlink, ice_dvl_msix_params,
+ ARRAY_SIZE(ice_dvl_msix_params));
if (hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
devl_params_unregister(devlink, ice_dvl_sched_params,
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 71e05d30f0fd..2a6de2115193 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -97,9 +97,6 @@
#define ICE_MIN_LAN_OICR_MSIX 1
#define ICE_MIN_MSIX (ICE_MIN_LAN_TXRX_MSIX + ICE_MIN_LAN_OICR_MSIX)
#define ICE_FDIR_MSIX 2
-#define ICE_RDMA_NUM_AEQ_MSIX 4
-#define ICE_MIN_RDMA_MSIX 2
-#define ICE_ESWITCH_MSIX 1
#define ICE_NO_VSI 0xffff
#define ICE_VSI_MAP_CONTIG 0
#define ICE_VSI_MAP_SCATTER 1
@@ -542,6 +539,14 @@ struct ice_agg_node {
u8 valid;
};
+struct ice_pf_msix {
+ u32 cur;
+ u32 min;
+ u32 max;
+ u32 total;
+ u32 rest;
+};
+
struct ice_pf {
struct pci_dev *pdev;
struct ice_adapter *adapter;
@@ -556,13 +561,7 @@ struct ice_pf {
/* OS reserved IRQ details */
struct msix_entry *msix_entries;
struct ice_irq_tracker irq_tracker;
- /* First MSIX vector used by SR-IOV VFs. Calculated by subtracting the
- * number of MSIX vectors needed for all SR-IOV VFs from the number of
- * MSIX vectors allowed on this PF.
- */
- u16 sriov_base_vector;
- unsigned long *sriov_irq_bm; /* bitmap to track irq usage */
- u16 sriov_irq_size; /* size of the irq_bm bitmap */
+ struct ice_virt_irq_tracker virt_irq_tracker;
u16 ctrl_vsi_idx; /* control VSI index in pf->vsi array */
@@ -612,7 +611,7 @@ struct ice_pf {
struct msi_map ll_ts_irq; /* LL_TS interrupt MSIX vector */
u16 max_pf_txqs; /* Total Tx queues PF wide */
u16 max_pf_rxqs; /* Total Rx queues PF wide */
- u16 num_lan_msix; /* Total MSIX vectors for base driver */
+ struct ice_pf_msix msix;
u16 num_lan_tx; /* num LAN Tx queues setup */
u16 num_lan_rx; /* num LAN Rx queues setup */
u16 next_vsi; /* Next free slot in pf->vsi[] - 0-based! */
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index b2af8e3586f7..0e862f20427a 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -801,13 +801,11 @@ int ice_vsi_alloc_q_vectors(struct ice_vsi *vsi)
return 0;
err_out:
- while (v_idx--)
- ice_free_q_vector(vsi, v_idx);
- dev_err(dev, "Failed to allocate %d q_vector for VSI %d, ret=%d\n",
- vsi->num_q_vectors, vsi->vsi_num, err);
- vsi->num_q_vectors = 0;
- return err;
+ dev_info(dev, "Failed to allocate %d q_vectors for VSI %d, new value %d",
+ vsi->num_q_vectors, vsi->vsi_num, v_idx);
+ vsi->num_q_vectors = v_idx;
+ return v_idx ? 0 : err;
}
/**
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index f241493a6ac8..b0805704834d 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -3788,8 +3788,7 @@ ice_get_ts_info(struct net_device *dev, struct kernel_ethtool_ts_info *info)
*/
static int ice_get_max_txq(struct ice_pf *pf)
{
- return min3(pf->num_lan_msix, (u16)num_online_cpus(),
- (u16)pf->hw.func_caps.common_cap.num_txq);
+ return min(num_online_cpus(), pf->hw.func_caps.common_cap.num_txq);
}
/**
@@ -3798,8 +3797,7 @@ static int ice_get_max_txq(struct ice_pf *pf)
*/
static int ice_get_max_rxq(struct ice_pf *pf)
{
- return min3(pf->num_lan_msix, (u16)num_online_cpus(),
- (u16)pf->hw.func_caps.common_cap.num_rxq);
+ return min(num_online_cpus(), pf->hw.func_caps.common_cap.num_rxq);
}
/**
@@ -3817,8 +3815,7 @@ static u32 ice_get_combined_cnt(struct ice_vsi *vsi)
ice_for_each_q_vector(vsi, q_idx) {
struct ice_q_vector *q_vector = vsi->q_vectors[q_idx];
- if (q_vector->rx.rx_ring && q_vector->tx.tx_ring)
- combined++;
+ combined += min(q_vector->num_ring_tx, q_vector->num_ring_rx);
}
return combined;
diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c b/drivers/net/ethernet/intel/ice/ice_idc.c
index 145b27f2a4ce..bab3e81cad5d 100644
--- a/drivers/net/ethernet/intel/ice/ice_idc.c
+++ b/drivers/net/ethernet/intel/ice/ice_idc.c
@@ -228,61 +228,34 @@ void ice_get_qos_params(struct ice_pf *pf, struct iidc_qos_params *qos)
}
EXPORT_SYMBOL_GPL(ice_get_qos_params);
-/**
- * ice_alloc_rdma_qvectors - Allocate vector resources for RDMA driver
- * @pf: board private structure to initialize
- */
-static int ice_alloc_rdma_qvectors(struct ice_pf *pf)
+int ice_alloc_rdma_qvector(struct ice_pf *pf, struct msix_entry *entry)
{
- if (ice_is_rdma_ena(pf)) {
- int i;
-
- pf->msix_entries = kcalloc(pf->num_rdma_msix,
- sizeof(*pf->msix_entries),
- GFP_KERNEL);
- if (!pf->msix_entries)
- return -ENOMEM;
+ struct msi_map map = ice_alloc_irq(pf, true);
- /* RDMA is the only user of pf->msix_entries array */
- pf->rdma_base_vector = 0;
-
- for (i = 0; i < pf->num_rdma_msix; i++) {
- struct msix_entry *entry = &pf->msix_entries[i];
- struct msi_map map;
+ if (map.index < 0)
+ return -ENOMEM;
- map = ice_alloc_irq(pf, false);
- if (map.index < 0)
- break;
+ entry->entry = map.index;
+ entry->vector = map.virq;
- entry->entry = map.index;
- entry->vector = map.virq;
- }
- }
return 0;
}
+EXPORT_SYMBOL_GPL(ice_alloc_rdma_qvector);
/**
* ice_free_rdma_qvector - free vector resources reserved for RDMA driver
* @pf: board private structure to initialize
+ * @entry: MSI-X entry to be removed
*/
-static void ice_free_rdma_qvector(struct ice_pf *pf)
+void ice_free_rdma_qvector(struct ice_pf *pf, struct msix_entry *entry)
{
- int i;
-
- if (!pf->msix_entries)
- return;
-
- for (i = 0; i < pf->num_rdma_msix; i++) {
- struct msi_map map;
+ struct msi_map map;
- map.index = pf->msix_entries[i].entry;
- map.virq = pf->msix_entries[i].vector;
- ice_free_irq(pf, map);
- }
-
- kfree(pf->msix_entries);
- pf->msix_entries = NULL;
+ map.index = entry->entry;
+ map.virq = entry->vector;
+ ice_free_irq(pf, map);
}
+EXPORT_SYMBOL_GPL(ice_free_rdma_qvector);
/**
* ice_adev_release - function to be mapped to AUX dev's release op
@@ -382,12 +355,6 @@ int ice_init_rdma(struct ice_pf *pf)
return -ENOMEM;
}
- /* Reserve vector resources */
- ret = ice_alloc_rdma_qvectors(pf);
- if (ret < 0) {
- dev_err(dev, "failed to reserve vectors for RDMA\n");
- goto err_reserve_rdma_qvector;
- }
pf->rdma_mode |= IIDC_RDMA_PROTOCOL_ROCEV2;
ret = ice_plug_aux_dev(pf);
if (ret)
@@ -395,8 +362,6 @@ int ice_init_rdma(struct ice_pf *pf)
return 0;
err_plug_aux_dev:
- ice_free_rdma_qvector(pf);
-err_reserve_rdma_qvector:
pf->adev = NULL;
xa_erase(&ice_aux_id, pf->aux_idx);
return ret;
@@ -412,6 +377,5 @@ void ice_deinit_rdma(struct ice_pf *pf)
return;
ice_unplug_aux_dev(pf);
- ice_free_rdma_qvector(pf);
xa_erase(&ice_aux_id, pf->aux_idx);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_irq.c b/drivers/net/ethernet/intel/ice/ice_irq.c
index ad82ff7d1995..cbae3d81f0f1 100644
--- a/drivers/net/ethernet/intel/ice/ice_irq.c
+++ b/drivers/net/ethernet/intel/ice/ice_irq.c
@@ -20,6 +20,19 @@ ice_init_irq_tracker(struct ice_pf *pf, unsigned int max_vectors,
xa_init_flags(&pf->irq_tracker.entries, XA_FLAGS_ALLOC);
}
+static int
+ice_init_virt_irq_tracker(struct ice_pf *pf, u32 base, u32 num_entries)
+{
+ pf->virt_irq_tracker.bm = bitmap_zalloc(num_entries, GFP_KERNEL);
+ if (!pf->virt_irq_tracker.bm)
+ return -ENOMEM;
+
+ pf->virt_irq_tracker.num_entries = num_entries;
+ pf->virt_irq_tracker.base = base;
+
+ return 0;
+}
+
/**
* ice_deinit_irq_tracker - free xarray tracker
* @pf: board private structure
@@ -29,6 +42,11 @@ static void ice_deinit_irq_tracker(struct ice_pf *pf)
xa_destroy(&pf->irq_tracker.entries);
}
+static void ice_deinit_virt_irq_tracker(struct ice_pf *pf)
+{
+ bitmap_free(pf->virt_irq_tracker.bm);
+}
+
/**
* ice_free_irq_res - free a block of resources
* @pf: board private structure
@@ -45,7 +63,7 @@ static void ice_free_irq_res(struct ice_pf *pf, u16 index)
/**
* ice_get_irq_res - get an interrupt resource
* @pf: board private structure
- * @dyn_only: force entry to be dynamically allocated
+ * @dyn_allowed: allow entry to be dynamically allocated
*
* Allocate new irq entry in the free slot of the tracker. Since xarray
* is used, always allocate new entry at the lowest possible index. Set
@@ -53,11 +71,12 @@ static void ice_free_irq_res(struct ice_pf *pf, u16 index)
*
* Returns allocated irq entry or NULL on failure.
*/
-static struct ice_irq_entry *ice_get_irq_res(struct ice_pf *pf, bool dyn_only)
+static struct ice_irq_entry *ice_get_irq_res(struct ice_pf *pf,
+ bool dyn_allowed)
{
- struct xa_limit limit = { .max = pf->irq_tracker.num_entries,
+ struct xa_limit limit = { .max = pf->irq_tracker.num_entries - 1,
.min = 0 };
- unsigned int num_static = pf->irq_tracker.num_static;
+ unsigned int num_static = pf->irq_tracker.num_static - 1;
struct ice_irq_entry *entry;
unsigned int index;
int ret;
@@ -66,9 +85,9 @@ static struct ice_irq_entry *ice_get_irq_res(struct ice_pf *pf, bool dyn_only)
if (!entry)
return NULL;
- /* skip preallocated entries if the caller says so */
- if (dyn_only)
- limit.min = num_static;
+ /* only already allocated if the caller says so */
+ if (!dyn_allowed)
+ limit.max = num_static;
ret = xa_alloc(&pf->irq_tracker.entries, &index, entry, limit,
GFP_KERNEL);
@@ -78,161 +97,18 @@ static struct ice_irq_entry *ice_get_irq_res(struct ice_pf *pf, bool dyn_only)
entry = NULL;
} else {
entry->index = index;
- entry->dynamic = index >= num_static;
+ entry->dynamic = index > num_static;
}
return entry;
}
-/**
- * ice_reduce_msix_usage - Reduce usage of MSI-X vectors
- * @pf: board private structure
- * @v_remain: number of remaining MSI-X vectors to be distributed
- *
- * Reduce the usage of MSI-X vectors when entire request cannot be fulfilled.
- * pf->num_lan_msix and pf->num_rdma_msix values are set based on number of
- * remaining vectors.
- */
-static void ice_reduce_msix_usage(struct ice_pf *pf, int v_remain)
+#define ICE_RDMA_AEQ_MSIX 1
+static int ice_get_default_msix_amount(struct ice_pf *pf)
{
- int v_rdma;
-
- if (!ice_is_rdma_ena(pf)) {
- pf->num_lan_msix = v_remain;
- return;
- }
-
- /* RDMA needs at least 1 interrupt in addition to AEQ MSIX */
- v_rdma = ICE_RDMA_NUM_AEQ_MSIX + 1;
-
- if (v_remain < ICE_MIN_LAN_TXRX_MSIX + ICE_MIN_RDMA_MSIX) {
- dev_warn(ice_pf_to_dev(pf), "Not enough MSI-X vectors to support RDMA.\n");
- clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
-
- pf->num_rdma_msix = 0;
- pf->num_lan_msix = ICE_MIN_LAN_TXRX_MSIX;
- } else if ((v_remain < ICE_MIN_LAN_TXRX_MSIX + v_rdma) ||
- (v_remain - v_rdma < v_rdma)) {
- /* Support minimum RDMA and give remaining vectors to LAN MSIX
- */
- pf->num_rdma_msix = ICE_MIN_RDMA_MSIX;
- pf->num_lan_msix = v_remain - ICE_MIN_RDMA_MSIX;
- } else {
- /* Split remaining MSIX with RDMA after accounting for AEQ MSIX
- */
- pf->num_rdma_msix = (v_remain - ICE_RDMA_NUM_AEQ_MSIX) / 2 +
- ICE_RDMA_NUM_AEQ_MSIX;
- pf->num_lan_msix = v_remain - pf->num_rdma_msix;
- }
-}
-
-/**
- * ice_ena_msix_range - Request a range of MSIX vectors from the OS
- * @pf: board private structure
- *
- * Compute the number of MSIX vectors wanted and request from the OS. Adjust
- * device usage if there are not enough vectors. Return the number of vectors
- * reserved or negative on failure.
- */
-static int ice_ena_msix_range(struct ice_pf *pf)
-{
- int num_cpus, hw_num_msix, v_other, v_wanted, v_actual;
- struct device *dev = ice_pf_to_dev(pf);
- int err;
-
- hw_num_msix = pf->hw.func_caps.common_cap.num_msix_vectors;
- num_cpus = num_online_cpus();
-
- /* LAN miscellaneous handler */
- v_other = ICE_MIN_LAN_OICR_MSIX;
-
- /* Flow Director */
- if (test_bit(ICE_FLAG_FD_ENA, pf->flags))
- v_other += ICE_FDIR_MSIX;
-
- /* switchdev */
- v_other += ICE_ESWITCH_MSIX;
-
- v_wanted = v_other;
-
- /* LAN traffic */
- pf->num_lan_msix = num_cpus;
- v_wanted += pf->num_lan_msix;
-
- /* RDMA auxiliary driver */
- if (ice_is_rdma_ena(pf)) {
- pf->num_rdma_msix = num_cpus + ICE_RDMA_NUM_AEQ_MSIX;
- v_wanted += pf->num_rdma_msix;
- }
-
- if (v_wanted > hw_num_msix) {
- int v_remain;
-
- dev_warn(dev, "not enough device MSI-X vectors. wanted = %d, available = %d\n",
- v_wanted, hw_num_msix);
-
- if (hw_num_msix < ICE_MIN_MSIX) {
- err = -ERANGE;
- goto exit_err;
- }
-
- v_remain = hw_num_msix - v_other;
- if (v_remain < ICE_MIN_LAN_TXRX_MSIX) {
- v_other = ICE_MIN_MSIX - ICE_MIN_LAN_TXRX_MSIX;
- v_remain = ICE_MIN_LAN_TXRX_MSIX;
- }
-
- ice_reduce_msix_usage(pf, v_remain);
- v_wanted = pf->num_lan_msix + pf->num_rdma_msix + v_other;
-
- dev_notice(dev, "Reducing request to %d MSI-X vectors for LAN traffic.\n",
- pf->num_lan_msix);
- if (ice_is_rdma_ena(pf))
- dev_notice(dev, "Reducing request to %d MSI-X vectors for RDMA.\n",
- pf->num_rdma_msix);
- }
-
- /* actually reserve the vectors */
- v_actual = pci_alloc_irq_vectors(pf->pdev, ICE_MIN_MSIX, v_wanted,
- PCI_IRQ_MSIX);
- if (v_actual < 0) {
- dev_err(dev, "unable to reserve MSI-X vectors\n");
- err = v_actual;
- goto exit_err;
- }
-
- if (v_actual < v_wanted) {
- dev_warn(dev, "not enough OS MSI-X vectors. requested = %d, obtained = %d\n",
- v_wanted, v_actual);
-
- if (v_actual < ICE_MIN_MSIX) {
- /* error if we can't get minimum vectors */
- pci_free_irq_vectors(pf->pdev);
- err = -ERANGE;
- goto exit_err;
- } else {
- int v_remain = v_actual - v_other;
-
- if (v_remain < ICE_MIN_LAN_TXRX_MSIX)
- v_remain = ICE_MIN_LAN_TXRX_MSIX;
-
- ice_reduce_msix_usage(pf, v_remain);
-
- dev_notice(dev, "Enabled %d MSI-X vectors for LAN traffic.\n",
- pf->num_lan_msix);
-
- if (ice_is_rdma_ena(pf))
- dev_notice(dev, "Enabled %d MSI-X vectors for RDMA.\n",
- pf->num_rdma_msix);
- }
- }
-
- return v_actual;
-
-exit_err:
- pf->num_rdma_msix = 0;
- pf->num_lan_msix = 0;
- return err;
+ return ICE_MIN_LAN_OICR_MSIX + num_online_cpus() +
+ (test_bit(ICE_FLAG_FD_ENA, pf->flags) ? ICE_FDIR_MSIX : 0) +
+ (ice_is_rdma_ena(pf) ? num_online_cpus() + ICE_RDMA_AEQ_MSIX : 0);
}
/**
@@ -243,6 +119,7 @@ void ice_clear_interrupt_scheme(struct ice_pf *pf)
{
pci_free_irq_vectors(pf->pdev);
ice_deinit_irq_tracker(pf);
+ ice_deinit_virt_irq_tracker(pf);
}
/**
@@ -252,27 +129,38 @@ void ice_clear_interrupt_scheme(struct ice_pf *pf)
int ice_init_interrupt_scheme(struct ice_pf *pf)
{
int total_vectors = pf->hw.func_caps.common_cap.num_msix_vectors;
- int vectors, max_vectors;
+ int vectors;
- vectors = ice_ena_msix_range(pf);
+ /* load default PF MSI-X range */
+ if (!pf->msix.min)
+ pf->msix.min = ICE_MIN_MSIX;
- if (vectors < 0)
- return -ENOMEM;
+ if (!pf->msix.max)
+ pf->msix.max = min(total_vectors,
+ ice_get_default_msix_amount(pf));
+
+ pf->msix.total = total_vectors;
+ pf->msix.rest = total_vectors - pf->msix.max;
if (pci_msix_can_alloc_dyn(pf->pdev))
- max_vectors = total_vectors;
+ vectors = pf->msix.min;
else
- max_vectors = vectors;
+ vectors = pf->msix.max;
+
+ vectors = pci_alloc_irq_vectors(pf->pdev, pf->msix.min, vectors,
+ PCI_IRQ_MSIX);
+ if (vectors < pf->msix.min)
+ return -ENOMEM;
- ice_init_irq_tracker(pf, max_vectors, vectors);
+ ice_init_irq_tracker(pf, pf->msix.max, vectors);
- return 0;
+ return ice_init_virt_irq_tracker(pf, pf->msix.max, pf->msix.rest);
}
/**
* ice_alloc_irq - Allocate new interrupt vector
* @pf: board private structure
- * @dyn_only: force dynamic allocation of the interrupt
+ * @dyn_allowed: allow dynamic allocation of the interrupt
*
* Allocate new interrupt vector for a given owner id.
* return struct msi_map with interrupt details and track
@@ -285,27 +173,22 @@ int ice_init_interrupt_scheme(struct ice_pf *pf)
* interrupt will be allocated with pci_msix_alloc_irq_at.
*
* Some callers may only support dynamically allocated interrupts.
- * This is indicated with dyn_only flag.
+ * This is indicated with dyn_allowed flag.
*
* On failure, return map with negative .index. The caller
* is expected to check returned map index.
*
*/
-struct msi_map ice_alloc_irq(struct ice_pf *pf, bool dyn_only)
+struct msi_map ice_alloc_irq(struct ice_pf *pf, bool dyn_allowed)
{
- int sriov_base_vector = pf->sriov_base_vector;
struct msi_map map = { .index = -ENOENT };
struct device *dev = ice_pf_to_dev(pf);
struct ice_irq_entry *entry;
- entry = ice_get_irq_res(pf, dyn_only);
+ entry = ice_get_irq_res(pf, dyn_allowed);
if (!entry)
return map;
- /* fail if we're about to violate SRIOV vectors space */
- if (sriov_base_vector && entry->index >= sriov_base_vector)
- goto exit_free_res;
-
if (pci_msix_can_alloc_dyn(pf->pdev) && entry->dynamic) {
map = pci_msix_alloc_irq_at(pf->pdev, entry->index, NULL);
if (map.index < 0)
@@ -353,26 +236,40 @@ void ice_free_irq(struct ice_pf *pf, struct msi_map map)
}
/**
- * ice_get_max_used_msix_vector - Get the max used interrupt vector
- * @pf: board private structure
+ * ice_virt_get_irqs - get irqs for SR-IOV usacase
+ * @pf: pointer to PF structure
+ * @needed: number of irqs to get
*
- * Return index of maximum used interrupt vectors with respect to the
- * beginning of the MSIX table. Take into account that some interrupts
- * may have been dynamically allocated after MSIX was initially enabled.
+ * This returns the first MSI-X vector index in PF space that is used by this
+ * VF. This index is used when accessing PF relative registers such as
+ * GLINT_VECT2FUNC and GLINT_DYN_CTL.
+ * This will always be the OICR index in the AVF driver so any functionality
+ * using vf->first_vector_idx for queue configuration_id: id of VF which will
+ * use this irqs
*/
-int ice_get_max_used_msix_vector(struct ice_pf *pf)
+int ice_virt_get_irqs(struct ice_pf *pf, u32 needed)
{
- unsigned long start, index, max_idx;
- void *entry;
+ int res = bitmap_find_next_zero_area(pf->virt_irq_tracker.bm,
+ pf->virt_irq_tracker.num_entries,
+ 0, needed, 0);
- /* Treat all preallocated interrupts as used */
- start = pf->irq_tracker.num_static;
- max_idx = start - 1;
+ if (res >= pf->virt_irq_tracker.num_entries)
+ return -ENOENT;
- xa_for_each_start(&pf->irq_tracker.entries, index, entry, start) {
- if (index > max_idx)
- max_idx = index;
- }
+ bitmap_set(pf->virt_irq_tracker.bm, res, needed);
+
+ /* conversion from number in bitmap to global irq index */
+ return res + pf->virt_irq_tracker.base;
+}
- return max_idx;
+/**
+ * ice_virt_free_irqs - free irqs used by the VF
+ * @pf: pointer to PF structure
+ * @index: first index to be free
+ * @irqs: number of irqs to free
+ */
+void ice_virt_free_irqs(struct ice_pf *pf, u32 index, u32 irqs)
+{
+ bitmap_clear(pf->virt_irq_tracker.bm, index - pf->virt_irq_tracker.base,
+ irqs);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_irq.h b/drivers/net/ethernet/intel/ice/ice_irq.h
index f35efc08575e..b2f9dbafd57e 100644
--- a/drivers/net/ethernet/intel/ice/ice_irq.h
+++ b/drivers/net/ethernet/intel/ice/ice_irq.h
@@ -15,11 +15,22 @@ struct ice_irq_tracker {
u16 num_static; /* preallocated entries */
};
+struct ice_virt_irq_tracker {
+ unsigned long *bm; /* bitmap to track irq usage */
+ u32 num_entries;
+ /* First MSIX vector used by SR-IOV VFs. Calculated by subtracting the
+ * number of MSIX vectors needed for all SR-IOV VFs from the number of
+ * MSIX vectors allowed on this PF.
+ */
+ u32 base;
+};
+
int ice_init_interrupt_scheme(struct ice_pf *pf);
void ice_clear_interrupt_scheme(struct ice_pf *pf);
struct msi_map ice_alloc_irq(struct ice_pf *pf, bool dyn_only);
void ice_free_irq(struct ice_pf *pf, struct msi_map map);
-int ice_get_max_used_msix_vector(struct ice_pf *pf);
+int ice_virt_get_irqs(struct ice_pf *pf, u32 needed);
+void ice_virt_free_irqs(struct ice_pf *pf, u32 index, u32 irqs);
#endif
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 38a1c8372180..16c419809849 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -157,6 +157,16 @@ static void ice_vsi_set_num_desc(struct ice_vsi *vsi)
}
}
+static u16 ice_get_rxq_count(struct ice_pf *pf)
+{
+ return min(ice_get_avail_rxq_count(pf), num_online_cpus());
+}
+
+static u16 ice_get_txq_count(struct ice_pf *pf)
+{
+ return min(ice_get_avail_txq_count(pf), num_online_cpus());
+}
+
/**
* ice_vsi_set_num_qs - Set number of queues, descriptors and vectors for a VSI
* @vsi: the VSI being configured
@@ -178,9 +188,7 @@ static void ice_vsi_set_num_qs(struct ice_vsi *vsi)
vsi->alloc_txq = vsi->req_txq;
vsi->num_txq = vsi->req_txq;
} else {
- vsi->alloc_txq = min3(pf->num_lan_msix,
- ice_get_avail_txq_count(pf),
- (u16)num_online_cpus());
+ vsi->alloc_txq = ice_get_txq_count(pf);
}
pf->num_lan_tx = vsi->alloc_txq;
@@ -193,17 +201,13 @@ static void ice_vsi_set_num_qs(struct ice_vsi *vsi)
vsi->alloc_rxq = vsi->req_rxq;
vsi->num_rxq = vsi->req_rxq;
} else {
- vsi->alloc_rxq = min3(pf->num_lan_msix,
- ice_get_avail_rxq_count(pf),
- (u16)num_online_cpus());
+ vsi->alloc_rxq = ice_get_rxq_count(pf);
}
}
pf->num_lan_rx = vsi->alloc_rxq;
- vsi->num_q_vectors = min_t(int, pf->num_lan_msix,
- max_t(int, vsi->alloc_rxq,
- vsi->alloc_txq));
+ vsi->num_q_vectors = max(vsi->alloc_rxq, vsi->alloc_txq);
break;
case ICE_VSI_SF:
vsi->alloc_txq = 1;
@@ -567,6 +571,8 @@ ice_vsi_alloc_def(struct ice_vsi *vsi, struct ice_channel *ch)
return -ENOMEM;
}
+ vsi->irq_dyn_alloc = pci_msix_can_alloc_dyn(vsi->back->pdev);
+
switch (vsi->type) {
case ICE_VSI_PF:
case ICE_VSI_SF:
@@ -827,7 +833,13 @@ bool ice_is_safe_mode(struct ice_pf *pf)
*/
bool ice_is_rdma_ena(struct ice_pf *pf)
{
- return test_bit(ICE_FLAG_RDMA_ENA, pf->flags);
+ union devlink_param_value value;
+ int err;
+
+ err = devl_param_driverinit_value_get(priv_to_devlink(pf),
+ DEVLINK_PARAM_GENERIC_ID_ENABLE_RDMA,
+ &value);
+ return err ? test_bit(ICE_FLAG_RDMA_ENA, pf->flags) : value.vbool;
}
/**
@@ -1173,12 +1185,11 @@ static void ice_set_rss_vsi_ctx(struct ice_vsi_ctx *ctxt, struct ice_vsi *vsi)
static void
ice_chnl_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
{
- struct ice_pf *pf = vsi->back;
u16 qcount, qmap;
u8 offset = 0;
int pow;
- qcount = min_t(int, vsi->num_rxq, pf->num_lan_msix);
+ qcount = vsi->num_rxq;
pow = order_base_2(qcount);
qmap = FIELD_PREP(ICE_AQ_VSI_TC_Q_OFFSET_M, offset);
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c3a0fb97c5ee..d7037de29545 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5186,11 +5186,12 @@ int ice_load(struct ice_pf *pf)
ice_napi_add(vsi);
+ ice_init_features(pf);
+
err = ice_init_rdma(pf);
if (err)
goto err_init_rdma;
- ice_init_features(pf);
ice_service_task_restart(pf);
clear_bit(ICE_DOWN, pf->state);
@@ -5198,6 +5199,7 @@ int ice_load(struct ice_pf *pf)
return 0;
err_init_rdma:
+ ice_deinit_features(pf);
ice_tc_indir_block_unregister(vsi);
err_tc_indir_block_register:
ice_unregister_netdev(vsi);
@@ -5221,8 +5223,8 @@ void ice_unload(struct ice_pf *pf)
devl_assert_locked(priv_to_devlink(pf));
- ice_deinit_features(pf);
ice_deinit_rdma(pf);
+ ice_deinit_features(pf);
ice_tc_indir_block_unregister(vsi);
ice_unregister_netdev(vsi);
ice_devlink_destroy_pf_port(pf);
diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c
index b83f99c01d91..33eac29b6a50 100644
--- a/drivers/net/ethernet/intel/ice/ice_sriov.c
+++ b/drivers/net/ethernet/intel/ice/ice_sriov.c
@@ -123,27 +123,6 @@ static void ice_dis_vf_mappings(struct ice_vf *vf)
}
/**
- * ice_sriov_free_msix_res - Reset/free any used MSIX resources
- * @pf: pointer to the PF structure
- *
- * Since no MSIX entries are taken from the pf->irq_tracker then just clear
- * the pf->sriov_base_vector.
- *
- * Returns 0 on success, and -EINVAL on error.
- */
-static int ice_sriov_free_msix_res(struct ice_pf *pf)
-{
- if (!pf)
- return -EINVAL;
-
- bitmap_free(pf->sriov_irq_bm);
- pf->sriov_irq_size = 0;
- pf->sriov_base_vector = 0;
-
- return 0;
-}
-
-/**
* ice_free_vfs - Free all VFs
* @pf: pointer to the PF structure
*/
@@ -177,6 +156,7 @@ void ice_free_vfs(struct ice_pf *pf)
ice_eswitch_detach_vf(pf, vf);
ice_dis_vf_qs(vf);
+ ice_virt_free_irqs(pf, vf->first_vector_idx, vf->num_msix);
if (test_bit(ICE_VF_STATE_INIT, vf->vf_states)) {
/* disable VF qp mappings and set VF disable state */
@@ -200,9 +180,6 @@ void ice_free_vfs(struct ice_pf *pf)
mutex_unlock(&vf->cfg_lock);
}
- if (ice_sriov_free_msix_res(pf))
- dev_err(dev, "Failed to free MSIX resources used by SR-IOV\n");
-
vfs->num_qps_per = 0;
ice_free_vf_entries(pf);
@@ -372,40 +349,6 @@ void ice_calc_vf_reg_idx(struct ice_vf *vf, struct ice_q_vector *q_vector)
}
/**
- * ice_sriov_set_msix_res - Set any used MSIX resources
- * @pf: pointer to PF structure
- * @num_msix_needed: number of MSIX vectors needed for all SR-IOV VFs
- *
- * This function allows SR-IOV resources to be taken from the end of the PF's
- * allowed HW MSIX vectors so that the irq_tracker will not be affected. We
- * just set the pf->sriov_base_vector and return success.
- *
- * If there are not enough resources available, return an error. This should
- * always be caught by ice_set_per_vf_res().
- *
- * Return 0 on success, and -EINVAL when there are not enough MSIX vectors
- * in the PF's space available for SR-IOV.
- */
-static int ice_sriov_set_msix_res(struct ice_pf *pf, u16 num_msix_needed)
-{
- u16 total_vectors = pf->hw.func_caps.common_cap.num_msix_vectors;
- int vectors_used = ice_get_max_used_msix_vector(pf);
- int sriov_base_vector;
-
- sriov_base_vector = total_vectors - num_msix_needed;
-
- /* make sure we only grab irq_tracker entries from the list end and
- * that we have enough available MSIX vectors
- */
- if (sriov_base_vector < vectors_used)
- return -EINVAL;
-
- pf->sriov_base_vector = sriov_base_vector;
-
- return 0;
-}
-
-/**
* ice_set_per_vf_res - check if vectors and queues are available
* @pf: pointer to the PF structure
* @num_vfs: the number of SR-IOV VFs being configured
@@ -429,11 +372,9 @@ static int ice_sriov_set_msix_res(struct ice_pf *pf, u16 num_msix_needed)
*/
static int ice_set_per_vf_res(struct ice_pf *pf, u16 num_vfs)
{
- int vectors_used = ice_get_max_used_msix_vector(pf);
u16 num_msix_per_vf, num_txq, num_rxq, avail_qs;
int msix_avail_per_vf, msix_avail_for_sriov;
struct device *dev = ice_pf_to_dev(pf);
- int err;
lockdep_assert_held(&pf->vfs.table_lock);
@@ -441,8 +382,7 @@ static int ice_set_per_vf_res(struct ice_pf *pf, u16 num_vfs)
return -EINVAL;
/* determine MSI-X resources per VF */
- msix_avail_for_sriov = pf->hw.func_caps.common_cap.num_msix_vectors -
- vectors_used;
+ msix_avail_for_sriov = pf->virt_irq_tracker.num_entries;
msix_avail_per_vf = msix_avail_for_sriov / num_vfs;
if (msix_avail_per_vf >= ICE_NUM_VF_MSIX_MED) {
num_msix_per_vf = ICE_NUM_VF_MSIX_MED;
@@ -481,13 +421,6 @@ static int ice_set_per_vf_res(struct ice_pf *pf, u16 num_vfs)
return -ENOSPC;
}
- err = ice_sriov_set_msix_res(pf, num_msix_per_vf * num_vfs);
- if (err) {
- dev_err(dev, "Unable to set MSI-X resources for %d VFs, err %d\n",
- num_vfs, err);
- return err;
- }
-
/* only allow equal Tx/Rx queue count (i.e. queue pairs) */
pf->vfs.num_qps_per = min_t(int, num_txq, num_rxq);
pf->vfs.num_msix_per = num_msix_per_vf;
@@ -498,52 +431,6 @@ static int ice_set_per_vf_res(struct ice_pf *pf, u16 num_vfs)
}
/**
- * ice_sriov_get_irqs - get irqs for SR-IOV usacase
- * @pf: pointer to PF structure
- * @needed: number of irqs to get
- *
- * This returns the first MSI-X vector index in PF space that is used by this
- * VF. This index is used when accessing PF relative registers such as
- * GLINT_VECT2FUNC and GLINT_DYN_CTL.
- * This will always be the OICR index in the AVF driver so any functionality
- * using vf->first_vector_idx for queue configuration_id: id of VF which will
- * use this irqs
- *
- * Only SRIOV specific vectors are tracked in sriov_irq_bm. SRIOV vectors are
- * allocated from the end of global irq index. First bit in sriov_irq_bm means
- * last irq index etc. It simplifies extension of SRIOV vectors.
- * They will be always located from sriov_base_vector to the last irq
- * index. While increasing/decreasing sriov_base_vector can be moved.
- */
-static int ice_sriov_get_irqs(struct ice_pf *pf, u16 needed)
-{
- int res = bitmap_find_next_zero_area(pf->sriov_irq_bm,
- pf->sriov_irq_size, 0, needed, 0);
- /* conversion from number in bitmap to global irq index */
- int index = pf->sriov_irq_size - res - needed;
-
- if (res >= pf->sriov_irq_size || index < pf->sriov_base_vector)
- return -ENOENT;
-
- bitmap_set(pf->sriov_irq_bm, res, needed);
- return index;
-}
-
-/**
- * ice_sriov_free_irqs - free irqs used by the VF
- * @pf: pointer to PF structure
- * @vf: pointer to VF structure
- */
-static void ice_sriov_free_irqs(struct ice_pf *pf, struct ice_vf *vf)
-{
- /* Move back from first vector index to first index in bitmap */
- int bm_i = pf->sriov_irq_size - vf->first_vector_idx - vf->num_msix;
-
- bitmap_clear(pf->sriov_irq_bm, bm_i, vf->num_msix);
- vf->first_vector_idx = 0;
-}
-
-/**
* ice_init_vf_vsi_res - initialize/setup VF VSI resources
* @vf: VF to initialize/setup the VSI for
*
@@ -556,7 +443,7 @@ static int ice_init_vf_vsi_res(struct ice_vf *vf)
struct ice_vsi *vsi;
int err;
- vf->first_vector_idx = ice_sriov_get_irqs(pf, vf->num_msix);
+ vf->first_vector_idx = ice_virt_get_irqs(pf, vf->num_msix);
if (vf->first_vector_idx < 0)
return -ENOMEM;
@@ -856,16 +743,10 @@ err_free_entries:
*/
static int ice_ena_vfs(struct ice_pf *pf, u16 num_vfs)
{
- int total_vectors = pf->hw.func_caps.common_cap.num_msix_vectors;
struct device *dev = ice_pf_to_dev(pf);
struct ice_hw *hw = &pf->hw;
int ret;
- pf->sriov_irq_bm = bitmap_zalloc(total_vectors, GFP_KERNEL);
- if (!pf->sriov_irq_bm)
- return -ENOMEM;
- pf->sriov_irq_size = total_vectors;
-
/* Disable global interrupt 0 so we don't try to handle the VFLR. */
wr32(hw, GLINT_DYN_CTL(pf->oicr_irq.index),
ICE_ITR_NONE << GLINT_DYN_CTL_ITR_INDX_S);
@@ -918,7 +799,6 @@ err_unroll_intr:
/* rearm interrupts here */
ice_irq_dynamic_ena(hw, NULL, NULL);
clear_bit(ICE_OICR_INTR_DIS, pf->state);
- bitmap_free(pf->sriov_irq_bm);
return ret;
}
@@ -992,16 +872,7 @@ u32 ice_sriov_get_vf_total_msix(struct pci_dev *pdev)
{
struct ice_pf *pf = pci_get_drvdata(pdev);
- return pf->sriov_irq_size - ice_get_max_used_msix_vector(pf);
-}
-
-static int ice_sriov_move_base_vector(struct ice_pf *pf, int move)
-{
- if (pf->sriov_base_vector - move < ice_get_max_used_msix_vector(pf))
- return -ENOMEM;
-
- pf->sriov_base_vector -= move;
- return 0;
+ return pf->virt_irq_tracker.num_entries;
}
static void ice_sriov_remap_vectors(struct ice_pf *pf, u16 restricted_id)
@@ -1020,7 +891,8 @@ static void ice_sriov_remap_vectors(struct ice_pf *pf, u16 restricted_id)
continue;
ice_dis_vf_mappings(tmp_vf);
- ice_sriov_free_irqs(pf, tmp_vf);
+ ice_virt_free_irqs(pf, tmp_vf->first_vector_idx,
+ tmp_vf->num_msix);
vf_ids[to_remap] = tmp_vf->vf_id;
to_remap += 1;
@@ -1032,7 +904,7 @@ static void ice_sriov_remap_vectors(struct ice_pf *pf, u16 restricted_id)
continue;
tmp_vf->first_vector_idx =
- ice_sriov_get_irqs(pf, tmp_vf->num_msix);
+ ice_virt_get_irqs(pf, tmp_vf->num_msix);
/* there is no need to rebuild VSI as we are only changing the
* vector indexes not amount of MSI-X or queues
*/
@@ -1105,20 +977,15 @@ int ice_sriov_set_msix_vec_count(struct pci_dev *vf_dev, int msix_vec_count)
prev_msix = vf->num_msix;
prev_queues = vf->num_vf_qs;
- if (ice_sriov_move_base_vector(pf, msix_vec_count - prev_msix)) {
- ice_put_vf(vf);
- return -ENOSPC;
- }
-
ice_dis_vf_mappings(vf);
- ice_sriov_free_irqs(pf, vf);
+ ice_virt_free_irqs(pf, vf->first_vector_idx, vf->num_msix);
/* Remap all VFs beside the one is now configured */
ice_sriov_remap_vectors(pf, vf->vf_id);
vf->num_msix = msix_vec_count;
vf->num_vf_qs = queues;
- vf->first_vector_idx = ice_sriov_get_irqs(pf, vf->num_msix);
+ vf->first_vector_idx = ice_virt_get_irqs(pf, vf->num_msix);
if (vf->first_vector_idx < 0)
goto unroll;
@@ -1147,7 +1014,8 @@ unroll:
vf->num_msix = prev_msix;
vf->num_vf_qs = prev_queues;
- vf->first_vector_idx = ice_sriov_get_irqs(pf, vf->num_msix);
+
+ vf->first_vector_idx = ice_virt_get_irqs(pf, vf->num_msix);
if (vf->first_vector_idx < 0) {
ice_put_vf(vf);
return -EINVAL;
diff --git a/include/linux/net/intel/iidc.h b/include/linux/net/intel/iidc.h
index 1c1332e4df26..13274c3def66 100644
--- a/include/linux/net/intel/iidc.h
+++ b/include/linux/net/intel/iidc.h
@@ -78,6 +78,8 @@ int ice_del_rdma_qset(struct ice_pf *pf, struct iidc_rdma_qset_params *qset);
int ice_rdma_request_reset(struct ice_pf *pf, enum iidc_reset_type reset_type);
int ice_rdma_update_vsi_filter(struct ice_pf *pf, u16 vsi_id, bool enable);
void ice_get_qos_params(struct ice_pf *pf, struct iidc_qos_params *qos);
+int ice_alloc_rdma_qvector(struct ice_pf *pf, struct msix_entry *entry);
+void ice_free_rdma_qvector(struct ice_pf *pf, struct msix_entry *entry);
/* Structure representing auxiliary driver tailored information about the core
* PCI dev, each auxiliary driver using the IIDC interface will have an