aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-04-30net/mlx5e: Fetch WQE: reuse code and enforce typingMaxim Mikityanskiy1-4/+6
There are multiple functions mlx5{e,i}_*_fetch_wqe that contain the same code, that is repeated, because they operate on different SQ struct types. mlx5e_sq_fetch_wqe also returns void *, instead of the concrete WQE type. This commit generalizes the fetch WQE operation by putting this code into a single function. To simplify calls of the generic function in concrete use cases, macros are provided that substitute the right WQE size and cast the return type. Before this patch, fetch_wqe used to calculate pi itself, but the value was often known to the caller. This calculation is moved outside to eliminate this unnecessary step and prepare for the fill_frag_edge refactoring in the next patch. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30net/mlx5e: TX, Generalise code and usage of error CQE dumpTariq Toukan1-17/+1
Error CQE was dumped only for TXQ SQs. Generalise the function, and add usage for error completions on ICO SQs and XDP SQs. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-25net/mlx5e: Define one flow for TXQ selection when TCs are configuredEran Ben Elisha1-7/+6
We shall always extract channel index out of the txq, regardless of the relation between txq_ix and num channels. The extraction is always valid, as if txq is smaller than number of channels, txq_ix == priv->txq2sq[txq_ix]->ch_ix. By doing so, we can remove an if clause from the select queue method, and have one flow for all packets. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5e: TX, Error completion is for last WQE in batchTariq Toukan1-19/+14
For a cyclic work queue, when not requesting a completion per WQE, a single CQE might indicate the completion of several WQEs. However, in case some WQE in the batch causes an error, then an error completion is issued, breaking the batch, and pointing to the offending WQE in the wqe_counter field. Hence, WQE-specific error CQE handling (like printing, breaking, etc...) should be performed only for the last WQE in batch. Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events") Fixes: fd9b4be8002c ("net/mlx5e: RX, Support multiple outstanding UMR posts") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-12-05net/mlx5e: Fix TXQ indices to be sequentialEran Ben Elisha1-1/+1
Cited patch changed (channel index, tc) => (TXQ index) mapping to be a static one, in order to keep indices consistent when changing number of channels or TCs. For 32 channels (OOB) and 8 TCs, real num of TXQs is 256. When reducing the amount of channels to 8, the real num of TXQs will be changed to 64. This indices method is buggy: - Channel #0, TC 3, the TXQ index is 96. - Index 8 is not valid, as there is no such TXQ from driver perspective (As it represents channel #8, TC 0, which is not valid with the above configuration). As part of driver's select queue, it calls netdev_pick_tx which returns an index in the range of real number of TXQs. Depends on the return value, with the examples above, driver could have returned index larger than the real number of tx queues, or crash the kernel as it tries to read invalid address of SQ which was not allocated. Fix that by allocating sequential TXQ indices, and hold a new mapping between (channel index, tc) => (real TXQ index). This mapping will be updated as part of priv channels activation, and is used in mlx5e_select_queue to find the selected queue index. The existing indices mapping (channel_tc2txq) is no longer needed, as it is used only for statistics structures and can be calculated on run time. Delete its definintion and updates. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-11-03Merge tag 'mlx5-updates-2019-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller1-0/+6
Saeed Mahameed says: ==================== mlx5-updates-2019-11-01 Misc updates for mlx5 netdev and core driver 1) Steering Core: Replace CRC32 internal implementation with standard kernel lib. 2) Steering Core: Support IPv4 and IPv6 mixed matcher. 3) Steering Core: Lockless FTE read lookups 4) TC: Bit sized fields rewrite support. 5) FPGA: Standalone FPGA support. 6) SRIOV: Reset VF parameters configurations on SRIOV disable. 7) netdev: Dump WQs wqe descriptors on CQE with error events. 8) MISC Cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error eventsSaeed Mahameed1-0/+6
Dump the Work Queue's TX WQE descriptor when a completion with error is received. Example: [5.331832] mlx5_core 0000:00:04.0 enp0s4: Error cqe on cqn 0xa, ci 0x1, TXQ-SQ qpn 0xe, opcode 0xd, syndrome 0x2, vendor syndrome 0x0 [5.333127] 00000000: 55 65 02 75 31 fe c2 d2 6b 6c 62 1e f9 e1 d8 5c [5.333837] 00000010: d3 b2 6c b8 89 e4 84 20 0b f4 3c e0 f3 75 41 ca [5.334568] 00000020: 46 00 00 00 cd 70 a0 92 18 3a 01 de 00 00 00 00 [5.335313] 00000030: 7d bc 05 89 b2 e9 00 02 1e 00 00 0e 00 00 30 d2 [5.335972] WQE DUMP: WQ size 1024 WQ cur size 0, WQE index 0x0, len: 64 [5.336710] 00000000: 00 00 00 1e 00 00 0e 04 00 00 00 08 00 00 00 00 [5.337524] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 12 33 33 [5.338151] 00000020: 00 00 00 16 52 54 00 00 00 01 86 dd 60 00 00 00 [5.338740] 00000030: 00 00 00 48 00 00 00 00 00 00 00 00 66 ba 58 14 Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-10-18net/mlx5e: TX, Fix consumer index of error cqe dumpTariq Toukan1-1/+4
The completion queue consumer index increments upon a call to mlx5_cqwq_pop(). When dumping an error CQE, the index is already incremented. Decrease one for the print command. Fixes: 16cc14d81733 ("net/mlx5e: Dump xmit error completions") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-10-18net/mlx5e: kTLS, Release reference on DUMPed fragments in shutdown flowTariq Toukan1-13/+15
A call to kTLS completion handler was missing in the TXQSQ release flow. Add it. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-10-18net/mlx5e: Tx, Fix assumption of single WQEBB of NOP in cleanup flowTariq Toukan1-2/+2
Cited patch removed the assumption only in datapath. Here we remove it also form control/cleanup flow. Fixes: 9ab0233728ca ("net/mlx5e: Tx, Don't implicitly assume SKB-less wqe has one WQEBB") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-08-01net/mlx5e: Tx, Soften inline mode VLAN dependenciesTariq Toukan1-3/+4
If capable, use zero inline mode in TX WQE for non-VLAN packets. For VLAN ones, keep the enforcement of at least L2 inline mode, unless the WQE VLAN insertion offload cap is on. Performance: Tested single core packet rate of 64Bytes. NIC: ConnectX-5 CPU: Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz pktgen: Before: 12.46 Mpps After: 14.65 Mpps (+17.5%) XDP_TX: The MPWQE flow is not affected, as it already has this optimization. So we test with priv-flag xdp_tx_mpwqe: off. Before: 9.90 Mpps After: 10.20 Mpps (+3%) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Tested-by: Noam Stolero <noams@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-07-22net: Use skb accessors in network driversMatthew Wilcox (Oracle)1-1/+1
In preparation for unifying the skb_frag and bio_vec, use the fine accessors which already exist and use skb_frag_t instead of struct skb_frag_struct. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Add kTLS TX HW offload supportTariq Toukan1-0/+15
Add support for transmit side kernel-TLS acceleration. Offload the crypto encryption to HW. Per TLS connection: - Use a separate TIS to maintain the HW context. - Use a separate encryption key. - Maintain static and progress HW contexts by posting the proper WQEs at creation time, or upon resync. - Use a special DUMP opcode to replay the previous frags and sync the HW context. To make sure the SQ is able to serve an xmit request, increase SQ stop room to cover: - static params WQE, - progress params WQE, and - resync DUMP per frag. Currently supporting TLS 1.2, and key size 128bit. Tested over SimX simulator. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Tx, Unconstify SQ stop roomTariq Toukan1-16/+2
Use an SQ field for stop_room, and use the larger value only if TLS is supported. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Tx, Don't implicitly assume SKB-less wqe has one WQEBBEran Ben Elisha1-2/+2
When polling a CQE of an SKB-less WQE, don't assume it consumed only one WQEBB. Use wi->num_wqebbs directly instead. In the downstream patch, SKB-less WQEs might have more the one WQEBB, thus this change is needed. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Tx, Make SQ WQE fetch function type genericTariq Toukan1-2/+2
Change mlx5e_sq_fetch_wqe to be agnostic to the Work Queue Element (WQE) type. Before this patch, it was specific for struct mlx5e_tx_wqe. In order to allow the change, the function now returns the generic void pointer, and gets the WQE size to do the zero memset. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Tx, Enforce L4 inline copy when neededTariq Toukan1-1/+4
When ctrl->tisn field exists, this indicates an operation (HW offload) on the TCP payload. For such WQEs, inline the headers up to L4. This is in preparation for kTLS HW offload support, added in a downstream patch. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-05net/mlx5e: Move helper functions to a new txrx datapath headerTariq Toukan1-51/+1
Take datapath helper functions to a new header file en/txrx.h. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-6/+6
Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07net/mlx5e: Replace reciprocal_scale in TX select queue functionShay Agroskin1-6/+6
The TX queue index returned by the fallback function ranges between [0,NUM CHANNELS - 1] if QoS isn't set and [0, (NUM CHANNELS)*(NUM TCs) -1] otherwise. Our HW uses different TC mapping than the fallback function (which is denoted as 'up', user priority) so we only need to extract a channel number out of the returned value. Since (NUM CHANNELS)*(NUM TCs) is a relatively small number, using reciprocal scale almost always returns zero. We instead access the 'txq2sq' table to extract the sq (and with it the channel number) associated with the tx queue, thus getting a more evenly distributed channel number. Perf: Rx/Tx side with Intel(R) Xeon(R) Silver 4108 CPU @ 1.80GHz and ConnectX-5. Used 'iperf' UDP traffic, 10 threads, and priority 5. Before: 0.566Mpps After: 2.37Mpps As expected, releasing the existing bottleneck of steering all traffic to TX queue zero significantly improves transmission rates. Fixes: 7ccdd0841b30 ("net/mlx5e: Fix select queue callback") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-31net/mlx5e: TX, Improve performance under GSO workloadErez Alfasi1-3/+4
__netdev_tx_sent_queue() was introduced by: commit 3e59020abf0f ("net: bql: add __netdev_tx_sent_queue()") BQL counters should be updated without flipping/caring about BQL status, if the current skb has xmit_more set. Using __netdev_tx_sent_queue() avoids messing with BQL stop flag, increases performance on GSO workload by keeping doorbells to the minimum required and also sparing atomic operations. Signed-off-by: Erez Alfasi <ereza@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-05-17net/mlx5e: Fix wrong xmit_more applicationTariq Toukan1-4/+5
Cited patch refactored the xmit_more indication while not preserving its functionality. Fix it. Fixes: 3c31ff22b25f ("drivers: mellanox: use netdev_xmit_more() helper") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-04-23net: pass net_device argument to the eth_get_headlenStanislav Fomichev1-1/+1
Update all users of eth_get_headlen to pass network device, fetch network namespace from it and pass it down to the flow dissector. This commit is a noop until administrator inserts BPF flow dissector program. Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Michael Chan <michael.chan@broadcom.com> Cc: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-04-01drivers: mellanox: use netdev_xmit_more() helperFlorian Westphal1-8/+9
skb->xmit_more hint is now always 0. This switches the mellanox drivers to the netdev_xmit_more() helper. Cc: Saeed Mahameed <saeedm@mellanox.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Boris Pismenny <borisp@mellanox.com> Cc: Ilya Lesokhin <ilyal@mellanox.com> Cc: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-22net/mlx5e: TX, Add geneve tunnel stateless offload supportMoshe Shemesh1-0/+5
Currently support only default geneve udp port (6081). For the tx side, the HW is assisted by SW parsing, which sets the headers offset to offload tunneled LSO and csum. Note that for udp tunnels, we don't use special rx offloads, as rss on the outer headers is enough, we support checksum complete and GRO takes care of aggregation. Geneve TSO BW and CPU load results (tested using iperf single tcp stream). In this patch we add TSO support over Geneve, so the "before" result doesn't actually get to using the TSO HW offload even when turned on. Tested on ConnectX-5, Intel(R) Xeon(R) CPU E5-2660 v2 @2.20GHz. __________________________________ | Before | After | |________________|_________________| | 12.6 Gbits/sec | 21.7 Gbits/sec | | 100% CPU load | 61.5% CPU load | |________________|_________________| Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni1-3/+2
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-23Merge tag 'mlx5-updates-2019-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller1-3/+2
Saeed Mahameed says: ==================== mlx5-updates-2019-02-21 This series adds some misc updates to mlx5 driver, 1) Eli Britstein, Introduces tunnel entropy control from PCMR register and fixes GRE key by controlling port tunnel entropy calculation. 2) Eran Ben Elisha, provides some mlx5 fixes to the latest tx devlink health reporting mechanism. 3) Huy Nguyen, Added the support for ndo bridge_setlink to allow VEPA/VEB E-Switch legacy mode configurations. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22net/mlx5e: Re-add support for TX timeout when TX reporter is not validEran Ben Elisha1-3/+2
When TX reporter was introduced, it took ownership over TX timeout error handling. this introduced a regression in case TX reporter is not valid (NET_DEVLINK is not set, or devlink_health_reporter_create failure). Fix mlx5e_tx_reporter_timeout function so it can be called at all times. In addition, remove a warning print that indicates that a TX timeout won't be handled in case of no valid TX reporter. Fixes: 7d91126b1aea ("net/mlx5e: Add tx timeout support for mlx5e tx reporter") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22net/mlx5e: Trust kernel regarding transport offsetMaxim Mikityanskiy1-4/+0
After AF_PACKET is fixed to calculate the transport header offset correctly, trust the value set by the kernel. If the offset wasn't set, it means there is no transport header in the packet. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-22net/mlx5e: Remove the wrong assumption about transport offsetMaxim Mikityanskiy1-9/+2
skb_transport_offset() == 0 is not a special value. The only special value is when skb->transport_header is ~0U, and it's checked by skb_transport_header_was_set(). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+6
An ipvlan bug fix in 'net' conflicted with the abstraction away of the IPV6 specific support in 'net-next'. Similarly, a bug fix for mlx5 in 'net' conflicted with the flow action conversion in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07net/mlx5e: Add tx reporter supportEran Ben Elisha1-2/+3
Add mlx5e tx reporter to devlink health reporters. This reporter will be responsible for diagnosing, reporting and recovering of tx errors. This patch declares the TX reporter operations and creates it using the devlink health API. Currently, this reporter supports reporting and recovering from send error CQE only. In addition, it adds diagnose information for the open SQs. For a local SQ recover (due to driver error report), in case of SQ recover failure, the recover operation will be considered as a failure. For a full tx recover, an attempt to close and open the channels will be done. If this one passed successfully, it will be considered as a successful recover. The SQ recover from error CQE flow is not a new feature in the driver, this patch re-organize the functions and adapt them for the devlink health API. For this purpose, move code from en_main.c to a new file named reporter_tx.c. Diagnose output: $devlink health diagnose pci/0000:00:09.0 reporter tx -j -p { "SQs": [ { "sqn": 138, "HW state": 1, "stopped": false },{ "sqn": 142, "HW state": 1, "stopped": false } ] } $devlink health diagnose pci/0000:00:09.0 reporter tx SQs: sqn: 138 HW state: 1 stopped: false sqn: 142 HW state: 1 stopped: false Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-05net/mlx5e: FPGA, fix Innova IPsec TX offload data path performanceRaed Salem1-0/+6
At Innova IPsec TX offload data path a special software parser metadata is used to pass some packet attributes to the hardware, this metadata is passed using the Ethernet control segment of a WQE (a HW descriptor) header. The cited commit might nullify this header, hence the metadata is lost, this caused a significant performance drop during hw offloading operation. Fix by restoring the metadata at the Ethernet control segment in case it was nullified. Fixes: 37fdffb217a4 ("net/mlx5: WQ, fixes for fragmented WQ buffers API") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net: Revert devlink health changes.David S. Miller1-1/+1
This reverts the devlink health changes from 9/17/2019, Jiri wants things to be designed differently and it was agreed that the easiest way to do this is start from the beginning again. Commits reverted: cb5ccfbe73b389470e1dc11061bb185ef4bc9aec 880ee82f0313453ec5a6cb122866ac057263066b c7af343b4e33578b7de91786a3f639c8cfa0d97b ff253fedab961b22117a73ab808fcfa9e6852b50 6f9d56132eb6d2603d4273cfc65bed914ec47acb fcd852c69d776c0f46c8f79e8e431e5cc6ddc7b7 8a66704a13d9713593342e29b4f0c19762f5746b 12bd0dcefe88782ac1c9fff632958dd1b71d27e5 aba25279c10094c5c97d09c3491ca86d00b4ad5e ce019faa70f81555fa17ebc1d5a03651f2e7e15a b8c45a033acc607201588f7665ba84207e5149e0 And the follow-on build fix: o33a0efa4baecd689da9474ce0e8b673eb6931c60 Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-18net/mlx5e: Add TX reporter supportEran Ben Elisha1-1/+1
Add mlx5e tx reporter to devlink health reporters. This reporter will be responsible for diagnosing, reporting and recovering of TX errors. This patch declares the TX reporter operations and allocate it using the devlink health API. Currently, this reporter supports reporting and recovering from send error CQE only. In addition, it adds diagnose information for the open SQs. For a local SQ recover (due to driver error report), in case of SQ recover failure, the recover operation will be considered as a failure. For a full TX recover, an attempt to close and open the channels will be done. If this one passed successfully, it will be considered as a successful recover. The SQ recover from error CQE flow is not a new feature in the driver, this patch re-organize the functions and adapt them for the devlink health API. For this purpose, move code from en_main.c to a new file named reporter_tx.c. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20net/mlx5e: TX, Print opcode in error CQE warningTariq Toukan1-3/+4
The opcode indicates about the error reason. Printing it helps in debug. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxSaeed Mahameed1-1/+1
mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev conflicts. Highlights: 1) RDMA ODP (On Demand Paging) improvements and moving ODP logic to mlx5 RDMA driver 2) Improved mlx5 core driver and device events handling and provided API for upper layers to subscribe to device events. 3) RDMA only code cleanup from mlx5 core 4) Add helper to get CQE opcode 5) Rework handling of port module events 6) shared mlx5_ifc.h updates to avoid conflicts Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-09net/mlx5: Use helper to get CQE opcodeTariq Toukan1-1/+1
Introduce and use a helper that extracts the opcode from a CQE (completion queue entry) structure. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-11-21mlx5: use skb_vlan_tag_get_prio()Michał Mirosław1-1/+1
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-10net/mlx5: WQ, fixes for fragmented WQ buffers APITariq Toukan1-11/+11
mlx5e netdevice used to calculate fragment edges by a call to mlx5_wq_cyc_get_frag_size(). This calculation did not give the correct indication for queues smaller than a PAGE_SIZE, (broken by default on PowerPC, where PAGE_SIZE == 64KB). Here it is replaced by the correct new calls/API. Since (TX/RX) Work Queues buffers are fragmented, here we introduce changes to the API in core driver, so that it gets a stride index and returns the index of last stride on same fragment, and an additional wrapping function that returns the number of physically contiguous strides that can be written contiguously to the work queue. This obsoletes the following API functions, and their buggy usage in EN driver: * mlx5_wq_cyc_get_frag_size() * mlx5_wq_cyc_ctr2fragix() The new API improves modularity and hides the details of such calculation for mlx5e netdevice and mlx5_ib rdma drivers. New calculation is also more efficient, and improves performance as follows: Packet rate test: pktgen, UDP / IPv4, 64byte, single ring, 8K ring size. Before: 16,477,619 pps After: 17,085,793 pps 3.7% improvement Fixes: 3a2f70331226 ("net/mlx5: Use order-0 allocations for all WQ types") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-26net/mlx5e: TX, Use function to access sq_dma object in fifoTariq Toukan1-10/+9
Use designated function mlx5e_dma_get() to get the mlx5e_sq_dma object to be pushed into fifo. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-09net: allow fallback function to pass netdevAlexander Duyck1-1/+1
For most of these calls we can just pass NULL through to the fallback function as the sb_dev. The only cases where we cannot are the cases where we might be dealing with either an upper device or a driver that would have configured things to support an sb_dev itself. The only driver that has any significant change in this patch set should be ixgbe as we can drop the redundant functionality that existed in both the ndo_select_queue function and the fallback function that was passed through to us. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09net: allow ndo_select_queue to pass netdevAlexander Duyck1-1/+2
This patch makes it so that instead of passing a void pointer as the accel_priv we instead pass a net_device pointer as sb_dev. Making this change allows us to pass the subordinate device through to the fallback function eventually so that we can keep the actual code in the ndo_select_queue call as focused on possible on the exception cases. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-06-28net/mlx5e: Add TX completions statisticsTariq Toukan1-2/+7
Add per-ring and global ethtool counters for TX completions. This helps us monitor and analyze TX flow performance. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-28net/mlx5e: Add UDP GSO supportBoris Pismenny1-3/+5
This patch enables UDP GSO support. We enable this by using two WQEs the first is a UDP LSO WQE for all segments with equal length, and the second is for the last segment in case it has different length. Due to HW limitation, before sending, we must adjust the packet length fields. We measure performance between two Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz machines connected back-to-back with Connectx4-Lx (40Gbps) NICs. We compare single stream UDP, UDP GSO and UDP GSO with offload. Performance: | MSS (bytes) | Throughput (Gbps) | CPU utilization (%) UDP GSO offload | 1472 | 35.6 | 8% UDP GSO | 1472 | 25.5 | 17% UDP | 1472 | 10.2 | 17% UDP GSO offload | 1024 | 35.6 | 8% UDP GSO | 1024 | 19.2 | 17% UDP | 1024 | 5.7 | 17% UDP GSO offload | 512 | 33.8 | 16% UDP GSO | 512 | 10.4 | 17% UDP | 512 | 3.5 | 17% Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-01net/mlx5e: TX, Obsolete maintaining local copies of skb->len/dataTariq Toukan1-30/+12
Instead of maintaining a local copy of skb->len/data and updating it upon every copy to the WQE inline part, just calculate it once when needed, using the ihs. This obsoletes the function mlx5e_tx_skb_pull_inline. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-01net/mlx5e: IPOIB, Add a missing skb_pullTariq Toukan1-0/+1
A call to mlx5e_tx_skb_pull_inline was mistakenly dropped in the cited patch. Get it back. Fixes: 043dc78ecf07 ("net/mlx5e: TX, Use actual WQE size for SQ edge fill") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-01net/mlx5e: IPOIB, Fix overflowing SQ WQE memsetTariq Toukan1-3/+3
IPoIB WQE size is larger than a single WQEBB. Must not fetch the WQE, and surely not memset it, until it is guaranteed that there are enough WQEBBs available before getting to SQ/frag edge. Fixes: 043dc78ecf07 ("net/mlx5e: TX, Use actual WQE size for SQ edge fill") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25net/mlx5e: Avoid reset netdev stats on configuration changesEran Ben Elisha1-23/+26
Move all RQ, SQ and channel counters from the channel objects into the priv structure. With this change, counters will not be reset upon channel configuration changes. Channel's statistics for SQs which are associated with TCs higher than zero will be presented in ethtool -S, only for SQs which were opened at least once since the module was loaded (regardless of their open/close current status). This is done in order to decrease the total amount of statistics presented and calculated for the common out of box use (no QoS). mlx5e_channel_stats is a compound of CH,RQ,SQs stats in order to create locality for the NAPI when handling TX and RX of the same channel. Align the new statistics struct per ring to avoid several channels update to the same cache line at the same time. Packet rate was tested, no degradation sensed. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> CC: Qing Huang <qing.huang@oracle.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-25net/mlx5: Use order-0 allocations for all WQ typesTariq Toukan1-11/+13
Complete the transition of all WQ types to use fragmented order-0 coherent memory instead of high-order allocations. CQ-WQ already uses order-0. Here we do the same for cyclic and linked-list WQs. This allows the driver to load cleanly on systems with a highly fragmented coherent memory. Performance tests: ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Packet rate of 64B packets, single transmit ring, size 8K. No degradation is sensed. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>