linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2022-10-03	net/mlx5e: Expose rx_oversize_pkts_buffer counter	Gal Pressman	1	-1/+20
	Add the rx_oversize_pkts_buffer counter to ethtool statistics. This counter exposes the number of dropped received packets due to length which arrived to RQ and exceed software buffer size allocated by the device for incoming traffic. It might imply that the device MTU is larger than the software buffers size. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-07	net/mlx5e: Add MACsec stats support for Rx/Tx flows	Lior Nahmanson	1	-0/+3
	Add the following statistics: RX successfully decrypted MACsec packets: macsec_rx_pkts : Number of packets decrypted successfully macsec_rx_bytes : Number of bytes decrypted successfully Rx dropped MACsec packets: macsec_rx_pkts_drop : Number of MACsec packets dropped macsec_rx_bytes_drop : Number of MACsec bytes dropped TX successfully encrypted MACsec packets: macsec_tx_pkts : Number of packets encrypted/authenticated successfully macsec_tx_bytes : Number of bytes encrypted/authenticated successfully Tx dropped MACsec packets: macsec_tx_pkts_drop : Number of MACsec packets dropped macsec_tx_bytes_drop : Number of MACsec bytes dropped The above can be seen using: ethtool -S <ifc> \|grep macsec Signed-off-by: Lior Nahmanson <liorna@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-19	net/mlx5e: Add resiliency for PTP TX port timestamp	Aya Levin	1	-0/+2
	PTP TX port timestamp relies on receiving 2 CQEs for each outgoing packet (WQE). The regular CQE has a less accurate timestamp than the wire CQE. On link change, the wire CQE may get lost. Let the driver detect and restore the relation between the CQEs, and re-sync after timeout. Add resiliency for this as follows: add id (producer counter) into the WQE's metadata. This id will be received in the wire CQE (in wqe_counter field). On handling the wire CQE, if there is no match, replay the PTP application with the time-stamp from the regular CQE and restore the sync between the CQEs and their SKBs. This patch adds 2 ptp counters: 1) ptp_cq0_resync_event: number of times a mismatch was detected between the regular CQE and the wire CQE. 2) ptp_cq0_resync_cqe: total amount of missing wire CQEs. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-19	net/mlx5e: HTB, move stats and max_sqs to priv	Moshe Tal	1	-6/+6
	Preparation for dynamic allocation of the HTB struct. The statistics should be preserved even when the struct is de-allocated. Signed-off-by: Moshe Tal <moshet@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-07-06	net/mlx5e: Fix capability check for updating vnic env counters	Gal Pressman	1	-1/+1
	The existing capability check for vnic env counters only checks for receive steering discards, although we need the counters update for the exposed internal queue oob counter as well. This could result in the latter counter not being updated correctly when the receive steering discards counter is not supported. Fix that by checking whether any counter is supported instead of only the steering counter capability. Fixes: 0cfafd4b4ddf ("net/mlx5e: Add device out of buffer counter") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-04-09	net/mlx5: Remove FPGA ipsec specific statistics	Leon Romanovsky	1	-1/+0
	Delete the statistics that is not used anymore. Link: https://lore.kernel.org/r/3f194752881e095910c887dd5cede1dcba6acaf3.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-04-06	net/mlx5: Remove tls vs. ktls separation as it is the same	Leon Romanovsky	1	-4/+4
	After removal FPGA TLS, we can remove tls->ktls indirection too, as it is the same thing. Link: https://lore.kernel.org/r/67e596599edcffb0de43f26551208dfd34ac777e.1649073691.git.leonro@nvidia.com Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-03-14	net/mlx5e: Fix use-after-free in mlx5e_stats_grp_sw_update_stats	Saeed Mahameed	1	-2/+3
	We need to sync page pool stats only for active channels. Reading ethtool stats on a down netdev or a netdev with modified number of channels will result in a user-after-free, trying to access page pools that are freed already. BUG: KASAN: use-after-free in mlx5e_stats_grp_sw_update_stats+0x465/0xf80 Read of size 8 at addr ffff888004835e40 by task ethtool/720 Fixes: cc10e84b2ec3 ("mlx5: add support for page_pool_get_stats") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Joe Damato <jdamato@fastly.com> Link: https://lore.kernel.org/r/20220312005353.786255-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03	mlx5: add support for page_pool_get_stats	Joe Damato	1	-0/+75
	This change adds support for the page_pool_get_stats API to mlx5. If the user has enabled CONFIG_PAGE_POOL_STATS in their kernel, ethtool will output page pool stats. Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Saeed Mahameed <saeed@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-3/+3
	tools/testing/selftests/net/mptcp/mptcp_join.sh 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers") 857898eb4b28 ("selftests: mptcp: add missing join check") 6ef84b1517e0 ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions") c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") 09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") 84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr") efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") 3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-23	net/mlx5e: Add feature check for set fec counters	Lama Kayal	1	-3/+3
	Fec counters support is checked via the PCAM feature_cap_mask, bit 0: PPCNT_counter_group_Phy_statistical_counter_group. Add feature check to avoid faulty behavior. Fixes: 0a1498ebfa55 ("net/mlx5e: Expose FEC counters via ethtool") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16	net/mlx5e: E-Switch, Add PTP counters for uplink representor	Aya Levin	1	-1/+1
	There is a configuration where the uplink interface is the synchronizer. Add PTP counters for this interface for monitoring. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-01-09	net/mlx5e: Fix build error in fec_set_block_stats()	Jakub Kicinski	1	-1/+1
	Build bot reports: drivers/net/ethernet/mellanox/mlx5/core/en_stats.c: In function 'fec_set_block_stats': drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:1235:48: error: 'outl' undeclared (first use in this function); did you mean 'out'? 1235 \| if (mlx5_core_access_reg(mdev, in, sz, outl, sz, MLX5_REG_PPCNT, 0, 0)) \| ^~~~ \| out Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20220109213321.2292830-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-06	net/mlx5e: Expose FEC counters via ethtool	Lama Kayal	1	-3/+98
	Add FEC counters' statistics of corrected_blocks and uncorrectable_blocks, along with their lanes via ethtool. HW supports corrected_blocks and uncorrectable_blocks counters both for RS-FEC mode and FC-FEC mode. In FC mode these counters are accumulated per lane, while in RS mode the correction method crosses lanes, thus only total corrected_blocks and uncorrectable_blocks are reported in this mode. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-21	net/mlx5e: Use dynamic per-channel allocations in stats	Tariq Toukan	1	-8/+8
	Make stats array an array of pointer. This patch comes in to prepare for the next patch where allocations of the stats are to be performed dynamically on first usage. Signed-off-by: Lama Kayal <lkayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: Fix format-security build warnings	Saeed Mahameed	1	-1/+1
	Treat the string as an argument to avoid this. drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:482:5: error: format string is not a string literal (potentially insecure) name); ^~~~ drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:2079:4: error: format string is not a string literal (potentially insecure) ptp_ch_stats_desc[i].format); ^~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
2021-10-26	net/mlx5e: Add HW_GRO statistics	Khalid Manaa	1	-0/+15
	This patch adds HW_GRO counters to RX packets statistics: - gro_match_packets: counter of received packets with set match flag. - gro_packets: counter of received packets over the HW_GRO feature, this counter is increased by one for every received HW_GRO cqe. - gro_bytes: counter of received bytes over the HW_GRO feature, this counter is increased by the received bytes for every received HW_GRO cqe. - gro_skbs: counter of built HW_GRO skbs, increased by one when we flush HW_GRO skb (when we call a napi_gro_receive with hw_gro skb). - gro_large_hds: counter of received packets with large headers size, in case the packet needs new SKB, the driver will allocate new one and will not use the headers entry to build it. Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-30	net/mlx5e: Fix the presented RQ index in PTP stats	Lama Kayal	1	-1/+2
	PTP-RQ counters title format contains PTP-RQ identifier, which is mistakenly not passed to sprinft(). This leads to unexpected garbage values instead. This patch fixes it. Before applying the patch: ethtool -S eth3 \| grep ptp_rq ptp_rq15_packets: 0 ptp_rq8_bytes: 0 ptp_rq6_csum_complete: 0 ptp_rq14_csum_complete_tail: 0 ptp_rq3_csum_complete_tail_slow : 0 ptp_rq9_csum_unnecessary: 0 ptp_rq1_csum_unnecessary_inner: 0 ptp_rq7_csum_none: 0 ptp_rq10_xdp_drop: 0 ptp_rq9_xdp_redirect: 0 ptp_rq13_lro_packets: 0 ptp_rq12_lro_bytes: 0 ptp_rq10_ecn_mark: 0 ptp_rq9_removed_vlan_packets: 0 ptp_rq5_wqe_err: 0 ptp_rq8_mpwqe_filler_cqes: 0 ptp_rq2_mpwqe_filler_strides: 0 ptp_rq5_oversize_pkts_sw_drop: 0 ptp_rq6_buff_alloc_err: 0 ptp_rq15_cqe_compress_blks: 0 ptp_rq2_cqe_compress_pkts: 0 ptp_rq2_cache_reuse: 0 ptp_rq12_cache_full: 0 ptp_rq11_cache_empty: 256 ptp_rq12_cache_busy: 0 ptp_rq11_cache_waive: 0 ptp_rq12_congst_umr: 0 ptp_rq11_arfs_err: 0 ptp_rq9_recover: 0 After applying the patch: ethtool -S eth3 \| grep ptp_rq ptp_rq0_packets: 0 ptp_rq0_bytes: 0 ptp_rq0_csum_complete: 0 ptp_rq0_csum_complete_tail: 0 ptp_rq0_csum_complete_tail_slow : 0 ptp_rq0_csum_unnecessary: 0 ptp_rq0_csum_unnecessary_inner: 0 ptp_rq0_csum_none: 0 ptp_rq0_xdp_drop: 0 ptp_rq0_xdp_redirect: 0 ptp_rq0_lro_packets: 0 ptp_rq0_lro_bytes: 0 ptp_rq0_ecn_mark: 0 ptp_rq0_removed_vlan_packets: 0 ptp_rq0_wqe_err: 0 ptp_rq0_mpwqe_filler_cqes: 0 ptp_rq0_mpwqe_filler_strides: 0 ptp_rq0_oversize_pkts_sw_drop: 0 ptp_rq0_buff_alloc_err: 0 ptp_rq0_cqe_compress_blks: 0 ptp_rq0_cqe_compress_pkts: 0 ptp_rq0_cache_reuse: 0 ptp_rq0_cache_full: 0 ptp_rq0_cache_empty: 256 ptp_rq0_cache_busy: 0 ptp_rq0_cache_waive: 0 ptp_rq0_congst_umr: 0 ptp_rq0_arfs_err: 0 ptp_rq0_recover: 0 Fixes: a28359e922c6 ("net/mlx5e: Add PTP-RX statistics") Signed-off-by: Lama Kayal <lkayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-09-30	net/mlx5e: Keep the value for maximum number of channels in-sync	Tariq Toukan	1	-4/+4
	The value for maximum number of channels is first calculated based on the netdev's profile and current function resources (specifically, number of MSIX vectors, which depends among other things on the number of online cores in the system). This value is then used to calculate the netdev's number of rxqs/txqs. Once created (by alloc_etherdev_mqs), the number of netdev's rxqs/txqs is constant and we must not exceed it. To achieve this, keep the maximum number of channels in sync upon any netdevice re-attach. Use mlx5e_get_max_num_channels() for calculating the number of netdev's rxqs/txqs. After netdev is created, use mlx5e_calc_max_nch() (which coinsiders core device resources, profile, and netdev) to init or update priv->max_nch. Before this patch, the value of priv->max_nch might get out of sync, mistakenly allowing accesses to out-of-bounds objects, which would crash the system. Track the number of channels stats structures used in a separate field, as they are persistent to suspend/resume operations. All the collected stats of every channel index that ever existed should be preserved. They are reset only when struct mlx5e_priv is, in mlx5e_priv_cleanup(), which is part of the profile changing flow. There is no point anymore in blocking a profile change due to max_nch mismatch in mlx5e_netdev_change_profile(). Remove the limitation. Fixes: a1f240f18017 ("net/mlx5e: Adjust to max number of channles when re-attaching") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-16	mlx5: implement ethtool standard stats	Jakub Kicinski	1	-7/+135
	Add support for PHY/MAC/Ctrl/RMON stats. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-16	net/mlx5e: kTLS, Add resiliency to RX resync failures	Tariq Toukan	1	-0/+3
	When the TLS logic finds a tcp seq match for a kTLS RX resync request, it calls the driver callback function mlx5e_ktls_resync() to handle it and communicate it to the device. Errors might occur during mlx5e_ktls_resync(), however, they are not reported to the stack. Moreover, there is no error handling in the stack for these errors. In this patch, the driver obtains responsibility on errors handling, adding queue and retry mechanisms to these resyncs. We maintain a linked list of resync matches, and try posting them to the async ICOSQ in the NAPI context. Only possible failure that demands driver handling is ICOSQ being full. By relying on the NAPI mechanism, we make sure that the entries in list will be handled when ICOSQ completions arrive and make some room available. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-15	mlx5: implement ethtool::get_fec_stats	Jakub Kicinski	1	-2/+27
	Report corrected bits. v2: catch reg access errors (Saeed) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-10/+0
	Conflicts: MAINTAINERS - keep Chandrasekar drivers/net/ethernet/mellanox/mlx5/core/en_main.c - simple fix + trust the code re-added to param.c in -next is fine include/linux/bpf.h - trivial include/linux/ethtool.h - trivial, fix kdoc while at it include/linux/skmsg.h - move to relevant place in tcp.c, comment re-wrapped net/core/skmsg.c - add the sk = sk // sk = NULL around calls net/tipc/crypto.c - trivial Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-03-31	net/mlx5e: kTLS, Fix RX counters atomicity	Tariq Toukan	1	-6/+0
	Some TLS RX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the RQ stats into the global atomic TLS stats. In this patch, we touch 'rx_tls_ctx/del' that count the number of device-offloaded RX TLS connections added/deleted. These counters are updated in the add/del callbacks, out of the fast data-path. This change is not needed for counters that increment only in NAPI context, as they are protected by the NAPI mechanism. Keep them as tls_* counters under 'struct mlx5e_rq_stats'. Fixes: 76c1e1ac2aae ("net/mlx5e: kTLS, Add kTLS RX stats") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-31	net/mlx5e: kTLS, Fix TX counters atomicity	Tariq Toukan	1	-4/+0
	Some TLS TX counters increment per socket/connection, and are not protected against parallel modifications from several cores. Switch them to atomic counters by taking them out of the SQ stats into the global atomic TLS stats. In this patch, we touch a single counter 'tx_tls_ctx' that counts the number of device-offloaded TX TLS connections added. Now that this counter can be increased without the for having the SQ context in hand, move it to the mlx5e_ktls_add_tx() callback where it really belongs, out of the fast data-path. This change is not needed for counters that increment only in NAPI context or under the TX lock, as they are already protected. Keep them as tls_* counters under 'struct mlx5e_sq_stats'. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-29	net/mlx5e: Add PTP-RX statistics	Aya Levin	1	-26/+88
	Like PTP-TX, once the PTP-RX is opened, corresponding statistics appear. Add indication that PTP-RX was ever opened: rx_ptp_opened. If any of the PTP RX or TX were opened, display the PTP channel's statistics. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-25	net/mlx5e: Generalize PTP implementation	Aya Levin	1	-9/+9
	Following patches in the set add support for RX PTP. Rename PTP prefix from %s/port_ptp/ptp/g to include RX PTP too. In addition rename indication (used in statistics context) that PTP-SQ was opened: %s/port_ptp_opened/tx_ptp_opened/g. This will simplify adding indication that PTP-RQ was opened. Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-01-22	net/mlx5e: Support HTB offload	Maxim Mikityanskiy	1	-0/+100
	This commit adds support for HTB offload in the mlx5e driver. Performance: NIC: Mellanox ConnectX-6 Dx CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (24 cores with HT) 100 Gbit/s line rate, 500 UDP streams @ ~200 Mbit/s each 48 traffic classes, flower used for steering No shaping (rate limits set to 4 Gbit/s per TC) - checking for max throughput. Baseline: 98.7 Gbps, 8.25 Mpps HTB: 6.7 Gbps, 0.56 Mpps HTB offload: 95.6 Gbps, 8.00 Mpps Limitations: 1. 256 leaf nodes, 3 levels of depth. 2. Granularity for ceil is 1 Mbit/s. Rates are converted to weights, and the bandwidth is split among the siblings according to these weights. Other parameters for classes are not supported. Ethtool statistics support for QoS SQs are also added. The counters are called qos_txN_, where N is the QoS queue number (starting from 0, the numeration is separate from the normal SQs), and is the counter name (the counters are the same as for the normal SQs). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-08	net/mlx5e: Add TX port timestamp support	Eran Ben Elisha	1	-1/+20
	Transmitted packet timestamping accuracy can be improved when using timestamp from the port, instead of packet CQE creation timestamp, as it better reflects the actual time of a packet's transmit. TX port timestamping is supported starting from ConnectX6-DX hardware. Although at the original completion, only CQE timestamp can be attached, we are able to get TX port timestamping via an additional completion over a special CQ associated with the SQ (in addition to the regular CQ). Driver to ignore the original packet completion timestamp, and report back the timestamp of the special CQ completion. If the absolute timestamp diff between the two completions is greater than 1 / 128 second, ignore the TX port timestamp as it has a jitter which is too big. No skb will be generate out of the extra completion. Allocate additional CQ per ptpsq, to receive the TX port timestamp. Driver to hold an skb FIFO in order to map between transmitted skb to the two expected completions. When using ptpsq, hold double refcount on the skb, to gaurantee it will not get released before both completions arrive. Expose dedicated counters of the ptp additional CQ and connect it to the TX health reporter. This patch improves TX Hardware timestamping offset to be less than 40ns at a 100Gbps line rate, compared to 600ns before. With that, making our HW compliant with G.8273.2 class C, and allow Linux systems to be deployed in the 5G telco edge, where this standard is a must. Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-12-08	net/mlx5e: Add TX PTP port object support	Eran Ben Elisha	1	-0/+96
	Add TX PTP port object support for better TX timestamping accuracy. Currently, driver supports CQE based TX port timestamp. Device also offers TX port timestamp, which has less jitter and better reflects the actual time of a packet's transmit. Define new driver layout called ptpsq, on which driver will create SQs that will support TX port timestamp for their transmitted packets. Driver to identify PTP TX skbs and steer them to these dedicated SQs as part of the select queue ndo. Driver to hold ptpsq per TC and report them at netif_set_real_num_tx_queues(). Add support for all needed functionality in order to xmit and poll completions received via ptpsq. Add ptpsq to the TX reporter recover, diagnose and dump methods. Creation of ptpsqs is disabled by default, and can be enabled via tx_port_ts private flag. This patch steer all timestamp related packets to a ptpsq, but it does not open the port timestamp support for it. The support will be added in the following patch. Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-12-08	net/mlx5e: Split SW group counters update function	Eran Ben Elisha	1	-127/+161
	SW group counter update function aggregates sw stats out of many mlx5e__stats resides in a given mlx5e_channel_stats struct. Split the function into a few helper functions. This will be used later in the series to calculate specific mlx5e__stats which are not defined inside mlx5e_channel_stats. Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-22	Merge tag 'mlx5-updates-2020-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux	David S. Miller	1	-0/+6
	Saeed Mahameed says: ==================== mlx5-updates-2020-09-21 Multi packet TX descriptor support for SKBs. This series introduces some refactoring of the regular TX data path in mlx5 and adds the Enhanced TX MPWQE feature support. MPWQE stands for multi-packet work queue element, and it can serve multiple packets, reducing the PCI bandwidth spent on control traffic. It should improve performance in scenarios where PCI is the bottleneck, and xmit_more is signaled by the kernel. The refactoring done in this series also improves the packet rate on its own. MPWQE is already implemented in the XDP tx path, this series adds the support of MPWQE for regular kernel SKB tx path. MPWQE is supported from ConnectX-5 and onward, for legacy devices we need to keep backward compatibility for regular (Single packet) WQE descriptor. MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE per SKB. Prior to the final patch "net/mlx5e: Enhanced TX MPWQE for SKBs" that adds the actual support, Maxim did some refactoring to the tx data path to split it into stages and smaller helper functions that can be utilized and reused for both legacy and new MPWQE feature. Performance testing: UDP performance is improved in a single stream pktgen test: Packet rate: 16.86 Mpps (±0.15 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 434 -> 329 Cycles per packet: 158 -> 123 Instructions per cycle: 2.75 -> 2.67 TCP and XDP_TX single stream tests show no performance difference. MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3% MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps Burst size in all pktgen tests is 32. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller	1	-0/+12
	Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-21	net/mlx5e: Enhanced TX MPWQE for SKBs	Maxim Mikityanskiy	1	-0/+6
	This commit adds support for Enhanced TX MPWQE feature in the regular (SKB) data path. A MPWQE (multi-packet work queue element) can serve multiple packets, reducing the PCI bandwidth on control traffic. Two new stats (tx_mpwqe_blks and tx_mpwqe_pkts) are added. The feature is on by default and controlled by the skb_tx_mpwqe private flag. In a MPWQE, eseg is shared among all packets, so eseg-based offloads (IPSEC, GENEVE, checksum) run on a separate eseg that is compared to the eseg of the current MPWQE session to decide if the new packet can be added to the same session. MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE. This change has no performance impact in TCP single stream test and XDP_TX single stream test. UDP pktgen, 64-byte packets, single stream, MPWQE off: Packet rate: 16.96 Mpps (±0.12 Mpps) -> 17.01 Mpps (±0.20 Mpps) Instructions per packet: 421 -> 429 Cycles per packet: 156 -> 161 Instructions per cycle: 2.70 -> 2.67 UDP pktgen, 64-byte packets, single stream, MPWQE on: Packet rate: 16.96 Mpps (±0.12 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 421 -> 329 Cycles per packet: 156 -> 123 Instructions per cycle: 2.70 -> 2.67 Enabling MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3% Enabling MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps Burst size in all pktgen tests is 32. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0 Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-09-21	net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()	Alaa Hleihel	1	-0/+12
	The cited commit started to reuse function mlx5e_update_ndo_stats() for the representors as well. However, the function is hard-coded to work on mlx5e_nic_stats_grps only. Due to this issue, the representors statistics were not updated in the output of "ip -s". Fix it to work with the correct group by extracting it from the caller's profile. Also, while at it and since this function became generic, move it to en_stats.c and rename it accordingly. Fixes: 8a236b15144b ("net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infra") Signed-off-by: Alaa Hleihel <alaa@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-09-15	mlx5: add pause frame stats	Jakub Kicinski	1	-0/+29
	Plumb through all the indirection and copy some code from ethtool -S. The names of the group indicate that these are the stats we are after (and Saeed confirms it). v3: - fix build in mlx5_rep v2: - drop the ethool helper and call stats directly - don't pass 0 as initialized to in buffer - use local buffer Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-27	net/mlx5e: kTLS, Add kTLS RX stats	Tariq Toukan	1	-0/+39
	Add global and per-channel ethtool SW stats for the device offload. Document the new counters in tls-offload.rst. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-30	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux	Saeed Mahameed	1	-23/+27
	mlx5 updates for both net-next and rdma-next: 1) HW bits and definitions for TLS and IPsec offlaods 2) Release all pages capability bits 3) New command interface helpers and some code cleanup as a result 4) Move qp.c out of mlx5 core driver into mlx5_ib rdma driver Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-23	net/mlx5: Update statistics to new cmd interface	Leon Romanovsky	1	-12/+5
	Do mass update of statistics to reuse newly introduced mlx5_cmd_exec_in*() interfaces. Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-04-20	net/mlx5e: IPSec, Expose IPsec HW stat only for supporting HW	Raed Salem	1	-24/+5
	The current HW counters are supported only by Innova, split the ipsec stats group into two groups, one for HW and one for SW. And expose the HW counters to ethtool only if Innova HW is used for IPsec offload. Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-19	net/mlx5: Remove Q counter low level helper APIs	Leon Romanovsky	1	-12/+23
	mlx5 core users are encouraged to use low level API (mlx5_cmd_exec) without the need of helper functions, do this for q counters, remove helper functions and call mlx5_cmd_exec directly from users. This will help reduce the total amount of code and reduction of the mlx5_core symbol table. Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-22	net/mlx5e: Enable all available stats for uplink reps	Vlad Buslov	1	-2/+2
	Extend stats group array of uplink representor with all stats that are available for PF in legacy mode, besides ipsec and TLS which are not supported. Don't output vport stats for uplink representor because they are already handled by 802_3 group (with different names: {tx\|rx}_{bytes\|packets}_phy). Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22	net/mlx5e: IPoIB, use separate stats groups	Saeed Mahameed	1	-13/+13
	Don't copy all of the stats groups used for mlx5e ethernet NIC profile, have a separate stats groups for IPoIB with the set of the needed stats only. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22	net/mlx5e: Convert stats groups array to array of group pointers	Saeed Mahameed	1	-26/+43
	Convert stats groups array to array of "stats group" pointers to allow sharing and individual selection of groups per profile as illustrated in the next patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22	net/mlx5e: Declare stats groups via macro	Saeed Mahameed	1	-183/+83
	Introduce new macros to declare stats callbacks and groups, for better code reuse and for individual groups selection per profile which will be introduced in next patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22	net/mlx5e: Profile specific stats groups	Saeed Mahameed	1	-2/+57
	Attach stats groups array to the profiles and make the stats utility functions (get_num, update, fill, fill_strings) generic and use the profile->stats_grps rather the hardcoded NIC stats groups. This will allow future extension to have per profile stats groups. In this patch mlx5e NIC and IPoIB will still share the same stats groups. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-16	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux	Saeed Mahameed	1	-0/+1
	This merge syncs with mlx5-next latest HW bits and layout updates for next features, in addition one patch that improves mlx5_create_auto_grouped_flow_table() API across all mlx5 users. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Refactor mlx5_create_auto_grouped_flow_table net/mlx5e: Add discard counters per priority net/mlx5e: Expose FEC feilds and related capability bit net/mlx5: Add mlx5_ifc definitions for connection tracking support net/mlx5: Add copy header action struct layout net/mlx5: Expose resource dump register mapping net/mlx5: Add structures and defines for MIRC register net/mlx5: Read MCAM register groups 1 and 2 net/mlx5: Add structures layout for new MCAM access reg groups net/mlx5: Expose vDPA emulation device capabilities net/mlx5: Add Virtio Emulation related device capabilities Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16	net/mlx5e: Add discard counters per priority	Aharon Landau	1	-0/+1
	Add counters that count (per priority) the number of received packets that dropped due to lack of buffers on a physical port. If this counter is increasing, it implies that the adapter is congested and cannot absorb the traffic coming from the network. Signed-off-by: Aharon Landau <aharonl@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07	mlx5: work around high stack usage with gcc	Arnd Bergmann	1	-0/+3
	In some configurations, gcc tries too hard to optimize this code: drivers/net/ethernet/mellanox/mlx5/core/en_stats.c: In function 'mlx5e_grp_sw_update_stats': drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:302:1: error: the frame size of 1336 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] As was stated in the bug report, the reason is that gcc runs into a corner case in the register allocator that is rather hard to fix in a good way. As there is an easy way to work around it, just add a comment and the barrier that stops gcc from trying to overoptimize the function. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92657 Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-12-05	net/mlx5e: Fix TXQ indices to be sequential	Eran Ben Elisha	1	-1/+1
	Cited patch changed (channel index, tc) => (TXQ index) mapping to be a static one, in order to keep indices consistent when changing number of channels or TCs. For 32 channels (OOB) and 8 TCs, real num of TXQs is 256. When reducing the amount of channels to 8, the real num of TXQs will be changed to 64. This indices method is buggy: - Channel #0, TC 3, the TXQ index is 96. - Index 8 is not valid, as there is no such TXQ from driver perspective (As it represents channel #8, TC 0, which is not valid with the above configuration). As part of driver's select queue, it calls netdev_pick_tx which returns an index in the range of real number of TXQs. Depends on the return value, with the examples above, driver could have returned index larger than the real number of tx queues, or crash the kernel as it tries to read invalid address of SQ which was not allocated. Fix that by allocating sequential TXQ indices, and hold a new mapping between (channel index, tc) => (real TXQ index). This mapping will be updated as part of priv channels activation, and is used in mlx5e_select_queue to find the selected queue index. The existing indices mapping (channel_tc2txq) is no longer needed, as it is used only for statistics structures and can be calculated on run time. Delete its definintion and updates. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>