aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-02-27mlxsw: pci: Wait longer before accessing the device after resetAmit Cohen1-1/+1
During initialization the driver issues a reset to the device and waits for 100ms before checking if the firmware is ready. The waiting is necessary because before that the device is irresponsive and the first read can result in a completion timeout. While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is insufficient for Spectrum-3. Fix this by increasing the timeout to 200ms. Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC") Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-27mlx5: register lag notifier for init network namespace onlyJiri Pirko5-14/+6
The current code causes problems when the unregistering netdevice could be different then the registering one. Since the check in mlx5_lag_netdev_event() does not allow any other network namespace anyway, fix this by registerting the lag notifier per init network namespace only. Fixes: d48834f9d4b4 ("mlx5: Use dev_net netdevice notifier registrations") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Aya Levin <ayal@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-18net/mlx5: DR, Handle reformat capability over sw-steering tablesErez Shitrit1-2/+7
On flow table creation, send the relevant flags according to what the FW currently supports. When FW doesn't support reformat option over SW-steering managed table, the driver shouldn't pass this. Fixes: 988fd6b32d07 ("net/mlx5: DR, Pass table flags at creation to lower layer") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix lowest FDB pool sizePaul Blakey1-1/+1
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Don't clear the whole vf config when switching modesDmytro Linkin2-3/+7
There is no need to reset all vf config (except link state) between legacy and switchdev modes changes. Also, set link state to AUTO, when legacy enabled. Fixes: 3b83b6c2e024 ("net/mlx5e: Clear VF config when switching modes") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: DR, Fix matching on vport gvmiHamdan Igbaria1-1/+4
Set vport gvmi in the tag, only when source gvmi is set in the bit mask. Fixes: 26d688e3 ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Fix crash in recovery flow without devlink reporterAya Levin1-1/+1
When health reporters are not supported, recovery function is invoked directly, not via devlink health reporters. In this direct flow, the recover function input parameter was passed incorrectly and is causing a kernel oops. This patch is fixing the input parameter. Following call trace is observed on rx error health reporting. Internal error: Oops: 96000007 [#1] PREEMPT SMP Process kworker/u16:4 (pid: 4584, stack limit = 0x00000000c9e45703) Call trace: mlx5e_rx_reporter_err_rq_cqe_recover+0x30/0x164 [mlx5_core] mlx5e_health_report+0x60/0x6c [mlx5_core] mlx5e_reporter_rq_cqe_err+0x6c/0x90 [mlx5_core] mlx5e_rq_err_cqe_work+0x20/0x2c [mlx5_core] process_one_work+0x168/0x3d0 worker_thread+0x58/0x3d0 kthread+0x108/0x134 Fixes: c50de4af1d63 ("net/mlx5e: Generalize tx reporter's functionality") Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5e: Reset RQ doorbell counter before moving RQ state from RST to RDYAya Levin4-9/+43
Initialize RQ doorbell counters to zero prior to moving an RQ from RST to RDY state. Per HW spec, when RQ is back to RDY state, the descriptor ID on the completion is reset. The doorbell record must comply. Fixes: 8276ea1353a4 ("net/mlx5e: Report and recover from CQE with error on RQ") Signed-off-by: Aya Levin <ayal@mellanox.com> Reported-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-18net/mlx5: Fix sleep while atomic in mlx5_eswitch_get_vepaHuy Nguyen1-11/+3
rtnl_bridge_getlink is protected by rcu lock, so mlx5_eswitch_get_vepa cannot take mutex lock. Two possible issues can happen: 1. User at the same time change vepa mode via RTM_SETLINK command. 2. User at the same time change the switchdev mode via devlink netlink interface. Case 1 cannot happen because rtnl executes one message in order. Case 2 can happen but we do not expect user to change the switchdev mode when changing vepa. Even if a user does it, so he will read a value which is no longer valid. Fixes: 8da202b24913 ("net/mlx5: E-Switch, Add support for VEPA in legacy mode.") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-07mlxsw: spectrum_dpipe: Add missing error pathIdo Schimmel1-1/+2
In case devlink_dpipe_entry_ctx_prepare() failed, release RTNL that was previously taken and free the memory allocated by mlxsw_sp_erif_entry_prepare(). Fixes: 2ba5999f009d ("mlxsw: spectrum: Add Support for erif table entries access") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07mlxsw: core: Add validation of hardware device types for MGPIR registerVadim Pasternak2-4/+10
When reading the number of gearboxes from the hardware, the driver does not validate the returned 'device type' field. The driver can therefore wrongly assume that the queried devices are gearboxes. On Spectrum-3 systems that support different types of devices, this can prevent the driver from loading, as it will try to query the temperature sensors from devices which it assumes are gearboxes and in fact are not. For example: [ 218.129230] mlxsw_minimal 2-0048: Reg cmd access status failed (status=7(bad parameter)) [ 218.138282] mlxsw_minimal 2-0048: Reg cmd access failed (reg_id=900a(mtmp),type=write) [ 218.147131] mlxsw_minimal 2-0048: Failed to setup temp sensor number 256 [ 218.534480] mlxsw_minimal 2-0048: Fail to register core bus [ 218.540714] mlxsw_minimal: probe of 2-0048 failed with error -5 Fix this by validating the 'device type' field. Fixes: 2e265a8b6c094 ("mlxsw: core: Extend hwmon interface with inter-connect temperature attributes") Fixes: f14f4e621b1b4 ("mlxsw: core: Extend thermal core with per inter-connect device thermal zones") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07mlxsw: spectrum_router: Clear offload indication from IPv6 nexthops on abortIdo Schimmel1-0/+3
Unlike IPv4, in IPv6 there is no unique structure to represent the nexthop and both the route and nexthop information are squashed to the same structure ('struct fib6_info'). In order to improve resource utilization the driver consolidates identical nexthop groups to the same internal representation of a nexthop group. Therefore, when the offload indication of a nexthop changes, the driver needs to iterate over all the linked fib6_info and toggle their offload flag accordingly. During abort, all the routes are removed from the device and unlinked from their nexthop group. The offload indication is cleared just before the group is destroyed, but by that time no fib6_info is linked to the group and the offload indication remains set. Fix this by clearing the offload indication just before dropping the reference from the nexthop. Fixes: ee5a0448e72b ("mlxsw: spectrum_router: Set hardware flags for routes") Reported-by: Alex Kushnarov <alexanderk@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Alex Kushnarov <alexanderk@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-07mlxsw: spectrum_router: Prevent incorrect replacement of local table routesIdo Schimmel1-1/+51
The driver uses the same table to represent both the main and local routing tables. Prevent routes in the main table from replacing routes in the local table to reflect the fact that the local table is consulted first during lookup. Fixes: b6a1d871d37a ("mlxsw: spectrum_router: Start using new IPv4 route notifications") Fixes: dacad7b34b59 ("mlxsw: spectrum_router: Start using new IPv6 route notifications") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-06net/mlx5: Deprecate usage of generic TLS HW capability bitTariq Toukan3-3/+3
Deprecate the generic TLS cap bit, use the new TX-specific TLS cap bit instead. Fixes: a12ff35e0fb7 ("net/mlx5: Introduce TLS TX offload hardware bits and structures") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5e: TX, Error completion is for last WQE in batchTariq Toukan2-26/+23
For a cyclic work queue, when not requesting a completion per WQE, a single CQE might indicate the completion of several WQEs. However, in case some WQE in the batch causes an error, then an error completion is issued, breaking the batch, and pointing to the offending WQE in the wqe_counter field. Hence, WQE-specific error CQE handling (like printing, breaking, etc...) should be performed only for the last WQE in batch. Fixes: 130c7b46c93d ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events") Fixes: fd9b4be8002c ("net/mlx5e: RX, Support multiple outstanding UMR posts") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctxRaed Salem1-0/+1
SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx, however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function nullifies sa_ctx pointer without freeing the memory allocated, hence the memory leak. Fix by free SA context when the SA is released. Fixes: d6c4f0298cec ("net/mlx5: Refactor accel IPSec code") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: IPsec, Fix esp modify function attributeRaed Salem1-1/+1
The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used with negative negation as zero value indicates success but it used as failure return value instead. Fix by remove the unary not negation operator. Fixes: 05564d0ae075 ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation") Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-02-06net/mlx5: Fix deadlock in fs_coreMaor Gottlieb1-7/+8
free_match_list could be called when the flow table is already locked. We need to pass this notation to tree_put_node. It fixes the following lockdep warnning: [ 1797.268537] ============================================ [ 1797.276837] WARNING: possible recursive locking detected [ 1797.285101] 5.5.0-rc5+ #10 Not tainted [ 1797.291641] -------------------------------------------- [ 1797.299917] handler10/9296 is trying to acquire lock: [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at: tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.319694] [ 1797.319694] but task is already holding lock: [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.344707] [ 1797.344707] other info that might help us debug this: [ 1797.356952] Possible unsafe locking scenario: [ 1797.356952] [ 1797.368333] CPU0 [ 1797.373357] ---- [ 1797.378364] lock(&node->lock); [ 1797.384222] lock(&node->lock); [ 1797.390031] [ 1797.390031] *** DEADLOCK *** [ 1797.390031] [ 1797.403003] May be due to missing lock nesting notation [ 1797.403003] [ 1797.414691] 3 locks held by handler10/9296: [ 1797.421465] #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at: tc_setup_cb_add+0x70/0x250 [ 1797.432810] #1: ffff88a030081490 (&comp->sem){++++}, at: mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core] [ 1797.445829] #2: ffff889ad399a0a0 (&node->lock){++++}, at: nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core] [ 1797.459913] [ 1797.459913] stack backtrace: [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not tainted 5.5.0-rc5+ #10 [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 [ 1797.491480] Call Trace: [ 1797.496701] dump_stack+0x96/0xe0 [ 1797.502864] __lock_acquire.cold.63+0xf8/0x212 [ 1797.510301] ? lockdep_hardirqs_on+0x250/0x250 [ 1797.517701] ? mark_held_locks+0x55/0xa0 [ 1797.524547] ? quarantine_put+0xb7/0x160 [ 1797.531422] ? lockdep_hardirqs_on+0x17d/0x250 [ 1797.538913] lock_acquire+0xd6/0x1f0 [ 1797.545529] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.553701] down_write+0x94/0x140 [ 1797.560206] ? tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.568464] ? down_write_killable_nested+0x170/0x170 [ 1797.576925] ? del_hw_flow_group+0xde/0x1f0 [mlx5_core] [ 1797.585629] tree_put_node+0x1d5/0x210 [mlx5_core] [ 1797.593891] ? free_match_list.part.25+0x147/0x170 [mlx5_core] [ 1797.603389] free_match_list.part.25+0xe0/0x170 [mlx5_core] [ 1797.612654] _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core] [ 1797.621838] ? lock_acquire+0xd6/0x1f0 [ 1797.629028] ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core] [ 1797.637981] ? alloc_insert_flow_group+0x420/0x420 [mlx5_core] [ 1797.647459] ? try_to_wake_up+0x4c7/0xc70 [ 1797.654881] ? lock_downgrade+0x350/0x350 [ 1797.662271] ? __mutex_unlock_slowpath+0xb1/0x3f0 [ 1797.670396] ? find_held_lock+0xac/0xd0 [ 1797.677540] ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.686467] mlx5_add_flow_rules+0xdc/0x360 [mlx5_core] [ 1797.695134] ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core] [ 1797.704270] ? irq_exit+0xa5/0x170 [ 1797.710764] ? retint_kernel+0x10/0x10 [ 1797.717698] ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230 [mlx5_core] [ 1797.728708] mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core] [ 1797.738713] ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core] [ 1797.748384] ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core] [ 1797.757400] mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core] [ 1797.767665] mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core] [ 1797.776886] ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core] [ 1797.785562] ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core] [ 1797.795353] __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core] [ 1797.804558] ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0 [mlx5_core] [ 1797.815093] ? wait_for_completion+0x260/0x260 [ 1797.823272] mlx5e_configure_flower+0xe94/0x1620 [mlx5_core] [ 1797.832792] ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core] [ 1797.842096] ? down_read+0x11a/0x2e0 [ 1797.849090] ? down_write+0x140/0x140 [ 1797.856142] ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core] [ 1797.866027] tc_setup_cb_add+0x11a/0x250 [ 1797.873339] fl_hw_replace_filter+0x25e/0x320 [cls_flower] [ 1797.882385] ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower] [ 1797.891607] fl_change+0x1d54/0x1fb6 [cls_flower] [ 1797.899772] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.910728] ? lock_downgrade+0x350/0x350 [ 1797.918187] ? __radix_tree_lookup+0xa5/0x130 [ 1797.926046] ? fl_set_key+0x1590/0x1590 [cls_flower] [ 1797.934611] ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0 [cls_flower] [ 1797.945673] tc_new_tfilter+0xcd1/0x1240 [ 1797.953138] ? tc_del_tfilter+0xb10/0xb10 [ 1797.960688] ? avc_has_perm_noaudit+0x92/0x320 [ 1797.968721] ? avc_has_perm_noaudit+0x1df/0x320 [ 1797.976816] ? avc_has_extended_perms+0x990/0x990 [ 1797.985090] ? mark_lock+0xaa/0x9e0 [ 1797.991988] ? match_held_lock+0x1b/0x240 [ 1797.999457] ? match_held_lock+0x1b/0x240 [ 1798.006859] ? find_held_lock+0xac/0xd0 [ 1798.014045] ? symbol_put_addr+0x40/0x40 [ 1798.021317] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.029460] ? tc_del_tfilter+0xb10/0xb10 [ 1798.036810] rtnetlink_rcv_msg+0x4d5/0x620 [ 1798.044236] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.052034] ? lockdep_hardirqs_on+0x250/0x250 [ 1798.059837] ? match_held_lock+0x1b/0x240 [ 1798.067146] ? find_held_lock+0xac/0xd0 [ 1798.074246] netlink_rcv_skb+0xc6/0x1f0 [ 1798.081339] ? rtnl_bridge_getlink+0x460/0x460 [ 1798.089104] ? netlink_ack+0x440/0x440 [ 1798.096061] netlink_unicast+0x2d4/0x3b0 [ 1798.103189] ? netlink_attachskb+0x3f0/0x3f0 [ 1798.110724] ? _copy_from_iter_full+0xda/0x370 [ 1798.118415] netlink_sendmsg+0x3ba/0x6a0 [ 1798.125478] ? netlink_unicast+0x3b0/0x3b0 [ 1798.132705] ? netlink_unicast+0x3b0/0x3b0 [ 1798.139880] sock_sendmsg+0x94/0xa0 [ 1798.146332] ____sys_sendmsg+0x36c/0x3f0 [ 1798.153251] ? copy_msghdr_from_user+0x165/0x230 [ 1798.160941] ? kernel_sendmsg+0x30/0x30 [ 1798.167738] ___sys_sendmsg+0xeb/0x150 [ 1798.174411] ? sendmsg_copy_msghdr+0x30/0x30 [ 1798.181649] ? lock_downgrade+0x350/0x350 [ 1798.188559] ? rcu_read_lock_sched_held+0xd0/0xd0 [ 1798.196239] ? __fget+0x21d/0x320 [ 1798.202335] ? do_dup2+0x2a0/0x2a0 [ 1798.208499] ? lock_downgrade+0x350/0x350 [ 1798.215366] ? __fget_light+0xd6/0xf0 [ 1798.221808] ? syscall_trace_enter+0x369/0x5d0 [ 1798.229112] __sys_sendmsg+0xd3/0x160 [ 1798.235511] ? __sys_sendmsg_sock+0x60/0x60 [ 1798.242478] ? syscall_trace_enter+0x233/0x5d0 [ 1798.249721] ? syscall_slow_exit_work+0x280/0x280 [ 1798.257211] ? do_syscall_64+0x1e/0x2e0 [ 1798.263680] do_syscall_64+0x72/0x2e0 [ 1798.269950] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Alaa Hleihel <alaa@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-31mlxsw: spectrum_qdisc: Fix 64-bit division error in mlxsw_sp_qdisc_tbf_rate_kbpsNathan Chancellor1-1/+1
When building arm32 allmodconfig: ERROR: "__aeabi_uldivmod" [drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined! rate_bytes_ps has type u64, we need to use a 64-bit division helper to avoid a build error. Fixes: a44f58c41bfb ("mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-27mlxsw: minimal: Fix an error handling path in 'mlxsw_m_port_create()'Christophe JAILLET1-1/+1
An 'alloc_etherdev()' called is not ballanced by a corresponding 'free_netdev()' call in one error handling path. Slighly reorder the error handling code to catch the missed case. Fixes: c100e47caa8e ("mlxsw: minimal: Add ethtool support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-27mlx5: Use dev_net netdevice notifier registrationsJiri Pirko8-10/+28
Register the dev_net notifier and allow the per-net notifier to follow the device into different namespace. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller8-44/+91
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: spectrum_qdisc: Support offloading of TBF QdiscPetr Machata3-0/+203
React to the TC messages that were introduced in a preceding patch and configure egress maximum shaper as appropriate. TBF can be used as a root qdisc or under one of PRIO or strict ETS bands. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: spectrum: Configure shaper rate and burst size togetherPetr Machata3-8/+10
In order to allow configuration of burst size together with shaper rate, extend mlxsw_sp_port_ets_maxrate_set() with a burst_size argument. Convert call sites to pass 0 (for default). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: spectrum: Add lowest_shaper_bs to struct mlxsw_spPetr Machata2-0/+4
Lower limit of burst size configuration is dependent on system type. Add a datum to track the value. Initialize as appropriate in mlxsw_spX_init(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: reg: Increase MLXSW_REG_QEEC_MAS_DISPetr Machata1-2/+2
As the port speeds grow, the current value of "unlimited shaper", 200000000Kbps, might become lower than the actually supported speeds. Bump it to the maximum value that fits in the corresponding QEEC field, which is about 2.1Tbps. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: reg: Add max_shaper_bs to QoS ETS Element ConfigurationPetr Machata1-0/+15
The QEEC register configures scheduling elements. One of the bits of configuration is the burst size to use for the shaper installed on the element. Add the necessary fields to support this configuration. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: spectrum_qdisc: Extract a common leaf unoffload functionPetr Machata1-5/+14
When the RED Qdisc is unoffloaded, it needs to reduce the reported backlog by the amount that is in the HW, so that only the SW backlog is contained in the counter. The same thing will need to be done by TBF, and likely any other leaf Qdisc as well. Extract a helper mlxsw_sp_qdisc_leaf_unoffload() and call it from mlxsw_sp_qdisc_red_unoffload(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: spectrum_qdisc: Add mlxsw_sp_qdisc_get_class_stats()Petr Machata1-10/+19
Add a wrapper around mlxsw_sp_qdisc_collect_tc_stats() and mlxsw_sp_qdisc_update_stats() for the simple case of doing both in one go: mlxsw_sp_qdisc_get_class_stats(). Dispatch to that function from mlxsw_sp_qdisc_get_red_stats(). This new function will be useful for other leaf Qdiscs as well. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25mlxsw: spectrum_qdisc: Extract a per-TC stat functionPetr Machata1-49/+70
Extract from mlxsw_sp_qdisc_get_prio_stats() two new functions: mlxsw_sp_qdisc_collect_tc_stats() to accumulate stats for that one TC only, and mlxsw_sp_qdisc_update_stats() that makes the stats relative to base values stored earlier. Use them from mlxsw_sp_qdisc_get_red_stats(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-24net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel pathTariq Toukan1-4/+10
When TCP out-of-order is identified (unexpected tcp seq mismatch), driver analyzes the packet and decides what handling should it get: 1. go to accelerated path (to be encrypted in HW), 2. go to regular xmit path (send w/o encryption), 3. drop. Packets marked with skb->decrypted by the TLS stack in the TX flow skips SW encryption, and rely on the HW offload. Verify that such packets are never sent un-encrypted on the wire. Add a WARN to catch such bugs, and prefer dropping the packet in these cases. Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5e: kTLS, Remove redundant posts in TX resync flowTariq Toukan1-2/+0
The call to tx_post_resync_params() is done earlier in the flow, the post of the control WQEs is unnecessarily repeated. Remove it. Fixes: 700ec4974240 ("net/mlx5e: kTLS, Fix missing SQ edge fill") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5e: kTLS, Fix corner-case checks in TX resync flowTariq Toukan1-14/+19
There are the following cases: 1. Packet ends before start marker: bypass offload. 2. Packet starts before start marker and ends after it: drop, not supported, breaks contract with kernel. 3. packet ends before tls record info starts: drop, this packet was already acknowledged and its record info was released. Add the above as comment in code. Mind possible wraparounds of the TCP seq, replace the simple comparison with a call to the TCP before() method. In addition, remove logic that handles negative sync_len values, as it became impossible. Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support") Fixes: 46a3ea98074e ("net/mlx5e: kTLS, Enhance TX resync flow") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5e: Clear VF config when switching modesDmytro Linkin2-4/+11
Currently VF in LEGACY mode are not able to go up. Also in OFFLOADS mode, when switching to it first time, VF can go up independently to his representor, which is not expected. Perform clearing of VF config when switching modes and set link state to AUTO as default value. Also, when switching to OFFLOADS mode set link state to DOWN, which allow VF link state to be controlled by its REP. Fixes: 1ab2068a4c66 ("net/mlx5: Implement vports admin state backup/restore") Fixes: 556b9d16d3f5 ("net/mlx5: Clear VF's configuration on disabling SRIOV") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: DR, use non preemptible call to get the current cpu numberErez Shitrit1-1/+2
Use raw_smp_processor_id instead of smp_processor_id() otherwise we will get the following trace in debug-kernel: BUG: using smp_processor_id() in preemptible [00000000] code: devlink caller is dr_create_cq.constprop.2+0x31d/0x970 [mlx5_core] Call Trace: dump_stack+0x9a/0xf0 debug_smp_processor_id+0x1f3/0x200 dr_create_cq.constprop.2+0x31d/0x970 genl_family_rcv_msg+0x5fd/0x1170 genl_rcv_msg+0xb8/0x160 netlink_rcv_skb+0x11e/0x340 Fixes: 297cccebdc5a ("net/mlx5: DR, Expose an internal API to issue RDMA operations") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: E-Switch, Prevent ingress rate configuration of uplink repEli Cohen1-2/+7
Since the implementation relies on limiting the VF transmit rate to simulate ingress rate limiting, and since either uplink representor or ecpf are not associated with a VF, we limit the rate limit configuration for those ports. Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support") Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: DR, Enable counter on non-fwd-dest objectsErez Shitrit1-13/+29
The current code handles only counters that attached to dest, we still have the cases where we have counter on non-dest, like over drop etc. Fixes: 6a48faeeca10 ("net/mlx5: Add direct rule fs_cmd implementation") Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: Update the list of the PCI supported devicesMeir Lichtinger1-0/+1
Add the upcoming ConnectX-7 device ID. Fixes: 85327a9c4150 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Meir Lichtinger <meirl@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-24net/mlx5: Fix lowest FDB pool sizePaul Blakey1-1/+1
The pool sizes represent the pool sizes in the fw. when we request a pool size from fw, it will return the next possible group. We track how many pools the fw has left and start requesting groups from the big to the small. When we start request 4k group, which doesn't exists in fw, fw wants to allocate the next possible size, 64k, but will fail since its exhausted. The correct smallest pool size in fw is 128 and not 4k. Fixes: e52c28024008 ("net/mlx5: E-Switch, Add chains and priorities") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-23mlxsw: spectrum_acl: Fix use-after-free during reloadIdo Schimmel1-4/+12
During reload (or module unload), the router block is de-initialized. Among other things, this results in the removal of a default multicast route from each active virtual router (VRF). These default routes are configured during initialization to trap packets to the CPU. In Spectrum-2, unlike Spectrum-1, multicast routes are implemented using ACL rules. Since the router block is de-initialized before the ACL block, it is possible that the ACL rules corresponding to the default routes are deleted while being accessed by the ACL delayed work that queries rules' activity from the device. This can result in a rare use-after-free [1]. Fix this by protecting the rules list accessed by the delayed work with a lock. We cannot use a spinlock as the activity read operation is blocking. [1] [ 123.331662] ================================================================== [ 123.339920] BUG: KASAN: use-after-free in mlxsw_sp_acl_rule_activity_update_work+0x330/0x3b0 [ 123.349381] Read of size 8 at addr ffff8881f3bb4520 by task kworker/0:2/78 [ 123.357080] [ 123.358773] CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 5.5.0-rc5-custom-33108-gf5df95d3ef41 #2209 [ 123.368898] Hardware name: Mellanox Technologies Ltd. MSN3700C/VMOD0008, BIOS 5.11 10/10/2018 [ 123.378456] Workqueue: mlxsw_core mlxsw_sp_acl_rule_activity_update_work [ 123.385970] Call Trace: [ 123.388734] dump_stack+0xc6/0x11e [ 123.392568] print_address_description.constprop.4+0x21/0x340 [ 123.403236] __kasan_report.cold.8+0x76/0xb1 [ 123.414884] kasan_report+0xe/0x20 [ 123.418716] mlxsw_sp_acl_rule_activity_update_work+0x330/0x3b0 [ 123.444034] process_one_work+0xb06/0x19a0 [ 123.453731] worker_thread+0x91/0xe90 [ 123.467348] kthread+0x348/0x410 [ 123.476847] ret_from_fork+0x24/0x30 [ 123.480863] [ 123.482545] Allocated by task 73: [ 123.486273] save_stack+0x19/0x80 [ 123.490000] __kasan_kmalloc.constprop.6+0xc1/0xd0 [ 123.495379] mlxsw_sp_acl_rule_create+0xa7/0x230 [ 123.500566] mlxsw_sp2_mr_tcam_route_create+0xf6/0x3e0 [ 123.506334] mlxsw_sp_mr_tcam_route_create+0x5b4/0x820 [ 123.512102] mlxsw_sp_mr_table_create+0x3b5/0x690 [ 123.517389] mlxsw_sp_vr_get+0x289/0x4d0 [ 123.521797] mlxsw_sp_fib_node_get+0xa2/0x990 [ 123.526692] mlxsw_sp_router_fib4_event_work+0x54c/0x2d60 [ 123.532752] process_one_work+0xb06/0x19a0 [ 123.537352] worker_thread+0x91/0xe90 [ 123.541471] kthread+0x348/0x410 [ 123.545103] ret_from_fork+0x24/0x30 [ 123.549113] [ 123.550795] Freed by task 518: [ 123.554231] save_stack+0x19/0x80 [ 123.557958] __kasan_slab_free+0x125/0x170 [ 123.562556] kfree+0xd7/0x3a0 [ 123.565895] mlxsw_sp_acl_rule_destroy+0x63/0xd0 [ 123.571081] mlxsw_sp2_mr_tcam_route_destroy+0xd5/0x130 [ 123.576946] mlxsw_sp_mr_tcam_route_destroy+0xba/0x260 [ 123.582714] mlxsw_sp_mr_table_destroy+0x1ab/0x290 [ 123.588091] mlxsw_sp_vr_put+0x1db/0x350 [ 123.592496] mlxsw_sp_fib_node_put+0x298/0x4c0 [ 123.597486] mlxsw_sp_vr_fib_flush+0x15b/0x360 [ 123.602476] mlxsw_sp_router_fib_flush+0xba/0x470 [ 123.607756] mlxsw_sp_vrs_fini+0xaa/0x120 [ 123.612260] mlxsw_sp_router_fini+0x137/0x384 [ 123.617152] mlxsw_sp_fini+0x30a/0x4a0 [ 123.621374] mlxsw_core_bus_device_unregister+0x159/0x600 [ 123.627435] mlxsw_devlink_core_bus_device_reload_down+0x7e/0xb0 [ 123.634176] devlink_reload+0xb4/0x380 [ 123.638391] devlink_nl_cmd_reload+0x610/0x700 [ 123.643382] genl_rcv_msg+0x6a8/0xdc0 [ 123.647497] netlink_rcv_skb+0x134/0x3a0 [ 123.651904] genl_rcv+0x29/0x40 [ 123.655436] netlink_unicast+0x4d4/0x700 [ 123.659843] netlink_sendmsg+0x7c0/0xc70 [ 123.664251] __sys_sendto+0x265/0x3c0 [ 123.668367] __x64_sys_sendto+0xe2/0x1b0 [ 123.672773] do_syscall_64+0xa0/0x530 [ 123.676892] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 123.682552] [ 123.684238] The buggy address belongs to the object at ffff8881f3bb4500 [ 123.684238] which belongs to the cache kmalloc-128 of size 128 [ 123.698261] The buggy address is located 32 bytes inside of [ 123.698261] 128-byte region [ffff8881f3bb4500, ffff8881f3bb4580) [ 123.711303] The buggy address belongs to the page: [ 123.716682] page:ffffea0007ceed00 refcount:1 mapcount:0 mapping:ffff888236403500 index:0x0 [ 123.725958] raw: 0200000000000200 dead000000000100 dead000000000122 ffff888236403500 [ 123.734646] raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 [ 123.743315] page dumped because: kasan: bad access detected [ 123.749562] [ 123.751241] Memory state around the buggy address: [ 123.756620] ffff8881f3bb4400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 123.764716] ffff8881f3bb4480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 123.772812] >ffff8881f3bb4500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 123.780904] ^ [ 123.785697] ffff8881f3bb4580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 123.793793] ffff8881f3bb4600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 123.801883] ================================================================== Fixes: cf7221a4f5a5 ("mlxsw: spectrum_router: Add Multicast routing support for Spectrum-2") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-22net/mlx5e: Enable all available stats for uplink repsVlad Buslov3-33/+18
Extend stats group array of uplink representor with all stats that are available for PF in legacy mode, besides ipsec and TLS which are not supported. Don't output vport stats for uplink representor because they are already handled by 802_3 group (with different names: {tx|rx}_{bytes|packets}_phy). Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Create q counters on uplink representorsVlad Buslov1-2/+19
Q counters were disabled for all types of representors to prevent an issue where there is not enough resources to init q counters for 127 representor instances. Enable q counters only for uplink representors to support "rx_out_of_buffer", "rx_if_down_packets" counters in ethtool. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infraVlad Buslov1-43/+114
In order to support all of the supported stats that are available in legacy mode for switchdev uplink representors, convert rep stats infrastructure to reuse struct mlx5e_stats_grp that is already used when device is in legacy mode. Refactor rep code to use array of mlx5e_stats_grp structures (constructed using macros provided by stats infra) to fill/update stats, instead of fixed hardcoded set of values. This approach allows to easily extend representors with new stats types. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: IPoIB, use separate stats groupsSaeed Mahameed3-15/+51
Don't copy all of the stats groups used for mlx5e ethernet NIC profile, have a separate stats groups for IPoIB with the set of the needed stats only. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5e: Convert stats groups array to array of group pointersSaeed Mahameed4-31/+56
Convert stats groups array to array of "stats group" pointers to allow sharing and individual selection of groups per profile as illustrated in the next patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22net/mlx5e: Declare stats groups via macroSaeed Mahameed3-185/+108
Introduce new macros to declare stats callbacks and groups, for better code reuse and for individual groups selection per profile which will be introduced in next patches. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22net/mlx5e: Profile specific stats groupsSaeed Mahameed6-48/+93
Attach stats groups array to the profiles and make the stats utility functions (get_num, update, fill, fill_strings) generic and use the profile->stats_grps rather the hardcoded NIC stats groups. This will allow future extension to have per profile stats groups. In this patch mlx5e NIC and IPoIB will still share the same stats groups. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2020-01-22net/mlx5e: Move uplink rep init/cleanup code into own functionsRoi Dayan1-30/+51
Clean up the code and allows to call uplink rep init/cleanup from different location later. To be used later for a new uplink representor mode. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5: DR, Allow connecting flow table to a lower/same level tableYevgeny Kliteynik1-2/+5
Allow connecting SW steering source table to a lower/same level destination table. Lifting this limitation is required to support Connection Tracking. Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-22net/mlx5: DR, Modify header copy supportHamdan Igbaria2-15/+151
Modify header supports ADD/SET and from this patch also COPY. Copy allows to copy header fields and metadata. Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com> Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>