linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-03-01	tc-testing: Allow test cases to be skipped	Lucas Bates	4	-13/+27
	By adding a check for an optional key/value pair to the test case data, individual test cases may be skipped to prevent tdc from aborting a test run due to setup or teardown failure. If a test case is skipped, it will still appear in the results output to allow for a consistent number of executed tests in each run. However, the test will be marked as skipped. This support for skipping extends to any plugins that may generate additional results for each executed test. Signed-off-by: Lucas Bates <lucasb@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	geneve: correctly handle ipv6.disable module parameter	Jiri Benc	1	-3/+8
	When IPv6 is compiled but disabled at runtime, geneve_sock_add returns -EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole operation of bringing up the tunnel. Ignore failure of IPv6 socket creation for metadata based tunnels caused by IPv6 not being available. This is the same fix as what commit d074bf960044 ("vxlan: correctly handle ipv6.disable module parameter") is doing for vxlan. Note there's also commit c0a47e44c098 ("geneve: should not call rt6_lookup() when ipv6 was disabled") which fixes a similar issue but for regular tunnels, while this patch is needed for metadata based tunnels. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	David S. Miller	3	-5/+6
	Alexei Starovoitov says: ==================== pull-request: bpf 2019-03-01 The following pull-request contains BPF updates for your net tree. The main changes are: 1) fix sanitation rewrite, from Daniel. 2) fix error path on map_new_fd, from Peng. 3) fix icache flush address, from Paul. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	Merge branch 'mlxsw-rehash-split'	David S. Miller	1	-125/+286
	Ido Schimmel says: ==================== mlxsw: spectrum_acl: Split rehash work into chunks Jiri says: When rehash happens on a vregion with many rules and they are being migrated, it might take significant time to finish the job. During that time vregion->lock is taken which prevents rules from being added/deleted from the vregion. Aim of this patchset is to allow to interrupt migration of rules during rehash, reschedule and give chance for rules to be added/deleted. Then continue migration in another execution of scheduled work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Make mlxsw_sp_acl_tcam_vregion_rehash() return void	Jiri Pirko	1	-8/+4
	The return value is ignored anyway, so just return void. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Remember where to continue rehash migration	Jiri Pirko	1	-5/+86
	Store pointer to vchunk where the migration was interrupted, as well as ventry pointer to start from and to stop at (during rollback). This saved pointers need to be forgotten in case of ventries list or vchunk list changes, which is done by couple of "changed" helpers. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Allow to interrupt/continue rehash work	Jiri Pirko	1	-20/+62
	Currently, migration of vregions with many entries may take long time during which insertions and removals of the rules are blocked due to wait to acquire vregion->lock. To overcome this, allow to interrupt and continue rehash work according to the set credits - number of rules to migrate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Do rollback as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all()	Jiri Pirko	1	-46/+29
	In order to simplify the code and to prepare it for interrupted/continued migration process, do the rollback in case of migration error as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all(). It can be understood as "migrate all back". Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Put vchunk migrate start/end code into separate functions	Jiri Pirko	1	-11/+32
	In preparations of interrupt/continue of rehash work, put the code that is done at the beginning/end of vchunk migrate function into separate functions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Put this_is_rollback to rehash context struct	Jiri Pirko	1	-7/+12
	Put the this_is_rollback flag into rehash context struct in preparations for interrupt/continue of rehash work. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Rename variables in mlxsw_sp_acl_tcam_ventry_migrate()	Jiri Pirko	1	-7/+7
	Remove some of variables in function mlxsw_sp_acl_tcam_ventry_migrate() so the names are aligned with the rest of the code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: assign vchunk->chunk by the newly created chunk	Jiri Pirko	1	-8/+10
	Make the vchunk->chunk contain pointer of a new chunk we migrate to. In case of a rollback, it contains the original chunk. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: assign vregion->region by the newly created region	Jiri Pirko	1	-22/+20
	Make the vregion->region contain pointer of a new region we migrate to. In case of a rollback, it contains the original region. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Push code start/end from mlxsw_sp_acl_tcam_vregion_migrate()	Jiri Pirko	1	-39/+35
	Push code from the beginning and end of function mlxsw_sp_acl_tcam_vregion_migrate() into rehash_start()/end() functions. Then all the things needed to be done before and after the actual migration process will be grouped together. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Push rehash start/end code into separate functions	Jiri Pirko	1	-9/+32
	In preparations for interrupt/continue of rehash work, put the code at the beginning/end of the rehash function into separate functions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Introduce new rehash context struct and save hint_priv there	Jiri Pirko	1	-4/+12
	Prepare for continued migration. Introduce a new structure to track rehash context and save hint_priv into it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Don't migrate already migrated entry	Jiri Pirko	1	-0/+4
	Check if the entry is already in a chunk where we want it to be. In that case, skip migration. This is preparation for "per parts" migration where this situation may occur. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	mlxsw: spectrum_acl: Push rehash dw struct into rehash sub-struct	Jiri Pirko	1	-7/+9
	More rehash related fields are going to come. Push "dw" into sub-struct that will accommodate the others as well. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode	Heiner Kallweit	3	-0/+15
	When debugging another issue I faced an interrupt storm in this driver (88E6390, port 9 in SGMII mode), consisting of alternating link-up / link-down interrupts. Analysis showed that the driver wanted to set a cmode that was set already. But so far mv88e6390x_port_set_cmode() doesn't check this and powers down SERDES, what causes the link to break, and eventually results in the described interrupt storm. Fix this by checking whether the cmode actually changes. We want that the very first call to mv88e6390x_port_set_cmode() always configures the registers, therefore initialize port.cmode with a value that is different from any supported cmode value. We have to take care that we only init the ports cmode once chip->info->num_ports is set. v2: - add small helper and init the number of actual ports only Fixes: 364e9d7776a3 ("net: dsa: mv88e6xxx: Power on/off SERDES on cmode change") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	switchdev: Remove unused transaction item queue	Florian Fainelli	3	-129/+2
	There are no more in tree users of the switchdev_trans_item_{dequeue,enqueue} or switchdev_trans_item structure in the kernel since commit 00fc0c51e35b ("rocker: Change world_ops API and implementation to be switchdev independant"). Remove this unused code and update the documentation accordingly since. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	bpf: fix sanitation rewrite in case of non-pointers	Daniel Borkmann	1	-1/+2
	Marek reported that he saw an issue with the below snippet in that timing measurements where off when loaded as unpriv while results were reasonable when loaded as privileged: [...] uint64_t a = bpf_ktime_get_ns(); uint64_t b = bpf_ktime_get_ns(); uint64_t delta = b - a; if ((int64_t)delta > 0) { [...] Turns out there is a bug where a corner case is missing in the fix d3bd7413e0ca ("bpf: fix sanitation of alu op with pointer / scalar type from different paths"), namely fixup_bpf_calls() only checks whether aux has a non-zero alu_state, but it also needs to test for the case of BPF_ALU_NON_POINTER since in both occasions we need to skip the masking rewrite (as there is nothing to mask). Fixes: d3bd7413e0ca ("bpf: fix sanitation of alu op with pointer / scalar type from different paths") Reported-by: Marek Majkowski <marek@cloudflare.com> Reported-by: Arthur Fabre <afabre@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/netdev/CAJPywTJqP34cK20iLM5YmUMz9KXQOdu1-+BZrGMAGgLuBWz7fg@mail.gmail.com/T/ Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-03-01	Merge branch 'doc-net-ieee802154-move-from-plain-text-to-rst'	David S. Miller	2	-95/+99
	Stefan Schmidt says: ==================== doc: net: ieee802154: move from plain text to rst The ieee802154 subsystem doc was still in plain text. With the networking book taking shape I thought it was time to do the first step and move it over to rst. This really is only the minimal conversion. I need to take some time to update and extend the docs. The patches are based on net-next, but they only touch the networking book so I would not expect and trouble. From what I have seen they would go through Jonathan's tree after being acked by Dave? If you want this patches against a different tree let me know. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	doc: net: ieee802154: remove old plain text docs after switching to rst	Stefan Schmidt	1	-177/+0
	The plain text docs are converted to rst now, which allows us to remove the old text file from the tree. Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	doc: net: ieee802154: introduce IEEE 802.15.4 subsystem doc in rst style	Stefan Schmidt	2	-0/+181
	Moving the ieee802154 docs from a plain text file into the new rst style. This commit only does the minimal needed change to bring the documentation over. Follow up patches will improve and extend on this. Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	devlink: fix kdoc	Jakub Kicinski	1	-7/+5
	devlink suffers from a few kdoc warnings: net/core/devlink.c:5292: warning: Function parameter or member 'dev' not described in 'devlink_register' net/core/devlink.c:5351: warning: Function parameter or member 'port_index' not described in 'devlink_port_register' net/core/devlink.c:5753: warning: Function parameter or member 'parent_resource_id' not described in 'devlink_resource_register' net/core/devlink.c:5753: warning: Function parameter or member 'size_params' not described in 'devlink_resource_register' net/core/devlink.c:5753: warning: Excess function parameter 'top_hierarchy' description in 'devlink_resource_register' net/core/devlink.c:5753: warning: Excess function parameter 'reload_required' description in 'devlink_resource_register' net/core/devlink.c:5753: warning: Excess function parameter 'parent_reosurce_id' description in 'devlink_resource_register' net/core/devlink.c:6451: warning: Function parameter or member 'region' not described in 'devlink_region_snapshot_create' net/core/devlink.c:6451: warning: Excess function parameter 'devlink_region' description in 'devlink_region_snapshot_create' Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	Merge branch 'net-aquantia-minor-bug-fixes-after-static-analysis'	David S. Miller	11	-83/+197
	Igor Russkikh says: ==================== net: aquantia: minor bug fixes after static analysis This patchset fixes minor errors and warnings found by smatch and kasan. Extra patch is to replace AQ_HW_WAIT_FOR with readx_poll_timeout to improve readability. V2: use readx_poll resubmitted to net-next since the changeset became quite big. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: aquantia: use better wrappers for state registers	Nikita Danilov	2	-5/+5
	Replace some direct registers reads with better online functions. Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: aquantia: replace AQ_HW_WAIT_FOR with readx_poll_timeout_atomic	Nikita Danilov	8	-72/+184
	David noticed the original define was hiding 'err' variable reference. Thats confusing and counterintuitive. Andrew noted the whole macro could be replaced with standard readx_poll kernel macro. This makes code more readable. Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: aquantia: fixed instack structure overflow	Igor Russkikh	2	-4/+4
	This is a real stack undercorruption found by kasan build. The issue did no harm normally because it only overflowed 2 bytes after `bitary` array which on most architectures were mapped into `err` local. Fixes: bab6de8fd180 ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions.") Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: aquantia: fixed buffer overflow	Nikita Danilov	1	-0/+2
	The overflow is detected by smatch: drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c: 175 aq_pci_func_free_irqs() error: buffer overflow 'self->aq_vec' 8 <= 31 In reality msix_entry_mask always restricts number of iterations. Adding extra condition to make logic clear and smatch happy. Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: aquantia: added newline at end of file	Nikita Danilov	1	-1/+1
	drivers/net/ethernet/aquantia/atlantic/aq_nic.c: 991:1: warning: no newline at end of file Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	net: aquantia: fixed memcpy size	Nikita Danilov	1	-1/+1
	Not careful array dereference caused analysis tools to think there could be memory overflow. There was actually no corruption because the array is two dimensional. drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c: 140 aq_ethtool_get_strings() error: memcpy() '*aq_ethtool_stat_names' too small (32 vs 704) Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-01	ipv4: Add ICMPv6 support when parse route ipproto	Hangbin Liu	4	-7/+17
	For ip rules, we need to use 'ipproto ipv6-icmp' to match ICMPv6 headers. But for ip -6 route, currently we only support tcp, udp and icmp. Add ICMPv6 support so we can match ipv6-icmp rules for route lookup. v2: As David Ahern and Sabrina Dubroca suggested, Add an argument to rtm_getroute_parse_ip_proto() to handle ICMP/ICMPv6 with different family. Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: eacb9384a3fe ("ipv6: support sport, dport and ip_proto in RTM_GETROUTE") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02	MIPS: eBPF: Fix icache flush end address	Paul Burton	1	-1/+1
	The MIPS eBPF JIT calls flush_icache_range() in order to ensure the icache observes the code that we just wrote. Unfortunately it gets the end address calculation wrong due to some bad pointer arithmetic. The struct jit_ctx target field is of type pointer to u32, and as such adding one to it will increment the address being pointed to by 4 bytes. Therefore in order to find the address of the end of the code we simply need to add the number of 4 byte instructions emitted, but we mistakenly add the number of instructions multiplied by 4. This results in the call to flush_icache_range() operating on a memory region 4x larger than intended, which is always wasteful and can cause crashes if we overrun into an unmapped page. Fix this by correcting the pointer arithmetic to remove the bogus multiplication, and use braces to remove the need for a set of brackets whilst also making it obvious that the target field is a pointer. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.") Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-03-01	net/mlx5: Update the list of the PCI supported devices	Eran Ben Elisha	1	-0/+2
	Add the upcoming ConnectX-6 Dx. In addition, add "ConnectX Family mlx5Gen Virtual Function" device ID. Every new HCA VF will be identified with this device ID. Different VF models will be distinguished by their revision id. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Set peer flow needed also for multipath	Roi Dayan	1	-2/+9
	Update the predicate that determines if to duplicate rules installed on vport reps to account also for the multipath case. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Update check for merged eswitch device	Roi Dayan	1	-4/+3
	The current check only validates if both netdevs use the same ops which means both are vf reps or both uplink reps. Unlike the case where the two uplinks are bonded (VF LAG), under multipath scheme the switchdev parent id is not unified between the uplink reps (and all the associated vf reps). However, we still want to duplicate in the driver encap flows, adjust the merged eswitch check for that matter. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Use hint to resolve route when in HW multipath mode	Roi Dayan	1	-0/+12
	As part of creating the tunnel headers while offloading TC encap rules, we resolve the route and neighbour in order to get the source / destination mac. Since the way we offload multipath route is by having two HW rules, one per uplink port, doing naive route lookup might get us a "wrong" routing path which goes through the peer uplink and this will get us eventually to create a wrong L2 header for the tunnel. To avoid that, we use a device hint to get the correct route. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Always query offloaded tc peer rule counter	Roi Dayan	1	-11/+15
	Under multipath when encap rules are duplicated to HW in the driver, it's possible for one flow to be currently un-offloaded (e.g. lack of next-hop route or neigh entry) while the other flow is offloaded. As such, we move to query the counters of both flows at all times. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Re-attempt to offload flows on multipath port affinity events	Roi Dayan	4	-12/+71
	Under multipath it's possible for us to offload the flow only through the e-switch for which proper route through the uplink exists. When the port is up and the next-hop route is set again we want to offload through it as well. We generate SW event from the FIB event handler when multipath port affinity changes. The tc offloads code gets this event, goes over the flows which were marked as of having missing route and attempts to offload them. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5: Emit port affinity event for multipath offloads	Roi Dayan	2	-0/+12
	Under multipath offload scheme, as part of handling fib events, emit mlx5 port affinity event on the enabled ports which will be handled by the tc offloads code. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Allow one failure when offloading tc encap rules under multipath	Roi Dayan	1	-2/+12
	In a similar manner to uplink/VF LAG, under multipath we add encap peer rule on the second port as well. However, unlike the LAG case, we do want to allow failure for adding one of the rules. This happens due to using a routing hint while doing the route lookup when one path (next hop device) is down. Introduce a new flag to indicate that route lookup failed for encap flow. Note that a flow may still not be offloaded to hw due to missing neighbour, in that case, the neigh update event will take care of it. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Don't inherit flow flags on peer flow creation	Roi Dayan	1	-3/+4
	Currently the peer flow inherits the flags from the original flow after we've set it. At this time the flags are set according to the flow state, e.g marked as going to slow path and such. Even if not getting us to real bugs now, this opens the door to get us to troubles later. Future proof the code and avoid the inheritance, use the peer flags as were set on input when we started adding the original flow. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Activate HW multipath and handle port affinity based on FIB events	Roi Dayan	6	-0/+326
	To support multipath offload we are going to track SW multipath route and related nexthops. To do that we register to FIB notifier and handle the route and next-hops events and reflect that as port affinity to HW. When there is a new multipath route entry that all next-hops are the ports of an HCA we will activate LAG in HW. Egress wise, we use HW LAG as the means to emulate multipath on current HW which doesn't support port selection based on xmit hash. In the presence of multiple VFs which use multiple SQs (send queues) this yields fairly good distribution. HA wise, HW LAG buys us the ability for a given RQ (receive queue) to receive traffic from both ports and for SQs to migrate xmitting over the active port if their base port fails. When the route entry is being updated to single path we will update the HW port affinity to use that port only. If a next-hop becomes dead we update the HW port affinity to the living port. When all next-hops are alive again we reset the affinity to default. Due to FW/HW limitations, when a route is deleted we are not disabling the HW LAG since doing so will not allow us to enable it again while VFs are bounded. Typically this is just a temporary state when a routing daemon removes dead routes and later adds them back as needed. This patch only handles events for AF_INET. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5: Add multipath mode	Roi Dayan	4	-2/+28
	In order to offload ecmp-on-host scheme where next-hop routes are used, we will make use of HW LAG. Add accessor function to let upper layers in the driver to realize if the lag acts in multi-path mode. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5: Use own workqueue for lag netdev events processing	Roi Dayan	2	-1/+9
	Instead of using the system workqueue, allocate our own workqueue. This workqueue will be used to handle more work in the next patch. This patch doesn't change functionality. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5: Expose lag operations in header file	Roi Dayan	2	-48/+68
	The change is a refactoring step towards a multipath use case. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5: Use unsigned int bit instead of bool as a struct member	Roi Dayan	1	-1/+1
	This fix checkpatch check CHECK: Avoid using bool structure members because of possible alignment issues Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Don't make internal use of errno to denote missing neigh	Roi Dayan	2	-14/+22
	EAGAIN is treated as a specific case when we consider the attachment successful but wait for neigh event before offloading the flow. This can result in unwanted behavior when sub calls on the offloading path will return EAGAIN and we pass this error up. Instead of attaching to a specific error code return a boolean value from the attach encap operation saying if the encap is valid or not. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01	net/mlx5e: Cleanup attach encap function	Roi Dayan	1	-14/+17
	Remove the tunnel info argument which we can get from the other args. Also reorder the args to have input args first and output args later. This patch doesn't change functionality. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>