wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2017-06-14	i40e: Fix a sleep-in-atomic bug	Jia-Ju Bai	1	-0/+2
	The driver may sleep under a spin lock, and the function call path is: i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh) i40e_vsi_remove_pvid i40e_vlan_stripping_disable i40e_aq_update_vsi_params i40e_asq_send_command mutex_lock --> may sleep To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14	net: don't global ICMP rate limit packets originating from loopback	Jesper Dangaard Brouer	2	-3/+7
	Florian Weimer seems to have a glibc test-case which requires that loopback interfaces does not get ICMP ratelimited. This was broken by commit c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited"). An ICMP response will usually be routed back-out the same incoming interface. Thus, take advantage of this and skip global ICMP ratelimit when the incoming device is loopback. In the unlikely event that the outgoing it not loopback, due to strange routing policy rules, ICMP rate limiting still works via peer ratelimiting via icmpv4_xrlim_allow(). Thus, we should still comply with RFC1812 (section 4.3.2.8 "Rate Limiting"). This seems to fix the reproducer given by Florian. While still avoiding to perform expensive and unneeded outgoing route lookup for rate limited packets (in the non-loopback case). Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited") Reported-by: Florian Weimer <fweimer@redhat.com> Reported-by: "H.J. Lu" <hjl.tools@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14	net/act_pedit: fix an error code	Dan Carpenter	1	-1/+3
	I'm reviewing static checker warnings where we do ERR_PTR(0), which is the same as NULL. I'm pretty sure we intended to return ERR_PTR(-EINVAL) here. Sometimes these bugs lead to a NULL dereference but I don't immediately see that problem here. Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Amir Vadai <amir@vadai.me> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14	net: update undefined ->ndo_change_mtu() comment	Magnus Damm	1	-2/+1
	Update ->ndo_change_mtu() callback comment to remove text about returning error in case of undefined callback. This change makes the comment match the existing code behavior. Signed-off-by: Magnus Damm <damm+renesas@opensource.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14	net_sched: move tcf_lock down after gen_replace_estimator()	WANG Cong	1	-5/+3
	Laura reported a sleep-in-atomic kernel warning inside tcf_act_police_init() which calls gen_replace_estimator() with spinlock protection. It is not necessary in this case, we already have RTNL lock here so it is enough to protect concurrent writers. For the reader, i.e. tcf_act_police(), it needs to make decision based on this rate estimator, in the worst case we drop more/less packets than necessary while changing the rate in parallel, it is still acceptable. Reported-by: Laura Abbott <labbott@redhat.com> Reported-by: Nick Huber <nicholashuber@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	caif: Add sockaddr length check before accessing sa_family in connect handler	Mateusz Jurczyk	1	-0/+4
	Verify that the caller-provided sockaddr structure is large enough to contain the sa_family field, before accessing it in the connect() handler of the AF_CAIF socket. Since the syscall doesn't enforce a minimum size of the corresponding memory region, very short sockaddrs (zero or one byte long) result in operating on uninitialized memory while referencing sa_family. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	qed: fix dump of context data	Tayar, Tomer	1	-1/+1
	Currently when dumping a context data only word number '1' is read for the entire context. Fixes: c965db444629 ("qed: Add support for debug data collection") Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	qmi_wwan: new Telewell and Sierra device IDs	Bjørn Mork	1	-0/+4
	A new Sierra Wireless EM7305 device ID used in a Toshiba laptop, and two Longcheer device IDs entries used by Telewell TW-3G HSPA+ branded modems. Reported-by: Petr Kloc <petr_kloc@yahoo.com> Reported-by: Teemu Likonen <tlikonen@iki.fi> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	net: phy: Fix MDIO_THUNDER dependencies	Florian Fainelli	1	-0/+1
	After commit 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs") we could create a configuration where MDIO_DEVICE=y and PHYLIB=m which leads to the following undefined references: drivers/built-in.o: In function `thunder_mdiobus_pci_remove': >> mdio-thunder.c:(.text+0x2a212f): undefined reference to >> `mdiobus_unregister' >> mdio-thunder.c:(.text+0x2a2138): undefined reference to >> `mdiobus_free' drivers/built-in.o: In function `thunder_mdiobus_pci_probe': mdio-thunder.c:(.text+0x2a22e7): undefined reference to `devm_mdiobus_alloc_size' mdio-thunder.c:(.text+0x2a236f): undefined reference to `of_mdiobus_register' Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	netconsole: Remove duplicate "netconsole: " logging prefix	Joe Perches	1	-1/+1
	It's already added by pr_fmt so remove the explicit use. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	igmp: acquire pmc lock for ip_mc_clear_src()	WANG Cong	1	-8/+13
	Andrey reported a use-after-free in add_grec(): for (psf = *psf_list; psf; psf = psf_next) { ... psf_next = psf->sf_next; where the struct ip_sf_list's were already freed by: kfree+0xe8/0x2b0 mm/slub.c:3882 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411 sock_release+0x8d/0x1e0 net/socket.c:597 sock_close+0x16/0x20 net/socket.c:1072 This happens because we don't hold pmc->lock in ip_mc_clear_src() and a parallel mr_ifc_timer timer could jump in and access them. The RCU lock is there but it is merely for pmc itself, this spinlock could actually ensure we don't access them in parallel. Thanks to Eric and Long for discussion on this bug. Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	r8152: give the device version	Oliver Neukum	1	-0/+2
	Getting the device version out of the driver really aids debugging. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	net: rps: fix uninitialized symbol warning	Ashwanth Goli	1	-1/+1
	This patch fixes uninitialized symbol warning that got introduced by the following commit 773fc8f6e8d6 ("net: rps: send out pending IPI's on CPU hotplug") Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-13	HID: let generic driver yield control iff specific driver has been enabled	Jiri Kosina	1	-61/+221
	There are many situations where generic HID driver provides some basic level of support for certain device, but later this support (usually by implementing vendor-specific extensions of HID protocol) is extended and the support moved over to a separate (usually per-vendor) specific driver. This might bring a rather unpleasant suprise for users, as all of a sudden there is a new config option they have to enable in order to get any support for their device whatsoever, although previous kernel versions provided basic support through the generic driver. Which is rightfully seen as a regression. Fix this by including the entry for a particular device in hid_have_special_driver[] iff the specific config option has been specified, and let generic driver handle the device otherwise. Also make the behavior of hid_scan_report() (where the same decision is being taken on a per-report level) consistent. While at it, reshuffle the hid_have_special_driver[] a bit to restore the alphabetical ordering (first order by config option, and within those sections order by VID). This is considered a short-term solution, before generic way of giving precedence to special drivers and falling back to generic driver is figured out. While at it, fixup a missing entry for GFRM driver; thanks to Hans de Geode for spotting this (and for discovering a few issues in the conversion). Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-06-13	mac80211: don't send SMPS action frame in AP mode when not needed	Emmanuel Grumbach	1	-0/+2
	mac80211 allows to modify the SMPS state of an AP both, when it is started, and after it has been started. Such a change will trigger an action frame to all the peers that are currently connected, and will be remembered so that new peers will get notified as soon as they connect (since the SMPS setting in the beacon may not be the right one). This means that we need to remember the SMPS state currently requested as well as the SMPS state that was configured initially (and advertised in the beacon). The former is bss->req_smps and the latter is sdata->smps_mode. Initially, the AP interface could only be started with SMPS_OFF, which means that sdata->smps_mode was SMPS_OFF always. Later, a nl80211 API was added to be able to start an AP with a different AP mode. That code forgot to update bss->req_smps and because of that, if the AP interface was started with SMPS_DYNAMIC, we had: sdata->smps_mode = SMPS_DYNAMIC bss->req_smps = SMPS_OFF That configuration made mac80211 think it needs to fire off an action frame to any new station connecting to the AP in order to let it know that the actual SMPS configuration is SMPS_OFF. Fix that by properly setting bss->req_smps in ieee80211_start_ap. Fixes: f69931748730 ("mac80211: set smps_mode according to ap params") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-06-13	mac80211/wpa: use constant time memory comparison for MACs	Jason A. Donenfeld	1	-4/+5
	Otherwise, we enable all sorts of forgeries via timing attack. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-wireless@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-06-13	mac80211: set bss_info data before configuring the channel	Johannes Berg	1	-10/+28
	When mac80211 changes the channel, it also calls into the driver's bss_info_changed() callback, e.g. with BSS_CHANGED_IDLE. The driver may, like iwlwifi does, access more data from bss_info in that case and iwlwifi accesses the basic_rates bitmap, but if changing from a band with more (basic) rates to one with fewer, an out-of-bounds access of the rate array may result. While we can't avoid having invalid data at some point in time, we can avoid having it while we call the driver - so set up all the data before configuring the channel, and then apply it afterwards. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195677 Reported-by: Johannes Hirte <johannes.hirte@datenkhaos.de> Tested-by: Johannes Hirte <johannes.hirte@datenkhaos.de> Debugged-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-06-13	mac80211: remove 5/10 MHz rate code from station MLME	Johannes Berg	1	-21/+3
	There's no need for the station MLME code to handle bitrates for 5 or 10 MHz channels when it can't ever create such a configuration. Remove the unnecessary code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-06-13	mac80211: Fix incorrect condition when checking rx timestamp	Avraham Stern	1	-1/+1
	If the driver reports the rx timestamp at PLCP start, mac80211 can only handle legacy encoding, but the code checks that the encoding is not legacy. Fix this. Fixes: da6a4352e7c8 ("mac80211: separate encoding/bandwidth from flags") Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-06-13	mac80211: don't look at the PM bit of BAR frames	Emmanuel Grumbach	1	-1/+5
	When a peer sends a BAR frame with PM bit clear, we should not modify its PM state as madated by the spec in 802.11-20012 10.2.1.2. Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-06-12	i40e: fix handling of HW ATR eviction	Jacob Keller	4	-7/+9
	A recent commit to refactor the driver and remove the hw_disabled_flags field accidentally introduced two regressions. First, we overwrote pf->flags which removed various key flags including the MSI-X settings. Additionally, it was intended that we have now two flags, HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done, and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere. This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely updates pf->flags instead of overwriting it. Without this patch we will have many problems including disabling MSI-X support, and we'll attempt to use HW ATR eviction on devices which do not support it. Fixes: 47994c119a36 ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12	hsr: fix incorrect warning	Karicheri, Muralidharan	3	-5/+9
	When HSR interface is setup using ip link command, an annoying warning appears with the trace as below:- [ 203.019828] hsr_get_node: Non-HSR frame [ 203.019833] Modules linked in: [ 203.019848] CPU: 0 PID: 158 Comm: sd-resolve Tainted: G W 4.12.0-rc3-00052-g9fa6bf70 #2 [ 203.019853] Hardware name: Generic DRA74X (Flattened Device Tree) [ 203.019869] [<c0110280>] (unwind_backtrace) from [<c010c2f4>] (show_stack+0x10/0x14) [ 203.019880] [<c010c2f4>] (show_stack) from [<c04b9f64>] (dump_stack+0xac/0xe0) [ 203.019894] [<c04b9f64>] (dump_stack) from [<c01374e8>] (__warn+0xd8/0x104) [ 203.019907] [<c01374e8>] (__warn) from [<c0137548>] (warn_slowpath_fmt+0x34/0x44) root@am57xx-evm:~# [ 203.019921] [<c0137548>] (warn_slowpath_fmt) from [<c081126c>] (hsr_get_node+0x148/0x170) [ 203.019932] [<c081126c>] (hsr_get_node) from [<c0814240>] (hsr_forward_skb+0x110/0x7c0) [ 203.019942] [<c0814240>] (hsr_forward_skb) from [<c0811d64>] (hsr_dev_xmit+0x2c/0x34) [ 203.019954] [<c0811d64>] (hsr_dev_xmit) from [<c06c0828>] (dev_hard_start_xmit+0xc4/0x3bc) [ 203.019963] [<c06c0828>] (dev_hard_start_xmit) from [<c06c13d8>] (__dev_queue_xmit+0x7c4/0x98c) [ 203.019974] [<c06c13d8>] (__dev_queue_xmit) from [<c0782f54>] (ip6_finish_output2+0x330/0xc1c) [ 203.019983] [<c0782f54>] (ip6_finish_output2) from [<c0788f0c>] (ip6_output+0x58/0x454) [ 203.019994] [<c0788f0c>] (ip6_output) from [<c07b16cc>] (mld_sendpack+0x420/0x744) As this is an expected path to hsr_get_node() with frame coming from the master interface, add a check to ensure packet is not from the master port and then warn. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12	proc: snmp6: Use correct type in memset	Christian Perle	1	-1/+1
	Reading /proc/net/snmp6 yields bogus values on 32 bit kernels. Use "u64" instead of "unsigned long" in sizeof(). Fixes: 4a4857b1c81e ("proc: Reduce cache miss in snmp6_seq_show") Signed-off-by: Christian Perle <christian.perle@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12	cpuidle: dt: Add missing 'of_node_put()'	Christophe Jaillet	1	-1/+3
	'of_node_put()' should be called on pointer returned by 'of_parse_phandle()' when done. In this function this is done in all path except this 'continue', so add it. Fixes: 97735da074fd (drivers: cpuidle: Add status property to ARM idle states) Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12	cpufreq: conservative: Allow down_threshold to take values from 1 to 10	Tomasz Wilczyński	1	-2/+2
	Commit 27ed3cd2ebf4 (cpufreq: conservative: Fix the logic in frequency decrease checking) removed the 10 point substraction when comparing the load against down_threshold but did not remove the related limit for the down_threshold value. As a result, down_threshold lower than 11 is not allowed even though values from 1 to 10 do work correctly too. The comment ("cannot be lower than 11 otherwise freq will not fall") also does not apply after removing the substraction. For this reason, allow down_threshold to take any value from 1 to 99 and fix the related comment. Fixes: 27ed3cd2ebf4 (cpufreq: conservative: Fix the logic in frequency decrease checking) Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12	Revert "cpufreq: schedutil: Reduce frequencies slower"	Rafael J. Wysocki	1	-3/+0
	Revert commit 39b64aa1c007 (cpufreq: schedutil: Reduce frequencies slower) that introduced unintentional changes in behavior leading to adverse effects on some systems. Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12	ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance	Lv Zheng	2	-9/+39
	Considering this case: 1. A program opens a sysfs table file 65535 times, it can increase validation_count and first increment cause the table to be mapped: validation_count = 65535 2. AML execution causes "Load" to be executed on the same table, this time it cannot increase validation_count, so validation_count remains: validation_count = 65535 3. The program closes sysfs table file 65535 times, it can decrease validation_count and the last decrement cause the table to be unmapped: validation_count = 0 4. AML code still accessing the loaded table, kernel crash can be observed. To prevent that from happening, add a validation_count threashold. When it is reached, the validation_count can no longer be incremented/decremented to invalidate the table descriptor (means preventing table unmappings) Note that code added in acpi_tb_put_table() is actually a no-op but changes the warning message into a "warn once" one. Lv Zheng. Signed-off-by: Lv Zheng <lv.zheng@intel.com> [ rjw: Changelog, comments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12	PM / devfreq: exynos-ppmu: Staticize event list	Krzysztof Kozlowski	1	-1/+1
	The ppmu_events array is accessed only in this compilation unit so it can be made static. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
2017-06-12	PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable	Arvind Yadav	1	-1/+5
	clk_prepare_enable() can fail here and we must check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
2017-06-12	PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable	Arvind Yadav	1	-1/+5
	clk_prepare_enable() can fail here and we must check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
2017-06-11	Linux 4.12-rc5	Linus Torvalds	1	-1/+1

2017-06-11	compiler, clang: properly override 'inline' for clang	Linus Torvalds	1	-1/+2
	Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused static inline functions") just caused more warnings due to re-defining the 'inline' macro. So undef it before re-defining it, and also add the 'notrace' attribute like the gcc version that this is overriding does. Maybe this makes clang happier. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-11	net: ipmr: Fix some mroute forwarding issues in vrf's	Donald Sharp	1	-17/+15
	This patch fixes two issues: 1) When forwarding on *,G mroutes that are in a vrf, the kernel was dropping information about the actual incoming interface when calling ip_mr_forward from ip_mr_input. This caused ip_mr_forward to send the multicast packet back out the incoming interface. Fix this by modifying ip_mr_forward to be handed the correctly resolved dev. 2) When a unresolved cache entry is created we store the incoming skb on the unresolved cache entry and upon mroute resolution from the user space daemon, we attempt to forward the packet. Again we were not resolving to the correct incoming device for a vrf scenario, before calling ip_mr_forward. Fix this by resolving to the correct interface and calling ip_mr_forward with the result. Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast") Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: update ena driver to version 1.1.7	Netanel Belgazal	1	-1/+1
	Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: bug fix in lost tx packets detection mechanism	Netanel Belgazal	3	-31/+50
	check_for_missing_tx_completions() is called from a timer task and looking for lost tx packets. The old implementation accumulate all the lost tx packets and did not check if those packets were retrieved on a later stage. This cause to a situation where the driver reset the device for no reason. Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: disable admin msix while working in polling mode	Netanel Belgazal	1	-0/+8
	Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: fix theoretical Rx hang on low memory systems	Netanel Belgazal	3	-0/+58
	For the rare case where the device runs out of free rx buffer descriptors (in case of pressure on kernel memory), and the napi handler continuously fail to refill new Rx descriptors until device rx queue totally runs out of all free rx buffers to post incoming packet, leading to a deadlock: * The device won't send interrupts since all the new Rx packets will be dropped. * The napi handler won't try to allocate new Rx descriptors since allocation is part of NAPI that's not being invoked any more The fix involves detecting this scenario and rescheduling NAPI (to refill buffers) by the keepalive/watchdog task. Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: add missing unmap bars on device removal	Netanel Belgazal	1	-4/+11
	This patch also change the mapping functions to devm_ functions Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: fix race condition between submit and completion admin command	Netanel Belgazal	1	-4/+2
	Bug: "Completion context is occupied" error printout will be noticed in dmesg. This error will cause the admin command to fail, which will lead to an ena_probe() failure or a watchdog reset (depends on which admin command failed). Root cause: __ena_com_submit_admin_cmd() is the function that submits new entries to the admin queue. The function have a check that makes sure the queue is not full and the function does not override any outstanding command. It uses head and tail indexes for this check. The head is increased by ena_com_handle_admin_completion() which runs from interrupt context, and the tail index is increased by the submit function (the function is running under ->q_lock, so there is no risk of multithread increment). Each command is associated with a completion context. This context allocated before call to __ena_com_submit_admin_cmd() and freed by ena_com_wait_and_process_admin_cq_interrupts(), right after the command was completed. This can lead to a state where the head was increased, the check passed, but the completion context is still in use. Solution: Use the atomic variable ->outstanding_cmds instead of using the head and the tail indexes. This variable is safe for use since it is bumped in get_comp_ctx() in __ena_com_submit_admin_cmd() and is freed by comp_ctxt_release() Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: add missing return when ena_com_get_io_handlers() fails	Netanel Belgazal	1	-0/+2
	Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: fix bug that might cause hang after consecutive open/close interface.	Netanel Belgazal	1	-15/+26
	Fixing a bug that the driver does not unmask the IO interrupts in ndo_open(): occasionally, the MSI-X interrupt (for one or more IO queues) can be masked when ndo_close() was called. If that is followed by ndo open(), then the MSI-X will be still masked so no interrupt will be received by the driver. Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net: ena: fix rare uncompleted admin command false alarm	Netanel Belgazal	1	-10/+11
	The current flow to detect admin completion is: while (command_not_completed) { if (timeout) error check_for_completion() sleep() } So in case the sleep took more than the timeout (in case the thread/workqueue was not scheduled due to higher priority task or prolonged VMexit), the driver can detect a stall even if the completion is present. The fix changes the order of this function to first check for completion and only after that check if the timeout expired. Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11	net/mlx5: Enable 4K UAR only when page size is bigger than 4K	Majd Dibbiny	1	-2/+4
	When the page size isn't bigger than 4K, there is no added value of enabling 4K UAR feature in the Firmware. Modified the condition of enabling the 4K UAR accordingly. Fixes: f502d834950a ("net/mlx5: Activate support for 4K UARs") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11	net/mlx5e: Fix wrong indications in DIM due to counter wraparound	Tal Gilboa	2	-7/+11
	DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for changing the channel interrupt moderation values in order to reduce CPU overhead for all traffic types. Each iteration of the algorithm, DIM calculates the difference in throughput, packet rate and interrupt rate from last iteration in order to make a decision. DIM relies on counters for each metric. When these counters get to their type's max value they wraparound. In this case the delta between 'end' and 'start' samples is negative and when translated to unsigned integers - very high. This results in a false indication to the algorithm and might result in a wrong decision. The fix calculates the 'distance' between 'end' and 'start' samples in a cyclic way around the relevant type's max value. It can also be viewed as an absolute value around the type's max value instead of around 0. Testing show higher stability in DIM profile selection and no wraparound issues. Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11	net/mlx5e: Added BW check for DIM decision mechanism	Tal Gilboa	2	-17/+22
	DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for changing the channel interrupt moderation values in order to reduce CPU overhead for all traffic types. Until now only interrupt and packet rate were sampled. We found a scenario on which we get a false indication since a change in DIM caused more aggregation and reduced packet rate while increasing BW. We now regard a change as succesfull iff: current_BW > (prev_BW + threshold) or current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or current_BW ~= prev_BW and current_PR ~= prev_PR and current_IR < (prev_IR - threshold) Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off) -------------------------------------------------- packet size \| before[Mb/s] \| after[Mb/s] \| gain \| 2B \| 343.4 \| 359.4 \| 4.5% \| 16B \| 2739.7 \| 2814.8 \| 2.7% \| 64B \| 9739 \| 10185.3 \| 4.5% \| Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11	net/mlx5: Remove several module events out of ethtool stats	Huy Nguyen	1	-9/+2
	Remove the following module event counters out of ethtool stats. The reason for removing these event counters is that these events do not occur without techinician's intervention. module_pwr_budget_exd module_long_range module_no_eeprom module_enforce_part module_unknown_id module_unknown_status module_plug Fixes: bedb7c909c19 ("net/mlx5e: Add port module event counters to ethtool stats") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11	net/mlx5: Continue health polling until it is explicitly stopped	Mohamad Haj Yahia	1	-6/+5
	The issue is that when we get an assert we will stop polling the health and thus we cant enter error state when we have a real health issue. Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11	net/mlx5: Fix create vport flow table flow	Mohamad Haj Yahia	1	-1/+1
	Send vport number to the create flow table inner method instead of ignoring the vport argument and sending always 0. Fixes: b3ba51498bdd ('net/mlx5: Refactor create flow table method to accept underlay QP') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11	KVM: async_pf: avoid async pf injection when in guest mode	Wanpeng Li	3	-4/+7
	INFO: task gnome-terminal-:1734 blocked for more than 120 seconds. Not tainted 4.12.0-rc4+ #8 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. gnome-terminal- D 0 1734 1015 0x00000000 Call Trace: __schedule+0x3cd/0xb30 schedule+0x40/0x90 kvm_async_pf_task_wait+0x1cc/0x270 ? __vfs_read+0x37/0x150 ? prepare_to_swait+0x22/0x70 do_async_page_fault+0x77/0xb0 ? do_async_page_fault+0x77/0xb0 async_page_fault+0x28/0x30 This is triggered by running both win7 and win2016 on L1 KVM simultaneously, and then gives stress to memory on L1, I can observed this hang on L1 when at least ~70% swap area is occupied on L0. This is due to async pf was injected to L2 which should be injected to L1, L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host actually), and L1 guest starts accumulating tasks stuck in D state in kvm_async_pf_task_wait() since missing PAGE_READY async_pfs. This patch fixes the hang by doing async pf when executing L1 guest. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-06-10	hexagon: Use raw_copy_to_user	Guenter Roeck	1	-3/+2
	Commit ac4691fac8ad ("hexagon: switch to RAW_COPY_USER") replaced __copy_to_user_hexagon() with raw_copy_to_user(), but did not catch all callers, resulting in the following build error. arch/hexagon/mm/uaccess.c: In function '__clear_user_hexagon': arch/hexagon/mm/uaccess.c:40:3: error: implicit declaration of function '__copy_to_user_hexagon' Fixes: ac4691fac8ad ("hexagon: switch to RAW_COPY_USER") Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Richard Kuo <rkuo@codeaurora.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>