linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2016-07-03	Linux 4.7-rc6	Linus Torvalds	1	-1/+1

2016-07-03	ovl: warn instead of error if d_type is not supported	Vivek Goyal	1	-5/+7
	overlay needs underlying fs to support d_type. Recently I put in a patch in to detect this condition and started failing mount if underlying fs did not support d_type. But this breaks existing configurations over kernel upgrade. Those who are running docker (partially broken configuration) with xfs not supporting d_type, are surprised that after kernel upgrade docker does not run anymore. https://github.com/docker/docker/issues/22937#issuecomment-229881315 So instead of erroring out, detect broken configuration and warn about it. This should allow existing docker setups to continue working after kernel upgrade. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 45aebeaf4f67 ("ovl: Ensure upper filesystem supports d_type") Cc: <stable@vger.kernel.org> 4.6
2016-07-02	MIPS: Fix possible corruption of cache mode by mprotect.	Ralf Baechle	1	-4/+6
	The following testcase may result in a page table entries with a invalid CCA field being generated: static void bindstack; static int sysrqfd; static void protect_low(int protect) { mprotect(bindstack, BINDSTACK_SIZE, protect); } static void sigbus_handler(int signal, siginfo_t info, void context) { void addr = info->si_addr; write(sysrqfd, "x", 1); printf("sigbus, fault address %p (should not happen, but might)\n", addr); abort(); } static void run_bind_test(void) { unsigned int p = bindstack; p[0] = 0xf001f001; write(sysrqfd, "x", 1); / Set trap on access to p[0] / protect_low(PROT_NONE); write(sysrqfd, "x", 1); / Clear trap on access to p[0] / protect_low(PROT_READ \| PROT_WRITE \| PROT_EXEC); write(sysrqfd, "x", 1); / Check the contents of p[0] / if (p[0] != 0xf001f001) { write(sysrqfd, "x", 1); / Reached, but shouldn't be / printf("badness, shouldn't happen but does\n"); abort(); } } int main(void) { struct sigaction sa; sysrqfd = open("/proc/sysrq-trigger", O_WRONLY); if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) { perror("sigprocmask"); return 0; } sa.sa_sigaction = sigbus_handler; sa.sa_flags = SA_SIGINFO \| SA_NODEFER \| SA_RESTART; if (sigaction(SIGBUS, &sa, NULL)) { perror("sigaction"); return 0; } bindstack = mmap(NULL, BINDSTACK_SIZE, PROT_READ \| PROT_WRITE \| PROT_EXEC, MAP_PRIVATE \| MAP_ANONYMOUS, -1, 0); if (bindstack == MAP_FAILED) { perror("mmap bindstack"); return 0; } printf("bindstack: %p\n", bindstack); run_bind_test(); printf("done\n"); return 0; } There are multiple ingredients for this: 1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3 on all platforms except SB1 where it's CCA 5. 2) _page_cachable_default must have bits set which are not set _CACHE_CACHABLE_NONCOHERENT. 3) Either the defective version of pte_modify for XPA or the standard version must be in used. However pte_modify for the 36 bit address space support is no affected. In that case additional bits in the final CCA mode may generate an invalid value for the CCA field. On the R10000 system where this was tracked down for example a CCA 7 has been observed, which is Uncached Accelerated. Fixed by: 1) Using the proper CCA mode for PAGE_NONE just like for all the other PAGE_ pte/pmd bits. 2) Fix the two affected variants of pte_modify. Further code inspection also shows the same issue to exist in pmd_modify which would affect huge page systems. Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE and pmd_modify issue found by me. The history of this goes back beyond Linus' git history. Chris Dearman's commit 351336929ccf222ae38ff0cb7a8dd5fd5c6236a0 ("[MIPS] Allow setting of the cache attribute at run time.") missed the opportunity to fix this but it was originally introduced in lmo commit d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.") and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option CONFIG_MIPS_UNCACHED.") Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
2016-07-01	locks: use file_inode()	Miklos Szeredi	1	-1/+1
	(Another one for the f_path debacle.) ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask. The reason is that generic_add_lease() used filp->f_path.dentry->inode while all the others use file_inode(). This makes a difference for files opened on overlayfs since the former will point to the overlay inode the latter to the underlying inode. So generic_add_lease() added the lease to the overlay inode and generic_delete_lease() removed it from the underlying inode. When the file was released the lease remained on the overlay inode's lock list, resulting in use after free. Reported-by: Eryu Guan <eguan@redhat.com> Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-07-01	usb: dwc3: st: Use explicit reset_control_get_exclusive() API	Lee Jones	1	-1/+2
	We're making all reset line users specify whether their lines are shared with other IP or they operate them exclusively. In this case the line is exclusively used only by this IP, so use the *_exclusive() API accordingly. Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-07-01	phy: phy-stih407-usb: Use explicit reset_control_get_exclusive() API	Lee Jones	1	-1/+1
	We're making all reset line users specify whether their lines are shared with other IP or they operate them exclusively. In this case the line is exclusively used only by this IP, so use the *_exclusive() API accordingly. Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-07-01	phy: miphy28lp: Inform the reset framework that our reset line may be shared	Lee Jones	1	-1/+2
	On the STiH410 B2120 development board the MiPHY28lp shares its reset line with the Synopsys DWC3 SuperSpeed (SS) USB 3.0 Dual-Role-Device (DRD). New functionality in the reset subsystems forces consumers to be explicit when requesting shared/exclusive reset lines. Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-30	namespace: update event counter when umounting a deleted dentry	Andrey Ulanov	1	-0/+1
	- m_start() in fs/namespace.c expects that ns->event is incremented each time a mount added or removed from ns->list. - umount_tree() removes items from the list but does not increment event counter, expecting that it's done before the function is called. - There are some codepaths that call umount_tree() without updating "event" counter. e.g. from __detach_mounts(). - When this happens m_start may reuse a cached mount structure that no longer belongs to ns->list (i.e. use after free which usually leads to infinite loop). This change fixes the above problem by incrementing global event counter before invoking umount_tree(). Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589 Cc: stable@vger.kernel.org Signed-off-by: Andrey Ulanov <andreyu@google.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-30	9p: use file_dentry()	Miklos Szeredi	1	-3/+3
	v9fs may be used as lower layer of overlayfs and accessing f_path.dentry can lead to a crash. In this case it's a NULL pointer dereference in p9_fid_create(). Fix by replacing direct access of file->f_path.dentry with the file_dentry() accessor, which will always return a native object. Reported-by: Alessio Igor Bogani <alessioigorbogani@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Alessio Igor Bogani <alessioigorbogani@gmail.com> Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Cc: <stable@vger.kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-06-30	lockd: unregister notifier blocks if the service fails to come up completely	Scott Mayhew	1	-3/+10
	If the lockd service fails to start up then we need to be sure that the notifier blocks are not registered, otherwise a subsequent start of the service could cause the same notifier to be registered twice, leading to soft lockups. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Cc: stable@vger.kernel.org Fixes: 0751ddf77b6a "lockd: Register callbacks on the inetaddr_chain..." Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2016-06-30	ACPI,PCI,IRQ: correct operator precedence	Sinan Kaya	1	-1/+1
	The omitted parenthesis prevents the addition operation when acpi_penalize_isa_irq function is called. Fixes: 103544d86976 (ACPI,PCI,IRQ: reduce resource requirements) Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-30	fuse: serialize dirops by default	Miklos Szeredi	4	-2/+37
	Negotiate with userspace filesystems whether they support parallel readdir and lookup. Disable parallelism by default for fear of breaking fuse filesystems. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 9902af79c01a ("parallel lookups: actual switch to rwsem") Fixes: d9b3dbdcfd62 ("fuse: switch to ->iterate_shared()")
2016-06-30	drm/i915: Fix missing unlock on error in i915_ppgtt_info()	Wei Yongjun	1	-2/+2
	Add the missing unlock before return from function i915_ppgtt_info() in the error handling case. Fixes: 1d2ac403ae3b(drm: Protect dev->filelist with its own mutex) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1465861320-26221-1-git-send-email-weiyj_lk@163.com (cherry picked from commit b0212486909de4f239ca9f20d032de1b1f2dc52e) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-30	powerpc: Initialise pci_io_base as early as possible	Darren Stevens	4	-1/+10
	Commit d6a9996e84ac ("powerpc/mm: vmalloc abstraction in preparation for radix") turned kernel memory and IO addresses from #defined constants to variables initialised at runtime. On PA6T (pasemi) systems the setup_arch() machine call initialises the onboard PCI-e root-ports, and uses pci_io_base to do this, which is now before its value has been set, resulting in a panic early in boot before console IO is initialised. Move the pci_io_base initialisation to the same place as vmalloc ranges are set (hash__early_init_mmu()/radix__early_init_mmu()) - this is the earliest possible place we can initialise it. Fixes: d6a9996e84ac ("powerpc/mm: vmalloc abstraction in preparation for radix") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Darren Stevens <darren@stevens-zone.net> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Add #ifdef CONFIG_PCI, massage change log slightly] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-30	mfd: da9053: Fix compiler warning message for uninitialised variable	Steve Twiss	1	-1/+1
	Fix compiler warning caused by an uninitialised variable inside da9052_group_write() function. Defaulting the value to zero covers the trivial case. Signed-off-by: Steve Twiss <stwiss.opensource@diasemi.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-30	mfd: max77620: Fix FPS switch statements	Rhyland Klein	1	-0/+2
	When configuring FPS during probe, assuming a DT node is present for FPS, the code can run into a problem with the switch statements in max77620_config_fps() and max77620_get_fps_period_reg_value(). Namely, in the case of chip->chip_id == MAX77620, it will set fps_[mix\|max]_period but then fall through to the default switch case and return -EINVAL. Returning this from max77620_config_fps() will cause probe to fail. Signed-off-by: Rhyland Klein <rklein@nvidia.com> Reviewed-by: Laxman Dewangan <ldewangan@nvidia.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Tested-by: Thierry Reding <treding@nvidia.com> Tested-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-30	phy: phy-stih407-usb: Inform the reset framework that our reset line may be shared	Lee Jones	1	-1/+1
	On the STiH410 B2120 development board the ports on the Generic PHY share their reset lines with each other. New functionality in the reset subsystems forces consumers to be explicit when requesting shared/exclusive reset lines. Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-30	usb: dwc3: st: Inform the reset framework that our reset line may be shared	Lee Jones	1	-1/+2
	On the STiH410 B2120 development board the MiPHY28lp shares its reset line with the Synopsys DWC3 SuperSpeed (SS) USB 3.0 Dual-Role-Device (DRD). New functionality in the reset subsystems forces consumers to be explicit when requesting shared/exclusive reset lines. Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-30	usb: host: ehci-st: Inform the reset framework that our reset line may be shared	Lee Jones	1	-2/+4
	On the STiH410 B2120 development board the ST EHCI IP shares its reset line with the OHCI IP. New functionality in the reset subsystems forces consumers to be explicit when requesting shared/exclusive reset lines. Acked-by: Peter Griffin <peter.griffin@linaro.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-30	usb: host: ohci-st: Inform the reset framework that our reset line may be shared	Lee Jones	1	-2/+4
	On the STiH410 B2120 development board the ST EHCI IP shares its reset line with the OHCI IP. New functionality in the reset subsystems forces consumers to be explicit when requesting shared/exclusive reset lines. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2016-06-29	reset: TRIVIAL: Add line break at same place for similar APIs	Lee Jones	1	-2/+2
	Standardise the way inline functions: devm_reset_control_get_shared_by_index devm_reset_control_get_exclusive_by_index ... are formatted. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2016-06-29	reset: Supply _shared variant calls when using _optional APIs	Lee Jones	1	-0/+12
	Consumers need to be able to specify whether they are requesting an 'exclusive' or 'shared' reset line no matter which API (of_, devm_, etc) they are using. This change allows users of the optional_* API in particular to specify that their request is for a 'shared' line. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2016-06-29	reset: Supply _shared variant calls when using of_ API	Lee Jones	1	-0/+53
	Consumers need to be able to specify whether they are requesting an 'exclusive' or 'shared' reset line no matter which API (of_, devm_, etc) they are using. This change allows users of the of_* API in particular to specify that their request is for a 'shared' line. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2016-06-29	reset: Ensure drivers are explicit when requesting reset lines	Lee Jones	1	-24/+82
	Phasing out generic reset line requests enables us to make some better decisions on when and how to (de)assert said lines. If an 'exclusive' line is requested, we know a device requires a reset and that it's preferable to act upon a request right away. However, if a 'shared' reset line is requested, we can reasonably assume sure that placing a device into reset isn't a hard requirement, but probably a measure to save power and is thus able to cope with not being asserted if another device is still in use. In order allow gentle adoption and not to forcing all consumers to move to the API immediately, causing administration headache between subsystems, this patch adds some temporary stand-in shim-calls. This will ease the burden at merge time and allow subsystems to migrate over to the new API in a more realistic time-frame. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2016-06-29	reset: Reorder inline reset_control_get*() wrappers	Lee Jones	1	-21/+21
	We're about to split the current API into two, where consumers will be forced to be explicit when requesting reset lines. The choice will be to either the call the _exclusive or _shared variant depending on whether they can actually tolorate not being asserted when that request is made. The new API will look like this once reorded and complete: reset_control_get_exclusive() reset_control_get_shared() reset_control_get_optional_exclusive() reset_control_get_optional_shared() of_reset_control_get_exclusive() of_reset_control_get_shared() of_reset_control_get_exclusive_by_index() of_reset_control_get_shared_by_index() devm_reset_control_get_exclusive() devm_reset_control_get_shared() devm_reset_control_get_optional_exclusive() devm_reset_control_get_optional_shared() devm_reset_control_get_exclusive_by_index() devm_reset_control_get_shared_by_index() Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2016-06-29	nfit: fix format interface code byte order	Dan Williams	2	-8/+8
	Per JEDEC Annex L Release 3 the SPD data is: Bits 9~5 00 000 = Function Undefined 00 001 = Byte addressable energy backed 00 010 = Block addressed 00 011 = Byte addressable, no energy backed All other codes reserved Bits 4~0 0 0000 = Proprietary interface 0 0001 = Standard interface 1 All other codes reserved; see Definitions of Functions ...and per the ACPI 6.1 spec: byte0: Bits 4~0 (0 or 1) byte1: Bits 9~5 (1, 2, or 3) ...so a format interface code displayed as 0x301 should be stored in the nfit as (0x1, 0x3), little-endian. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Robert Moore <robert.moore@intel.com> Cc: Robert Elliott <elliott@hpe.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=121161 Fixes: 30ec5fd464d5 ("nfit: fix format interface code byte order per ACPI6.1") Fixes: 5ad9a7fde07a ("acpi/nfit: Update nfit driver to comply with ACPI 6.1") Reported-by: Kristin Jacque <kristin.jacque@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-06-29	regulator: max77620: check for valid regulator info	Venkat Reddy Talla	1	-1/+6
	SD4 regulator is not registered with regulator core framework in probe as there is no support in MAX77620 PMIC, removing SD4 entry from MAX77620 regulator information list and checking for valid regulator information data before configuring FPS source and FPS power up/down period to avoid NULL pointer exception if regulator not registered with core. Signed-off-by: Venkat Reddy Talla <vreddytalla@nvidia.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2016-06-29	drm/amd/powerplay: workaround for UVD clock issue	Rex Zhu	1	-0/+4
	workaround issue that when uvd dpm disabled, uvd clock remain high on polaris10. Manually turn off the clocks. Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Ken Wang <Qingqing.Wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-06-29	drm/amdgpu: add ACLK_CNTL setting for polaris10	Ken Wang	1	-0/+3
	This is a temporary workaround for early boards. Signed-off-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-06-29	drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.	Rex Zhu	1	-41/+60
	1. Populate correct value of VDDCI voltage for SMC SAMU, VCE, and UVD levels depending on whether VDDCi control is SVI2 or GPIO. 2. Populate SMC ACPI minimum voltage using VBIOS boot SCLK and MCLK When static voltage is configured as VDDCI, driver still tries to program a voltage for MM minVoltage using VDDC-VDDCI delta requirement. minVoltage should be set as boot up voltage. Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-06-29	drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.	Rex Zhu	1	-0/+26
	Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-06-29	ovl: get_write_access() in truncate	Miklos Szeredi	1	-0/+21
	When truncating a file we should check write access on the underlying inode. And we should do so on the lower file as well (before copy-up) for consistency. Original patch and test case by Aihua Zhang. - - >o >o - - test.c - - >o >o - - #include <stdio.h> #include <errno.h> #include <unistd.h> int main(int argc, char *argv[]) { int ret; ret = truncate(argv[0], 4096); if (ret != -1) { fprintf(stderr, "truncate(argv[0]) should have failed\n"); return 1; } if (errno != ETXTBSY) { perror("truncate(argv[0])"); return 1; } return 0; } - - >o >o - - >o >o - - >o >o - - Reported-by: Aihua Zhang <zhangaihua1@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
2016-06-29	openvswitch: fix conntrack netlink event delivery	Samuel Gauthier	1	-2/+12
	Only the first and last netlink message for a particular conntrack are actually sent. The first message is sent through nf_conntrack_confirm when the conntrack is committed. The last one is sent when the conntrack is destroyed on timeout. The other conntrack state change messages are not advertised. When the conntrack subsystem is used from netfilter, nf_conntrack_confirm is called for each packet, from the postrouting hook, which in turn calls nf_ct_deliver_cached_events to send the state change netlink messages. This commit fixes the problem by calling nf_ct_deliver_cached_events in the non-commit case as well. Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") CC: Joe Stringer <joestringer@nicira.com> CC: Justin Pettit <jpettit@nicira.com> CC: Andy Zhou <azhou@nicira.com> CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	qed: Protect the doorbell BAR with the write barriers.	Sudarsana Reddy Kalluru	1	-7/+3
	SPQ doorbell is currently protected with the compilation barrier. Under the stress scenarios, we may get into a state where (due to the weak ordering) several ramrod doorbells were written to the BAR with an out-of-order producer values. Need to change the barrier type to a write barrier to make sure that the write buffer is flushed after each doorbell. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit()	David Barroso	1	-1/+5
	neigh_xmit() expects to be called inside an RCU-bh read side critical section, and while one of its two current callers gets this right, the other one doesn't. More specifically, neigh_xmit() has two callers, mpls_forward() and mpls_output(), and while both callers call neigh_xmit() under rcu_read_lock(), this provides sufficient protection for neigh_xmit() only in the case of mpls_forward(), as that is always called from softirq context and therefore doesn't need explicit BH protection, while mpls_output() can be called from process context with softirqs enabled. When mpls_output() is called from process context, with softirqs enabled, we can be preempted by a softirq at any time, and RCU-bh considers the completion of a softirq as signaling the end of any pending read-side critical sections, so if we do get a softirq while we are in the part of neigh_xmit() that expects to be run inside an RCU-bh read side critical section, we can end up with an unexpected RCU grace period running right in the middle of that critical section, making things go boom. This patch fixes this impedance mismatch in the callee, by making neigh_xmit() always take rcu_read_{,un}lock_bh() around the code that expects to be treated as an RCU-bh read side critical section, as this seems a safer option than fixing it in the callers. Fixes: 4fd3d7d9e868f ("neigh: Add helper function neigh_xmit") Signed-off-by: David Barroso <dbarroso@fastly.com> Signed-off-by: Lennert Buytenhek <lbuytenhek@fastly.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	e1000e: keep VLAN interfaces functional after rxvlan off	Jarod Wilson	1	-2/+13
	I've got a bug report about an e1000e interface, where a VLAN interface is set up on top of it: $ ip link add link ens1f0 name ens1f0.99 type vlan id 99 $ ip link set ens1f0 up $ ip link set ens1f0.99 up $ ip addr add 192.168.99.92 dev ens1f0.99 At this point, I can ping another host on vlan 99, ip 192.168.99.91. However, if I do the following: $ ethtool -K ens1f0 rxvlan off Then no traffic passes on ens1f0.99. It comes back if I toggle rxvlan on again. I'm not sure if this is actually intended behavior, or if there's a lack of software VLAN stripping fallback, or what, but things continue to work if I simply don't call e1000e_vlan_strip_disable() if there are active VLANs (plagiarizing a function from the e1000 driver here) on the interface. Also slipped a related-ish fix to the kerneldoc text for e1000e_vlan_strip_disable here... Signed-off-by: Jarod Wilson <jarod@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	cfg80211: fix proto in ieee80211_data_to_8023 for frames without LLC header	Felix Fietkau	1	-1/+1
	The PDU length of incoming LLC frames is set to the total skb payload size in __ieee80211_data_to_8023() of net/wireless/util.c which incorrectly includes the length of the IEEE 802.11 header. The resulting LLC frame header has a too large PDU length, causing the llc_fixup_skb() function of net/llc/llc_input.c to reject the incoming skb, effectively breaking STP. Solve the problem by properly substracting the IEEE 802.11 frame header size from the PDU length, allowing the LLC processor to pick up the incoming control messages. Special thanks to Gerry Rozema for tracking down the regression and proposing a suitable patch. Fixes: 2d1c304cb2d5 ("cfg80211: add function for 802.3 conversion with separate output buffer") Cc: stable@vger.kernel.org Reported-by: Gerry Rozema <gerryr@rozeware.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2016-06-29	qlcnic: use the correct ring in qlcnic_83xx_process_rcv_ring_diag()	Dan Carpenter	1	-1/+1
	There is a static checker warning here "warn: mask and shift to zero" and the code sets "ring" to zero every time. From looking at how QLCNIC_FETCH_RING_ID() is used in qlcnic_83xx_process_rcv_ring() the qlcnic_83xx_hndl() should be removed. Fixes: 4be41e92f7c6 ('qlcnic: 83xx data path routines') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	bpf, perf: delay release of BPF prog after grace period	Daniel Borkmann	2	-1/+5
	Commit dead9f29ddcc ("perf: Fix race in BPF program unregister") moved destruction of BPF program from free_event_rcu() callback to __free_event(), which is problematic if used with tail calls: if prog A is attached as trace event directly, but at the same time present in a tail call map used by another trace event program elsewhere, then we need to delay destruction via RCU grace period since it can still be in use by the program doing the tail call (the prog first needs to be dropped from the tail call map, then trace event with prog A attached destroyed, so we get immediate destruction). Fixes: dead9f29ddcc ("perf: Fix race in BPF program unregister") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Jann Horn <jann@thejh.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	net: bridge: fix vlan stats continue counter	Nikolay Aleksandrov	1	-1/+1
	I made a dumb off-by-one mistake when I added the vlan stats counter dumping code. The increment should happen before the check, not after otherwise we miss one entry when we continue dumping. Fixes: a60c090361ea ("bridge: netlink: export per-vlan stats") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	tcp: do not send too big packets at retransmit time	Eric Dumazet	1	-1/+6
	Arjun reported a bug in TCP stack and bisected it to a recent commit. In case where we process SACK, we can coalesce multiple skbs into fat ones (tcp_shift_skb_data()), to lower write queue overhead, because we do not expect to retransmit these packets. However, SACK reneging can happen, forcing the sender to retransmit all these packets. If skb->len is above 64KB, we then send buggy IP packets that could hang TSO engine on cxgb4. Neal suggested to use tcp_tso_autosize() instead of tp->gso_segs so that we cook packets of optimal size vs TCP/pacing. Thanks to Arjun for reporting the bug and running the tests ! Fixes: 10d3be569243 ("tcp-tso: do not split TSO packets at retransmit time") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Arjun V <arjun@chelsio.com> Tested-by: Arjun V <arjun@chelsio.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	ibmvnic: fix to use list_for_each_safe() when delete items	Wei Yongjun	1	-7/+7
	Since we will remove items off the list using list_del() we need to use a safe version of the list_for_each() macro aptly named list_for_each_safe(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	drm/i915: Removing PCI IDs that are no longer listed as Kabylake.	Rodrigo Vivi	1	-7/+2
	This is unusual. Usually IDs listed on early stages of platform definition are kept there as reserved for later use. However these IDs here are not listed anymore in any of steppings and devices IDs tables for Kabylake on configurations overview section of BSpec. So it is better removing them before they become used in any other future platform. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466718636-19675-2-git-send-email-rodrigo.vivi@intel.com (cherry picked from commit a922eb8d4581c883c37ce6e12dca9ff2cb1ea723) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29	drm/i915: Add more Kabylake PCI IDs.	Rodrigo Vivi	1	-0/+3
	The spec has been updated adding new PCI IDs. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466718636-19675-1-git-send-email-rodrigo.vivi@intel.com (cherry picked from commit 33d9391d3020e069dca98fa87a604c037beb2b9e) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2016-06-29	net: thunderx: Fix TL4 configuration for secondary Qsets	Sunil Goutham	1	-3/+13
	TL4 calculation for a given SQ of secondary Qsets is incorrect and goes out of bounds and also for some SQ's TL4 chosen will transmit data via a different BGX interface and not same as primary Qset's interface. This patch fixes this issue. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	net: thunderx: Fix link status reporting	Sunil Goutham	2	-31/+62
	Check for SMU RX local/remote faults along with SPU LINK status. Otherwise at times link is UP at our end but DOWN at link partner's side. Also due to an issue in BGX it's rarely seen that initialization doesn't happen properly and SMU RX reports faults with everything fine at SPU. This patch tries to reinitialize LMAC to fix it. Also fixed LMAC disable sequence to properly bring down link. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Tao Wang <tao.wang@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	net/mlx5e: Reorganize ethtool statistics	Gal Pressman	5	-163/+130
	Categorize and reorganize ethtool statistics counters by renaming to "rx_" and "tx_" and removing redundant and duplicated counters, this way they are easier to grasp and more user friendly. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	net/mlx5e: Fix number of PFC counters reported to ethtool	Gal Pressman	1	-1/+3
	Number of PFC counters used to count only number of priorities with PFC enabled, but each priority has more than one counter, hence the need to multiply it by the number of PFC counters per priority. Fixes: cf678570d5a1 ('net/mlx5e: Add per priority group to PPort counters') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	net/mlx5e: Prevent adding the same vxlan port	Matthew Finlay	1	-0/+3
	Do not allow the same vxlan udp port to be added to the device more than once. Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling") Signed-off-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29	net/mlx5e: Check for BlueFlame capability before allocating SQ uar	Gal Pressman	1	-1/+1
	Previous to this patch mapping was always set to write combining without checking whether BlueFlame is supported in the device. Fixes: 0ba422410bbf ('net/mlx5: Fix global UAR mapping') Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>