wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2015-07-13	locks: have flock_lock_file take an inode pointer instead of a filp	Jeff Layton	1	-6/+6
	...and rename it to better describe how it works. In order to fix a use-after-free in NFS, we need to be able to remove locks from an inode after the filp associated with them may have already been freed. flock_lock_file already only dereferences the filp to get to the inode, so just change it so the callers do that. All of the callers already pass in a lock request that has the fl_file set properly, so we don't need to pass it in individually. With that change it now only dereferences the filp to get to the inode, so just push that out to the callers. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org> Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
2015-07-13	Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"	Jeff Layton	1	-2/+0
	This reverts commit db2efec0caba4f81a22d95a34da640b86c313c8e. William reported that he was seeing instability with this patch, which is likely due to the fact that it can cause the kernel to take a new reference to a filp after the last reference has already been put. Revert this patch for now, as we'll need to fix this in another way. Cc: stable@vger.kernel.org Reported-by: William Dauchy <william@gandi.net> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reviewed-by: "J. Bruce Fields" <bfields@fieldses.org> Tested-by: "J. Bruce Fields" <bfields@fieldses.org>
2015-07-10	parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results	John David Anglin	5	-168/+212
	The increased use of pdtlb/pitlb instructions seemed to increase the frequency of random segmentation faults building packages. Further, we had a number of cases where TLB inserts would repeatedly fail and all forward progress would stop. The Haskell ghc package caused a lot of trouble in this area. The final indication of a race in pte handling was this syslog entry on sibaris (C8000): swap_free: Unused swap offset entry 00000004 BUG: Bad page map in process mysqld pte:00000100 pmd:019bbec5 addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464 CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1 Backtrace: [<0000000040173eb0>] show_stack+0x20/0x38 [<0000000040444424>] dump_stack+0x9c/0x110 [<00000000402a0d38>] print_bad_pte+0x1a8/0x278 [<00000000402a28b8>] unmap_single_vma+0x3d8/0x770 [<00000000402a4090>] zap_page_range+0xf0/0x198 [<00000000402ba2a4>] SyS_madvise+0x404/0x8c0 Note that the pte value is 0 except for the accessed bit 0x100. This bit shouldn't be set without the present bit. It should be noted that the madvise system call is probably a trigger for many of the random segmentation faults. In looking at the kernel code, I found the following problems: 1) The pte_clear define didn't take TLB lock when clearing a pte. 2) We didn't test pte present bit inside lock in exception support. 3) The pte and tlb locks needed to merged in order to ensure consistency between page table and TLB. This also has the effect of serializing TLB broadcasts on SMP systems. The attached change implements the above and a few other tweaks to try to improve performance. Based on the timing code, TLB purges are very slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it beneficial to test the split_tlb variable to avoid duplicate purges. Probably, all PA 2.0 machines have combined TLBs. I dropped using __flush_tlb_range in flush_tlb_mm as I realized all applications and most threads have a stack size that is too large to make this useful. I added some comments to this effect. Since implementing 1 through 3, I haven't had any random segmentation faults on mx3210 (rp3440) in about one week of building code and running as a Debian buildd. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Helge Deller <deller@gmx.de>
2015-07-10	stifb: Implement hardware accelerated copyarea	Alex Ivanov	1	-2/+38
	This patch adds hardware assisted scrolling. The code is based upon the following investigation: https://parisc.wiki.kernel.org/index.php/NGLE#Blitter A simple 'time ls -la /usr/bin' test shows 1.6x speed increase over soft copy and 2.3x increase over FBINFO_READS_FAST (prefer soft copy over screen redraw) on Artist framebuffer. Signed-off-by: Alex Ivanov <lausgans@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
2015-07-10	arm64: entry32: remove pointless register assignment	Mark Rutland	1	-2/+0
	We currently set x27 in compat_sys_sigreturn_wrapper and compat_sys_rt_sigreturn_wrapper, similarly to what we do with r8/why on 32-bit ARM, in an attempt to prevent sigreturns from being restarted. However, on arm64 we have always used pt_regs::syscallno for syscall restarting (for both native and compat tasks), and x27 is never inspected again before being overwritten in kernel_exit. This patch removes the pointless register assignments. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-10	cxl: Check if afu is not null in cxl_slbia	Daniel Axtens	1	-1/+1
	The pointer to an AFU in the adapter's list of AFUs can be null if we're in the process of removing AFUs. The afu_list_lock doesn't guard against this. Say we have 2 slices, and we're in the process of removing cxl. - We remove the AFUs in order (see cxl_remove). In cxl_remove_afu for AFU 0, we take the lock, set adapter->afu[0] = NULL, and release the lock. - Then we get an slbia. In cxl_slbia we take the lock, and set afu = adapter->afu[0], which is NULL. - Therefore our attempt to check afu->enabled will blow up. Therefore, check if afu is a null pointer before dereferencing it. Cc: stable@vger.kernel.org Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-09	hpfs: hpfs_error: Remove static buffer, use vsprintf extension %pV instead	Joe Perches	1	-4/+7
	Removing unnecessary static buffers is good. Use the vsprintf %pV extension instead. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Cc: stable@vger.kernel.org # v2.6.36+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	hpfs: kstrdup() out of memory handling	Sanidhya Kashyap	1	-2/+5
	There is a possibility of nothing being allocated to the new_opts in case of memory pressure, therefore return ENOMEM for such case. Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	hpfs: Remove unessary cast	Firo Yang	1	-1/+1
	Avoid a pointless kmem_cache_alloc() return value cast in fs/hpfs/super.c::hpfs_alloc_inode() Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	hpfs: add fstrim support	Mikulas Patocka	5	-0/+128
	This patch adds support for fstrim to the HPFS filesystem. Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	ioctl_compat: handle FITRIM	Mikulas Patocka	6	-7/+1
	The FITRIM ioctl has the same arguments on 32-bit and 64-bit architectures, so we can add it to the list of compatible ioctls and drop it from compat_ioctl method of various filesystems. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ted Ts'o <tytso@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	Fix firmware loader uevent buffer NULL pointer dereference	Linus Torvalds	1	-3/+13
	The firmware class uevent function accessed the "fw_priv->buf" buffer without the proper locking and testing for NULL. This is an old bug (looks like it goes back to 2012 and commit 1244691c73b2: "firmware loader: introduce firmware_buf"), but for some reason it's triggering only now in 4.2-rc1. Shuah Khan is trying to bisect what it is that causes this to trigger more easily, but in the meantime let's just fix the bug since others are hitting it too (at least Ingo reports having seen it as well). Reported-and-tested-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Ming Lei <ming.lei@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	mm: avoid setting up anonymous pages into file mapping	Kirill A. Shutemov	1	-7/+13
	Reading page fault handler code I've noticed that under right circumstances kernel would map anonymous pages into file mappings: if the VMA doesn't have vm_ops->fault() and the VMA wasn't fully populated on ->mmap(), kernel would handle page fault to not populated pte with do_anonymous_page(). Let's change page fault handler to use do_anonymous_page() only on anonymous VMA (->vm_ops == NULL) and make sure that the VMA is not shared. For file mappings without vm_ops->fault() or shred VMA without vm_ops, page fault on pte_none() entry would lead to SIGBUS. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	MAINTAINERS: add secondary tree for ceph modules	Sage Weil	1	-0/+3
	The Ceph kernel code is primarily developed in the github tree, and only pushed to the korg tree before going to Linus. If Sage is unavailable and another maintainer needs to push something upstream, pull requests may originate from the github tree instead of Sage's korg tree. Signed-off-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-07-09	MAINTAINERS: update ceph entries	Sage Weil	1	-4/+15
	- The Ceph common code is used by both fs/ceph and drivers/block/rbd. Add a separate maintainers entry. - Add Ilya as libceph maintainer and cephfs submaintainer. - Attribute Documentation/ABI/testing/sysfs-bus-rbd to rbd. - ceph-devel@vger.kernel.org should be L, not M in rbd entry. Signed-off-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-07-09	libceph: treat sockaddr_storage with uninitialized family as blank	Ilya Dryomov	1	-7/+7
	addr_is_blank() should return true if family is neither AF_INET nor AF_INET6. This is what its counterpart entity_addr_t::is_blank_ip() is doing and it is the right thing to do: in process_banner() we check if our address is blank and if it is "learn" it from our peer. As it is, we never learn our address and always send out a blank one. This goes way back to ceph.git commit dd732cbfc1c9 ("use sockaddr_storage; and some ipv6 support groundwork") from 2009. While at at, do not open-code ipv6_addr_any() and use INADDR_ANY constant instead of 0. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2015-07-09	libceph: enable ceph in a non-default network namespace	Ilya Dryomov	3	-7/+22
	Grab a reference on a network namespace of the 'rbd map' (in case of rbd) or 'mount' (in case of ceph) process and use that to open sockets instead of always using init_net and bailing if network namespace is anything but init_net. Be careful to not share struct ceph_client instances between different namespaces and don't add any code in the !CONFIG_NET_NS case. This is based on a patch from Hong Zhiguo <zhiguohong@tencent.com>. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2015-07-08	modpost: work correctly with tile coldtext sections	Chris Metcalf	2	-3/+4
	The tilegx and tilepro compilers use .coldtext for their unlikely executed text section name, so an __attribute__((cold)) function will (when compiled with higher optimization levels) land in the .coldtext section. Modify modpost to add .coldtext to the set of OTHER_TEXT_SECTIONS so we don't get warnings about referencing such a section in an __ex_table block, and then also modify arch/tile/lib/memcpy_user_64.c so that it uses plain ".coldtext" instead of ".coldtext.memcpy". The latter naming is a relic of an earlier use of -ffunction-sections, which we no longer use by default. Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au>
2015-07-09	module: Fix load_module() error path	Peter Zijlstra	1	-0/+1
	The load_module() error path frees a module but forgot to take it out of the mod_tree, leaving a dangling entry in the tree, causing havoc. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reported-by: Arthur Marsh <arthur.marsh@internode.on.net> Tested-by: Arthur Marsh <arthur.marsh@internode.on.net> Fixes: 93c2e105f6bc ("module: Optimize __module_address() using a latched RB-tree") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-07-08	arm64: entry: handle debug exceptions in el*_inv	Mark Rutland	1	-2/+2
	Currently we enable debug exceptions before reading ESR_EL1 in both el0_inv and el1_inv. If a debug exception is taken before we read ESR_EL1, the value will have been corrupted. As el*_inv is typically fatal, an intervening debug exception results in misleading debug information being logged to the console, but is not otherwise harmful. As with the other entry paths, we can use the ESR_EL1 value stashed earlier in the exception entry (in x25 for el0_sync{,_compat}, and x1 for el1_sync), giving us better error reporting in this case. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-08	Fix broken audit tests for exec arg len	Linus Torvalds	1	-2/+1
	The "fix" in commit 0b08c5e5944 ("audit: Fix check of return value of strnlen_user()") didn't fix anything, it broke things. As reported by Steven Rostedt: "Yes, strnlen_user() returns 0 on fault, but if you look at what len is set to, than you would notice that on fault len would be -1" because we just subtracted one from the return value. So testing against 0 doesn't test for a fault condition, it tests against a perfectly valid empty string. Also fix up the usual braindamage wrt using WARN_ON() inside a conditional - make it part of the conditional and remove the explicit unlikely() (which is already part of the WARN_ON*() logic, exactly so that you don't have to write unreadable code. Reported-and-tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Paul Moore <pmoore@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-08	powerpc: Update MAINTAINERS to point at shared tree	Michael Ellerman	1	-1/+1
	Now that we have a shared powerpc tree on kernel.org, point folks at that as the primary place to look for powerpc stuff. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08	powerpc/perf/24x7: Fix lockdep warning	Sukadev Bhattiprolu	1	-0/+2
	The sysfs attributes for the 24x7 counters are dynamically allocated. Initialize the attributes using sysfs_attr_init() to fix following warning which occurs when CONFIG_DEBUG_LOCK_VMALLOC=y. [ 0.346249] audit: initializing netlink subsys (disabled) [ 0.346284] audit: type=2000 audit(1436295254.340:1): initialized [ 0.346489] BUG: key c0000000efe90198 not in .data! [ 0.346491] DEBUG_LOCKS_WARN_ON(1) [ 0.346502] ------------[ cut here ]------------ [ 0.346504] WARNING: at ../kernel/locking/lockdep.c:3002 [ 0.346506] Modules linked in: Reported-by: Gustavo Luiz Duarte <gustavold@linux.vnet.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Gustavo Luiz Duarte <gustavold@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08	cxl: Fix off by one error allowing subsequent mmap page to be accessed	Ian Munsie	1	-2/+2
	It was discovered that if a process mmaped their problem state area they were able to access one page more than expected, potentially allowing them to access the problem state area of an unrelated process. This was due to a simple off by one error in the mmap fault handler introduced in 0712dc7e73e59d79bcead5d5520acf4e9e917e87 ("cxl: Fix issues when unmapping contexts"), which is fixed in this patch. Cc: stable@vger.kernel.org Fixes: 0712dc7e73e5 ("cxl: Fix issues when unmapping contexts") Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08	cxl: Fail mmap if requested mapping is larger than assigned problem state area	Ian Munsie	1	-1/+9
	This patch makes the mmap call fail outright if the requested region is larger than the problem state area assigned to the context so the error is reported immediately rather than waiting for an attempt to access an address out of bounds. Although we never expect users to map more than the assigned problem state area and are not aware of anyone doing this (other than for testing), this does have the potential to break users if someone has used a larger range regardless. I'm submitting it for consideration, but if this change is not considered acceptable the previous patch is sufficient to prevent access out of bounds without breaking anyone. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07	mei: nfc: fix deadlock on shutdown/suspend path	Tomas Winkler	3	-19/+2
	In function mei_nfc_host_exit mei_cl_remove_device cannot be called under the device mutex as device removing flow invokes the device driver remove handler that calls in turn to mei_cl_disable_device which naturally acquires the device mutex. Also remove mei_cl_bus_remove_devices which has the same issue, but is never executed as currently the only device on the mei client bus is NFC and a new device cannot be easily added till the bus revamp is completed. This fixes regression caused by commit be9b720a0ccb ("mei_phy: move all nfc logic from mei driver to nfc") Prior to this change the nfc driver remove handler called to no-op disable function while actual nfc device was disabled directly from the mei driver. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-07	arm64: Keep the ARM64 Kconfig selects sorted	Catalin Marinas	1	-1/+1
	Move EDAC_SUPPORT to the right place. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-07	ACPI / ARM64 : use the new BAD_MADT_GICC_ENTRY macro	Al Stone	2	-2/+2
	For those parts of the arm64 ACPI code that need to check GICC subtables in the MADT, use the new BAD_MADT_GICC_ENTRY macro instead of the previous BAD_MADT_ENTRY. The new macro takes into account differences in the size of the GICC subtable that the old macro did not; this caused failures even though the subtable entries are valid. Fixes: aeb823bbacc2 ("ACPICA: ACPI 6.0: Add changes for FADT table.") Signed-off-by: Al Stone <al.stone@linaro.org> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-07	ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro	Al Stone	1	-0/+8
	The BAD_MADT_ENTRY() macro is designed to work for all of the subtables of the MADT. In the ACPI 5.1 version of the spec, the struct for the GICC subtable (struct acpi_madt_generic_interrupt) is 76 bytes long; in ACPI 6.0, the struct is 80 bytes long. But, there is only one definition in ACPICA for this struct -- and that is the 6.0 version. Hence, when BAD_MADT_ENTRY() compares the struct size to the length in the GICC subtable, it fails if 5.1 structs are in use, and there are systems in the wild that have them. This patch adds the BAD_MADT_GICC_ENTRY() that checks the GICC subtable only, accounting for the difference in specification versions that are possible. The BAD_MADT_ENTRY() will continue to work as is for all other MADT subtables. This code is being added to an arm64 header file since that is currently the only architecture using the GICC subtable of the MADT. As a GIC is specific to ARM, it is also unlikely the subtable will be used elsewhere. Fixes: aeb823bbacc2 ("ACPICA: ACPI 6.0: Add changes for FADT table.") Signed-off-by: Al Stone <al.stone@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> [catalin.marinas@arm.com: extra brackets around macro arguments] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-07	PM / wakeirq: Avoid setting power.wakeirq too hastily	Rafael J. Wysocki	2	-28/+15
	If dev_pm_attach_wake_irq() fails, the device's power.wakeirq field should not be set to point to the struct wake_irq passed to that function, as that object will be freed going forward. For this reason, make dev_pm_attach_wake_irq() first call device_wakeup_attach_irq() and only set the device's power.wakeirq field if that's successful. That requires device_wakeup_attach_irq() to be called under the device's power.lock lock, but since dev_pm_attach_wake_irq() is the only caller of it, the requisite changes are easy to make. Fixes: 4990d4fe327b (PM / Wakeirq: Add automated device wake IRQ handling) Reported-by: Felipe Balbi <balbi@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07	cxl: Fix refcounting in kernel API	Michael Neuling	1	-7/+5
	Currently the kernel API AFU dev refcounting is done on context start and stop. This patch moves this refcounting to context init and release, bringing it inline with how the userspace API does it. Without this we've seen the refcounting on the AFU get out of whack between the user and kernel API usage. This causes the AFU structures to be freed when they are actually still in use. This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07	powerpc/powernv: Fix race in updating core_idle_state	Shreyas B. Prabhu	1	-10/+21
	core_idle_state is maintained for each core. It uses 0-7 bits to track whether a thread in the core has entered fastsleep or winkle. 8th bit is used as a lock bit. The lock bit is set in these 2 scenarios- - The thread is first in subcore to wakeup from sleep/winkle. - If its the last thread in the core about to enter sleep/winkle While the lock bit is set, if any other thread in the core wakes up, it loops until the lock bit is cleared before proceeding in the wakeup path. This helps prevent race conditions w.r.t fastsleep workaround and prevents threads from switching to process context before core/subcore resources are restored. But, in the path to sleep/winkle entry, we currently don't check for lock-bit. This exposes us to following race when running with subcore on- First thread in the subcorea Another thread in the same waking up core entering sleep/winkle lwarx r15,0,r14 ori r15,r15,PNV_CORE_IDLE_LOCK_BIT stwcx. r15,0,r14 [Code to restore subcore state] lwarx r15,0,r14 [clear thread bit] stwcx. r15,0,r14 andi. r15,r15,PNV_CORE_IDLE_THREAD_BITS stw r15,0(r14) Here, after the thread entering sleep clears its thread bit in core_idle_state, the value is overwritten by the thread waking up. In such cases when the core enters fastsleep, code mistakes an idle thread as running. Because of this, the first thread waking up from fastsleep which is supposed to resync timebase skips it. So we can end up having a core with stale timebase value. This patch fixes the above race by looping on the lock bit even while entering the idle states. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Fixes: 7b54e9f213f76 'powernv/powerpc: Add winkle support for offline cpus' Cc: stable@vger.kernel.org # 3.19+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07	ata: ahci_platform: Add ACPI _CLS matching	Suthikulpanit, Suravee	2	-1/+10
	This patch adds ACPI supports for AHCI platform driver, which uses _CLS method to match the device. The following is an example of ASL structure in DSDT for a SATA controller, which contains _CLS package to be matched by the ahci_platform driver: Device (AHC0) // AHCI Controller { Name(_HID, "AMDI0600") Name (_CCA, 1) Name (_CLS, Package (3) { 0x01, // Base Class: Mass Storage 0x06, // Sub-Class: serial ATA 0x01, // Interface: AHCI }) Name (_CRS, ResourceTemplate () { Memory32Fixed (ReadWrite, 0xE0300000, 0x00010000) Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive,,,) { 387 } }) } Also, since ATA driver should not require PCI support for ATA_ACPI, this patch removes dependency in the driver/ata/Kconfig. Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07	ACPI / scan: Add support for ACPI _CLS device matching	Suthikulpanit, Suravee	5	-4/+78
	Device drivers typically use ACPI _HIDs/_CIDs listed in struct device_driver acpi_match_table to match devices. However, for generic drivers, we do not want to list _HID for all supported devices. Also, certain classes of devices do not have _CID (e.g. SATA, USB). Instead, we can leverage ACPI _CLS, which specifies PCI-defined class code (i.e. base-class, subclass and programming interface). This patch adds support for matching ACPI devices using the _CLS method. To support loadable module, current design uses _HID or _CID to match device's modalias. With the new way of matching with _CLS this would requires modification to the current ACPI modalias key to include _CLS. This patch appends PCI-defined class-code to the existing ACPI modalias as following. acpi:<HID>:<CID1>:<CID2>:..:<CIDn>:<bbsspp>: E.g: # cat /sys/devices/platform/AMDI0600:00/modalias acpi:AMDI0600:010601: where bb is th base-class code, ss is te sub-class code, and pp is the programming interface code Since there would not be _HID/_CID in the ACPI matching table of the driver, this patch adds a field to acpi_device_id to specify the matching _CLS. static const struct acpi_device_id ahci_acpi_match[] = { { ACPI_DEVICE_CLASS(PCI_CLASS_STORAGE_SATA_AHCI, 0xffffff) }, {}, }; In this case, the corresponded entry in modules.alias file would be: alias acpi:010601: ahci_platform Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07	PM / hibernate: clarify resume documentation	Uwe Geuder	1	-2/+11
	it was not the whole truth that kernel mode cannot be used with swap on LVM Signed-off-by: Uwe Geuder <linuxkernel2015-ugeuder@snkmail.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07	PM / Domains: Avoid infinite loops in attach/detach code	Geert Uytterhoeven	1	-2/+11
	If pm_genpd_{add,remove}_device() keeps on failing with -EAGAIN, we end up with an infinite loop in genpd_dev_pm_{at,de}tach(). This may happen due to a genpd.prepared_count imbalance. This is a bug elsewhere, but it will result in a system lock up, possibly during reboot of an otherwise functioning system. To avoid this, put a limit on the maximum number of loop iterations, using an exponential back-off mechanism. If the limit is reached, the operation will just fail. An error message is already printed. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-07	ACPI / LPSS: Fix up acpi_lpss_create_device()	Rafael J. Wysocki	1	-2/+5
	Fix a return value (which should be a negative error code) and a memory leak (the list allocated by acpi_dev_get_resources() needs to be freed on ioremap() errors too) in acpi_lpss_create_device() introduced by commit 4483d59e29fe 'ACPI / LPSS: check the result of ioremap()'. Fixes: 4483d59e29fe 'ACPI / LPSS: check the result of ioremap()' Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: 4.0+ <stable@vger.kernel.org> # 4.0+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-06	ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stage	Rafael J. Wysocki	4	-201/+18
	This effectively reverts the following three commits: 7bc10388ccdd ACPI / resources: free memory on error in add_region_before() 0f1b414d1907 ACPI / PNP: Avoid conflicting resource reservations b9a5e5e18fbf ACPI / init: Fix the ordering of acpi_reserve_resources() (commit b9a5e5e18fbf introduced regressions some of which, but not all, were addressed by commit 0f1b414d1907 and commit 7bc10388ccdd was a fixup on top of the latter) and causes ACPI fixed hardware resources to be reserved at the fs_initcall_sync stage of system initialization. The story is as follows. First, a boot regression was reported due to an apparent resource reservation ordering change after a commit that shouldn't lead to such changes. Investigation led to the conclusion that the problem happened because acpi_reserve_resources() was executed at the device_initcall() stage of system initialization which wasn't strictly ordered with respect to driver initialization (and with respect to the initialization of the pcieport driver in particular), so a random change causing the device initcalls to be run in a different order might break things. The response to that was to attempt to run acpi_reserve_resources() as soon as we knew that ACPI would be in use (commit b9a5e5e18fbf). However, that turned out to be too early, because it caused resource reservations made by the PNP system driver to fail on at least one system and that failure was addressed by commit 0f1b414d1907. That fix still turned out to be insufficient, though, because calling acpi_reserve_resources() before the fs_initcall stage of system initialization caused a boot regression to happen on the eCAFE EC-800-H20G/S netbook. That meant that we only could call acpi_reserve_resources() at the fs_initcall initialization stage or later, but then we might just as well call it after the PNP initalization in which case commit 0f1b414d1907 wouldn't be necessary any more. For this reason, the changes made by commit 0f1b414d1907 are reverted (along with a memory leak fixup on top of that commit), the changes made by commit b9a5e5e18fbf that went too far are reverted too and acpi_reserve_resources() is changed into fs_initcall_sync, which will cause it to be executed after the PNP subsystem initialization (which is an fs_initcall) and before device initcalls (including the pcieport driver initialization) which should avoid the initial issue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581 Link: http://marc.info/?t=143092384600002&r=1&w=2 Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-06	arm64: defconfig: Add Ceva ahci to the defconfig	Suneel Garapati	1	-0/+1
	The Ceva ahci controller is available on the Xilinx Zynq UltraScale+ MPSoC. Signed-off-by: Suneel Garapati <suneel.garapati@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com> [catalin.marinas@arm.com: removed unnecessary defconfig changes] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-06	arm64: remove another unnecessary libfdt include path	Ard Biesheuvel	1	-2/+0
	Patch 63a4aea55670 ("of: clean-up unnecessary libfdt include paths") removed all explicit libfdt include paths, since those are no longer necessary after the latest dtc upgrade. However, this one snuck in during the same merge window. Remove it. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-06	perf/x86: Fix copy_from_user_nmi() return if range is not ok	Yann Droneaud	1	-1/+1
	Commit 0a196848ca36 ("perf: Fix arch_perf_out_copy_user default"), changes copy_from_user_nmi() to return the number of remaining bytes so that it behave like copy_from_user(). Unfortunately, when the range is outside of the process memory, the return value is still the number of byte copied, eg. 0, instead of the remaining bytes. As all users of copy_from_user_nmi() were modified as part of commit 0a196848ca36, the function should be fixed to return the total number of bytes if range is not correct. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435001923-30986-1-git-send-email-ydroneaud@opteya.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-06	perf: Fix AUX buffer refcounting	Peter Zijlstra	3	-10/+35
	Its currently possible to drop the last refcount to the aux buffer from NMI context, which results in the expected fireworks. The refcounting needs a bigger overhaul, but to cure the immediate problem, delay the freeing by using an irq_work. Reviewed-and-tested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150618103249.GK19282@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-06	powerpc/powernv: Fix opal-elog interrupt handler	Alistair Popple	1	-11/+5
	The conversion of opal events to a proper irqchip means that handlers are called until the relevant opal event has been cleared by processing it. Events that queue work should therefore use a threaded handler to mask the event until processing is complete. Signed-off-by: Alistair Popple <alistair@popple.id.au> Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06	powerpc/ppc4xx_hsta_msi: Include ppc-pci.h to fix reference to hose_list	Daniel Axtens	1	-0/+1
	An earlier commit referenced 'hose_list' in sysdev/ppc4xx_hsta_msi.c. hose_list is defined in ppc-pci.h, which was not included in that file. Include it, fixing the build for the akebono defconfig used by the kbuild test robot. Fixes: f2c800aaceb6 ("powerpc/ppc4xx_hsta_msi: Move MSI-related ops to pci_controller_ops") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06	powerpc: Add plain English description for alignment exception oopses	Anton Blanchard	1	-0/+4
	If we take an alignment exception which we cannot fix, the oops currently prints: Unable to handle kernel paging request for unknown fault Lets print something more useful: Unable to handle kernel paging request for unaligned access at address 0xc0000000f77bba8f Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06	cxl: Test the correct mmio space before unmapping	Daniel Axtens	1	-1/+1
	Before freeing p2n, test p2n, not p1n. Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06	powerpc: Set the correct kernel taint on machine check errors.	Daniel Axtens	1	-0/+2
	This means the 'M' flag will work properly when the kernel prints a backtrace. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06	cxl/vphb.c: Use phb pointer after NULL check	Maninder Singh	1	-1/+2
	static Anlaysis detected below error:- (error) Possible null pointer dereference: phb So, Use phb after NULL check. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06	powerpc/powernv: Fix vma page prot flags in opal-prd driver	Vaidyanathan Srinivasan	1	-5/+4
	opal-prd driver will mmap() firmware code/data area as private mapping to prd user space daemon. Write to this page will trigger COW faults. The new COW pages are normal kernel RAM pages accounted by the kernel and are not special. vma->vm_page_prot value will be used at page fault time for the new COW pages, while pgprot_t value passed in remap_pfn_range() is used for the initial page table entry. Hence: * Do not add _PAGE_SPECIAL in vma, but only for remap_pfn_range() * Also remap_pfn_range() will add the _PAGE_SPECIAL flag using pte_mkspecial() call, hence no need to specify in the driver This fix resolves the page accounting warning shown below: BUG: Bad rss-counter state mm:c0000007d34ac600 idx:1 val:19 The above warning is triggered since _PAGE_SPECIAL was incorrectly being set for the normal kernel COW pages. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-05	tools: Copy rbtree_augmented.h from the kernel	Arnaldo Carvalho de Melo	3	-3/+246
	To complete the transitioning to not to share the same files with the kernel, also moving it from tools/perf/include/linux/ to tools/include/linux to make the whoke rbtree kit to other tools/ living codebases. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-5bxyehixafckqm6ez25alnfo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>