wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2016-10-21	arm64: KVM: Take S1 walks into account when determining S2 write faults	Will Deacon	1	-5/+6
	The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was generated by a read or a write instruction. For stage 2 data aborts generated by a stage 1 translation table walk (i.e. the actual page table access faults at EL2), the WnR bit therefore reports whether the instruction generating the walk was a load or a store, not whether the page table walker was reading or writing the entry. For page tables marked as read-only at stage 2 (e.g. due to KSM merging them with the tables from another guest), this could result in livelock, where a page table walk generated by a load instruction attempts to set the access flag in the stage 1 descriptor, but fails to trigger CoW in the host since only a read fault is reported. This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to take into account stage 2 faults in stage 1 walks. Since DBM cannot be disabled at EL2 for CPUs that implement it, we assume that these faults are always causes by writes, avoiding the livelock situation at the expense of occasional, spurious CoWs. We could, in theory, do a bit better by checking the guest TCR configuration and inspecting the page table to see why the PTE faulted. However, I doubt this is measurable in practice, and the threat of livelock is real. Cc: <stable@vger.kernel.org> Cc: Julien Grall <julien.grall@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-21	PCI: designware-plat: Update author email address	Joao Pinto	1	-1/+1
	Although I am leaving Synopsys, I would like to keep working with the linux kernel community and help in what you might find useful. For that I am sending this patch to change my contact e-mail. Signed-off-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-10-21	watchdog: wdat_wdt: Ping the watchdog on resume	Mika Westerberg	1	-0/+4
	It turns out we need to ping the watchdog hardware on resume when we re-program it. Otherwise this causes inadvertent reset to trigger right after the resume is complete. Fixes: 058dfc767008 (ACPI / watchdog: Add support for WDAT hardware watchdog) Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-10-20	KVM: s390: reject invalid modes for runtime instrumentation	Christian Borntraeger	1	-2/+7
	Usually a validity intercept is a programming error of the host because of invalid entries in the state description. We can get a validity intercept if the mode of the runtime instrumentation control block is wrong. As the host does not know which modes are valid, this can be used by userspace to trigger a WARN. Instead of printing a WARN let's return an error to userspace as this can only happen if userspace provides a malformed initial value (e.g. on migration). The kernel should never warn on bogus input. Instead let's log it into the s390 debug feature. While at it, let's return -EINVAL for all validity intercepts as this will trigger an error in QEMU like error: kvm run failed Invalid argument PSW=mask 0404c00180000000 addr 000000000063c226 cc 00 R00=000000000000004f R01=0000000000000004 R02=0000000000760005 R03=000000007fe0a000 R04=000000000064ba2a R05=000000049db73dd0 R06=000000000082c4b0 R07=0000000000000041 R08=0000000000000002 R09=000003e0804042a8 R10=0000000496152c42 R11=000000007fe0afb0 [...] This will avoid an endless loop of validity intercepts. Cc: stable@vger.kernel.org # v4.5+ Fixes: c6e5f166373a ("KVM: s390: implement the RI support of guest") Acked-by: Fan Zhang <zhangfan@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-10-20	cpufreq: fix overflow in cpufreq_table_find_index_dl()	Sergey Senozhatsky	1	-2/+2
	'best' is always less or equals to 'pos', so `best - pos' returns a negative value which is then getting casted to `unsigned int' and passed to __cpufreq_driver_target()->acpi_cpufreq_target() for policy->freq_table selection. This results in BUG: unable to handle kernel paging request at ffff881019b469f8 IP: [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq] PGD 267f067 PUD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 6 PID: 70 Comm: kworker/6:1 Not tainted 4.9.0-rc1-next-20161017-dbg-dirty Workqueue: events dbs_work_handler task: ffff88041b808000 task.stack: ffff88041b810000 RIP: 0010:[<ffffffffa00356c1>] [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq] RSP: 0018:ffff88041b813c60 EFLAGS: 00010282 RAX: ffff880419b46a00 RBX: ffff88041b848400 RCX: ffff880419b20f80 RDX: 00000000001dff38 RSI: 00000000ffffffff RDI: ffff88041b848400 RBP: ffff88041b813cb0 R08: 0000000000000006 R09: 0000000000000040 R10: ffffffff8207f9e0 R11: ffffffff8173595b R12: 0000000000000000 R13: ffff88041f1dff38 R14: 0000000000262900 R15: 0000000bfffffff4 FS: 0000000000000000(0000) GS:ffff88041f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff881019b469f8 CR3: 000000041a2d3000 CR4: 00000000001406e0 Stack: ffff88041b813cb0 ffffffff813347f9 ffff88041b813ca0 ffffffff81334663 ffff88041f1d4bc0 ffff88041b848400 0000000000000000 0000000000000000 0000000000262900 0000000000000000 ffff88041b813d00 ffffffff813355dc Call Trace: [<ffffffff813347f9>] ? cpufreq_freq_transition_begin+0xf1/0xfc [<ffffffff81334663>] ? get_cpu_idle_time+0x97/0xa6 [<ffffffff813355dc>] __cpufreq_driver_target+0x3b6/0x44e [<ffffffff81336ca3>] cs_dbs_timer+0x11a/0x135 [<ffffffff81336fda>] dbs_work_handler+0x39/0x62 [<ffffffff81057823>] process_one_work+0x280/0x4a5 [<ffffffff81058719>] worker_thread+0x24f/0x397 [<ffffffff810584ca>] ? rescuer_thread+0x30b/0x30b [<ffffffff81418380>] ? nl80211_get_key+0x29/0x36a [<ffffffff8105d2b7>] kthread+0xfc/0x104 [<ffffffff8107ceea>] ? put_lock_stats.isra.9+0xe/0x20 [<ffffffff8105d1bb>] ? kthread_create_on_node+0x3f/0x3f [<ffffffff814b2092>] ret_from_fork+0x22/0x30 Code: 56 4d 6b ff 0c 41 55 41 54 53 48 83 ec 28 48 8b 15 ad 1e 00 00 44 8b 41 08 48 8b 87 c8 00 00 00 49 89 d5 4e 03 2c c5 80 b2 78 81 <46> 8b 74 38 04 45 3b 75 00 75 11 31 c0 83 39 00 0f 84 1c 01 00 RIP [<ffffffffa00356c1>] acpi_cpufreq_target+0x4f/0x190 [acpi_cpufreq] RSP <ffff88041b813c60> CR2: ffff881019b469f8 ---[ end trace 16d9fc7a17897d37 ]--- [ rjw: In some cases this bug may also cause incorrect frequencies to be selected by cpufreq governors. ] Fixes: 899bb6642f2a (cpufreq: skip invalid entries when searching the frequency) Link: http://marc.info/?l=linux-kernel&m=147672030714331&w=2 Reported-and-tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reported-and-tested-by: Jörg Otte <jrg.otte@gmail.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.8+ <stable@vger.kernel.org> # 4.8+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-10-20	arm64: remove pr_cont abuse from mem_init	Mark Rutland	1	-13/+13
	All the lines printed by mem_init are independent, with each ending with a newline. While they logically form a large block, none are actually continuations of previous lines. The kernel-side printk code and the userspace demsg tool differ in their handling of KERN_CONT following a newline, and while this isn't always a problem kernel-side, it does cause difficulty for userspace. Using pr_cont causes the userspace tool to not print line prefix (e.g. timestamps) even when following a newline, mis-aligning the output and making it harder to read, e.g. [ 0.000000] Virtual kernel memory layout: [ 0.000000] modules : 0xffff000000000000 - 0xffff000008000000 ( 128 MB) vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000 (129022 GB) .text : 0xffff000008080000 - 0xffff0000088b0000 ( 8384 KB) .rodata : 0xffff0000088b0000 - 0xffff000008c50000 ( 3712 KB) .init : 0xffff000008c50000 - 0xffff000008d50000 ( 1024 KB) .data : 0xffff000008d50000 - 0xffff000008e25200 ( 853 KB) .bss : 0xffff000008e25200 - 0xffff000008e6bec0 ( 284 KB) fixed : 0xffff7dfffe7fd000 - 0xffff7dfffec00000 ( 4108 KB) PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000 ( 16 MB) vmemmap : 0xffff7e0000000000 - 0xffff800000000000 ( 2048 GB maximum) 0xffff7e0000000000 - 0xffff7e0026000000 ( 608 MB actual) memory : 0xffff800000000000 - 0xffff800980000000 ( 38912 MB) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1 Fix this by using pr_notice consistently for all lines, which both the kernel and userspace are happy with. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20	arm64: fix show_regs fallout from KERN_CONT changes	Mark Rutland	1	-3/+12
	Recently in commit 4bcc595ccd80decb ("printk: reinstate KERN_CONT for printing continuation lines"), the behaviour of printk changed w.r.t. KERN_CONT. Now, KERN_CONT is mandatory to continue existing lines. Without this, prefixes are inserted, making output illegible, e.g. [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145 [ 1007.076329] sp : ffff000008d53ec0 [ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018 [ 1007.086160] [ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00 [ 1007.094183] [ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac [ 1007.102206] ... or when dumped with the userpace dmesg tool, which has slightly different implicit newline behaviour. e.g. [ 1007.069010] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145 [ 1007.076329] sp : ffff000008d53ec0 [ 1007.079606] x29: ffff000008d53ec0 [ 1007.082797] x28: 0000000080c50018 [ 1007.086160] [ 1007.087630] x27: ffff000008e0c7f8 [ 1007.090820] x26: ffff80097631ca00 [ 1007.094183] [ 1007.095653] x25: 0000000000000001 [ 1007.098843] x24: 000000ea68b61cac [ 1007.102206] We can't simply always use KERN_CONT for lines which may or may not be continuations. That causes line prefixes (e.g. timestamps) to be supressed, and the alignment of all but the first line will be broken. For even more fun, we can't simply insert some dummy empty-string printk calls, as GCC warns for an empty printk string, and even if we pass KERN_DEFAULT explcitly to silence the warning, the prefix gets swallowed unless there is an additional part to the string. Instead, we must manually iterate over pairs of registers, which gives us the legible output we want in either case, e.g. [ 169.771790] pc : [<ffff00000871898c>] lr : [<ffff000008718948>] pstate: 40000145 [ 169.779109] sp : ffff000008d53ec0 [ 169.782386] x29: ffff000008d53ec0 x28: 0000000080c50018 [ 169.787650] x27: ffff000008e0c7f8 x26: ffff80097631de00 [ 169.792913] x25: 0000000000000001 x24: 00000027827b2cf4 Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20	kvm: x86: memset whole irq_eoi	Jiri Slaby	1	-1/+1
	gcc 7 warns: arch/x86/kvm/ioapic.c: In function 'kvm_ioapic_reset': arch/x86/kvm/ioapic.c:597:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size] And it is right. Memset whole array using sizeof operator. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> [Added x86 subject tag] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-10-20	kvm/x86: Fix unused variable warning in kvm_timer_init()	Borislav Petkov	1	-2/+2
	When CONFIG_CPU_FREQ is not set, int cpu is unused and gcc rightfully warns about it: arch/x86/kvm/x86.c: In function ‘kvm_timer_init’: arch/x86/kvm/x86.c:5697:6: warning: unused variable ‘cpu’ [-Wunused-variable] int cpu; ^~~ But since it is used only in the CONFIG_CPU_FREQ block, simply move it there, thus squashing the warning too. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-10-20	arm64: kernel: force ET_DYN ELF type for CONFIG_RELOCATABLE=y	Ard Biesheuvel	1	-1/+1
	GNU ld used to set the ELF file type to ET_DYN for PIE executables, which is the same file type used for shared libraries. However, this was changed recently, and now PIE executables are emitted as ET_EXEC instead. The distinction is only relevant for ELF loaders, and so there is little reason to care about the difference when building the kernel, which is why the change has gone unnoticed until now. However, debuggers do use the ELF binary, and expect ET_EXEC type files to appear in memory at the exact offset described in the ELF metadata. This means source level debugging is no longer possible when KASLR is in effect or when executing the stub. So add the -shared LD option when building with CONFIG_RELOCATABLE=y. This forces the ELF file type to be set to ET_DYN (which is what you get when building with binutils 2.24 and earlier anyway), and has no other ill effects. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20	arm64: suspend: Reconfigure PSTATE after resume from idle	James Morse	3	-1/+16
	The suspend/resume path in kernel/sleep.S, as used by cpu-idle, does not save/restore PSTATE. As a result of this cpufeatures that were detected and have bits in PSTATE get lost when we resume from idle. UAO gets set appropriately on the next context switch. PAN will be re-enabled next time we return from user-space, but on a preemptible kernel we may run work accessing user space before this point. Add code to re-enable theses two features in __cpu_suspend_exit(). We re-use uao_thread_switch() passing current. Signed-off-by: James Morse <james.morse@arm.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20	arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call	James Morse	1	-0/+9
	Commit 338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never") enabled PAN by enabling the 'SPAN' feature-bit in SCTLR_EL1. This means the PSTATE.PAN bit won't be set until the next return to the kernel from userspace. On a preemptible kernel we may schedule work that accesses userspace on a CPU before it has done this. Now that cpufeature enable() calls are scheduled via stop_machine(), we can set PSTATE.PAN from the cpu_enable_pan() call. Add WARN_ON_ONCE(in_interrupt()) to check the PSTATE value we updated is not immediately discarded. Reported-by: Tony Thompson <anthony.thompson@arm.com> Reported-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> [will: fixed typo in comment] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20	arm64: cpufeature: Schedule enable() calls instead of calling them via IPI	James Morse	6	-9/+21
	The enable() call for a cpufeature/errata is called using on_each_cpu(). This issues a cross-call IPI to get the work done. Implicitly, this stashes the running PSTATE in SPSR when the CPU receives the IPI, and restores it when we return. This means an enable() call can never modify PSTATE. To allow PAN to do this, change the on_each_cpu() call to use stop_machine(). This schedules the work on each CPU which allows us to modify PSTATE. This involves changing the protype of all the enable() functions. enable_cpu_capabilities() is called during boot and enables the feature on all online CPUs. This path now uses stop_machine(). CPU features for hotplug'd CPUs are enabled by verify_local_cpu_features() which only acts on the local CPU, and can already modify the running PSTATE as it is called from secondary_start_kernel(). Reported-by: Tony Thompson <anthony.thompson@arm.com> Reported-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20	arm64: Cortex-A53 errata workaround: check for kernel addresses	Andre Przywara	2	-12/+23
	Commit 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") adds code to execute cache maintenance instructions in the kernel on behalf of userland on CPUs with certain ARM CPU errata. It turns out that the address hasn't been checked to be a valid user space address, allowing userland to clean cache lines in kernel space. Fix this by introducing an address check before executing the instructions on behalf of userland. Since the address doesn't come via a syscall parameter, we can't just reject tagged pointers and instead have to remove the tag when checking against the user address limit. Cc: <stable@vger.kernel.org> Fixes: 7dd01aef0557 ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") Reported-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> [will: rework commit message + replace access_ok with max_user_addr()] Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-19	drm/fsl-dcu: enable pixel clock when enabling CRTC	Stefan Agner	2	-20/+3
	The pixel clock should not be on if the CRTC is not in use, hence move clock enable/disable calls into CRTC callbacks. Signed-off-by: Stefan Agner <stefan@agner.ch> Tested-By: Meng Yi <meng.yi@nxp.com>
2016-10-19	drm/fsl-dcu: do not transfer registers in mode_set_nofb	Stefan Agner	1	-2/+0
	Do not schedule a transfer of mode settings early. Modes should get applied on on CRTC enable where we also enable the pixel clock. Signed-off-by: Stefan Agner <stefan@agner.ch> Tested-By: Meng Yi <meng.yi@nxp.com>
2016-10-19	drm/fsl-dcu: do not transfer registers on plane init	Stefan Agner	1	-5/+0
	There is no need to explicitly initiate a register transfer and turn off the DCU after initializing the plane registers. In fact, this is harmful and leads to unnecessary flickers if the DCU has been left on by the bootloader. Signed-off-by: Stefan Agner <stefan@agner.ch> Tested-By: Meng Yi <meng.yi@nxp.com>
2016-10-19	drm/fsl-dcu: enable TCON bypass mode by default	Stefan Agner	2	-34/+7
	Do not use encoder disable/enable callbacks to control bypass mode as this seems to mess with the signals not liked by displays. This also makes more sense since the encoder is already defined to be parallel RGB/LVDS at creation time. Signed-off-by: Stefan Agner <stefan@agner.ch> Tested-By: Meng Yi <meng.yi@nxp.com>
2016-10-19	nfs4: fix missing-braces warning	Arnd Bergmann	1	-1/+1
	A bugfix introduced a harmless warning for update_open_stateid: fs/nfs/nfs4proc.c:1548:2: error: missing braces around initializer [-Werror=missing-braces] Removing the zero in the initializer will do the right thing here and initialize the entire structure to zero. Fixes: 1393d9612ba0 ("NFSv4: Fix a race when updating an open_stateid") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-10-19	nvmet: use symbolic constants for CNS values	Christoph Hellwig	2	-4/+4
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-19	nvme: use symbolic constants for CNS values	Christoph Hellwig	1	-2/+2
	Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-19	nvme.h: add an enum for cns values	Christoph Hellwig	1	-0/+10
	Ported over from nvme-cli. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-19	nvme.h: don't use uuid_be	Christoph Hellwig	1	-2/+1
	This makes life easier for nvme-cli and we don't really need the uuid type anyway to start with. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-19	nvme.h: resync with nvme-cli	Christoph Hellwig	3	-8/+29
	Import a few updates to nvme.h from nvme-cli. This mostly includes a few new fields and error codes, but also a few renames that so far are only used in user space. Also one field is moved from an array of two le64 values to one of 16 u8 values so that we can more easily access it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-19	nvme: Add tertiary number to NVME_VS	Gabriel Krisman Bertazi	5	-10/+11
	NVMe 1.2.1 specification adds a tertiary element to the version number. This updates the macro and its callers to include the final number and fixup a single place in nvmet where the version was generated manually. Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-19	printk: suppress empty continuation lines	Linus Torvalds	1	-0/+4
	We have a fairly common pattern where you print several things as continuations on one single line in a loop, and then at the end you do printk(KERN_CONT "\n"); to flush the buffered output. But if the output was flushed by something else (concurrent printk activity, or just system logging), we don't want that final flushing to just print an empty line. So just suppress empty continuation lines when they couldn't be merged into the line they are a continuation of. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace access_process_vm() write parameter with gup_flags	Lorenzo Stoakes	17	-54/+84
	This removes the 'write' argument from access_process_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace access_remote_vm() write parameter with gup_flags	Lorenzo Stoakes	4	-19/+20
	This removes the 'write' argument from access_remote_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace __access_remote_vm() write parameter with gup_flags	Lorenzo Stoakes	2	-11/+21
	This removes the 'write' argument from __access_remote_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace get_user_pages_remote() write/force parameters with gup_flags	Lorenzo Stoakes	9	-27/+40
	This removes the 'write' and 'force' from get_user_pages_remote() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace get_user_pages() write/force parameters with gup_flags	Lorenzo Stoakes	22	-54/+49
	This removes the 'write' and 'force' from get_user_pages() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace get_vaddr_frames() write/force parameters with gup_flags	Lorenzo Stoakes	5	-15/+11
	This removes the 'write' and 'force' from get_vaddr_frames() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	mm: replace get_user_pages_locked() write/force parameters with gup_flags	Lorenzo Stoakes	4	-12/+15
	This removes the 'write' and 'force' use from get_user_pages_locked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-19	arm64: percpu: rewrite ll/sc loops in assembly	Will Deacon	1	-64/+56
	Writing the outer loop of an LL/SC sequence using do {...} while constructs potentially allows the compiler to hoist memory accesses between the STXR and the branch back to the LDXR. On CPUs that do not guarantee forward progress of LL/SC loops when faced with memory accesses to the same ERG (up to 2k) between the failed STXR and the branch back, we may end up livelocking. This patch avoids this issue in our percpu atomics by rewriting the outer loop as part of the LL/SC inline assembly block. Cc: <stable@vger.kernel.org> Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-19	arm64: swp emulation: bound LL/SC retries before rescheduling	Will Deacon	1	-14/+22
	If a CPU does not implement a global monitor for certain memory types, then userspace can attempt a kernel DoS by issuing SWP instructions targetting the problematic memory (for example, a framebuffer mapped with non-cacheable attributes). The SWP emulation code protects against these sorts of attacks by checking for pending signals and potentially rescheduling when the STXR instruction fails during the emulation. Whilst this is good for avoiding livelock, it harms emulation of legitimate SWP instructions on CPUs where forward progress is not guaranteed if there are memory accesses to the same reservation granule (up to 2k) between the failing STXR and the retry of the LDXR. This patch solves the problem by retrying the STXR a bounded number of times (4) before breaking out of the LL/SC loop and looking for something else to do. Cc: <stable@vger.kernel.org> Fixes: bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm") Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-19	sched/fair: Fix incorrect task group ->load_avg	Vincent Guittot	1	-1/+8
	A scheduler performance regression has been reported by Joseph Salisbury, which he bisected back to: 3d30544f0212 ("sched/fair: Apply more PELT fixes) The regression triggers when several levels of task groups are involved (read: SystemD) and cpu_possible_mask != cpu_present_mask. The root cause is that group entity's load (tg_child->se[i]->avg.load_avg) is initialized to scale_load_down(se->load.weight). During the creation of a child task group, its group entities on possible CPUs are attached to parent's cfs_rq (tg_parent) and their loads are added to the parent's load (tg_parent->load_avg) with update_tg_load_avg(). But only the load on online CPUs will then be updated to reflect real load, whereas load on other CPUs will stay at the initial value. The result is a tg_parent->load_avg that is higher than the real load, the weight of group entities (tg_parent->se[i]->load.weight) on online CPUs is smaller than it should be, and the task group gets a less running time than what it could expect. ( This situation can be detected with /proc/sched_debug. The ".tg_load_avg" of the task group will be much higher than sum of ".tg_load_avg_contrib" of online cfs_rqs of the task group. ) The load of group entities don't have to be intialized to something else than 0 because their load will increase when an entity is attached. Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> # 4.8.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: joonwoop@codeaurora.org Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes) Link: http://lkml.kernel.org/r/1476881123-10159-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-10-19	KVM: MIPS: Add missing uaccess.h include	James Hogan	1	-0/+1
	MIPS KVM uses user memory accessors but mips.c doesn't directly include uaccess.h, so include it now. This wasn't too much of a problem before v4.9-rc1 as asm/module.h included asm/uaccess.h, however since commit 29abfbd9cbba ("mips: separate extable.h, switch module.h to it") this is no longer the case. This resulted in build failures when trace points were disabled, as trace/define_trace.h includes trace/trace_events.h only ifdef TRACEPOINTS_ENABLED, which goes on to include asm/uaccess.h via a couple of other headers. Fixes: 29abfbd9cbba ("mips: separate extable.h, switch module.h to it") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org
2016-10-18	sh: add earlycon support to j2_defconfig	Rich Felker	1	-0/+1
	Signed-off-by: Rich Felker <dalias@libc.org>
2016-10-18	sh: add Kconfig option for J-Core SoC core drivers	Rich Felker	2	-0/+11
	Signed-off-by: Rich Felker <dalias@libc.org>
2016-10-18	mm: replace get_user_pages_unlocked() write/force parameters with gup_flags	Lorenzo Stoakes	14	-36/+27
	This removes the 'write' and 'force' use from get_user_pages_unlocked() and replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18	mm: remove write/force parameters from __get_user_pages_unlocked()	Lorenzo Stoakes	6	-19/+34
	This removes the redundant 'write' and 'force' parameters from __get_user_pages_unlocked() to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18	mm: remove write/force parameters from __get_user_pages_locked()	Lorenzo Stoakes	1	-14/+33
	This removes the redundant 'write' and 'force' parameters from __get_user_pages_locked() to make the use of FOLL_FORCE explicit in callers as use of this flag can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18	mm: remove gup_flags FOLL_WRITE games from __get_user_pages()	Linus Torvalds	2	-2/+13
	This is an ancient bug that was actually attempted to be fixed once (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race for write access") but that was then undone due to problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug"). In the meantime, the s390 situation has long been fixed, and we can now fix it by checking the pte_dirty() bit properly (and do it better). The s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement software dirty bits") which made it into v3.9. Earlier kernels will have to look at the page state itself. Also, the VM has become more scalable, and what used a purely theoretical race back then has become easier to trigger. To fix it, we introduce a new internal FOLL_COW flag to mark the "yes, we already did a COW" rather than play racy games with FOLL_WRITE that is very fundamental, and then use the pte dirty flag to validate that the FOLL_COW flag is still valid. Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Nick Piggin <npiggin@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-18	pinctrl: intel: Only restore pins that are used by the driver	Mika Westerberg	1	-2/+23
	Dell XPS 13 (and maybe some others) uses a GPIO (CPU_GP_1) during suspend to explicitly disable USB touchscreen interrupt. This is done to prevent situation where the lid is closed the touchscreen is left functional. The pinctrl driver (wrongly) assumes it owns all pins which are owned by host and not locked down. It is perfectly fine for BIOS to use those pins as it is also considered as host in this context. What happens is that when the lid of Dell XPS 13 is closed, the BIOS configures CPU_GP_1 low disabling the touchscreen interrupt. During resume we restore all host owned pins to the known state which includes CPU_GP_1 and this overwrites what the BIOS has programmed there causing the touchscreen to fail as no interrupts are reaching the CPU anymore. Fix this by restoring only those pins we know are explicitly requested by the kernel one way or other. Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=176361 Reported-by: AceLan Kao <acelan.kao@canonical.com> Tested-by: AceLan Kao <acelan.kao@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18	pinctrl: baytrail: Fix lockdep	Ville Syrjälä	1	-1/+2
	Initialize the spinlock before using it. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.8.0-dwc-bisect #4 Hardware name: Intel Corp. VALLEYVIEW C0 PLATFORM/BYT-T FFD8, BIOS BLAKFF81.X64.0088.R10.1403240443 FFD8_X64_R_2014_13_1_00 03/24/2014 0000000000000000 ffff8800788ff770 ffffffff8133d597 0000000000000000 0000000000000000 ffff8800788ff7e0 ffffffff810cfb9e 0000000000000002 ffff8800788ff7d0 ffffffff8205b600 0000000000000002 ffff8800788ff7f0 Call Trace: [<ffffffff8133d597>] dump_stack+0x67/0x90 [<ffffffff810cfb9e>] register_lock_class+0x52e/0x540 [<ffffffff810d2081>] __lock_acquire+0x81/0x16b0 [<ffffffff810cede1>] ? save_trace+0x41/0xd0 [<ffffffff810d33b2>] ? __lock_acquire+0x13b2/0x16b0 [<ffffffff810cf05a>] ? __lock_is_held+0x4a/0x70 [<ffffffff810d3b1a>] lock_acquire+0xba/0x220 [<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80 [<ffffffff81631567>] _raw_spin_lock_irqsave+0x47/0x60 [<ffffffff8136f1fe>] ? byt_gpio_get_direction+0x3e/0x80 [<ffffffff8136f1fe>] byt_gpio_get_direction+0x3e/0x80 [<ffffffff813740a9>] gpiochip_add_data+0x319/0x7d0 [<ffffffff81631723>] ? _raw_spin_unlock_irqrestore+0x43/0x70 [<ffffffff8136fe3b>] byt_pinctrl_probe+0x2fb/0x620 [<ffffffff8142fb0c>] platform_drv_probe+0x3c/0xa0 ... Based on the diff it looks like the problem was introduced in commit 71e6ca61e826 ("pinctrl: baytrail: Register pin control handling") but I wasn't able to verify that empirically as the parent commit just oopsed when I tried to boot it. Cc: Cristina Ciocan <cristina.ciocan@intel.com> Cc: stable@vger.kernel.org Fixes: 71e6ca61e826 ("pinctrl: baytrail: Register pin control handling") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18	pinctrl: aspeed-g5: Fix pin association of SPI1 function	Andrew Jeffery	2	-9/+81
	The SPI1 function was associated with the wrong pins: The functions that those pins provide is either an SPI debug or passthrough function coupled to SPI1. Make the SPI1 mux function configure the relevant pins and associate new SPI1DEBUG and SPI1PASSTHRU functions with the pins that were already defined. The notation used in the datasheet's multi-function pin table for the SoC is often creative: in this case the SYS* signals are enabled by a single bit, which is nothing unusual on its own, but in this case the bit was also participating in a multi-bit bitfield and therefore represented multiple functions. This fact was overlooked in the original patch. Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver) Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18	pinctrl: aspeed-g5: Fix GPIOE1 typo	Andrew Jeffery	1	-1/+1
	This prevented C20 from successfully being muxed as GPIO. Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver) Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18	pinctrl: aspeed-g5: Fix names of GPID2 pins	Andrew Jeffery	1	-6/+6
	Fixes simple typos in the initial commit. There is no behavioural change. Fixes: 56e57cb6c07f (pinctrl: Add pinctrl-aspeed-g5 driver) Reported-by: Xo Wang <xow@google.com> Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18	pinctrl: aspeed: "Not enabled" is a significant mux state	Andrew Jeffery	1	-5/+7
	Consider a scenario with one pin P that has two signals A and B, where A is defined to be higher priority than B: That is, if the mux IP is in a state that would consider both A and B to be active on P, then A will be the active signal. To instead configure B as the active signal we must configure the mux so that A is inactive. The mux state for signals can be described by logical operations on one or more bits from one or more registers (a "signal expression"), which in some cases leads to aliased mux states for a particular signal. Further, signals described by multi-bit bitfields often do not only need to record the states that would make them active (the "enable" expressions), but also the states that makes them inactive (the "disable" expressions). All of this combined leads to four possible states for a signal: 1. A signal is active with respect to an "enable" expression 2. A signal is not active with respect to an "enable" expression 3. A signal is inactive with respect to a "disable" expression 4. A signal is not inactive with respect to a "disable" expression In the case of P, if we are looking to activate B without explicitly having configured A it's enough to consider A inactive if all of A's "enable" signal expressions evaluate to "not active". If any evaluate to "active" then the corresponding "disable" states must be applied so it becomes inactive. For example, on the AST2400 the pins composing GPIO bank H provide signals ROMD8 through ROMD15 (high priority) and those for UART6 (low priority). The mux states for ROMD8 through ROMD15 are aliased, i.e. there are two mux states that result in the respective signals being configured: A. SCU90[6]=1 B. Strap[4,1:0]=100 Further, the second mux state is a 3-bit bitfield that explicitly defines the enabled state but the disabled state is implicit, i.e. if Strap[4,1:0] is not exactly "100" then ROMD8 through ROMD15 are not considered active. This requires the mux function evaluation logic to use approach 2. above, however the existing code was using approach 3. The problem was brought to light on the Palmetto machines where the strap register value is 0x120ce416, and prevented GPIO requests in bank H from succeeding despite the hardware being in a position to allow them. Fixes: 318398c09a8d ("pinctrl: Add core pinctrl support for Aspeed SoCs") Signed-off-by: Andrew Jeffery <andrew@aj.id.au> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2016-10-18	ceph: fix non static symbol warning	Wei Yongjun	1	-2/+2
	Fixes the following sparse warning: fs/ceph/xattr.c:19:28: warning: symbol 'ceph_other_xattr_handler' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>