aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/kvmclock.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-11-02futex: futex_wake_op, do not fail on invalid opJiri Slaby1-2/+10
In commit 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour"), I let FUTEX_WAKE_OP to fail on invalid op. Namely when op should be considered as shift and the shift is out of range (< 0 or > 31). But strace's test suite does this madness: futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee); futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xbadfaced); futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xffffffff); When I pick the first 0xa0caffee, it decodes as: 0x80000000 & 0xa0caffee: oparg is shift 0x70000000 & 0xa0caffee: op is FUTEX_OP_OR 0x0f000000 & 0xa0caffee: cmp is FUTEX_OP_CMP_EQ 0x00fff000 & 0xa0caffee: oparg is sign-extended 0xcaf = -849 0x00000fff & 0xa0caffee: cmparg is sign-extended 0xfee = -18 That means the op tries to do this: (futex |= (1 << (-849))) == -18 which is completely bogus. The new check of op in the code is: if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) { if (oparg < 0 || oparg > 31) return -EINVAL; oparg = 1 << oparg; } which results obviously in the "Invalid argument" errno: FAIL: futex =========== futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee) = -1: Invalid argument futex.test: failed test: ../futex failed with code 1 So let us soften the failure to print only a (ratelimited) message, crop the value and continue as if it were right. When userspace keeps up, we can switch this to return -EINVAL again. [v2] Do not return 0 immediatelly, proceed with the cropped value. Fixes: 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-02MIPS: Update email address for Marcin NowakowskiMarcin Nowakowski3-2/+3
MIPS is no longer part of Imagination Technologies and my @imgtec.com address will soon stop working. Update any files containing my address as well as the .mailmap to point to my new @mips.com address. Signed-off-by: Marcin Nowakowski <marcin.nowakowski@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/17579/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-02License cleanup: add SPDX license identifier to uapi header files with a licenseGreg Kroah-Hartman522-0/+522
Many user space API headers have licensing information, which is either incomplete, badly formatted or just a shorthand for referring to the license under which the file is supposed to be. This makes it hard for compliance tools to determine the correct license. Update these files with an SPDX license identifier. The identifier was chosen based on the license information in the file. GPL/LGPL licensed headers get the matching GPL/LGPL SPDX license identifier with the added 'WITH Linux-syscall-note' exception, which is the officially assigned exception identifier for the kernel syscall exception: NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work". This exception makes it possible to include GPL headers into non GPL code, without confusing license compliance tools. Headers which have either explicit dual licensing or are just licensed under a non GPL license are updated with the corresponding SPDX identifier and the GPLv2 with syscall exception identifier. The format is: ((GPL-2.0 WITH Linux-syscall-note) OR SPDX-ID-OF-OTHER-LICENSE) SPDX license identifiers are a legally binding shorthand, which can be used instead of the full boiler plate text. The update does not remove existing license information as this has to be done on a case by case basis and the copyright holders might have to be consulted. This will happen in a separate step. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. See the previous patch in this series for the methodology of how this patch was researched. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02License cleanup: add SPDX license identifier to uapi header files with no licenseGreg Kroah-Hartman930-0/+930
Many user space API headers are missing licensing information, which makes it hard for compliance tools to determine the correct license. By default are files without license information under the default license of the kernel, which is GPLV2. Marking them GPLV2 would exclude them from being included in non GPLV2 code, which is obviously not intended. The user space API headers fall under the syscall exception which is in the kernels COPYING file: NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work". otherwise syscall usage would not be possible. Update the files which contain no license information with an SPDX license identifier. The chosen identifier is 'GPL-2.0 WITH Linux-syscall-note' which is the officially assigned identifier for the Linux syscall exception. SPDX license identifiers are a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. See the previous patch in this series for the methodology of how this patch was researched. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman11139-0/+11139
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02KEYS: fix out-of-bounds read during ASN.1 parsingEric Biggers1-0/+3
syzkaller with KASAN reported an out-of-bounds read in asn1_ber_decoder(). It can be reproduced by the following command, assuming CONFIG_X509_CERTIFICATE_PARSER=y and CONFIG_KASAN=y: keyctl add asymmetric desc $'\x30\x30' @s The bug is that the length of an ASN.1 data value isn't validated in the case where it is encoded using the short form, causing the decoder to read past the end of the input buffer. Fix it by validating the length. The bug report was: BUG: KASAN: slab-out-of-bounds in asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233 Read of size 1 at addr ffff88003cccfa02 by task syz-executor0/6818 CPU: 1 PID: 6818 Comm: syz-executor0 Not tainted 4.14.0-rc7-00008-g5f479447d983 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0xb3/0x10b lib/dump_stack.c:52 print_address_description+0x79/0x2a0 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x236/0x340 mm/kasan/report.c:409 __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427 asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233 x509_cert_parse+0x1db/0x650 crypto/asymmetric_keys/x509_cert_parser.c:89 x509_key_preparse+0x64/0x7a0 crypto/asymmetric_keys/x509_public_key.c:174 asymmetric_key_preparse+0xcb/0x1a0 crypto/asymmetric_keys/asymmetric_type.c:388 key_create_or_update+0x347/0xb20 security/keys/key.c:855 SYSC_add_key security/keys/keyctl.c:122 [inline] SyS_add_key+0x1cd/0x340 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x447c89 RSP: 002b:00007fca7a5d3bd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000f8 RAX: ffffffffffffffda RBX: 00007fca7a5d46cc RCX: 0000000000447c89 RDX: 0000000020006f4a RSI: 0000000020006000 RDI: 0000000020001ff5 RBP: 0000000000000046 R08: fffffffffffffffd R09: 0000000000000000 R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007fca7a5d49c0 R15: 00007fca7a5d4700 Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder") Cc: <stable@vger.kernel.org> # v3.7+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-11-02KEYS: trusted: fix writing past end of buffer in trusted_read()Eric Biggers1-11/+12
When calling keyctl_read() on a key of type "trusted", if the user-supplied buffer was too small, the kernel ignored the buffer length and just wrote past the end of the buffer, potentially corrupting userspace memory. Fix it by instead returning the size required, as per the documentation for keyctl_read(). We also don't even fill the buffer at all in this case, as this is slightly easier to implement than doing a short read, and either behavior appears to be permitted. It also makes it match the behavior of the "encrypted" key type. Fixes: d00a1c72f7f4 ("keys: add new trusted key-type") Reported-by: Ben Hutchings <ben@decadent.org.uk> Cc: <stable@vger.kernel.org> # v2.6.38+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-11-02KEYS: return full count in keyring_read() if buffer is too smallEric Biggers1-20/+19
Commit e645016abc80 ("KEYS: fix writing past end of user-supplied buffer in keyring_read()") made keyring_read() stop corrupting userspace memory when the user-supplied buffer is too small. However it also made the return value in that case be the short buffer size rather than the size required, yet keyctl_read() is actually documented to return the size required. Therefore, switch it over to the documented behavior. Note that for now we continue to have it fill the short buffer, since it did that before (pre-v3.13) and dump_key_tree_aux() in keyutils arguably relies on it. Fixes: e645016abc80 ("KEYS: fix writing past end of user-supplied buffer in keyring_read()") Reported-by: Ben Hutchings <ben@decadent.org.uk> Cc: <stable@vger.kernel.org> # v3.13+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-11-02net: vrf: correct FRA_L3MDEV encode typeJeff Barnhill1-1/+1
FRA_L3MDEV is defined as U8, but is being added as a U32 attribute. On big endian architecture, this results in the l3mdev entry not being added to the FIB rules. Fixes: 1aa6c4f6b8cd8 ("net: vrf: Add l3mdev rules on first device create") Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02tcp_nv: fix division by zero in tcpnv_acked()Konstantin Khlebnikov1-1/+1
Average RTT could become zero. This happened in real life at least twice. This patch treats zero as 1us. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Lawrence Brakmo <Brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01drm/amdgpu: allow harvesting check for Polaris VCELeo Liu1-6/+6
Fixes init failures on Polaris cards with harvested VCE blocks. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2017-11-01drm/amdgpu: return -ENOENT from uvd 6.0 early init for harvestingLeo Liu1-0/+4
Fixes init failures on polaris cards with harvested UVD. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2017-11-02ARM: add debug ".edata_real" symbolRussell King1-0/+9
Add an additional symbol to the decompressor image, which will allow future debugging of non-bootable problems similar to the one encountered with the EFI stub. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-01MIPS: smp-cmp: Fix vpe_id build errorJames Hogan1-2/+2
The smp-cmp build has been (further) broken since commit 856fbcee6099 ("MIPS: Store core & VP IDs in GlobalNumber-style variable") in v4.14-rc1 like so: arch/mips/kernel/smp-cmp.c: In function ‘cmp_init_secondary’: arch/mips/kernel/smp-cmp.c:53:4: error: ‘struct cpuinfo_mips’ has no member named ‘vpe_id’ c->vpe_id = (read_c0_tcbind() >> TCBIND_CURVPE_SHIFT) & ^ Fix by replacing vpe_id with cpu_set_vpe_id(). Fixes: 856fbcee6099 ("MIPS: Store core & VP IDs in GlobalNumber-style variable") Signed-off-by: James Hogan <jhogan@kernel.org> Reviewed-by: Paul Burton <paul.burton@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17569/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01MAINTAINERS: Update Pistachio platform maintainersJames Hartley1-3/+2
Neither of the current maintainers works for Imagination any more. Removed both imgtec email addresses and added back mine for occasional reviews, also changed from Maintained to Odd Fixes to reflect the time that I will be able to spend on it. Signed-off-by: James Hartley <james.hartley@sondrel.com> Patchwork: https://patchwork.linux-mips.org/patch/17475/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01MIPS: smp-cmp: Use right include for task_structJason A. Donenfeld1-1/+1
When task_struct was moved, this MIPS code was neglected. Evidently nobody is using it anymore. This fixes this build error: In file included from ./arch/mips/include/asm/thread_info.h:15:0, from ./include/linux/thread_info.h:37, from ./include/asm-generic/current.h:4, from ./arch/mips/include/generated/asm/current.h:1, from ./include/linux/sched.h:11, from arch/mips/kernel/smp-cmp.c:22: arch/mips/kernel/smp-cmp.c: In function ‘cmp_boot_secondary’: ./arch/mips/include/asm/processor.h:384:41: error: implicit declaration of function ‘task_stack_page’ [-Werror=implicit-function-declaration] #define __KSTK_TOS(tsk) ((unsigned long)task_stack_page(tsk) + \ ^ arch/mips/kernel/smp-cmp.c:84:21: note: in expansion of macro ‘__KSTK_TOS’ unsigned long sp = __KSTK_TOS(idle); ^~~~~~~~~~ Fixes: f3ac60671954 ("sched/headers: Move task-stack related APIs from <linux/sched.h> to <linux/sched/task_stack.h>") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: <stable@vger.kernel.org> # 4.11+ Patchwork: https://patchwork.linux-mips.org/patch/17522/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01signal: Fix name of SIGEMT in #if defined() checkAndrew Clayton1-1/+1
Commit cc731525f26a ("signal: Remove kernel interal si_code magic") added a check for SIGMET and NSIGEMT being defined. That SIGMET should in fact be SIGEMT, with SIGEMT being defined in arch/{alpha,mips,sparc}/include/uapi/asm/signal.h This was actually pointed out by BenHutchings in a lwn.net comment here https://lwn.net/Comments/734608/ Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic") Signed-off-by: Andrew Clayton <andrew@digital-domain.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-11-01MIPS: Update Goldfish RTC driver maintainer email addressAleksandar Markovic2-1/+2
Change all relevant instances of miodrag.dinic@imgtec.com email address to miodrag.dinic@mips.com. Signed-off-by: Miodrag Dinic <miodrag.dinic@mips.com> Signed-off-by: Aleksandar Markovic <aleksandar.markovic@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/17515/ [jhogan@kernel.org: Fix .mailmap direction] Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01MIPS: Update RINT emulation maintainer email addressAleksandar Markovic2-1/+2
Change all relevant instances of aleksandar.markovic@imgtec.com email address to aleksandar.markovic@mips.com. Signed-off-by: Miodrag Dinic <miodrag.dinic@mips.com> Signed-off-by: Aleksandar Markovic <aleksandar.markovic@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/17514/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01MIPS: CPS: Fix use of current_cpu_data in preemptible codeMatt Redfearn1-1/+1
Commit 1ec9dd80bedc ("MIPS: CPS: Detect CPUs in secondary clusters") added a check in cps_boot_secondary() that the secondary being booted is in the same cluster as the CPU running this code. This check is performed using current_cpu_data without disabling preemption. As such when CONFIG_PREEMPT=y, a BUG is triggered: [ 57.991693] BUG: using smp_processor_id() in preemptible [00000000] code: hotplug/1749 <snip> [ 58.063077] Call Trace: [ 58.065842] [<8040cdb4>] show_stack+0x84/0x114 [ 58.070830] [<80b11b38>] dump_stack+0xf8/0x140 [ 58.075796] [<8079b12c>] check_preemption_disabled+0xec/0x118 [ 58.082204] [<80415110>] cps_boot_secondary+0x84/0x44c [ 58.087935] [<80413a14>] __cpu_up+0x34/0x98 [ 58.092624] [<80434240>] bringup_cpu+0x38/0x114 [ 58.097680] [<80434af0>] cpuhp_invoke_callback+0x168/0x8f0 [ 58.103801] [<804362d0>] _cpu_up+0x154/0x1c8 [ 58.108565] [<804363dc>] do_cpu_up+0x98/0xa8 [ 58.113333] [<808261f8>] device_online+0x84/0xc0 [ 58.118481] [<80826294>] online_store+0x60/0x98 [ 58.123562] [<8062261c>] kernfs_fop_write+0x158/0x1d4 [ 58.129196] [<805a2ae4>] __vfs_write+0x4c/0x168 [ 58.134247] [<805a2dc8>] vfs_write+0xe0/0x190 [ 58.139095] [<805a2fe0>] SyS_write+0x68/0xc4 [ 58.143854] [<80415d58>] syscall_common+0x34/0x58 In reality we don't currently support running the kernel on CPUs not in cluster 0, so the answer to cpu_cluster(&current_cpu_data) will always be 0, even if this task being preempted and continues running on a different CPU. Regardless, the BUG should not be triggered, so fix this by switching to raw_current_cpu_data. When multicluster support lands upstream this check will need removing or changing anyway. Fixes: 1ec9dd80bedc ("MIPS: CPS: Detect CPUs in secondary clusters") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Reviewed-by: Paul Burton <paul.burton@mips.com> CC: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/17563/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01x86/mcelog: Get rid of RCU remnantsBorislav Petkov1-94/+27
Jeremy reported a suspicious RCU usage warning in mcelog. /dev/mcelog is called in process context now as part of the notifier chain and doesn't need any of the fancy RCU and lockless accesses which it did in atomic context. Axe it all in favor of a simple mutex synchronization which cures the problem reported. Fixes: 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver") Reported-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-and-tested-by: Tony Luck <tony.luck@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: linux-edac@vger.kernel.org Cc: Laura Abbott <labbott@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171101164754.xzzmskl4ngrqc5br@pd.tnic Link: https://bugzilla.redhat.com/show_bug.cgi?id=1498969
2017-11-01watchdog/hardlockup/perf: Use atomics to track in-use cpu counterDon Zickus1-3/+5
Guenter reported: There is still a problem. When running echo 6 > /proc/sys/kernel/watchdog_thresh echo 5 > /proc/sys/kernel/watchdog_thresh repeatedly, the message NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. stops after a while (after ~10-30 iterations, with fluctuations). Maybe watchdog_cpus needs to be atomic ? That's correct as this again is affected by the asynchronous nature of the smpboot thread unpark mechanism. CPU 0 CPU1 CPU2 write(watchdog_thresh, 6) stop() park() update() start() unpark() thread->unpark() cnt++; write(watchdog_thresh, 5) thread->unpark() stop() park() thread->park() cnt--; cnt++; update() start() unpark() That's not a functional problem, it just affects the informational message. Convert watchdog_cpus to atomic_t to prevent the problem Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20171101181126.j727fqjmdthjz4xk@redhat.com
2017-11-01watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")Thomas Gleixner1-2/+5
Guenter reported a crash in the watchdog/perf code, which is caused by cleanup() and enable() running concurrently. The reason for this is: The watchdog functions are serialized via the watchdog_mutex and cpu hotplug locking, but the enable of the perf based watchdog happens in context of the unpark callback of the smpboot thread. But that unpark function is not synchronous inside the locking. The unparking of the thread just wakes it up and leaves so there is no guarantee when the thread is executing. If it starts running _before_ the cleanup happened then it will create a event and overwrite the dead event pointer. The new event is then cleaned up because the event is marked dead. lock(watchdog_mutex); lockup_detector_reconfigure(); cpus_read_lock(); stop(); park() update(); start(); unpark() cpus_read_unlock(); thread runs() overwrite dead event ptr cleanup(); free new event, which is active inside perf.... unlock(watchdog_mutex); The park side is safe as that actually waits for the thread to reach parked state. Commit a33d44843d45 removed the protection against this kind of scenario under the stupid assumption that the hotplug serialization and the watchdog_mutex cover everything. Bring it back. Reverts: a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Feels-stupid Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos
2017-11-01ARM: 8716/1: pass endianness info to sparseLuc Van Oostenryck1-0/+2
ARM depends on the macros '__ARMEL__' & '__ARMEB__' being defined or not to correctly select or define endian-specific macros, structures or pieces of code. These macros are predefined by the compiler but sparse knows nothing about them and thus may pre-process files differently from what gcc would. Fix this by passing '-D__ARMEL__' or '-D__ARMEB__' to sparse, depending on the endianness of the kernel, like defined by GCC. Note: In most case it won't change anything since most ARMs use little-endian (but an allyesconfig would use big-endian!). To: Russell King <linux@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-01drm/i915: Check incoming alignment for unfenced buffers (on i915gm)Chris Wilson1-0/+4
In case the object has changed tiling between calls to execbuf, we need to check if the existing offset inside the GTT matches the new tiling constraint. We even need to do this for "unfenced" tiled objects, where the 3D commands use an implied fence and so the object still needs to match the physical fence restrictions on alignment (only required for gen2 and early gen3). In commit 2889caa92321 ("drm/i915: Eliminate lots of iterations over the execobjects array"), the idea was to remove the second guessing and only set the NEEDS_MAP flag when required. However, the entire check for an unusable offset for fencing was removed and not just the secondary check. I.e. /* avoid costly ping-pong once a batch bo ended up non-mappable */ if (entry->flags & __EXEC_OBJECT_NEEDS_MAP && !i915_vma_is_map_and_fenceable(vma)) return !only_mappable_for_reloc(entry->flags); was entirely removed as the ping-pong between execbuf passes was fixed, but its primary purpose in forcing unaligned unfenced access to be rebound was forgotten. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103502 Fixes: 2889caa92321 ("drm/i915: Eliminate lots of iterations over the execobjects array") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171031103607.17836-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (cherry picked from commit 1d033beb20d6d5885587a02a393b6598d766a382) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-11-01x86/mm: fix use-after-free of vma during userfaultfd faultVlastimil Babka1-1/+10
Syzkaller with KASAN has reported a use-after-free of vma->vm_flags in __do_page_fault() with the following reproducer: mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0) mmap(&(0x7f0000011000/0x3000)=nil, 0x3000, 0x1, 0x32, 0xffffffffffffffff, 0x0) r0 = userfaultfd(0x0) ioctl$UFFDIO_API(r0, 0xc018aa3f, &(0x7f0000002000-0x18)={0xaa, 0x0, 0x0}) ioctl$UFFDIO_REGISTER(r0, 0xc020aa00, &(0x7f0000019000)={{&(0x7f0000012000/0x2000)=nil, 0x2000}, 0x1, 0x0}) r1 = gettid() syz_open_dev$evdev(&(0x7f0000013000-0x12)="2f6465762f696e7075742f6576656e742300", 0x0, 0x0) tkill(r1, 0x7) The vma should be pinned by mmap_sem, but handle_userfault() might (in a return to userspace scenario) release it and then acquire again, so when we return to __do_page_fault() (with other result than VM_FAULT_RETRY), the vma might be gone. Specifically, per Andrea the scenario is "A return to userland to repeat the page fault later with a VM_FAULT_NOPAGE retval (potentially after handling any pending signal during the return to userland). The return to userland is identified whenever FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in vmf->flags" However, since commit a3c4fb7c9c2e ("x86/mm: Fix fault error path using unsafe vma pointer") there is a vma_pkey() read of vma->vm_flags after that point, which can thus become use-after-free. Fix this by moving the read before calling handle_mm_fault(). Reported-by: syzbot <bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Fixes: 3c4fb7c9c2e ("x86/mm: Fix fault error path using unsafe vma pointer") Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-01ide:ide-cd: fix kernel panic resulting from missing scsi_req_initHongxu Jia1-0/+1
Since we split the scsi_request out of struct request, while the standard prep_rq_fn builds 10 byte cmds, it missed to invoke scsi_req_init() to initialize certain fields of a scsi_request structure (.__cmd[], .cmd, .cmd_len and .sense_len but no other members of struct scsi_request). An example panic on virtual machines (qemu/virtualbox) to boot from IDE cdrom: ... [ 8.754381] Call Trace: [ 8.755419] blk_peek_request+0x182/0x2e0 [ 8.755863] blk_fetch_request+0x1c/0x40 [ 8.756148] ? ktime_get+0x40/0xa0 [ 8.756385] do_ide_request+0x37d/0x660 [ 8.756704] ? cfq_group_service_tree_add+0x98/0xc0 [ 8.757011] ? cfq_service_tree_add+0x1e5/0x2c0 [ 8.757313] ? ktime_get+0x40/0xa0 [ 8.757544] __blk_run_queue+0x3d/0x60 [ 8.757837] queue_unplugged+0x2f/0xc0 [ 8.758088] blk_flush_plug_list+0x1f4/0x240 [ 8.758362] blk_finish_plug+0x2c/0x40 ... [ 8.770906] RIP: ide_cdrom_prep_fn+0x63/0x180 RSP: ffff92aec018bae8 [ 8.772329] ---[ end trace 6408481e551a85c9 ]--- ... Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-11-01mmc: dw_mmc: Fix the DTO timeout calculationDouglas Anderson1-1/+6
Just like the CTO timeout calculation introduced recently, the DTO timeout calculation was incorrect. It used "bus_hz" but, as far as I can tell, it's supposed to use the card clock. Let's account for the div value, which is documented as 2x the value stored in the register, or 1 if the register is 0. NOTE: This was likely not terribly important until commit 16a34574c6ca ("mmc: dw_mmc: remove the quirks flags") landed because "DIV" is documented on Rockchip SoCs (the ones that used to define the quirk) to always be 0 or 1. ...and, in fact, it's documented to only be 1 with EMMC in 8-bit DDR52 mode. Thus before the quirk was applied to everyone it was mostly OK to ignore the DIV value. I haven't personally observed any problems that are fixed by this patch but I also haven't tested this anywhere with a DIV other an 0. AKA: this problem was found simply by code inspection and I have no failing test cases that are fixed by it. Presumably this could fix real bugs for someone out there, though. Fixes: 16a34574c6ca ("mmc: dw_mmc: remove the quirks flags") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-11-01tcp: fix tcp_mtu_probe() vs highest_sackEric Dumazet2-4/+5
Based on SNMP values provided by Roman, Yuchung made the observation that some crashes in tcp_sacktag_walk() might be caused by MTU probing. Looking at tcp_mtu_probe(), I found that when a new skb was placed in front of the write queue, we were not updating tcp highest sack. If one skb is freed because all its content was copied to the new skb (for MTU probing), then tp->highest_sack could point to a now freed skb. Bad things would then happen, including infinite loops. This patch renames tcp_highest_sack_combine() and uses it from tcp_mtu_probe() to fix the bug. Note that I also removed one test against tp->sacked_out, since we want to replace tp->highest_sack regardless of whatever condition, since keeping a stale pointer to freed skb is a recipe for disaster. Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Reported-by: Roman Gushchin <guro@fb.com> Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01ipv6: addrconf: increment ifp refcount before ipv6_del_addr()Eric Dumazet1-0/+1
In the (unlikely) event fixup_permanent_addr() returns a failure, addrconf_permanent_addr() calls ipv6_del_addr() without the mandatory call to in6_ifa_hold(), leading to a refcount error, spotted by syzkaller : WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50 lib/refcount.c:227 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-20171009+ #33 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:52 panic+0x1e4/0x41c kernel/panic.c:181 __warn+0x1c4/0x1e0 kernel/panic.c:544 report_bug+0x211/0x2d0 lib/bug.c:183 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178 do_trap_no_signal arch/x86/kernel/traps.c:212 [inline] do_trap+0x260/0x390 arch/x86/kernel/traps.c:261 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905 RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227 RSP: 0018:ffff8801ca49e680 EFLAGS: 00010286 RAX: 000000000000002c RBX: ffff8801d07cfcdc RCX: 0000000000000000 RDX: 000000000000002c RSI: 1ffff10039493c90 RDI: ffffed0039493cc4 RBP: ffff8801ca49e688 R08: ffff8801ca49dd70 R09: 0000000000000000 R10: ffff8801ca49df58 R11: 0000000000000000 R12: 1ffff10039493cd9 R13: ffff8801ca49e6e8 R14: ffff8801ca49e7e8 R15: ffff8801d07cfcdc __in6_ifa_put include/net/addrconf.h:369 [inline] ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208 addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline] addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393 notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697 call_netdevice_notifiers net/core/dev.c:1715 [inline] __dev_notify_flags+0x15d/0x430 net/core/dev.c:6843 dev_change_flags+0xf5/0x140 net/core/dev.c:6879 do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113 rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661 rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301 netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313 netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline] netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299 netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049 __sys_sendmsg+0xe5/0x210 net/socket.c:2083 SYSC_sendmsg net/socket.c:2094 [inline] SyS_sendmsg+0x2d/0x50 net/socket.c:2090 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x7fa9174d3320 RSP: 002b:00007ffe302ae9e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffe302b2ae0 RCX: 00007fa9174d3320 RDX: 0000000000000000 RSI: 00007ffe302aea20 RDI: 0000000000000016 RBP: 0000000000000082 R08: 0000000000000000 R09: 000000000000000f R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe302b32a0 R13: 0000000000000000 R14: 00007ffe302b2ab8 R15: 00007ffe302b32b8 Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01tun/tap: sanitize TUNSETSNDBUF inputCraig Gallek2-0/+6
Syzkaller found several variants of the lockup below by setting negative values with the TUNSETSNDBUF ioctl. This patch adds a sanity check to both the tun and tap versions of this ioctl. watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:2389] Modules linked in: irq event stamp: 329692056 hardirqs last enabled at (329692055): [<ffffffff824b8381>] _raw_spin_unlock_irqrestore+0x31/0x75 hardirqs last disabled at (329692056): [<ffffffff824b9e58>] apic_timer_interrupt+0x98/0xb0 softirqs last enabled at (35659740): [<ffffffff824bc958>] __do_softirq+0x328/0x48c softirqs last disabled at (35659731): [<ffffffff811c796c>] irq_exit+0xbc/0xd0 CPU: 0 PID: 2389 Comm: repro Not tainted 4.14.0-rc7 #23 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff880009452140 task.stack: ffff880006a20000 RIP: 0010:_raw_spin_lock_irqsave+0x11/0x80 RSP: 0018:ffff880006a27c50 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10 RAX: ffff880009ac68d0 RBX: ffff880006a27ce0 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff880006a27ce0 RDI: ffff880009ac6900 RBP: ffff880006a27c60 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 000000000063ff00 R12: ffff880009ac6900 R13: ffff880006a27cf8 R14: 0000000000000001 R15: ffff880006a27cf8 FS: 00007f4be4838700(0000) GS:ffff88000cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020101000 CR3: 0000000009616000 CR4: 00000000000006f0 Call Trace: prepare_to_wait+0x26/0xc0 sock_alloc_send_pskb+0x14e/0x270 ? remove_wait_queue+0x60/0x60 tun_get_user+0x2cc/0x19d0 ? __tun_get+0x60/0x1b0 tun_chr_write_iter+0x57/0x86 __vfs_write+0x156/0x1e0 vfs_write+0xf7/0x230 SyS_write+0x57/0xd0 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x7f4be4356df9 RSP: 002b:00007ffc18101c08 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4be4356df9 RDX: 0000000000000046 RSI: 0000000020101000 RDI: 0000000000000005 RBP: 00007ffc18101c40 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000293 R12: 0000559c75f64780 R13: 00007ffc18101d30 R14: 0000000000000000 R15: 0000000000000000 Fixes: 33dccbb050bb ("tun: Limit amount of queued packets per device") Fixes: 20d29d7a916a ("net: macvtap driver") Signed-off-by: Craig Gallek <kraig@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01mlxsw: i2c: Fix buffer increment counter for write transactionVadim Pasternak1-1/+1
It fixes a problem for the last chunk where 'chunk_size' is smaller than MLXSW_I2C_BLK_MAX and data is copied to the wrong offset, overriding previous data. Fixes: 6882b0aee180 ("mlxsw: Introduce support for I2C bus") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01netfilter: nf_reject_ipv4: Fix use-after-free in send_resetTejaswi Tanikella1-0/+2
niph is not updated after pskb_expand_head changes the skb head. It still points to the freed data, which is then used to update tot_len and checksum. This could cause use-after-free poison crash. Update niph, if ip_route_me_harder does not fail. This only affects the interaction with REJECT targets and br_netfilter. Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-11-01futex: Fix more put_pi_state() vs. exit_pi_state_list() racesPeter Zijlstra1-3/+20
Dmitry (through syzbot) reported being able to trigger the WARN in get_pi_state() and a use-after-free on: raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock); Both are due to this race: exit_pi_state_list() put_pi_state() lock(&curr->pi_lock) while() { pi_state = list_first_entry(head); hb = hash_futex(&pi_state->key); unlock(&curr->pi_lock); dec_and_test(&pi_state->refcount); lock(&hb->lock) lock(&pi_state->pi_mutex.wait_lock) // uaf if pi_state free'd lock(&curr->pi_lock); .... unlock(&curr->pi_lock); get_pi_state(); // WARN; refcount==0 The problem is we take the reference count too late, and don't allow it being 0. Fix it by using inc_not_zero() and simply retrying the loop when we fail to get a refcount. In that case put_pi_state() should remove the entry from the list. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Gratian Crisan <gratian.crisan@ni.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: dvhart@infradead.org Cc: syzbot <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@syzkaller.appspotmail.com> Cc: syzkaller-bugs@googlegroups.com Cc: <stable@vger.kernel.org> Fixes: c74aef2d06a9 ("futex: Fix pi_state->owner serialization") Link: http://lkml.kernel.org/r/20171031101853.xpfh72y643kdfhjs@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-01powerpc/kprobes: Dereference function pointers only if the address does not belong to kernel textNaveen N. Rao1-1/+6
This makes the changes introduced in commit 83e840c770f2c5 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols") to be specific to the kprobe subsystem. We previously changed ppc_function_entry() to always check the provided address to confirm if it needed to be dereferenced. This is actually only an issue for kprobe blacklisted asm labels (through use of _ASM_NOKPROBE_SYMBOL) and can cause other issues with ftrace. Also, the additional checks are not really necessary for our other uses. As such, move this check to the kprobes subsystem. Fixes: 83e840c770f2 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-01Revert "powerpc64/elfv1: Only dereference function descriptor for non-text symbols"Naveen N. Rao1-9/+1
This reverts commit 83e840c770f2c5 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols"). Chandan reported that on newer kernels, trying to enable function_graph tracer on ppc64 (BE) locks up the system with the following trace: Unable to handle kernel paging request for data at address 0x600000002fa30010 Faulting instruction address: 0xc0000000001f1300 Thread overran stack, or stack corrupted Oops: Kernel access of bad area, sig: 11 [#1] BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries Modules linked in: CPU: 1 PID: 6586 Comm: bash Not tainted 4.14.0-rc3-00162-g6e51f1f-dirty #20 task: c000000625c07200 task.stack: c000000625c07310 NIP: c0000000001f1300 LR: c000000000121cac CTR: c000000000061af8 REGS: c000000625c088c0 TRAP: 0380 Not tainted (4.14.0-rc3-00162-g6e51f1f-dirty) MSR: 8000000000001032 <SF,ME,IR,DR,RI> CR: 28002848 XER: 00000000 CFAR: c0000000001f1320 SOFTE: 0 ... NIP [c0000000001f1300] .__is_insn_slot_addr+0x30/0x90 LR [c000000000121cac] .kernel_text_address+0x18c/0x1c0 Call Trace: [c000000625c08b40] [c0000000001bd040] .is_module_text_address+0x20/0x40 (unreliable) [c000000625c08bc0] [c000000000121cac] .kernel_text_address+0x18c/0x1c0 [c000000625c08c50] [c000000000061960] .prepare_ftrace_return+0x50/0x130 [c000000625c08cf0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34 [c000000625c08d60] [c000000000121b40] .kernel_text_address+0x20/0x1c0 [c000000625c08df0] [c000000000061960] .prepare_ftrace_return+0x50/0x130 ... [c000000625c0ab30] [c000000000061960] .prepare_ftrace_return+0x50/0x130 [c000000625c0abd0] [c000000000061b10] .ftrace_graph_caller+0x14/0x34 [c000000625c0ac40] [c000000000121b40] .kernel_text_address+0x20/0x1c0 [c000000625c0acd0] [c000000000061960] .prepare_ftrace_return+0x50/0x130 [c000000625c0ad70] [c000000000061b10] .ftrace_graph_caller+0x14/0x34 [c000000625c0ade0] [c000000000121b40] .kernel_text_address+0x20/0x1c0 This is because ftrace is using ppc_function_entry() for obtaining the address of return_to_handler() in prepare_ftrace_return(). The call to kernel_text_address() itself gets traced and we end up in a recursive loop. Fixes: 83e840c770f2 ("powerpc64/elfv1: Only dereference function descriptor for non-text symbols") Cc: stable@vger.kernel.org # v4.13+ Reported-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-01mlxsw: reg: Add high and low temperature thresholdsIdo Schimmel1-0/+25
The ASIC has the ability to generate events whenever a sensor indicates the temperature goes above or below its high or low thresholds, respectively. In new firmware versions the firmware enforces a minimum of 5 degrees Celsius difference between both thresholds. Make the driver conform to this requirement. Note that this is required even when the events are disabled, as in certain systems interrupts are generated via GPIO based on these thresholds. Fixes: 85926f877040 ("mlxsw: reg: Add definition of temperature management registers") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01MAINTAINERS: Remove Yotam from mlxfwYuval Mintz1-1/+1
Provide a mailing list for maintenance of the module instead. Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01MAINTAINERS: Update Yotam's E-mailYotam Gigi4-5/+5
For the time being I will be available in my private mail. Update both the MAINTAINERS file and the individual modules MODULE_AUTHOR directive with the new address. Signed-off-by: Yotam Gigi <yotam.gi@gmail.com> Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01net: hns: set correct return valuePan Bian1-2/+2
The function of_parse_phandle() returns a NULL pointer if it cannot resolve a phandle property to a device_node pointer. In function hns_nic_dev_probe(), its return value is passed to PTR_ERR to extract the error code. However, in this case, the extracted error code will always be zero, which is unexpected. Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01net: lapbether: fix double freePan Bian1-1/+0
The function netdev_priv() returns the private data of the device. The memory to store the private data is allocated in alloc_netdev() and is released in netdev_free(). Calling kfree() on the return value of netdev_priv() after netdev_free() results in a double free bug. Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01bpf: remove SK_REDIRECT from UAPIJohn Fastabend3-7/+13
Now that SK_REDIRECT is no longer a valid return code. Remove it from the UAPI completely. Then do a namespace remapping internal to sockmap so SK_REDIRECT is no longer externally visible. Patchs primary change is to do a namechange from SK_REDIRECT to __SK_REDIRECT Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01net: phy: marvell: Only configure RGMII delays when using RGMIIAndrew Lunn1-3/+5
The fix 5987feb38aa5 ("net: phy: marvell: logical vs bitwise OR typo") uncovered another bug in the Marvell PHY driver, which broke the Marvell OpenRD platform. It relies on the bootloader configuring the RGMII delays and does not specify a phy-mode in its device tree. The PHY driver should only configure RGMII delays if the phy mode indicates it is using RGMII. Without anything in device tree, the mv643xx Ethernet driver defaults to GMII. Fixes: 5987feb38aa5 ("net: phy: marvell: logical vs bitwise OR typo") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-01MIPS: SMP: Fix deadlock & online raceMatt Redfearn1-6/+16
Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of deadlock when bringing CPUs online") and thus has reinstated the possibility of deadlock. The commit was based on testing of kernel v4.4, where the CPU hotplug core code issued a BUG() if the starting CPU is not marked online when the boot CPU returns from __cpu_up. The commit fixes this race (in v4.4), but re-introduces the deadlock situation. As noted in the commit message, upstream differs in this area. Commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up") adds a completion event in the CPU hotplug core code, making this race impossible. However, people were unhappy with relying on the core code to do the right thing. To address the issues both commits were trying to fix, add a second completion event in the MIPS smp hotplug path. It removes the possibility of a race, since the MIPS smp hotplug code now synchronises both the boot and secondary CPUs before they return to the hotplug core code. It also addresses the deadlock by ensuring that the secondary CPU is not marked online before it's counters are synchronised. This fix should also be backported to fix the race condition introduced by the backport of commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of deadlock when bringing CPUs online"), through really that race only existed before commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up"). Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Fixes: 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask") CC: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Cc: <stable@vger.kernel.org> # v4.1+: 8f46cca1e6c0: "MIPS: SMP: Fix possibility of deadlock when bringing CPUs online" Cc: <stable@vger.kernel.org> # v4.1+: a00eeede507c: "MIPS: SMP: Use a completion event to signal CPU up" Cc: <stable@vger.kernel.org> # v4.1+: 6f542ebeaee0: "MIPS: Fix race on setting and getting cpu_online_mask" Cc: <stable@vger.kernel.org> # v4.1+ Patchwork: https://patchwork.linux-mips.org/patch/17376/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01MIPS: bpf: Fix a typo in build_one_insn()Wei Yongjun1-1/+1
Fix a typo in build_one_insn(). Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: <stable@vger.kernel.org> # 4.13+ Patchwork: https://patchwork.linux-mips.org/patch/17491/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01MIPS: microMIPS: Fix incorrect mask in insn_table_MMGustavo A. R. Silva1-1/+1
It seems that this is a typo error and the proper bit masking is "RT | RS" instead of "RS | RS". This issue was detected with the help of Coccinelle. Fixes: d6b3314b49e1 ("MIPS: uasm: Add lh uam instruction") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: James Hogan <jhogan@kernel.org> Cc: <stable@vger.kernel.org> # 3.16+ Patchwork: https://patchwork.linux-mips.org/patch/17551/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-10-31MIPS: Fix CM region target definitionsPaul Burton1-2/+2
The default CM target field in the GCR_BASE register is encoded with 0 meaning memory & 1 being reserved. However the definitions we use for those bits effectively get these two values backwards - likely because they were copied from the definitions for the CM regions where the target is encoded differently. This results in use setting up GCR_BASE with the reserved target value by default, rather than targeting memory as intended. Although we currently seem to get away with this it's not a great idea to rely upon. Fix this by changing our macros to match the documentated target values. The incorrect encoding became used as of commit 9f98f3dd0c51 ("MIPS: Add generic CM probe & access code") in the Linux v3.15 cycle, and was likely carried forwards from older but unused code introduced by commit 39b8d5254246 ("[MIPS] Add support for MIPS CMP platform.") in the v2.6.26 cycle. Fixes: 9f98f3dd0c51 ("MIPS: Add generic CM probe & access code") Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Matt Redfearn <matt.redfearn@mips.com> Reviewed-by: James Hogan <jhogan@kernel.org> Cc: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # v3.15+ Patchwork: https://patchwork.linux-mips.org/patch/17562/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-10-31MIPS: generic: Fix compilation error from include asm/mips-cpc.hMatt Redfearn2-2/+2
Commit e83f7e02af50c ("MIPS: CPS: Have asm/mips-cps.h include CM & CPC headers") adds a #error to arch/mips/include/asm/mips-cpc.h if it is included directly. While this commit replaced almost all direct includes of mips-cm.h and mips-cpc.h, 2 remain. With some defconfigs, mips-cps.h is indirectly included before mips-cpc.h, but in others this results in compilation errors: In file included from arch/mips/generic/init.c:23:0: ./arch/mips/include/asm/mips-cpc.h:12:3: error: #error Please include asm/mips-cps.h rather than asm/mips-cpc.h # error Please include asm/mips-cps.h rather than asm/mips-cpc.h In file included from arch/mips/kernel/smp.c:23:0: ./arch/mips/include/asm/mips-cpc.h:12:3: error: #error Please include asm/mips-cps.h rather than asm/mips-cpc.h # error Please include asm/mips-cps.h rather than asm/mips-cpc.h In both cases, fix this by including mips-cps.h instead. Fixes: e83f7e02af50c ("MIPS: CPS: Have asm/mips-cps.h include CM & CPC headers") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/17492/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-10-31MIPS: Fix exception entry when CONFIG_EVA enabledMatt Redfearn1-4/+4
Commit 9fef68686317b ("MIPS: Make SAVE_SOME more standard") made several changes to the order in which registers are saved in the SAVE_SOME macro, used by exception handlers to save the processor state. In particular, it removed the move k1, sp in the delay slot of the branch testing if the processor is already in kernel mode. This is replaced later in the macro by a move k0, sp When CONFIG_EVA is disabled, this instruction actually appears in the delay slot of the branch. However, when CONFIG_EVA is enabled, instead the RPS workaround of MFC0 k0, CP0_ENTRYHI appears in the delay slot. This results in k0 not containing the stack pointer, but some unrelated value, which is then saved to the kernel stack. On exit from the exception, this bogus value is restored to the stack pointer, resulting in an OOPS. Fix this by moving the save of SP in k0 explicitly in the delay slot of the branch, outside of the CONFIG_EVA section, restoring the expected instruction ordering when CONFIG_EVA is active. Fixes: 9fef68686317b ("MIPS: Make SAVE_SOME more standard") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Reported-by: Vladimir Kondratiev <vladimir.kondratiev@intel.com> Reviewed-by: Corey Minyard <cminyard@mvista.com> Reviewed-by: James Hogan <jhogan@kernel.org> Patchwork: https://patchwork.linux-mips.org/patch/17471/ Signed-off-by: James Hogan <jhogan@kernel.org>
2017-11-01irqchip/irq-mvebu-gicp: Add missing spin_lock initAntoine Tenart1-0/+1
A spin lock is used in the irq-mvebu-gicp driver, but it is never initialized. This patch adds the missing spin_lock_init() call in the driver's probe function. Fixes: a68a63cb4dfc ("irqchip/irq-mvebu-gicp: Add new driver for Marvell GICP") Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: gregory.clement@free-electrons.com Acked-by: marc.zyngier@arm.com Cc: thomas.petazzoni@free-electrons.com Cc: andrew@lunn.ch Cc: jason@lakedaemon.net Cc: nadavh@marvell.com Cc: miquel.raynal@free-electrons.com Cc: linux-arm-kernel@lists.infradead.org Cc: sebastian.hesselbarth@gmail.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20171025072326.21030-1-antoine.tenart@free-electrons.com