wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2019-06-20	x86/resctrl: Prevent possible overrun during bitmap operations	Reinette Chatre	1	-19/+16
	While the DOC at the beginning of lib/bitmap.c explicitly states that "The number of valid bits in a given bitmap does _not_ need to be an exact multiple of BITS_PER_LONG.", some of the bitmap operations do indeed access BITS_PER_LONG portions of the provided bitmap no matter the size of the provided bitmap. For example, if find_first_bit() is provided with an 8 bit bitmap the operation will access BITS_PER_LONG bits from the provided bitmap. While the operation ensures that these extra bits do not affect the result, the memory is still accessed. The capacity bitmasks (CBMs) are typically stored in u32 since they can never exceed 32 bits. A few instances exist where a bitmap_* operation is performed on a CBM by simply pointing the bitmap operation to the stored u32 value. The consequence of this pattern is that some bitmap_* operations will access out-of-bounds memory when interacting with the provided CBM. This same issue has previously been addressed with commit 49e00eee0061 ("x86/intel_rdt: Fix out-of-bounds memory access in CBM tests") but at that time not all instances of the issue were fixed. Fix this by using an unsigned long to store the capacity bitmask data that is passed to bitmap functions. Fixes: e651901187ab ("x86/intel_rdt: Introduce "bit_usage" to display cache allocations details") Fixes: f4e80d67a527 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information") Fixes: 95f0b77efa57 ("x86/intel_rdt: Initialize new resource group with sane defaults") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: stable <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/58c9b6081fd9bf599af0dfc01a6fdd335768efef.1560975645.git.reinette.chatre@intel.com
2019-06-19	x86/microcode: Fix the microcode load on CPU hotplug for real	Thomas Gleixner	1	-5/+10
	A recent change moved the microcode loader hotplug callback into the early startup phase which is running with interrupts disabled. It missed that the callbacks invoke sysfs functions which might sleep causing nice 'might sleep' splats with proper debugging enabled. Split the callbacks and only load the microcode in the early startup phase and move the sysfs handling back into the later threaded and preemptible bringup phase where it was before. Fixes: 78f4e932f776 ("x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906182228350.1766@nanos.tec.linutronix.de
2019-06-15	x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback	Borislav Petkov	2	-1/+2
	Adric Blake reported the following warning during suspend-resume: Enabling non-boot CPUs ... x86: Booting SMP configuration: smpboot: Booting Node 0 Processor 1 APIC 0x2 unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) \ at rIP: 0xffffffff8d267924 (native_write_msr+0x4/0x20) Call Trace: intel_set_tfa intel_pmu_cpu_starting ? x86_pmu_dead_cpu x86_pmu_starting_cpu cpuhp_invoke_callback ? _raw_spin_lock_irqsave notify_cpu_starting start_secondary secondary_startup_64 microcode: sig=0x806ea, pf=0x80, revision=0x96 microcode: updated to revision 0xb4, date = 2019-04-01 CPU1 is up The MSR in question is MSR_TFA_RTM_FORCE_ABORT and that MSR is emulated by microcode. The log above shows that the microcode loader callback happens after the PMU restoration, leading to the conjecture that because the microcode hasn't been updated yet, that MSR is not present yet, leading to the #GP. Add a microcode loader-specific hotplug vector which comes before the PERF vectors and thus executes earlier and makes sure the MSR is present. Fixes: 400816f60c54 ("perf/x86/intel: Implement support for TSX Force Abort") Reported-by: Adric Blake <promarbler14@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: x86@kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=203637
2019-06-14	x86/kasan: Fix boot with 5-level paging and KASAN	Andrey Ryabinin	1	-1/+1
	Since commit d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") kernel doesn't boot with KASAN on 5-level paging machines. The bug is actually in early_p4d_offset() and introduced by commit 12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging") early_p4d_offset() tries to convert pgd_val(pgd) value to a physical address. This doesn't make sense because pgd_val() already contains the physical address. It did work prior to commit d52888aa2753 because the result of "__pa_nodebug(pgd_val(pgd)) & PTE_PFN_MASK" was the same as "pgd_val(pgd) & PTE_PFN_MASK". __pa_nodebug() just set some high bits which were masked out by applying PTE_PFN_MASK. After the change of the PAGE_OFFSET offset in commit d52888aa2753 __pa_nodebug(pgd_val(pgd)) started to return a value with more high bits set and PTE_PFN_MASK wasn't enough to mask out all of them. So it returns a wrong not even canonical address and crashes on the attempt to dereference it. Switch back to pgd_val() & PTE_PFN_MASK to cure the issue. Fixes: 12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging") Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: kasan-dev@googlegroups.com Cc: stable@vger.kernel.org Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190614143149.2227-1-aryabinin@virtuozzo.com
2019-06-13	x86/fpu: Don't use current->mm to check for a kthread	Christoph Hellwig	2	-4/+4
	current->mm can be non-NULL if a kthread calls use_mm(). Check for PF_KTHREAD instead to decide when to store user mode FP state. Fixes: 2722146eb784 ("x86/fpu: Remove fpu->initialized") Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Aubrey Li <aubrey.li@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190604175411.GA27477@lst.de
2019-06-12	x86/kgdb: Return 0 from kgdb_arch_set_breakpoint()	Matt Mullins	1	-1/+1
	err must be nonzero in order to reach text_poke(), which caused kgdb to fail to set breakpoints: (gdb) break __x64_sys_sync Breakpoint 1 at 0xffffffff81288910: file ../fs/sync.c, line 124. (gdb) c Continuing. Warning: Cannot insert breakpoint 1. Cannot access memory at address 0xffffffff81288910 Command aborted. Fixes: 86a22057127d ("x86/kgdb: Avoid redundant comparison of patched code") Signed-off-by: Matt Mullins <mmullins@fb.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nadav Amit <namit@vmware.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190531194755.6320-1-mmullins@fb.com
2019-06-12	x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled	Prarit Bhargava	1	-0/+3
	Booting with kernel parameter "rdt=cmt,mbmtotal,memlocal,l3cat,mba" and executing "mount -t resctrl resctrl -o mba_MBps /sys/fs/resctrl" results in a NULL pointer dereference on systems which do not have local MBM support enabled.. BUG: kernel NULL pointer dereference, address: 0000000000000020 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 722 Comm: kworker/0:3 Not tainted 5.2.0-0.rc3.git0.1.el7_UNSUPPORTED.x86_64 #2 Workqueue: events mbm_handle_overflow RIP: 0010:mbm_handle_overflow+0x150/0x2b0 Only enter the bandwith update loop if the system has local MBM enabled. Fixes: de73f38f7680 ("x86/intel_rdt/mba_sc: Feedback loop to dynamically update mem bandwidth") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Reinette Chatre <reinette.chatre@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190610171544.13474-1-prarit@redhat.com
2019-06-12	x86/resctrl: Don't stop walking closids when a locksetup group is found	James Morse	1	-1/+6
	When a new control group is created __init_one_rdt_domain() walks all the other closids to calculate the sets of used and unused bits. If it discovers a pseudo_locksetup group, it breaks out of the loop. This means any later closid doesn't get its used bits added to used_b. These bits will then get set in unused_b, and added to the new control group's configuration, even if they were marked as exclusive for a later closid. When encountering a pseudo_locksetup group, we should continue. This is because "a resource group enters 'pseudo-locked' mode after the schemata is written while the resource group is in 'pseudo-locksetup' mode." When we find a pseudo_locksetup group, its configuration is expected to be overwritten, we can skip it. Fixes: dfe9674b04ff6 ("x86/intel_rdt: Enable entering of pseudo-locksetup mode") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H Peter Avin <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20190603172531.178830-1-james.morse@arm.com
2019-06-10	x86/resctrl: Use _ASM_BX to avoid ifdeffery	Uros Bizjak	1	-5/+1
	Use the _ASM_BX macro which expands to either %rbx or %ebx, depending on the 32-bit or 64-bit config selected. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190606200044.5730-1-ubizjak@gmail.com
2019-06-08	Linux 5.2-rc4	Linus Torvalds	1	-1/+1

2019-06-08	x86/fpu: Update kernel's FPU state before using for the fsave header	Sebastian Andrzej Siewior	1	-0/+5
	In commit 39388e80f9b0c ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()") I removed the statement \| if (ia32_fxstate) \| copy_fxregs_to_kernel(fpu); and argued that it was wrongly merged because the content was already saved in kernel's state. This was wrong: It is required to write it back because it is only saved on the user-stack and save_fsave_header() reads it from task's FPU-state. I missed that part… Save x87 FPU state unless thread's FPU registers are already up to date. Fixes: 39388e80f9b0c ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()") Reported-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Eric Biggers <ebiggers@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: kvm ML <kvm@vger.kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190607142915.y52mfmgk5lvhll7n@linutronix.de
2019-06-08	MAINTAINERS: Karthikeyan Ramasubramanian is MIA	Wolfram Sang	1	-1/+0
	A mail just bounced back with "user unknown": 550 5.1.1 <kramasub@codeaurora.org> User doesn't exist I also couldn't find a more recent address in git history. So, remove this stale entry. Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2019-06-08	i2c: xiic: Add max_read_len quirk	Robert Hancock	1	-0/+5
	This driver does not support reading more than 255 bytes at once because the register for storing the number of bytes to read is only 8 bits. Add a max_read_len quirk to enforce this. This was found when using this driver with the SFP driver, which was previously reading all 256 bytes in the SFP EEPROM in one transaction. This caused a bunch of hard-to-debug errors in the xiic driver since the driver/logic was treating the number of bytes to read as zero. Rejecting transactions that aren't supported at least allows the problem to be diagnosed more easily. Signed-off-by: Robert Hancock <hancock@sedsystems.ca> Reviewed-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org
2019-06-07	x86/mm/KASLR: Compute the size of the vmemmap section properly	Baoquan He	1	-1/+10
	The size of the vmemmap section is hardcoded to 1 TB to support the maximum amount of system RAM in 4-level paging mode - 64 TB. However, 1 TB is not enough for vmemmap in 5-level paging mode. Assuming the size of struct page is 64 Bytes, to support 4 PB system RAM in 5-level, 64 TB of vmemmap area is needed: 4 * 1000^5 PB / 4096 bytes page size * 64 bytes per page struct / 1000^4 TB = 62.5 TB. This hardcoding may cause vmemmap to corrupt the following cpu_entry_area section, if KASLR puts vmemmap very close to it and the actual vmemmap size is bigger than 1 TB. So calculate the actual size of the vmemmap region needed and then align it up to 1 TB boundary. In 4-level paging mode it is always 1 TB. In 5-level it's adjusted on demand. The current code reserves 0.5 PB for vmemmap on 5-level. With this change, the space can be saved and thus used to increase entropy for the randomization. [ bp: Spell out how the 64 TB needed for vmemmap is computed and massage commit message. ] Fixes: eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y") Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Kirill A. Shutemov <kirill@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: kirill.shutemov@linux.intel.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable <stable@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190523025744.3756-1-bhe@redhat.com
2019-06-07	lockref: Limit number of cmpxchg loop retries	Jan Glauber	1	-0/+3
	The lockref cmpxchg loop is unbound as long as the spinlock is not taken. Depending on the hardware implementation of compare-and-swap a high number of loop retries might happen. Add an upper bound to the loop to force the fallback to spinlocks after some time. A retry value of 100 should not impact any hardware that does not have this issue. With the retry limit the performance of an open-close testcase improved between 60-70% on ThunderX2. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jan Glauber <jglauber@marvell.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-07	uaccess: add noop untagged_addr definition	Andrey Konovalov	1	-0/+11
	Architectures that support memory tagging have a need to perform untagging (stripping the tag) in various parts of the kernel. This patch adds an untagged_addr() macro, which is defined as noop for architectures that do not support memory tagging. The oncoming patch series will define it at least for sparc64 and arm64. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-07	x86/insn-eval: Fix use-after-free access to LDT entry	Jann Horn	1	-23/+24
	get_desc() computes a pointer into the LDT while holding a lock that protects the LDT from being freed, but then drops the lock and returns the (now potentially dangling) pointer to its caller. Fix it by giving the caller a copy of the LDT entry instead. Fixes: 670f928ba09b ("x86/insn-eval: Add utility function to get segment descriptor") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-06-08	kbuild: use more portable 'command -v' for cc-cross-prefix	Masahiro Yamada	1	-1/+6
	To print the pathname that will be used by shell in the current environment, 'command -v' is a standardized way. [1] 'which' is also often used in scripts, but it is less portable. When I worked on commit bd55f96fa9fc ("kbuild: refactor cc-cross-prefix implementation"), I was eager to use 'command -v' but it did not work. (The reason is explained below.) I kept 'which' as before but got rid of '> /dev/null 2>&1' as I thought it was no longer needed. Sorry, I was wrong. It works well on my Ubuntu machine, but Alexey Brodkin reports noisy warnings on CentOS7 when 'which' fails to find the given command in the PATH environment. $ which foo which: no foo in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin) Given that behavior of 'which' depends on system (and it may not be installed by default), I want to try 'command -v' once again. The specification [1] clearly describes the behavior of 'command -v' when the given command is not found: Otherwise, no output shall be written and the exit status shall reflect that the name was not found. However, we need a little magic to use 'command -v' from Make. $(shell ...) passes the argument to a subshell for execution, and returns the standard output of the command. Here is a trick. GNU Make may optimize this by executing the command directly instead of forking a subshell, if no shell special characters are found in the command and omitting the subshell will not change the behavior. In this case, no shell special character is used. So, Make will try to run it directly. However, 'command' is a shell-builtin command, then Make would fail to find it in the PATH environment: $ make ARCH=m68k defconfig make: command: Command not found make: command: Command not found make: command: Command not found In fact, Make has a table of shell-builtin commands because it must ask the shell to execute them. Until recently, 'command' was missing in the table. This issue was fixed by the following commit: \| commit 1af314465e5dfe3e8baa839a32a72e83c04f26ef \| Author: Paul Smith <psmith@gnu.org> \| Date: Sun Nov 12 18:10:28 2017 -0500 \| \| * job.c: Add "command" as a known shell built-in. \| \| This is not a POSIX shell built-in but it's common in UNIX shells. \| Reported by Nick Bowler <nbowler@draconx.ca>. Because the latest release is GNU Make 4.2.1 in 2016, this commit is not included in any released versions. (But some distributions may have back-ported it.) We need to trick Make to spawn a subshell. There are various ways to do so: 1) Use a shell special character '~' as dummy $(shell : ~; command -v $(c)gcc) 2) Use a variable reference that always expands to the empty string (suggested by David Laight) $(shell command$${x:+} -v $(c)gcc) 3) Use redirect $(shell command -v $(c)gcc 2>/dev/null) I chose 3) to not confuse people. The stderr would not be polluted anyway, but it will provide extra safety, and is easy to understand. Tested on Make 3.81, 3.82, 4.0, 4.1, 4.2, 4.2.1 [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/command.html Fixes: bd55f96fa9fc ("kbuild: refactor cc-cross-prefix implementation") Cc: linux-stable <stable@vger.kernel.org> # 5.1 Reported-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Alexey Brodkin <abrodkin@synopsys.com>
2019-06-07	s390/unwind: correct stack switching during unwind	Vasily Gorbik	1	-1/+1
	Adjust conditions in on_stack function. That fixes backchain unwinder which was unable to read pt_regs at the very bottom of the stack and hence couldn't follow stacks (e.g. from async stack to a task stack). Fixes: 78c98f907413 ("s390/unwind: introduce stack unwind API") Reported-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-06-07	block, bfq: add weight symlink to the bfq.weight cgroup parameter	Angelo Ruocco	1	-2/+4
	Many userspace tools and services use the proportional-share policy of the blkio/io cgroups controller. The CFQ I/O scheduler implemented this policy for the legacy block layer. To modify the weight of a group in case CFQ was in charge, the 'weight' parameter of the group must be modified. On the other hand, the BFQ I/O scheduler implements the same policy in blk-mq, but, with BFQ, the parameter to modify has a different name: bfq.weight (forced choice until legacy block was present, because two different policies cannot share a common parameter in cgroups). Due to CFQ legacy, most if not all userspace configurations still use the parameter 'weight', and for the moment do not seem likely to be changed. But, when CFQ went away with legacy block, such a parameter ceased to exist. So, a simple workaround has been proposed [1] to make all configurations work: add a symlink, named weight, to bfq.weight. This commit adds such a symlink. [1] https://lkml.org/lkml/2019/4/8/555 Suggested-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-07	cgroup: let a symlink too be created with a cftype file	Angelo Ruocco	2	-4/+32
	This commit enables a cftype to have a symlink (of any name) that points to the file associated with the cftype. Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-07	drm/nouveau/secboot/gp10[2467]: support newer FW to fix SEC2 failures on some boards	Ben Skeggs	5	-6/+18
	Some newer boards with these chipsets aren't compatible with the prior version of the SEC2 FW, and fail to load as a result. This newer FW is actually the one we already use on >=GP108. Unfortunately, there are interface differences in GP108's FW, making it impossible to simply move files around in linux-firmware to solve this. We need to be able to keep compatibility with all linux-firmware/kernel combinations, which means supporting both firmwares. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-06-07	drm/nouveau/secboot: enable loading of versioned LS PMU/SEC2 ACR msgqueue FW	Ben Skeggs	1	-14/+14
	Some chipsets will be switching to updated SEC2 LS firmware, so we need to plumb that through. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-06-07	drm/nouveau/secboot: split out FW version-specific LS function pointers	Ben Skeggs	6	-41/+141
	It's not enough to have per-falcon structures anymore, we have multiple versions of some firmware now that have interface differences. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-06-07	drm/nouveau/secboot: pass max supported FW version to LS load funcs	Ben Skeggs	6	-21/+32
	Will be passed to the FW loader function as an upper bound on the supported FW version to attempt to load. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-06-07	drm/nouveau/core: support versioned firmware loading	Ben Skeggs	2	-6/+31
	We have a need for this now with updated SEC2 LS FW images that have an incompatible interface from the previous version. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-06-07	drm/nouveau/core: pass subdev into nvkm_firmware_get, rather than device	Ben Skeggs	6	-18/+14
	It'd be nice to have FW loading debug messages to appear for the relevant subsystem, when enabled. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-06-06	block: free sched's request pool in blk_cleanup_queue	Ming Lei	6	-6/+52
	In theory, IO scheduler belongs to request queue, and the request pool of sched tags belongs to the request queue too. However, the current tags allocation interfaces are re-used for both driver tags and sched tags, and driver tags is definitely host wide, and doesn't belong to any request queue, same with its request pool. So we need tagset instance for freeing request of sched tags. Meantime, blk_mq_free_tag_set() often follows blk_cleanup_queue() in case of non-BLK_MQ_F_TAG_SHARED, this way requires that request pool of sched tags to be freed before calling blk_mq_free_tag_set(). Commit 47cdee29ef9d94e ("block: move blk_exit_queue into __blk_release_queue") moves blk_exit_queue into __blk_release_queue for simplying the fast path in generic_make_request(), then causes oops during freeing requests of sched tags in __blk_release_queue(). Fix the above issue by move freeing request pool of sched tags into blk_cleanup_queue(), this way is safe becasue queue has been frozen and no any in-queue requests at that time. Freeing sched tags has to be kept in queue's release handler becasue there might be un-completed dispatch activity which might refer to sched tags. Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Fixes: 47cdee29ef9d94e485eb08f962c74943023a5271 ("block: move blk_exit_queue into __blk_release_queue") Tested-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-06	pktgen: do not sleep with the thread lock held.	Paolo Abeni	1	-0/+11
	Currently, the process issuing a "start" command on the pktgen procfs interface, acquires the pktgen thread lock and never release it, until all pktgen threads are completed. The above can blocks indefinitely any other pktgen command and any (even unrelated) netdevice removal - as the pktgen netdev notifier acquires the same lock. The issue is demonstrated by the following script, reported by Matteo: ip -b - <<'EOF' link add type dummy link add type veth link set dummy0 up EOF modprobe pktgen echo reset >/proc/net/pktgen/pgctrl { echo rem_device_all echo add_device dummy0 } >/proc/net/pktgen/kpktgend_0 echo count 0 >/proc/net/pktgen/dummy0 echo start >/proc/net/pktgen/pgctrl & sleep 1 rmmod veth Fix the above releasing the thread lock around the sleep call. Additionally we must prevent racing with forcefull rmmod - as the thread lock no more protects from them. Instead, acquire a self-reference before waiting for any thread. As a side effect, running rmmod pktgen while some thread is running now fails with "module in use" error, before this patch such command hanged indefinitely. Note: the issue predates the commit reported in the fixes tag, but this fix can't be applied before the mentioned commit. v1 -> v2: - no need to check for thread existence after flipping the lock, pktgen threads are freed only at net exit time - Fixes: 6146e6a43b35 ("[PKTGEN]: Removes thread_{un,}lock() macros.") Reported-and-tested-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06	net: mvpp2: Use strscpy to handle stat strings	Maxime Chevallier	1	-2/+2
	Use a safe strscpy call to copy the ethtool stat strings into the relevant buffers, instead of a memcpy that will be accessing out-of-bound data. Fixes: 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06	net: rds: fix memory leak in rds_ib_flush_mr_pool	Zhu Yanjun	1	-4/+6
	When the following tests last for several hours, the problem will occur. Server: rds-stress -r 1.1.1.16 -D 1M Client: rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30 The following will occur. " Starting up.... tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu % 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00 " >From vmcore, we can find that clean_list is NULL. >From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker. Then rds_ib_mr_pool_flush_worker calls " rds_ib_flush_mr_pool(pool, 0, NULL); " Then in function " int rds_ib_flush_mr_pool(struct rds_ib_mr_pool pool, int free_all, struct rds_ib_mr ibmr_ret) " ibmr_ret is NULL. In the source code, " ... list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail); if (ibmr_ret) ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode); /* more than one entry in llist nodes */ if (clean_nodes->next) llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list); ... " When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next instead of clean_nodes is added in clean_list. So clean_nodes is discarded. It can not be used again. The workqueue is executed periodically. So more and more clean_nodes are discarded. Finally the clean_list is NULL. Then this problem will occur. Fixes: 1bc144b62524 ("net, rds, Replace xlist in net/rds/xlist.h with llist") Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06	ipv6: fix EFAULT on sendto with icmpv6 and hdrincl	Olivier Matz	1	-5/+8
	The following code returns EFAULT (Bad address): s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6); setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1); sendto(ipv6_icmp6_packet, addr); /* returns -1, errno = EFAULT */ The IPv4 equivalent code works. A workaround is to use IPPROTO_RAW instead of IPPROTO_ICMPV6. The failure happens because 2 bytes are eaten from the msghdr by rawv6_probe_proto_opt() starting from commit 19e3c66b52ca ("ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt""), but at that time it was not a problem because IPV6_HDRINCL was not yet introduced. Only eat these 2 bytes if hdrincl == 0. Fixes: 715f504b1189 ("ipv6: add IPV6_HDRINCL option for raw sockets") Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06	ipv6: use READ_ONCE() for inet->hdrincl as in ipv4	Olivier Matz	1	-2/+10
	As it was done in commit 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg") and commit 20b50d79974e ("net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the value of inet->hdrincl in a local variable, to avoid introducing a race condition in the next commit. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-06	x86/fpu: Use fault_in_pages_writeable() for pre-faulting	Hugh Dickins	1	-9/+2
	Since commit d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") get_user_pages_unlocked() pre-faults user's memory if a write generates a page fault while the handler is disabled. This works in general and uncovered a bug as reported by Mike Rapoport¹. It has been pointed out that this function may be fragile and a simple pre-fault as in fault_in_pages_writeable() would be a better solution. Better as in taste and simplicity: that write (as performed by the alternative function) performs exactly the same faulting of memory as before. This was suggested by Hugh Dickins and Andrew Morton. Use fault_in_pages_writeable() for pre-faulting user's stack. [ bigeasy: Write commit message. ] [ bp: Massage some. ] ¹ https://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com Fixes: d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails") Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: linux-mm <linux-mm@kvack.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190529072540.g46j4kfeae37a3iu@linutronix.de Link: https://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com
2019-06-06	nvme-rdma: use dynamic dma mapping per command	Max Gurtovoy	1	-17/+36
	Commit 87fd125344d6 ("nvme-rdma: remove redundant reference between ib_device and tagset") caused a kernel panic when disconnecting from an inaccessible controller (disconnect during re-connection). -- nvme nvme0: Removing ctrl: NQN "testnqn1" nvme_rdma: nvme_rdma_exit_request: hctx 0 queue_idx 1 BUG: unable to handle kernel paging request at 0000000080000228 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ... Call Trace: blk_mq_exit_hctx+0x5c/0xf0 blk_mq_exit_queue+0xd4/0x100 blk_cleanup_queue+0x9a/0xc0 nvme_rdma_destroy_io_queues+0x52/0x60 [nvme_rdma] nvme_rdma_shutdown_ctrl+0x3e/0x80 [nvme_rdma] nvme_do_delete_ctrl+0x53/0x80 [nvme_core] nvme_sysfs_delete+0x45/0x60 [nvme_core] kernfs_fop_write+0x105/0x180 vfs_write+0xad/0x1a0 ksys_write+0x5a/0xd0 do_syscall_64+0x55/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fa215417154 -- The reason for this crash is accessing an already freed ib_device for performing dma_unmap during exit_request commands. The root cause for that is that during re-connection all the queues are destroyed and re-created (and the ib_device is reference counted by the queues and freed as well) but the tagset stays alive and all the DMA mappings (that we perform in init_request) kept in the request context. The original commit fixed a different bug that was introduced during bonding (aka nic teaming) tests that for some scenarios change the underlying ib_device and caused memory leakage and possible segmentation fault. This commit is a complementary commit that also changes the wrong DMA mappings that were saved in the request context and making the request sqe dma mappings dynamic with the command lifetime (i.e. mapped in .queue_rq and unmapped in .complete). It also fixes the above crash of accessing freed ib_device during destruction of the tagset. Fixes: 87fd125344d6 ("nvme-rdma: remove redundant reference between ib_device and tagset") Reported-by: Jim Harris <james.r.harris@intel.com> Suggested-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-06-06	nvme: Fix u32 overflow in the number of namespace list calculation	Jaesoo Lee	1	-1/+2
	The Number of Namespaces (nn) field in the identify controller data structure is defined as u32 and the maximum allowed value in NVMe specification is 0xFFFFFFFEUL. This change fixes the possible overflow of the DIV_ROUND_UP() operation used in nvme_scan_ns_list() by casting the nn to u64. Signed-off-by: Jaesoo Lee <jalee@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-06-06	Revert "gfs2: Replace gl_revokes with a GLF flag"	Bob Peterson	6	-31/+15
	Commit 73118ca8baf7 introduced a glock reference counting bug in gfs2_trans_remove_revoke. Given that, replacing gl_revokes with a GLF flag is no longer useful, so revert that commit. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2019-06-06	arm64: Silence gcc warnings about arch ABI drift	Dave Martin	1	-0/+1
	Since GCC 9, the compiler warns about evolution of the platform-specific ABI, in particular relating for the marshaling of certain structures involving bitfields. The kernel is a standalone binary, and of course nobody would be so stupid as to expose structs containing bitfields as function arguments in ABI. (Passing a pointer to such a struct, however inadvisable, should be unaffected by this change. perf and various drivers rely on that.) So these warnings do more harm than good: turn them off. We may miss warnings about future ABI drift, but that's too bad. Future ABI breaks of this class will have to be debugged and fixed the traditional way unless the compiler evolves finer-grained diagnostics. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-06	parisc: Fix crash due alternative coding for NP iopdir_fdc bit	Helge Deller	1	-1/+2
	According to the found documentation, data cache flushes and sync instructions are needed on the PCX-U+ (PA8200, e.g. C200/C240) platforms, while PCX-W (PA8500, e.g. C360) platforms aparently don't need those flushes when changing the IO PDIR data structures. We have no documentation for PCX-W+ (PA8600) and PCX-W2 (PA8700) CPUs, but Carlo Pisani reported that his C3600 machine (PA8600, PCX-W+) fails when the fdc instructions were removed. His firmware didn't set the NIOP bit, so one may assume it's a firmware bug since other C3750 machines had the bit set. Even if documentation (as mentioned above) states that PCX-W (PA8500, e.g. J5000) does not need fdc flushes, Sven could show that an Adaptec 29320A PCI-X SCSI controller reliably failed on a dd command during the first five minutes in his J5000 when fdc flushes were missing. Going forward, we will now NOT replace the fdc and sync assembler instructions by NOPS if: a) the NP iopdir_fdc bit was set by firmware, or b) we find a CPU up to and including a PCX-W+ (PA8600). This fixes the HPMC crashes on a C240 and C36XX machines. For other machines we rely on the firmware to set the bit when needed. In case one finds HPMC issues, people could try to boot their machines with the "no-alternatives" kernel option to turn off any alternative patching. Reported-by: Sven Schnelle <svens@stackframe.org> Reported-by: Carlo Pisani <carlojpisani@gmail.com> Tested-by: Sven Schnelle <svens@stackframe.org> Fixes: 3847dab77421 ("parisc: Add alternative coding infrastructure") Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # 5.0+
2019-06-06	parisc: Use lpa instruction to load physical addresses in driver code	John David Anglin	3	-2/+26
	Most I/O in the kernel is done using the kernel offset mapping. However, there is one API that uses aliased kernel address ranges: > The final category of APIs is for I/O to deliberately aliased address > ranges inside the kernel. Such aliases are set up by use of the > vmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O > subsystem assumes that the user mapping and kernel offset mapping are > the only aliases. This isn't true for vmap aliases, so anything in > the kernel trying to do I/O to vmap areas must manually manage > coherency. It must do this by flushing the vmap range before doing > I/O and invalidating it after the I/O returns. For this reason, we should use the hardware lpa instruction to load the physical address of kernel virtual addresses in the driver code. I believe we only use the vmap/vmalloc API with old PA 1.x processors which don't have a sba, so we don't hit this problem. Tested on c3750, c8000 and rp3440. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06	parisc: configs: Remove useless UEVENT_HELPER_PATH	Krzysztof Kozlowski	7	-7/+0
	Remove the CONFIG_UEVENT_HELPER_PATH because: 1. It is disabled since commit 1be01d4a5714 ("driver: base: Disable CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was made default to 'n', 2. It is not recommended (help message: "This should not be used today [...] creates a high system load") and was kept only for ancient userland, 3. Certain userland specifically requests it to be disabled (systemd README: "Legacy hotplug slows down the system and confuses udev"). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06	parisc: Use implicit space register selection for loading the coherence index of I/O pdirs	John David Anglin	2	-5/+2
	We only support I/O to kernel space. Using %sr1 to load the coherence index may be racy unless interrupts are disabled. This patch changes the code used to load the coherence index to use implicit space register selection. This saves one instruction and eliminates the race. Tested on rp3440, c8000 and c3750. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06	ARM64: trivial: s/TIF_SECOMP/TIF_SECCOMP/ comment typo fix	George G. Davis	1	-1/+1
	Fix a s/TIF_SECOMP/TIF_SECCOMP/ comment typo Cc: Jiri Kosina <trivial@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org Signed-off-by: George G. Davis <george_davis@mentor.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-06-06	drm/komeda: Potential error pointer dereference	Dan Carpenter	1	-1/+1
	We need to check whether drm_atomic_get_crtc_state() returns an error pointer before dereferencing "crtc_st". Fixes: 9e5603094176 ("drm/komeda: Add komeda_plane/plane_helper_funcs") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: "james qian wang (Arm Technology China)" <james.qian.wang@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2019-06-06	drm/komeda: remove set but not used variable 'kcrtc'	YueHaibing	1	-2/+0
	Fixes gcc '-Wunused-but-set-variable' warning: drivers/gpu/drm/arm/display/komeda/komeda_plane.c: In function komeda_plane_atomic_check: drivers/gpu/drm/arm/display/komeda/komeda_plane.c:49:22: warning: variable kcrtc set but not used [-Wunused-but-set-variable] It is never used since introduction in commit 9e5603094176 ("drm/komeda: Add komeda_plane/plane_helper_funcs") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2019-06-06	x86/CPU: Add more Icelake model numbers	Kan Liang	1	-0/+3
	Add the CPUID model numbers of Icelake (ICL) desktop and server processors to the Intel family list. [ Qiuxu: Sort the macros by model number. ] Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com> Cc: rui.zhang@intel.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190603134122.13853-1-kan.liang@linux.intel.com
2019-06-05	hwmon: (pmbus/core) Treat parameters as paged if on multiple pages	Robert Hancock	1	-4/+30
	Some chips have attributes which exist on more than one page but the attribute is not presently marked as paged. This causes the attributes to be generated with the same label, which makes it impossible for userspace to tell them apart. Marking all such attributes as paged would result in the page suffix being added regardless of whether they were present on more than one page or not, which might break existing setups. Therefore, we add a second check which treats the attribute as paged, even if not marked as such, if it is present on multiple pages. Fixes: b4ce237b7f7d ("hwmon: (pmbus) Introduce infrastructure to detect sensors and limit registers") Signed-off-by: Robert Hancock <hancock@sedsystems.ca> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-06-05	hwmon: (pmbus/core) mutex_lock write in pmbus_set_samples	Adamski, Krzysztof (Nokia - PL/Wroclaw)	1	-0/+3
	update_lock is a mutex intended to protect write operations. It was not taken, however, when _pmbus_write_word_data is called from pmbus_set_samples() function which may cause problems especially when some PMBUS_VIRT_* operation is implemented as a read-modify-write cycle. This patch makes sure the lock is held during the operation. Fixes: 49c4455dccf2 ("hwmon: (pmbus) Introduce PMBUS_VIRT_*_SAMPLES registers") Signed-off-by: Krzysztof Adamski <krzysztof.adamski@nokia.com> Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> [groeck: Declared and initialized missing 'data' variable] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-06-05	hwmon: (core) add thermal sensors only if dev->of_node is present	Eduardo Valentin	1	-1/+1
	Drivers may register to hwmon and request for also registering with the thermal subsystem (HWMON_C_REGISTER_TZ). However, some of these driver, e.g. marvell phy, may be probed from Device Tree or being dynamically allocated, and in the later case, it will not have a dev->of_node entry. Registering with hwmon without the dev->of_node may result in different outcomes depending on the device tree, which may be a bit misleading. If the device tree blob has no 'thermal-zones' node, the hwmon_device_register() family functions are going to gracefully succeed, because of-thermal, thermal_zone_of_sensor_register() return -ENODEV in this case, and the hwmon error path handles this error code as success to cover for the case where CONFIG_THERMAL_OF is not set. However, if the device tree blob has the 'thermal-zones' entry, the hwmon_device_register() will always fail on callers with no dev->of_node, propagating -EINVAL. If dev->of_node is not present, calling of-thermal does not make sense. For this reason, this patch checks first if the device has a of_node before going over the process of registering with the thermal subsystem of-thermal interface. And in this case, when a caller of hwmon_device_register*() with HWMON_C_REGISTER_TZ and no dev->of_node will still register with hwmon, but not with the thermal subsystem. If all the hwmon part bits are in place, the registration will succeed. Fixes: d560168b5d0f ("hwmon: (core) New hwmon registration API") Cc: Jean Delvare <jdelvare@suse.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: linux-hwmon@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eduardo Valentin <eduval@amazon.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-06-05	Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"	Hangbin Liu	1	-3/+3
	This reverts commit e9919a24d3022f72bcadc407e73a6ef17093a849. Nathan reported the new behaviour breaks Android, as Android just add new rules and delete old ones. If we return 0 without adding dup rules, Android will remove the new added rules and causing system to soft-reboot. Fixes: e9919a24d302 ("fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Reported-by: Yaro Slav <yaro330@gmail.com> Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>