linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-11-30	s390/unwind: add stack pointer alignment sanity checks	Miroslav Benes	2	-0/+8
	ABI requires SP to be aligned 8 bytes, report unwinding error otherwise. Link: https://lkml.kernel.org/r/20191106095601.29986-5-mbenes@suse.cz Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: filter out unreliable bogus %r14	Vasily Gorbik	1	-0/+5
	Currently unwinder unconditionally returns %r14 from the first frame pointed by %r15 from pt_regs. A task could be interrupted when a function already allocated this frame (if it needs it) for its callees or to store local variables. In that case this frame would contain random values from stack or values stored there by a callee. As we are only interested in %r14 to get potential return address, skip bogus return addresses which doesn't belong to kernel text. This helps to avoid duplicating filtering logic in unwider users, most of which use unwind_get_return_address() and would choke on bogus 0 address returned by it otherwise. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: start unwinding from reliable state	Vasily Gorbik	2	-17/+31
	A comment in arch/s390/include/asm/unwind.h says: > If 'first_frame' is not zero unwind_start skips unwind frames until it > reaches the specified stack pointer. > The end of the unwinding is indicated with unwind_done, this can be true > right after unwind_start, e.g. with first_frame!=0 that can not be found. > unwind_next_frame skips to the next frame. > Once the unwind is completed unwind_error() can be used to check if there > has been a situation where the unwinder could not correctly understand > the tasks call chain. With this change backchain unwinder now comply with behaviour described. As well as matches orc unwinder implementation. Now unwinder starts from reliable state, i.e. __unwind_start own stack frame is taken or stack frame generated by __switch_to (ksp) - both known to be valid. In case of pt_regs %r15 is better match for pt_regs psw, than sometimes random "sp" caller passed. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/test_unwind: add program check context tests	Vasily Gorbik	1	-0/+47
	Add unwinding from program check handler tests. Unwinder should be able to unwind through pt_regs stored by program check handler on task stack. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/test_unwind: add irq context tests	Vasily Gorbik	1	-0/+45
	Add unwinding from irq context tests. Unwinder should be able to unwind through irq stack to task stack up to task pt_regs. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/test_unwind: print verbose unwinding results	Vasily Gorbik	2	-2/+11
	Add stack name, sp and reliable information into test unwinding results. Also consider ip outside of kernel text as failure if the state is reported reliable. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/test_unwind: add CALL_ON_STACK tests	Vasily Gorbik	1	-7/+19
	Add CALL_ON_STACK helper testing. Tests make sure that we can unwind from switched stack to original one up to task pt_regs (nodat -> task stack). UWM_SWITCH_STACK could not be used together with UWM_THREAD because get_stack_info explicitly restricts unwinding to task stack if task != current. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390: fix register clobbering in CALL_ON_STACK	Vasily Gorbik	1	-2/+2
	CALL_ON_STACK defines and initializes register variables. Inline assembly which follows might trigger compiler to generate memory access for "stack" argument (e.g. in case of S390_lowcore.nodat_stack). This memory access produces a function call under kasan with outline instrumentation which clobbers registers. Switch "stack" argument in CALL_ON_STACK helper to use memory reference constraint and perform load instead. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/test_unwind: require that unwinding ended successfully	Vasily Gorbik	1	-0/+4
	Currently unwinder test passes if unwinding results contain unwindme_func2 and unwindme_func1 functions. Now that unwinder reports success upon reaching task pt_regs, check that unwinding ended successfully in every test. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: add a test for the internal API	Ilya Leoshkevich	3	-0/+248
	unwind_for_each_frame can take at least 8 different sets of parameters. Add a test to make sure they all are handled in a sane way. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Co-developed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: always inline get_stack_pointer	Vasily Gorbik	1	-2/+2
	Always inline get_stack_pointer() to avoid potential problems due to compiler inlining decisions, i.e. getting stack pointer of get_stack_pointer() itself which is later reused. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/pci: add error message on device number limit	Niklas Schnelle	1	-0/+2
	The config option CONFIG_PCI_NR_FUNCTIONS sets a limit on the number of PCI functions we can support. Previously on reaching this limit there was no indication why newly attached devices are not recognized by Linux which could be quite confusing. Thus this patch adds a pr_err() for this case. Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/pci: add error message for UID collision	Niklas Schnelle	1	-0/+3
	When UID checking was turned off during runtime in the underlying hypervisor, a PCI device may be attached with the same UID. This is already detected but happens silently. Add an error message so it can more easily be understood why a device was not added. Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/cpum_sf: Check for SDBT and SDB consistency	Thomas Richter	1	-2/+15
	Each SBDT is located at a 4KB page and contains 512 entries. Each entry of a SDBT points to a SDB, a 4KB page containing sampled data. The last entry is a link to another SDBT page. When an event is created the function sequence executed is: __hw_perf_event_init() +--> allocate_buffers() +--> realloc_sampling_buffers() +---> alloc_sample_data_block() Both functions realloc_sampling_buffers() and alloc_sample_data_block() allocate pages and the allocation can fail. This is handled correctly and all allocated pages are freed and error -ENOMEM is returned to the top calling function. Finally the event is not created. Once the event has been created, the amount of initially allocated SDBT and SDB can be too low. This is detected during measurement interrupt handling, where the amount of lost samples is calculated. If the number of lost samples is too high considering sampling frequency and already allocated SBDs, the number of SDBs is enlarged during the next execution of cpumsf_pmu_enable(). If more SBDs need to be allocated, functions realloc_sampling_buffers() +---> alloc-sample_data_block() are called to allocate more pages. Page allocation may fail and the returned error is ignored. A SDBT and SDB setup already exists. However the modified SDBTs and SDBs might end up in a situation where the first entry of an SDBT does not point to an SDB, but another SDBT, basicly an SBDT without payload. This can not be handled by the interrupt handler, where an SDBT must have at least one entry pointing to an SBD. Add a check to avoid SDBTs with out payload (SDBs) when enlarging the buffer setup. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/cpum_sf: Use TEAR_REG macro consistantly	Thomas Richter	1	-8/+1
	The macro TEAR_REG() saves the last used SDBT address in the perf_hw_event structure. This is also done by function hw_reset_registers() which is a one-liner and simply uses macro TEAR_REG(). Remove function hw_reset_registers(), which is only used one time and use macro TEAR_REG() instead. This macro is used throughout the code anyway. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/cpum_sf: Remove unnecessary check for pending SDBs	Thomas Richter	1	-2/+1
	In interrupt handling the function extend_sampling_buffer() is called after checking for a possibly extension. This check is not necessary as the called function itself performs this check again. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/cpum_sf: Replace function name in debug statements	Thomas Richter	2	-52/+57
	Replace hard coded function names in debug statements by the "%s ...", __func__ construct suggested by checkpatch.pl script. Use consistent debug print format of the form variable blank value. Also add leading 0x for all hex values. Print allocated page addresses consistantly as hex numbers with leading 0x. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/kaslr: store KASLR offset for early dumps	Gerald Schaefer	2	-1/+6
	The KASLR offset is added to vmcoreinfo in arch_crash_save_vmcoreinfo(), so that it can be found by crash when processing kernel dumps. However, arch_crash_save_vmcoreinfo() is called during a subsys_initcall, so if the kernel crashes before that, we have no vmcoreinfo and no KASLR offset. Fix this by storing the KASLR offset in the lowcore, where the vmcore_info pointer will be stored, and where it can be found by crash. In order to make it distinguishable from a real vmcore_info pointer, mark it as uneven (KASLR offset itself is aligned to THREAD_SIZE). When arch_crash_save_vmcoreinfo() stores the real vmcore_info pointer in the lowcore, it overwrites the KASLR offset. At that point, the KASLR offset is not yet added to vmcoreinfo, so we also need to move the mem_assign_absolute() behind the vmcoreinfo_append_str(). Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: stop gracefully at task pt_regs	Vasily Gorbik	1	-1/+7
	Consider reaching task pt_regs graceful unwinder termination. Task pt_regs itself never contains a valid state to which a task might return within the kernel context (user task pt_regs is a special case). Since we already avoid printing user task pt_regs and in most cases we don't even bother filling task pt_regs psw and r15 with something reasonable simply skip task pt_regs altogether. With this change unwind_error() now accurately represent whether unwinder reached task pt_regs successfully or failed along the way. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/head64: correct init_task stack setup	Vasily Gorbik	1	-1/+1
	Add missing allocation of pt_regs at the bottom of the stack. This makes it consistent with other stack setup cases and also what stack unwinder expects. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: make reuse_sp default when unwinding pt_regs	Vasily Gorbik	2	-15/+7
	Currently unwinder yields 2 entries when pt_regs are met: sp="address of pt_regs itself" ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" And neither of those 2 states (combination of sp and ip) ever happened. reuse_sp has been introduced by commit a1d863ac3e10 ("s390/unwind: fix mixing regs and sp"). reuse_sp=true makes unwinder keen to produce the following result, when pt_regs are given (as an arg to unwind_start): sp=pt_regs->gprs[15] ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" The first state is an actual state in which a task was when pt_regs were collected. The second state is marked unreliable and is for debugging purposes to cover the case when a task has been interrupted in between stack frame allocation and writing back_chain - in this case r14 might show an actual caller. Make unwinder behaviour enabled via reuse_sp=true default and drop the special case handling. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: report an error if pt_regs are not on stack	Vasily Gorbik	1	-1/+1
	If unwinder is looking at pt_regs which is not on stack then something went wrong and an error has to be reported rather than successful unwinding termination. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390: avoid misusing CALL_ON_STACK for task stack setup	Vasily Gorbik	3	-9/+13
	CALL_ON_STACK is intended to be used for temporary stack switching with potential return to the caller. When CALL_ON_STACK is misused to switch from nodat stack to task stack back_chain information would later lead stack unwinder from task stack into (per cpu) nodat stack which is reused for other purposes. This would yield confusing unwinding result or errors. To avoid that introduce CALL_ON_STACK_NORETURN to be used instead. It makes sure that back_chain is zeroed and unwinder finishes gracefully ending up at task pt_regs. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390: correct CALL_ON_STACK back_chain saving	Vasily Gorbik	1	-1/+14
	Currently CALL_ON_STACK saves r15 as back_chain in the first stack frame of the stack we about to switch to. But if a function which uses CALL_ON_STACK calls other function it allocates a stack frame for a callee. In this case r15 is pointing to a callee stack frame and not a stack frame of function itself. This results in dummy unwinding entry with random sp and ip values. Introduce and utilize current_frame_address macro to get an address of actual function stack frame. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/unwind: unify task is current checks	Vasily Gorbik	3	-6/+3
	Avoid mixture of task == NULL and task == current meaning the same thing and simply always initialize task with current in unwind_start. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390: disable preemption when switching to nodat stack with CALL_ON_STACK	Vasily Gorbik	2	-3/+11
	Make sure preemption is disabled when temporary switching to nodat stack with CALL_ON_STACK helper, because nodat stack is per cpu. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390: always inline disabled_wait	Vasily Gorbik	1	-1/+1
	disabled_wait uses _THIS_IP_ and assumes that compiler would inline it. Make sure this assumption is always correct by utilizing __always_inline. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/vdso: fix getcpu	Heiko Carstens	4	-10/+14
	getcpu reads the required values for cpu and node with two instructions. This might lead to an inconsistent result if user space gets preempted and migrated to a different CPU between the two instructions. Fix this by using just a single instruction to read both values at once. This is currently rather a theoretical bug, since there is no real NUMA support available (except for NUMA emulation). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/smp,vdso: fix ASCE handling	Heiko Carstens	1	-0/+5
	When a secondary CPU is brought up it must initialize its control registers. CPU A which triggers that a secondary CPU B is brought up stores its control register contents into the lowcore of new CPU B, which then loads these values on startup. This is problematic in various ways: the control register which contains the home space ASCE will correctly contain the kernel ASCE; however control registers for primary and secondary ASCEs are initialized with whatever values were present in CPU A. Typically: - the primary ASCE will contain the user process ASCE of the process that triggered onlining of CPU B. - the secondary ASCE will contain the percpu VDSO ASCE of CPU A. Due to lazy ASCE handling we may also end up with other combinations. When then CPU B switches to a different process (!= idle) it will fixup the primary ASCE. However the problem is that the (wrong) ASCE from CPU A was loaded into control register 1: as soon as an ASCE is attached (aka loaded) a CPU is free to generate TLB entries using that address space. Even though it is very unlikey that CPU B will actually generate such entries, this could result in TLB entries of the address space of the process that ran on CPU A. These entries shouldn't exist at all and could cause problems later on. Furthermore the secondary ASCE of CPU B will not be updated correctly. This means that processes may see wrong results or even crash if they access VDSO data on CPU B. The correct VDSO ASCE will eventually be loaded on return to user space as soon as the kernel executed a call to strnlen_user or an atomic futex operation on CPU B. Fix both issues by intializing the to be loaded control register contents with the correct ASCEs and also enforce (re-)loading of the ASCEs upon first context switch and return to user space. Fixes: 0aaba41b58bc ("s390: remove all code using the access register mode") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390/zcrypt: handle new reply code FILTERED_BY_HYPERVISOR	Harald Freudenberger	1	-0/+2
	This patch introduces support for a new architectured reply code 0x8B indicating that a hypervisor layer (if any) has rejected an ap message. Linux may run as a guest on top of a hypervisor like zVM or KVM. So the crypto hardware seen by the ap bus may be restricted by the hypervisor for example only a subset like only clear key crypto requests may be supported. Other requests will be filtered out - rejected by the hypervisor. The new reply code 0x8B will appear in such cases and needs to get recognized by the ap bus and zcrypt device driver zoo. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-30	s390: implement perf_arch_fetch_caller_regs	Ilya Leoshkevich	1	-0/+7
	On s390 bpf_get_stack_raw_tp() returns 0 entries for both kernel and user stacks. While there is no practical unwinding solution for userspace on s390 at this moment, there certainly is a kernel unwinder. However, it is not properly integrated with BPF. In order to start unwinding, bpf_get_stack_raw_tp() obtains the current kernel register values using perf_fetch_caller_regs(), which is not implemented for s390. The actual unwinding then happens by passing those registers to perf_callchain_kernel(). Implement perf_arch_fetch_caller_regs() for s390, where __builtin_frame_address(0) points to back_chain. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-25	vfs: properly and reliably lock f_pos in fdget_pos()	Linus Torvalds	3	-8/+2
	fdget_pos() is used by file operations that will read and update f_pos: things like "read()", "write()" and "lseek()" (but not, for example, "pread()/pwrite" that get their file positions elsewhere). However, it had two separate escape clauses for this, because not everybody wants or needs serialization of the file position. The first and most obvious case is the "file descriptor doesn't have a position at all", ie a stream-like file. Except we didn't actually use FMODE_STREAM, but instead used FMODE_ATOMIC_POS. The reason for that was that FMODE_STREAM didn't exist back in the days, but also that we didn't want to mark all the special cases, so we only marked the ones that _required_ position atomicity according to POSIX - regular files and directories. The case one was intentionally lazy, but now that we _do_ have FMODE_STREAM we could and should just use it. With the change to use FMODE_STREAM, there are no remaining uses for FMODE_ATOMIC_POS, and all the code to set it is deleted. Any cases where we don't want the serialization because the driver (or subsystem) doesn't use the file position should just be updated to do "stream_open()". We've done that for all the obvious and common situations, we may need a few more. Quoting Kirill Smelkov in the original FMODE_STREAM thread (see link below for full email): "And I appreciate if people could help at least somehow with "getting rid of mixed case entirely" (i.e. always lock f_pos_lock on !FMODE_STREAM), because this transition starts to diverge from my particular use-case too far. To me it makes sense to do that transition as follows: - convert nonseekable_open -> stream_open via stream_open.cocci; - audit other nonseekable_open calls and convert left users that truly don't depend on position to stream_open; - extend stream_open.cocci to analyze alloc_file_pseudo as well (this will cover pipes and sockets), or maybe convert pipes and sockets to FMODE_STREAM manually; - extend stream_open.cocci to analyze file_operations that use no_llseek or noop_llseek, but do not use nonseekable_open or alloc_file_pseudo. This might find files that have stream semantic but are opened differently; - extend stream_open.cocci to analyze file_operations whose .read/.write do not use ppos at all (independently of how file was opened); - ... - after that remove FMODE_ATOMIC_POS and always take f_pos_lock if !FMODE_STREAM; - gather bug reports for deadlocked read/write and convert missed cases to FMODE_STREAM, probably extending stream_open.cocci along the road to catch similar cases i.e. always take f_pos_lock unless a file is explicitly marked as being stream, and try to find and cover all files that are streams" We have not done the "extend stream_open.cocci to analyze alloc_file_pseudo" as well, but the previous commit did manually handle the case of pipes and sockets. The other case where we can avoid locking f_pos is the "this file descriptor only has a single user and it is us, and thus there is no need to lock it". The second test was correct, although a bit subtle and worth just re-iterating here. There are two kinds of other sources of references to the same file descriptor: file descriptors that have been explicitly shared across fork() or with dup(), and file tables having elevated reference counts due to threading (or explicit file sharing with clone()). The first case would have incremented the file count explicitly, and in the second case the previous __fdget() would have incremented it for us and set the FDPUT_FPUT flag. But in both cases the file count would be greater than one, so the "file_count(file) > 1" test catches both situations. Also note that if file_count is 1, that also means that no other thread can have access to the file table, so there also cannot be races with concurrent calls to dup()/fork()/clone() that would increment the file count any other way. Link: https://lore.kernel.org/linux-fsdevel/20190413184404.GA13490@deco.navytux.spb.ru Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Eic Dumazet <edumazet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marco Elver <elver@google.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Paul McKenney <paulmck@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-25	vfs: mark pipes and sockets as stream-like file descriptors	Linus Torvalds	2	-2/+5
	In commit 3975b097e577 ("convert stream-like files -> stream_open, even if they use noop_llseek") Kirill used a coccinelle script to change "nonseekable_open()" to "stream_open()", which changed the trivial cases of stream-like file descriptors to the new model with FMODE_STREAM. However, the two big cases - sockets and pipes - don't actually have that trivial pattern at all, and were thus never converted to FMODE_STREAM even though it makes lots of sense to do so. That's particularly true when looking forward to the next change: getting rid of FMODE_ATOMIC_POS entirely, and just using FMODE_STREAM to decide whether f_pos updates are needed or not. And if they are, we'll always do them atomically. This came up because KCSAN (correctly) noted that the non-locked f_pos updates are data races: they are clearly benign for the case where we don't care, but it would be good to just not have that issue exist at all. Note that the reason we used FMODE_ATOMIC_POS originally is that only doing it for the minimal required case is "safer" in that it's possible that the f_pos locking can cause unnecessary serialization across the whole write() call. And in the worst case, that kind of serialization can cause deadlock issues: think writers that need readers to empty the state using the same file descriptor. [ Note that the locking is per-file descriptor - because it protects "f_pos", which is obviously per-file descriptor - so it only affects cases where you literally use the same file descriptor to both read and write. So a regular pipe that has separate reading and writing file descriptors doesn't really have this situation even though it's the obvious case of "reader empties what a bit writer concurrently fills" But we want to make pipes as being stream-line anyway, because we don't want the unnecessary overhead of locking, and because a named pipe can be (ab-)used by reading and writing to the same file descriptor. ] There are likely a lot of other cases that might want FMODE_STREAM, and looking for ".llseek = no_llseek" users and other cases that don't have an lseek file operation at all and making them use "stream_open()" might be a good idea. But pipes and sockets are likely to be the two main cases. Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Eic Dumazet <edumazet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marco Elver <elver@google.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Paul McKenney <paulmck@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-24	Linux 5.4	Linus Torvalds	1	-1/+1

2019-11-23	cramfs: fix usage on non-MTD device	Maxime Bizon	1	-2/+2
	When both CONFIG_CRAMFS_MTD and CONFIG_CRAMFS_BLOCKDEV are enabled, if we fail to mount on MTD, we don't try on block device. Note: this relies upon cramfs_mtd_fill_super() leaving no side effects on fc state in case of failure; in general, failing get_tree_...() does not mean "fine to try again"; e.g. parsed options might've been consumed by fill_super callback and freed on failure. Fixes: 74f78fc5ef43 ("vfs: Convert cramfs to use the new mount API") Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-22	Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"	Lyude Paul	1	-1/+0
	This reverts commit 68b9c5066e39af41d3448abfc887c77ce22dd64d. Ugh, I really dropped the ball on this one :\. So as it turns out RMI4 works perfectly fine on the X1 Extreme Gen 2 except for one thing I didn't notice because I usually use the trackpoint: clicking with the touchpad. Somehow this is broken, in fact we don't even seem to indicate BTN_LEFT as a valid event type for the RMI4 touchpad. And, I don't even see any RMI4 events coming from the touchpad when I press down on it. This only seems to work for PS/2 mode. Since that means we have a regression, and PS/2 mode seems to work fine for the time being - revert this for now. We'll have to do a more thorough investigation on this. Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://lore.kernel.org/r/20191119234534.10725-1-lyude@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-11-22	afs: Fix large file support	Marc Dionne	1	-0/+1
	By default s_maxbytes is set to MAX_NON_LFS, which limits the usable file size to 2GB, enforced by the vfs. Commit b9b1f8d5930a ("AFS: write support fixes") added support for the 64-bit fetch and store server operations, but did not change this value. As a result, attempts to write past the 2G mark result in EFBIG errors: $ dd if=/dev/zero of=foo bs=1M count=1 seek=2048 dd: error writing 'foo': File too large Set s_maxbytes to MAX_LFS_FILESIZE. Fixes: b9b1f8d5930a ("AFS: write support fixes") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22	afs: Fix possible assert with callbacks from yfs servers	Marc Dionne	1	-1/+0
	Servers sending callback breaks to the YFS_CM_SERVICE service may send up to YFSCBMAX (1024) fids in a single RPC. Anything over AFSCBMAX (50) will cause the assert in afs_break_callbacks to trigger. Remove the assert, as the count has already been checked against the appropriate max values in afs_deliver_cb_callback and afs_deliver_yfs_cb_callback. Fixes: 35dbfba3111a ("afs: Implement the YFS cache manager service") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22	r8152: avoid to call napi_disable twice	Hayes Wang	1	-8/+20
	Call napi_disable() twice would cause dead lock. There are three situations may result in the issue. 1. rtl8152_pre_reset() and set_carrier() are run at the same time. 2. Call rtl8152_set_tunable() after rtl8152_close(). 3. Call rtl8152_set_ringparam() after rtl8152_close(). For #1, use the same solution as commit 84811412464d ("r8152: Re-order napi_disable in rtl8152_close"). For #2 and #3, add checking the flag of IFF_UP and using napi_disable/napi_enable during mutex. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22	MAINTAINERS: Add myself as maintainer of virtio-vsock	Stefano Garzarella	1	-0/+1
	Since I'm actively working on vsock and virtio/vhost transports, Stefan suggested to help him to maintain it. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22	udp: drop skb extensions before marking skb stateless	Florian Westphal	2	-5/+28
	Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free assumes all skb head state has been dropped already. This will leak the extension memory in case the skb has extensions other than the ipsec secpath, e.g. bridge nf data. To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have extensions or if the extension space can be free'd. Fixes: 895b5c9f206eb7d25dc1360a ("netfilter: drop bridge nf reset from nf_reset") Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: Byron Stanoszek <gandalf@winds.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22	net: rtnetlink: prevent underflows in do_setvfinfo()	Dan Carpenter	1	-1/+22
	The "ivm->vf" variable is a u32, but the problem is that a number of drivers cast it to an int and then forget to check for negatives. An example of this is in the cxgb4 driver. drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 2890 static int cxgb4_mgmt_get_vf_config(struct net_device dev, 2891 int vf, struct ifla_vf_info ivi) ^^^^^^ 2892 { 2893 struct port_info pi = netdev_priv(dev); 2894 struct adapter adap = pi->adapter; 2895 struct vf_info *vfinfo; 2896 2897 if (vf >= adap->num_vfs) ^^^^^^^^^^^^^^^^^^^ 2898 return -EINVAL; 2899 vfinfo = &adap->vfinfo[vf]; ^^^^^^^^^^^^^^^^^^^^^^^^^^ There are 48 functions affected. drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646' drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646' Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22	mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()	Andrey Ryabinin	1	-7/+7
	It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in remove_stable_node() when it races with __mmput() and squeezes in between ksm_exit() and exit_mmap(). WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150 Call Trace: remove_all_stable_nodes+0x12b/0x330 run_store+0x4ef/0x7b0 kernfs_fop_write+0x200/0x420 vfs_write+0x154/0x450 ksys_write+0xf9/0x1d0 do_syscall_64+0x99/0x510 entry_SYSCALL_64_after_hwframe+0x49/0xbe Remove the warning as there is nothing scary going on. Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com Fixes: cbf86cfe04a6 ("ksm: remove old stable nodes more thoroughly") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22	mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()	David Hildenbrand	1	-3/+13
	Let's limit shrinking to !ZONE_DEVICE so we can fix the current code. We should never try to touch the memmap of offline sections where we could have uninitialized memmaps and could trigger BUGs when calling page_to_nid() on poisoned pages. There is no reliable way to distinguish an uninitialized memmap from an initialized memmap that belongs to ZONE_DEVICE, as we don't have anything like SECTION_IS_ONLINE we can use similar to pfn_to_online_section() for !ZONE_DEVICE memory. E.g., set_zone_contiguous() similarly relies on pfn_to_online_section() and will therefore never set a ZONE_DEVICE zone consecutive. Stopping to shrink the ZONE_DEVICE therefore results in no observable changes, besides /proc/zoneinfo indicating different boundaries - something we can totally live with. Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug"), the memmap was initialized with 0 and the node with the right value. So the zone might be wrong but not garbage. After that commit, both the zone and the node will be garbage when touching uninitialized memmaps. Toshiki reported a BUG (race between delayed initialization of ZONE_DEVICE memmaps without holding the memory hotplug lock and concurrent zone shrinking). https://lkml.org/lkml/2019/11/14/1040 "Iteration of create and destroy namespace causes the panic as below: kernel BUG at mm/page_alloc.c:535! CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0 Call Trace: memmap_init_zone_device+0x165/0x17c memremap_pages+0x4c1/0x540 devm_memremap_pages+0x1d/0x60 pmem_attach_disk+0x16b/0x600 [nd_pmem] nvdimm_bus_probe+0x69/0x1c0 really_probe+0x1c2/0x3e0 driver_probe_device+0xb4/0x100 device_driver_attach+0x4f/0x60 bind_store+0xc9/0x110 kernfs_fop_write+0x116/0x190 vfs_write+0xa5/0x1a0 ksys_write+0x59/0xd0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 While creating a namespace and initializing memmap, if you destroy the namespace and shrink the zone, it will initialize the memmap outside the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page) in set_pfnblock_flags_mask()." This BUG is also mitigated by this commit, where we for now stop to shrink the ZONE_DEVICE zone until we can do it in a safe and clean way. Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Damian Tometzki <damian.tometzki@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jun Yao <yaojun8558363@gmail.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pankaj Gupta <pagupta@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: Rich Felker <dalias@libc.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> [4.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22	Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"	Joseph Qi	1	-23/+33
	This reverts commit 56e94ea132bb5c2c1d0b60a6aeb34dcb7d71a53d. Commit 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()") introduces a regression that fail to create directory with mount option user_xattr and acl. Actually the reported NULL pointer dereference case can be correctly handled by loc->xl_ops->xlo_add_entry(), so revert it. Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com Fixes: 56e94ea132bb ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()") Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reported-by: Thomas Voegtle <tv@lio96.de> Acked-by: Changwei Ge <gechangwei@live.cn> Cc: Jia-Ju Bai <baijiaju1990@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22	can: m_can_platform: remove unnecessary m_can_class_resume() call	Pankaj Sharma	1	-2/+0
	The function m_can_runtime_resume() is getting recursively called from m_can_class_resume(). This results in a lock up. We need not call m_can_class_resume() during m_can_runtime_resume(). Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com> Signed-off-by: Sriram Dash <sriram.dash@samsung.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-22	can: m_can_platform: set net_device structure as driver data	Pankaj Sharma	1	-1/+1
	The current code is failing during clock prepare enable because of not getting proper clock from platform device. [ 0.852089] Call trace: [ 0.854516] 0xffff0000fa22a668 [ 0.857638] clk_prepare+0x20/0x34 [ 0.861019] m_can_runtime_resume+0x2c/0xe4 [ 0.865180] pm_generic_runtime_resume+0x28/0x38 [ 0.869770] __rpm_callback+0x16c/0x1bc [ 0.873583] rpm_callback+0x24/0x78 [ 0.877050] rpm_resume+0x428/0x560 [ 0.880517] __pm_runtime_resume+0x7c/0xa8 [ 0.884593] m_can_clk_start.isra.9.part.10+0x1c/0xa8 [ 0.889618] m_can_class_register+0x138/0x370 [ 0.893950] m_can_plat_probe+0x120/0x170 [ 0.897939] platform_drv_probe+0x4c/0xa0 [ 0.901924] really_probe+0xd8/0x31c [ 0.905477] driver_probe_device+0x58/0xe8 [ 0.909551] device_driver_attach+0x68/0x70 [ 0.913711] __driver_attach+0x9c/0xf8 [ 0.917437] bus_for_each_dev+0x50/0xa0 [ 0.921251] driver_attach+0x20/0x28 [ 0.924804] bus_add_driver+0x148/0x1fc [ 0.928617] driver_register+0x6c/0x124 [ 0.932431] __platform_driver_register+0x48/0x50 [ 0.937113] m_can_plat_driver_init+0x18/0x20 [ 0.941446] do_one_initcall+0x4c/0x19c [ 0.945259] kernel_init_freeable+0x1d0/0x280 [ 0.949591] kernel_init+0x10/0x100 [ 0.953057] ret_from_fork+0x10/0x18 [ 0.956614] Code: 00000000 00000000 00000000 00000000 (fa22a668) [ 0.962681] ---[ end trace 881f71bd609de763 ]--- [ 0.967301] Kernel panic - not syncing: Attempted to kill init! A device driver for CAN controller hardware registers itself with the Linux network layer as a network device. So, the driver data for m_can should ideally be of type net_device. Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com> Signed-off-by: Sriram Dash <sriram.dash@samsung.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-11-21	hv_netvsc: Fix send_table offset in case of a host bug	Haiyang Zhang	1	-2/+14
	If negotiated NVSP version <= NVSP_PROTOCOL_VERSION_6, the offset may be wrong (too small) due to a host bug. This can cause missing the end of the send indirection table, and add multiple zero entries from leading zeros before the data region. This bug adds extra burden on channel 0. So fix the offset by computing it from the data structure sizes. This will ensure netvsc driver runs normally on unfixed hosts, and future fixed hosts. Fixes: 5b54dac856cb ("hyperv: Add support for virtual Receive Side Scaling (vRSS)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21	hv_netvsc: Fix offset usage in netvsc_send_table()	Haiyang Zhang	2	-9/+20
	To reach the data region, the existing code adds offset in struct nvsp_5_send_indirect_table on the beginning of this struct. But the offset should be based on the beginning of its container, struct nvsp_message. This bug causes the first table entry missing, and adds an extra zero from the zero pad after the data region. This can put extra burden on the channel 0. So, correct the offset usage. Also add a boundary check to ensure not reading beyond data region. Fixes: 5b54dac856cb ("hyperv: Add support for virtual Receive Side Scaling (vRSS)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-21	net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN	Maciej Żenczykowski	1	-2/+2
	NET_RAW is less dangerous, so more likely to be available to a process, so check it first to prevent some spurious logging. This matches IP_TRANSPARENT which checks NET_RAW first. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>