aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-05-25KVM: arm/arm64: Check vcpu redist base before registering an iodevEric Auger2-0/+6
As we are going to register several redist regions, vgic_register_all_redist_iodevs() may be called several times. We need to register a redist_iodev for a given vcpu only once. So let's check if the base address has already been set. Initialize this latter in kvm_vgic_vcpu_init(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Remove kvm_vgic_vcpu_early_initEric Auger3-45/+37
kvm_vgic_vcpu_early_init gets called after kvm_vgic_cpu_init which is confusing. The call path is as follows: kvm_vm_ioctl_create_vcpu |_ kvm_arch_cpu_create |_ kvm_vcpu_init |_ kvm_arch_vcpu_init |_ kvm_vgic_vcpu_init |_ kvm_arch_vcpu_postcreate |_ kvm_vgic_vcpu_early_init Static initialization currently done in kvm_vgic_vcpu_early_init() can be moved to kvm_vgic_vcpu_init(). So let's move the code and remove kvm_vgic_vcpu_early_init(). kvm_arch_vcpu_postcreate() does nothing. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Helper to register a new redistributor regionEric Auger2-16/+81
We introduce a new helper that creates and inserts a new redistributor region into the rdist region list. This helper both handles the case where the redistributor region size is known at registration time and the legacy case where it is not (eventually depending on the number of online vcpus). Depending on pfns, we perform all the possible checks that we can do: - end of memory crossing - incorrect alignment of the base address - collision with distributor region if already defined - collision with already registered rdist regions - check of the new index Rdist regions must be inserted by increasing order of indices. Indices must be contiguous. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Adapt vgic_v3_check_base to multiple rdist regionsEric Auger2-17/+42
vgic_v3_check_base() currently only handles the case of a unique legacy redistributor region whose size is not explicitly set but inferred, instead, from the number of online vcpus. We adapt it to handle the case of multiple redistributor regions with explicitly defined size. We rely on two new helpers: - vgic_v3_rdist_overlap() is used to detect overlap with the dist region if defined - vgic_v3_rd_region_size computes the size of the redist region, would it be a legacy unique region or a new explicitly sized region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Revisit Redistributor TYPER last bit computationEric Auger1-1/+6
The TYPER of an redistributor reflects whether the rdist is the last one of the redistributor region. Let's compare the TYPER GPA against the address of the last occupied slot within the redistributor region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Helper to locate free rdist indexEric Auger3-2/+35
We introduce vgic_v3_rdist_free_slot to help identifying where we can place a new 2x64KB redistributor. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Replace the single rdist region by a listEric Auger5-24/+77
At the moment KVM supports a single rdist region. We want to support several separate rdist regions so let's introduce a list of them. This patch currently only cares about a single entry in this list as the functionality to register several redist regions is not yet there. So this only translates the existing code into something functionally similar using that new data struct. The redistributor region handle is stored in the vgic_cpu structure to allow later computation of the TYPER last bit. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Document KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONEric Auger1-2/+28
We introduce a new KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attribute in KVM_DEV_ARM_VGIC_GRP_ADDR group. It allows userspace to provide the base address and size of a redistributor region Compared to KVM_VGIC_V3_ADDR_TYPE_REDIST, this new attribute allows to declare several separate redistributor regions. So the whole redist space does not need to be contiguous anymore. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Set dist->spis to NULL after kfreeEric Auger1-0/+1
in case kvm_vgic_map_resources() fails, typically if the vgic distributor is not defined, __kvm_vgic_destroy will be called several times. Indeed kvm_vgic_map_resources() is called on first vcpu run. As a result dist->spis is freeed more than once and on the second time it causes a "kernel BUG at mm/slub.c:3912!" Set dist->spis to NULL to avoid the crash. Fixes: ad275b8bb1e6 ("KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init") Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Invoke FPSIMD context switch trap from CDave Martin3-51/+13
The conversion of the FPSIMD context switch trap code to C has added some overhead to calling it, due to the need to save registers that the procedure call standard defines as caller-saved. So, perhaps it is no longer worth invoking this trap handler quite so early. Instead, we can invoke it from fixup_guest_exit(), with little likelihood of increasing the overhead much further. As a convenience, this patch gives __hyp_switch_fpsimd() the same return semantics fixup_guest_exit(). For now there is no possibility of a spurious FPSIMD trap, so the function always returns true, but this allows it to be tail-called with a single return statement. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Fold redundant exit code checks out of fixup_guest_exit()Dave Martin1-4/+8
The entire tail of fixup_guest_exit() is contained in if statements of the form if (x && *exit_code == ARM_EXCEPTION_TRAP). As a result, we can check just once and bail out of the function early, allowing the remaining if conditions to be simplified. The only awkward case is where *exit_code is changed to ARM_EXCEPTION_EL1_SERROR in the case of an illegal GICv2 CPU interface access: in that case, the GICv3 trap handling code is skipped using a goto. This avoids pointlessly evaluating the static branch check for the GICv3 case, even though we can't have vgic_v2_cpuif_trap and vgic_v3_cpuif_trap true simultaneously unless we have a GICv3 and GICv2 on the host: that sounds stupid, but I haven't satisfied myself that it can't happen. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Remove redundant *exit_code changes in fpsimd_guest_exit()Dave Martin1-12/+4
In fixup_guest_exit(), there are a couple of cases where after checking what the exit code was, we assign it explicitly with the value it already had. Assuming this is not indicative of a bug, these assignments are not needed. This patch removes the redundant assignments, and simplifies some if-nesting that becomes trivial as a result. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Remove eager host SVE state savingDave Martin4-37/+0
Now that the host SVE context can be saved on demand from Hyp, there is no longer any need to save this state in advance before entering the guest. This patch removes the relevant call to kvm_fpsimd_flush_cpu_state(). Since the problem that function was intended to solve now no longer exists, the function and its dependencies are also deleted. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Save host SVE context as appropriateDave Martin6-2/+47
This patch adds SVE context saving to the hyp FPSIMD context switch path. This means that it is no longer necessary to save the host SVE state in advance of entering the guest, when in use. In order to avoid adding pointless complexity to the code, VHE is assumed if SVE is in use. VHE is an architectural prerequisite for SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in kernels that support both SVE and KVM. Historically, software models exist that can expose the architecturally invalid configuration of SVE without VHE, so if this situation is detected at kvm_init() time then KVM will be disabled. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Move sve_pffr() to fpsimd.h and make inlineDave Martin3-13/+24
In order to make sve_save_state()/sve_load_state() more easily reusable and to get rid of a potential branch on context switch critical paths, this patch makes sve_pffr() inline and moves it to fpsimd.h. <asm/processor.h> must be included in fpsimd.h in order to make this work, and this creates an #include cycle that is tricky to avoid without modifying core code, due to the way the PR_SVE_*() prctl helpers are included in the core prctl implementation. Instead of breaking the cycle, this patch defers inclusion of <asm/fpsimd.h> in <asm/processor.h> until the point where it is actually needed: i.e., immediately before the prctl definitions. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Switch sve_pffr() argument from task to threadDave Martin1-5/+5
sve_pffr(), which is used to derive the base address used for low-level SVE save/restore routines, currently takes the relevant task_struct as an argument. The only accessed fields are actually part of thread_struct, so this patch changes the argument type accordingly. This is done in preparation for moving this function to a header, where we do not want to have to include <linux/sched.h> due to the consequent circular #include problems. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Move read_zcr_features() out of cpufeature.hDave Martin5-29/+32
Having read_zcr_features() inline in cpufeature.h results in that header requiring #includes which make it hard to include <asm/fpsimd.h> elsewhere without triggering header inclusion cycles. This is not a hot-path function and arguably should not be in cpufeature.h in the first place, so this patch moves it to fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y. This allows some SVE-related #includes to be dropped from cpufeature.h, which will ease future maintenance. A couple of missing #includes of <asm/fpsimd.h> are exposed by this change under arch/arm64/. This patch adds the missing #includes as necessary. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashingDave Martin9-31/+192
This patch refactors KVM to align the host and guest FPSIMD save/restore logic with each other for arm64. This reduces the number of redundant save/restore operations that must occur, and reduces the common-case IRQ blackout time during guest exit storms by saving the host state lazily and optimising away the need to restore the host state before returning to the run loop. Four hooks are defined in order to enable this: * kvm_arch_vcpu_run_map_fp(): Called on PID change to map necessary bits of current to Hyp. * kvm_arch_vcpu_load_fp(): Set up FP/SIMD for entering the KVM run loop (parse as "vcpu_load fp"). * kvm_arch_vcpu_ctxsync_fp(): Get FP/SIMD into a safe state for re-enabling interrupts after a guest exit back to the run loop. For arm64 specifically, this involves updating the host kernel's FPSIMD context tracking metadata so that kernel-mode NEON use will cause the vcpu's FPSIMD state to be saved back correctly into the vcpu struct. This must be done before re-enabling interrupts because kernel-mode NEON may be used by softirqs. * kvm_arch_vcpu_put_fp(): Save guest FP/SIMD state back to memory and dissociate from the CPU ("vcpu_put fp"). Also, the arm64 FPSIMD context switch code is updated to enable it to save back FPSIMD state for a vcpu, not just current. A few helpers drive this: * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): mark this CPU as having context fp (which may belong to a vcpu) currently loaded in its registers. This is the non-task equivalent of the static function fpsimd_bind_to_cpu() in fpsimd.c. * task_fpsimd_save(): exported to allow KVM to save the guest's FPSIMD state back to memory on exit from the run loop. * fpsimd_flush_state(): invalidate any context's FPSIMD state that is currently loaded. Used to disassociate the vcpu from the CPU regs on run loop exit. These changes allow the run loop to enable interrupts (and thus softirqs that may use kernel-mode NEON) without having to save the guest's FPSIMD state eagerly. Some new vcpu_arch fields are added to make all this work. Because host FPSIMD state can now be saved back directly into current's thread_struct as appropriate, host_cpu_context is no longer used for preserving the FPSIMD state. However, it is still needed for preserving other things such as the host's system registers. To avoid ABI churn, the redundant storage space in host_cpu_context is not removed for now. arch/arm is not addressed by this patch and continues to use its current save/restore logic. It could provide implementations of the helpers later if desired. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Repurpose vcpu_arch.debug_flags for general-purpose flagsDave Martin6-19/+18
In struct vcpu_arch, the debug_flags field is used to store debug-related flags about the vcpu state. Since we are about to add some more flags related to FPSIMD and SVE, it makes sense to add them to the existing flags field rather than adding new fields. Since there is only one debug_flags flag defined so far, there is plenty of free space for expansion. In preparation for adding more flags, this patch renames the debug_flags field to simply "flags", and updates comments appropriately. The flag definitions are also moved to <asm/kvm_host.h>, since their presence in <asm/kvm_asm.h> was for purely historical reasons: these definitions are not used from asm any more, and not very likely to be as more Hyp asm is migrated to C. KVM_ARM64_DEBUG_DIRTY_SHIFT has not been used since commit 1ea66d27e7b0 ("arm64: KVM: Move away from the assembly version of the world switch"), so this patch gets rid of that too. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> [maz: fixed minor conflict] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64/sve: Refactor user SVE trap maintenance for external useDave Martin1-15/+15
In preparation for optimising the way KVM manages switching the guest and host FPSIMD state, it is necessary to provide a means for code outside arch/arm64/kernel/fpsimd.c to restore the user trap configuration for SVE correctly for the current task. Rather than requiring external code to duplicate the maintenance explicitly, this patch moves the trap maintenenace to fpsimd_bind_to_cpu(), since it is logically part of the work of associating the current task with the cpu. Because fpsimd_bind_to_cpu() is rather a cryptic name to publish alongside fpsimd_bind_state_to_cpu(), the former function is renamed to fpsimd_bind_task_to_cpu() to make its purpose more explicit. This patch makes appropriate changes to ensure that fpsimd_bind_task_to_cpu() is always called alongside task_fpsimd_load(), so that the trap maintenance continues to be done in every situation where it was done prior to this patch. As a side-effect, the metadata updates done by fpsimd_bind_task_to_cpu() now change from conditional to unconditional in the "already bound" case of sigreturn. This is harmless, and a couple of extra stores on this slow path will not impact performance. I consider this a reasonable price to pay for a slightly cleaner interface. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Eliminate task->mm checksDave Martin2-25/+19
Currently the FPSIMD handling code uses the condition task->mm == NULL as a hint that task has no FPSIMD register context. The ->mm check is only there to filter out tasks that cannot possibly have FPSIMD context loaded, for optimisation purposes. Also, TIF_FOREIGN_FPSTATE must always be checked anyway before saving FPSIMD context back to memory. For these reasons, the ->mm checks are not useful, providing that TIF_FOREIGN_FPSTATE is maintained in a consistent way for all threads. The context switch logic is already deliberately optimised to defer reloads of the regs until ret_to_user (or sigreturn as a special case), and save them only if they have been previously loaded. These paths are the only places where the wrong_task and wrong_cpu conditions can be made false, by calling fpsimd_bind_task_to_cpu(). Kernel threads by definition never reach these paths. As a result, the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will always yield true for kernel threads. This patch removes the redundant checks and special-case code, ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread is scheduled in, and ensures that this flag is set for the init task. The fpsimd_flush_task_state() call already present in copy_thread() ensures the same for any new task. With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch ensures that no extra context save work is added for kernel threads, and eliminates the redundant context saving that may currently occur for kernel threads that have acquired an mm via use_mm(). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Avoid FPSIMD context leakage for the init taskDave Martin1-6/+7
The init task is started with thread_flags equal to 0, which means that TIF_FOREIGN_FPSTATE is initially clear. It is theoretically possible (if unlikely) that the init task could reach userspace without ever being scheduled out. If this occurs, data left in the FPSIMD registers by the kernel could be exposed. This patch fixes this anomaly by ensuring that the init task's initial TIF_FOREIGN_FPSTATE is set. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Fixes: 005f78cd8849 ("arm64: defer reloading a task's FPSIMD state to userland resume") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Generalise context saving for non-task contextsDave Martin1-12/+14
In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD contexts to be handled by the fpsimd common code, this patch adapts task_fpsimd_save() to save back the currently loaded context, removing the explicit dependency on current. The relevant storage to write back to in memory is now found by examining the fpsimd_last_state percpu struct. fpsimd_save() does nothing unless TIF_FOREIGN_FPSTATE is clear, and fpsimd_last_state is updated under local_bh_disable() or local_irq_disable() everywhere that TIF_FOREIGN_FPSTATE is cleared: thus, fpsimd_save() will write back to the correct storage for the loaded context. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm64: Convert lazy FPSIMD context switch trap to CDave Martin2-35/+46
To make the lazy FPSIMD context switch trap code easier to hack on, this patch converts it to C. This is not amazingly efficient, but the trap should typically only be taken once per host context switch. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Introduce kvm_arch_vcpu_run_pid_changeChristoffer Dall3-1/+18
KVM/ARM differs from other architectures in having to maintain an additional virtual address space from that of the host and the guest, because we split the execution of KVM across both EL1 and EL2. This results in a need to explicitly map data structures into EL2 (hyp) which are accessed from the hyp code. As we are about to be more clever with our FPSIMD handling on arm64, which stores data in the task struct and uses thread_info flags, we will have to map parts of the currently executing task struct into the EL2 virtual address space. However, we don't want to do this on every KVM_RUN, because it is a fairly expensive operation to walk the page tables, and the common execution mode is to map a single thread to a VCPU. By introducing a hook that architectures can select with HAVE_KVM_VCPU_RUN_PID_CHANGE, we do not introduce overhead for other architectures, but have a simple way to only map the data we need when required for arm64. This patch introduces the framework only, and wires it up in the arm/arm64 KVM common code. No functional change. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: Use update{,_tsk}_thread_flag()Dave Martin1-10/+8
This patch uses the new update_thread_flag() helpers to simplify a couple of if () set; else clear; constructs. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25thread_info: Add update_thread_flag() helpersDave Martin2-0/+17
There are a number of bits of code sprinkled around the kernel to set a thread flag if a certain condition is true, and clear it otherwise. To help make those call sites terser and less cumbersome, this patch adds a new family of thread flag manipulators update*_thread_flag([...,] flag, cond) which do the equivalent of: if (cond) set*_thread_flag([...,] flag); else clear*_thread_flag([...,] flag); Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after invalidating cpu regsDave Martin1-5/+2
fpsimd_last_state.st is set to NULL as a way of indicating that current's FPSIMD registers are no longer loaded in the cpu. In particular, this is done when the kernel temporarily uses or clobbers the FPSIMD registers for its own purposes, as in CPU PM or kernel-mode NEON, resulting in them being populated with garbage data not belonging to a task. Commit 17eed27b02da ("arm64/sve: KVM: Prevent guests from using SVE") factors this operation out as a new helper fpsimd_flush_cpu_state() to make it clearer what is being done here, and on SVE systems this helper is now used, via kvm_fpsimd_flush_cpu_state(), to invalidate the registers after KVM has run a vcpu. The reason for this is that KVM does not yet understand how to restore the full host SVE registers itself after loading the guest FPSIMD context into them. This exposes a particular problem: if fpsimd_last_state.st is set to NULL without also setting TIF_FOREIGN_FPSTATE, the kernel may continue to think that current's FPSIMD registers are live even though they have actually been clobbered. Prior to the aforementioned commit, the only path where fpsimd_last_state.st is set to NULL without setting TIF_FOREIGN_FPSTATE is when kernel_neon_begin() is called by a kernel thread (where current->mm can be NULL). This does not matter, because the only harm is that at context-switch time fpsimd_thread_switch() may unnecessarily save the FPSIMD registers back to current's thread_struct (even though kernel threads are not considered to have any FPSIMD context of their own and the registers will never be reloaded). Note that although CPU_PM_ENTER lacks the TIF_FOREIGN_FPSTATE setting, every CPU passing through that path must subsequently pass through CPU_PM_EXIT before it can re-enter the kernel proper. CPU_PM_EXIT sets the flag. The sve_flush_cpu_state() function added by commit 17eed27b02da also lacks the proper maintenance of TIF_FOREIGN_FPSTATE. This may cause the bits of a host task's SVE registers that do not alias the FPSIMD register file to spontaneously appear zeroed if a KVM vcpu runs in the same task in the meantime. Although this effect is hidden by the fact that the non-FPSIMD bits of the SVE registers are zeroed by a syscall anyway, it is doubtless a bad idea to rely on these different code paths interacting correctly under future maintenance. This patch makes TIF_FOREIGN_FPSTATE an unconditional side-effect of fpsimd_flush_cpu_state(), and removes the set_thread_flag() calls that become redundant as a result. This ensures that TIF_FOREIGN_FPSTATE cannot remain clear if the FPSIMD state in the FPSIMD registers is invalid. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-20arm64: KVM: Use lm_alias() for kvm_ksym_ref()Mark Rutland1-2/+5
For historical reasons, we open-code lm_alias() in kvm_ksym_ref(). Let's use lm_alias() to avoid duplication and make things clearer. As we have to pull this from <linux/mm.h> (which is not safe for inclusion in assembly), we may as well move the kvm_ksym_ref() definition into the existing !__ASSEMBLY__ block. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-06Linux 4.17-rc4Linus Torvalds1-2/+2
2018-05-05KVM: x86: remove APIC Timer periodic/oneshot spikesAnthoine Bourgeois1-17/+20
Since the commit "8003c9ae204e: add APIC Timer periodic/oneshot mode VMX preemption timer support", a Windows 10 guest has some erratic timer spikes. Here the results on a 150000 times 1ms timer without any load: Before 8003c9ae204e | After 8003c9ae204e Max 1834us | 86000us Mean 1100us | 1021us Deviation 59us | 149us Here the results on a 150000 times 1ms timer with a cpu-z stress test: Before 8003c9ae204e | After 8003c9ae204e Max 32000us | 140000us Mean 1006us | 1997us Deviation 140us | 11095us The root cause of the problem is starting hrtimer with an expiry time already in the past can take more than 20 milliseconds to trigger the timer function. It can be solved by forward such past timers immediately, rather than submitting them to hrtimer_start(). In case the timer is periodic, update the target expiration and call hrtimer_start with it. v2: Check if the tsc deadline is already expired. Thank you Mika. v3: Execute the past timers immediately rather than submitting them to hrtimer_start(). v4: Rearm the periodic timer with advance_periodic_target_expiration() a simpler version of set_target_expiration(). Thank you Paolo. Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com> 8003c9ae204e ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-05-05genksyms: fix typo in parse.tab.{c,h} generation rulesMauro Rossi1-2/+2
'quet' is replaced by 'quiet' in scripts/genksyms/Makefile Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-05kbuild: replace hardcoded bison in cmd_bison_h with $(YACC)Masahiro Yamada1-1/+1
Commit 73a4f6dbe70a ("kbuild: add LEX and YACC variables") missed to update cmd_bison_h somehow. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-05gcc-plugins: fix build condition of SANCOV pluginMasahiro Yamada1-1/+1
Since commit d677a4d60193 ("Makefile: support flag -fsanitizer-coverage=trace-cmp"), you miss to build the SANCOV plugin under some circumstances. CONFIG_KCOV=y CONFIG_KCOV_ENABLE_COMPARISONS=y Your compiler does not support -fsanitize-coverage=trace-pc Your compiler does not support -fsanitize-coverage=trace-cmp Under this condition, $(CFLAGS_KCOV) is not empty but contains a space, so the following ifeq-conditional is false. ifeq ($(CFLAGS_KCOV),) Then, scripts/Makefile.gcc-plugins misses to add sancov_plugin.so to gcc-plugin-y while the SANCOV plugin is necessary as an alternative means. Fixes: d677a4d60193 ("Makefile: support flag -fsanitizer-coverage=trace-cmp") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Kees Cook <keescook@chromium.org>
2018-05-05MAINTAINERS: Update Kbuild entry with a few pathsRasmus Villemoes1-1/+3
I managed to send some modpost patches to old addresses of both Masahiro and Michal, and omitted linux-kbuild from cc, because my tried and trusted scripts/get_maintainer wrapper failed me. Add the modpost directory to the MAINTAINERS entry, and while at it make the Makefile glob match scripts/Makefile itself, and add one matching the Kbuild.include file as well. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-05-04Revert "usb: host: ehci: Use dma_pool_zalloc()"Greg Kroah-Hartman2-3/+6
This reverts commit 22072e83ebd510fb6a090aef9d65ccfda9b1e7e4 as it is broken. Alan writes: What you can't see just from reading the patch is that in both cases (ehci->itd_pool and ehci->sitd_pool) there are two allocation paths -- the two branches of an "if" statement -- and only one of the paths calls dma_pool_[z]alloc. However, the memset is needed for both paths, and so it can't be eliminated. Given that it must be present, there's no advantage to calling dma_pool_zalloc rather than dma_pool_alloc. Reported-by: Erick Cafferata <erick@cafferata.me> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-04platform/x86: Kconfig: Fix dell-laptop dependency chain.Mario Limonciello1-1/+1
As reported by Randy Dunlap: >> WARNING: unmet direct dependencies detected for DELL_SMBIOS >> Depends on [m]: X86 [=y] && X86_PLATFORM_DEVICES [=y] >> && (DCDBAS [=m] || >> DCDBAS [=m]=n) && (ACPI_WMI [=n] || ACPI_WMI [=n]=n) >> Selected by [y]: >> - DELL_LAPTOP [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y] >> && DMI [=y] >> && BACKLIGHT_CLASS_DEVICE [=y] && (ACPI_VIDEO [=n] || >> ACPI_VIDEO [=n]=n) >> && (RFKILL [=n] || RFKILL [=n]=n) && SERIO_I8042 [=y] >> Right now it's possible to set dell laptop to compile in but this causes dell-smbios to compile in which breaks if dcdbas is a module. Dell laptop shouldn't select dell-smbios anymore, but depend on it. Fixes: 32d7b19bad96 (platform/x86: dell-smbios: Resolve dependency error on DCDBAS) Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Cc: stable@vger.kernel.org Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-05-04platform/x86: asus-wireless: Fix NULL pointer dereferenceJoão Paulo Rechi Vita1-1/+3
When the module is removed the led workqueue is destroyed in the remove callback, before the led device is unregistered from the led subsystem. This leads to a NULL pointer derefence when the led device is unregistered automatically later as part of the module removal cleanup. Bellow is the backtrace showing the problem. BUG: unable to handle kernel NULL pointer dereference at (null) IP: __queue_work+0x8c/0x410 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI Modules linked in: ccm edac_mce_amd kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 joydev crypto_simd asus_nb_wmi glue_helper uvcvideo snd_hda_codec_conexant snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel asus_wmi snd_hda_codec cryptd snd_hda_core sparse_keymap videobuf2_vmalloc arc4 videobuf2_memops snd_hwdep input_leds videobuf2_v4l2 ath9k psmouse videobuf2_core videodev ath9k_common snd_pcm ath9k_hw media fam15h_power ath k10temp snd_timer mac80211 i2c_piix4 r8169 mii mac_hid cfg80211 asus_wireless(-) snd soundcore wmi shpchp 8250_dw ip_tables x_tables amdkfd amd_iommu_v2 amdgpu radeon chash i2c_algo_bit drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ahci ttm libahci drm video CPU: 3 PID: 2177 Comm: rmmod Not tainted 4.15.0-5-generic #6+dev94.b4287e5bem1-Endless Hardware name: ASUSTeK COMPUTER INC. X555DG/X555DG, BIOS 5.011 05/05/2015 RIP: 0010:__queue_work+0x8c/0x410 RSP: 0018:ffffbe8cc249fcd8 EFLAGS: 00010086 RAX: ffff992ac6810800 RBX: 0000000000000000 RCX: 0000000000000008 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff992ac6400e18 RBP: ffffbe8cc249fd18 R08: ffff992ac6400db0 R09: 0000000000000000 R10: 0000000000000040 R11: ffff992ac6400dd8 R12: 0000000000002000 R13: ffff992abd762e00 R14: ffff992abd763e38 R15: 000000000001ebe0 FS: 00007f318203e700(0000) GS:ffff992aced80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001c720e000 CR4: 00000000001406e0 Call Trace: queue_work_on+0x38/0x40 led_state_set+0x2c/0x40 [asus_wireless] led_set_brightness_nopm+0x14/0x40 led_set_brightness+0x37/0x60 led_trigger_set+0xfc/0x1d0 led_classdev_unregister+0x32/0xd0 devm_led_classdev_release+0x11/0x20 release_nodes+0x109/0x1f0 devres_release_all+0x3c/0x50 device_release_driver_internal+0x16d/0x220 driver_detach+0x3f/0x80 bus_remove_driver+0x55/0xd0 driver_unregister+0x2c/0x40 acpi_bus_unregister_driver+0x15/0x20 asus_wireless_driver_exit+0x10/0xb7c [asus_wireless] SyS_delete_module+0x1da/0x2b0 entry_SYSCALL_64_fastpath+0x24/0x87 RIP: 0033:0x7f3181b65fd7 RSP: 002b:00007ffe74bcbe18 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3181b65fd7 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000555ea2559258 RBP: 0000555ea25591f0 R08: 00007ffe74bcad91 R09: 000000000000000a R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000003 R13: 00007ffe74bcae00 R14: 0000000000000000 R15: 0000555ea25591f0 Code: 01 00 00 02 0f 85 7d 01 00 00 48 63 45 d4 48 c7 c6 00 f4 fa 87 49 8b 9d 08 01 00 00 48 03 1c c6 4c 89 f7 e8 87 fb ff ff 48 85 c0 <48> 8b 3b 0f 84 c5 01 00 00 48 39 f8 0f 84 bc 01 00 00 48 89 c7 RIP: __queue_work+0x8c/0x410 RSP: ffffbe8cc249fcd8 CR2: 0000000000000000 ---[ end trace 7aa4f4a232e9c39c ]--- Unregistering the led device on the remove callback before destroying the workqueue avoids this problem. https://bugzilla.kernel.org/show_bug.cgi?id=196097 Reported-by: Dun Hum <bitter.taste@gmx.com> Cc: stable@vger.kernel.org Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-05-04arm64: vgic-v2: Fix proxying of cpuif accessJames Morse1-5/+19
Proxying the cpuif accesses at EL2 makes use of vcpu_data_guest_to_host and co, which check the endianness, which call into vcpu_read_sys_reg... which isn't mapped at EL2 (it was inlined before, and got moved OoL with the VHE optimizations). The result is of course a nice panic. Let's add some specialized cruft to keep the broken platforms that require this hack alive. But, this code used vcpu_data_guest_to_host(), which expected us to write the value to host memory, instead we have trapped the guest's read or write to an mmio-device, and are about to replay it using the host's readl()/writel() which also perform swabbing based on the host endianness. This goes wrong when both host and guest are big-endian, as readl()/writel() will undo the guest's swabbing, causing the big-endian value to be written to device-memory. What needs doing? A big-endian guest will have pre-swabbed data before storing, undo this. If its necessary for the host, writel() will re-swab it. For a read a big-endian guest expects to swab the data after the load. The hosts's readl() will correct for host endianness, giving us the device-memory's value in the register. For a big-endian guest, swab it as if we'd only done the load. For a little-endian guest, nothing needs doing as readl()/writel() leave the correct device-memory value in registers. Tested on Juno with that rarest of things: a big-endian 64K host. Based on a patch from Marc Zyngier. Reported-by: Suzuki K Poulose <suzuki.poulose@arm.com> Fixes: bf8feb39642b ("arm64: KVM: vgic-v2: Add GICV access from HYP") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-04KVM: arm/arm64: vgic_init: Cleanup reference to process_maintenanceValentin Schneider1-1/+1
One comment still mentioned process_maintenance operations after commit af0614991ab6 ("KVM: arm/arm64: vgic: Get rid of unnecessary process_maintenance operation") Update the comment to point to vgic_fold_lr_state instead, which is where maintenance interrupts are taken care of. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-04KVM: arm64: Fix order of vcpu_write_sys_reg() argumentsJames Morse1-1/+1
A typo in kvm_vcpu_set_be()'s call: | vcpu_write_sys_reg(vcpu, SCTLR_EL1, sctlr) causes us to use the 32bit register value as an index into the sys_reg[] array, and sail off the end of the linear map when we try to bring up big-endian secondaries. | Unable to handle kernel paging request at virtual address ffff80098b982c00 | Mem abort info: | ESR = 0x96000045 | Exception class = DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x00000045 | CM = 0, WnR = 1 | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000002ea0571a | [ffff80098b982c00] pgd=00000009ffff8803, pud=0000000000000000 | Internal error: Oops: 96000045 [#1] PREEMPT SMP | Modules linked in: | CPU: 2 PID: 1561 Comm: kvm-vcpu-0 Not tainted 4.17.0-rc3-00001-ga912e2261ca6-dirty #1323 | Hardware name: ARM Juno development board (r1) (DT) | pstate: 60000005 (nZCv daif -PAN -UAO) | pc : vcpu_write_sys_reg+0x50/0x134 | lr : vcpu_write_sys_reg+0x50/0x134 | Process kvm-vcpu-0 (pid: 1561, stack limit = 0x000000006df4728b) | Call trace: | vcpu_write_sys_reg+0x50/0x134 | kvm_psci_vcpu_on+0x14c/0x150 | kvm_psci_0_2_call+0x244/0x2a4 | kvm_hvc_call_handler+0x1cc/0x258 | handle_hvc+0x20/0x3c | handle_exit+0x130/0x1ec | kvm_arch_vcpu_ioctl_run+0x340/0x614 | kvm_vcpu_ioctl+0x4d0/0x840 | do_vfs_ioctl+0xc8/0x8d0 | ksys_ioctl+0x78/0xa8 | sys_ioctl+0xc/0x18 | el0_svc_naked+0x30/0x34 | Code: 73620291 604d00b0 00201891 1ab10194 (957a33f8) |---[ end trace 4b4a4f9628596602 ]--- Fix the order of the arguments. Fixes: 8d404c4c24613 ("KVM: arm64: Rewrite system register accessors to read/write functions") CC: Christoffer Dall <cdall@cs.columbia.edu> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-04MAINTAINERS & files: Canonize the e-mails I use at filesMauro Carvalho Chehab74-109/+92
From now on, I'll start using my @kernel.org as my development e-mail. As such, let's remove the entries that point to the old mchehab@s-opensource.com at MAINTAINERS file. For the files written with a copyright with mchehab@s-opensource, let's keep Samsung on their names, using mchehab+samsung@kernel.org, in order to keep pointing to my employer, with sponsors the work. For the files written before I join Samsung (on July, 4 2013), let's just use mchehab@kernel.org. For bug reports, we can simply point to just kernel.org, as this will reach my mchehab+samsung inbox anyway. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Brian Warner <brian.warner@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2018-05-04media: imx-media-csi: Fix inconsistent IS_ERR and PTR_ERRFrom: Gustavo A. R. Silva1-1/+1
Fix inconsistent IS_ERR and PTR_ERR in imx_csi_probe. The proper pointer to be passed as argument is pinctrl instead of priv->vdev. This issue was detected with the help of Coccinelle. Fixes: 52e17089d185 ("media: imx: Don't initialize vars that won't be used") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2018-05-04tools: power/acpi, revert to LD = gccJiri Slaby1-0/+1
Commit 7ed1c1901fe5 (tools: fix cross-compile var clobbering) removed setting of LD to $(CROSS_COMPILE)gcc. This broke build of acpica (acpidump) in power/acpi: ld: unrecognized option '-D_LINUX' The tools pass CFLAGS to the linker (incl. -D_LINUX), so revert this particular change and let LD be $(CC) again. Note that the old behaviour was a bit different, it used $(CROSS_COMPILE)gcc which was eliminated by the commit 7ed1c1901fe5. We use $(CC) for that reason. Fixes: 7ed1c1901fe5 (tools: fix cross-compile var clobbering) Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-03bdi: Fix oops in wb_workfn()Jan Kara1-1/+1
Syzbot has reported that it can hit a NULL pointer dereference in wb_workfn() due to wb->bdi->dev being NULL. This indicates that wb_workfn() was called for an already unregistered bdi which should not happen as wb_shutdown() called from bdi_unregister() should make sure all pending writeback works are completed before bdi is unregistered. Except that wb_workfn() itself can requeue the work with: mod_delayed_work(bdi_wq, &wb->dwork, 0); and if this happens while wb_shutdown() is waiting in: flush_delayed_work(&wb->dwork); the dwork can get executed after wb_shutdown() has finished and bdi_unregister() has cleared wb->bdi->dev. Make wb_workfn() use wakeup_wb() for requeueing the work which takes all the necessary precautions against racing with bdi unregistration. CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> CC: Tejun Heo <tj@kernel.org> Fixes: 839a8e8660b6777e7fe4e80af1a048aebe2b5977 Reported-by: syzbot <syzbot+9873874c735f2892e7e9@syzkaller.appspotmail.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-03RDMA/cma: Do not query GID during QP state transition to RTRParav Pandit1-7/+0
When commit [1] was added, SGID was queried to derive the SMAC address. Then, later on during a refactor [2], SMAC was no longer needed. However, the now useless GID query remained. Then during additional code changes later on, the GID query was being done in such a way that it caused iWARP queries to start breaking. Remove the useless GID query and resolve the iWARP breakage at the same time. This is discussed in [3]. [1] commit dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures") [2] commit 5c266b2304fb ("IB/cm: Remove the usage of smac and vid of qp_attr and cm_av") [3] https://www.spinics.net/lists/linux-rdma/msg63951.html Suggested-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-03IB/mlx4: Fix integer overflow when calculating optimal MTT sizeJack Morgenstein1-1/+1
When the kernel was compiled using the UBSAN option, we saw the following stack trace: [ 1184.827917] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx4/mr.c:349:27 [ 1184.828114] signed integer overflow: [ 1184.828247] -2147483648 - 1 cannot be represented in type 'int' The problem was caused by calling round_up in procedure mlx4_ib_umem_calc_optimal_mtt_size (on line 349, as noted in the stack trace) with the second parameter (1 << block_shift) (which is an int). The second parameter should have been (1ULL << block_shift) (which is an unsigned long long). (1 << block_shift) is treated by the compiler as an int (because 1 is an integer). Now, local variable block_shift is initialized to 31. If block_shift is 31, 1 << block_shift is 1 << 31 = 0x80000000=-214748368. This is the most negative int value. Inside the round_up macro, there is a cast applied to ((1 << 31) - 1). However, this cast is applied AFTER ((1 << 31) - 1) is calculated. Since (1 << 31) is treated as an int, we get the negative overflow identified by UBSAN in the process of calculating ((1 << 31) - 1). The fix is to change (1 << block_shift) to (1ULL << block_shift) on line 349. Fixes: 9901abf58368 ("IB/mlx4: Use optimal numbers of MTT entries") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-03IB/hfi1: Fix memory leak in exception path in get_irq_affinity()Sebastian Sanchez1-6/+5
When IRQ affinity is set and the interrupt type is unknown, a cpu mask allocated within the function is never freed. Fix this memory leak by allocating memory within the scope where it is used. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-03IB/{hfi1, rdmavt}: Fix memory leak in hfi1_alloc_devdata() upon failureSebastian Sanchez3-10/+30
When allocating device data, if there's an allocation failure, the already allocated memory won't be freed such as per-cpu counters. Fix memory leaks in exception path by creating a common reentrant clean up function hfi1_clean_devdata() to be used at driver unload time and device data allocation failure. To accomplish this, free_platform_config() and clean_up_i2c() are changed to be reentrant to remove dependencies when they are called in different order. This helps avoid NULL pointer dereferences introduced by this patch if those two functions weren't reentrant. In addition, set dd->int_counter, dd->rcv_limit, dd->send_schedule and dd->tx_opstats to NULL after they're freed in hfi1_clean_devdata(), so that hfi1_clean_devdata() is fully reentrant. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-03IB/hfi1: Fix NULL pointer dereference when invalid num_vls is usedSebastian Sanchez2-3/+2
When an invalid num_vls is used as a module parameter, the code execution follows an exception path where the macro dd_dev_err() expects dd->pcidev->dev not to be NULL in hfi1_init_dd(). This causes a NULL pointer dereference. Fix hfi1_init_dd() by initializing dd->pcidev and dd->pcidev->dev earlier in the code. If a dd exists, then dd->pcidev and dd->pcidev->dev always exists. BUG: unable to handle kernel NULL pointer dereference at 00000000000000f0 IP: __dev_printk+0x15/0x90 Workqueue: events work_for_cpu_fn RIP: 0010:__dev_printk+0x15/0x90 Call Trace: dev_err+0x6c/0x90 ? hfi1_init_pportdata+0x38d/0x3f0 [hfi1] hfi1_init_dd+0xdd/0x2530 [hfi1] ? pci_conf1_read+0xb2/0xf0 ? pci_read_config_word.part.9+0x64/0x80 ? pci_conf1_write+0xb0/0xf0 ? pcie_capability_clear_and_set_word+0x57/0x80 init_one+0x141/0x490 [hfi1] local_pci_probe+0x3f/0xa0 work_for_cpu_fn+0x10/0x20 process_one_work+0x152/0x350 worker_thread+0x1cf/0x3e0 kthread+0xf5/0x130 ? max_active_store+0x80/0x80 ? kthread_bind+0x10/0x10 ? do_syscall_64+0x6e/0x1a0 ? SyS_exit_group+0x10/0x10 ret_from_fork+0x35/0x40 Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>