aboutsummaryrefslogtreecommitdiffstats
path: root/virt (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-07-21KVM: arm/arm64: vgic: Allow configuration of interrupt groupsChristoffer Dall5-4/+66
Implement the required MMIO accessors for GICv2 and GICv3 for the IGROUPR distributor and redistributor registers. This can allow guests to change behavior compared to running on previous versions of KVM, but only to align with the architecture and hardware implementations. This also allows userspace to configure the interrupts groups for GICv3. We don't allow userspace to write the groups on GICv2 just yet, because that would result in GICv2 guests not receiving interrupts after migrating from an older kernel that exposes GICv2 interrupts as group 1. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Return error on incompatible uaccess GICD_IIDR writesChristoffer Dall2-6/+36
If userspace attempts to write a GICD_IIDR that does not match the kernel version, return an error to userspace. The intention is to allow implementation changes inside KVM while avoiding silently breaking migration resulting in guests not running without any clear indication of what went wrong. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Permit uaccess writes to return errorsChristoffer Dall3-18/+31
Currently we do not allow any vgic mmio write operations to fail, which makes sense from mmio traps from the guest. However, we should be able to report failures to userspace, if userspace writes incompatible values to read-only registers. Rework the internal interface to allow errors to be returned on the write side for userspace writes. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Signal IRQs using their configured groupChristoffer Dall2-5/+4
Now when we have a group configuration on the struct IRQ, use this state when populating the LR and signaling interrupts as either group 0 or group 1 to the VM. Depending on the model of the emulated GIC, and the guest's configuration of the VMCR, interrupts may be signaled as IRQs or FIQs to the VM. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Add group field to struct irqChristoffer Dall3-5/+23
In preparation for proper group 0 and group 1 support in the vgic, we add a field in the struct irq to store the group of all interrupts. We initialize the group to group 0 when emulating GICv2 and to group 1 when emulating GICv3, just like we treat them today. LPIs are always group 1. We also continue to ignore writes from the guest, preserving existing functionality, for now. Finally, we also add this field to the vgic debug logic to show the group for all interrupts. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: GICv2 IGROUPR should read as zeroChristoffer Dall2-2/+8
We currently don't support grouping in the emulated VGIC, which is a known defect on KVM (not hurting any currently used guests as far as we're aware). This is currently handled by treating all interrupts as group 0 interrupts for an emulated GICv2 and always signaling interrupts as group 0 to the virtual CPU interface. However, when reading which group interrupts belongs to in the guest from the emulated VGIC, the VGIC currently reports group 1 instead of group 0, which is misleading. Fix this temporarily before introducing full group support by changing the hander to _raz instead of _rao. Fixes: fb848db39661a "KVM: arm/arm64: vgic-new: Add GICv2 MMIO handling framework" Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Keep track of implementation revisionChristoffer Dall3-4/+9
As we are about to tweak implementation aspects of the VGIC emulation, while still preserving some level of backwards compatibility support, add a field to keep track of the implementation revision field which is reported to the VM and to userspace. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic: Define GICD_IIDR fields for GICv2 and GIv3Christoffer Dall2-2/+4
Instead of hardcoding the shifts and masks in the GICD_IIDR register emulation, let's add the definition of these fields to the GIC header files and use them. This will make things more obvious when we're going to bump the revision in the IIDR when we'll make guest-visible changes to the implementation. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: vgic-debug: Show LPI statusMarc Zyngier3-16/+39
The vgic debugfs file only knows about SGI/PPI/SPI interrupts, and completely ignores LPIs. Let's fix that. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm64: vgic-its: Remove VLA usageKees Cook1-4/+10
In the quest to remove all stack VLA usage from the kernel[1], this switches to using a maximum size and adds sanity checks. Additionally cleans up some of the int-vs-u32 usage and adds additional bounds checking. As it currently stands, this will always be 8 bytes until the ABI changes. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Kees Cook <keescook@chromium.org> [maz: dropped WARN_ONs] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-21KVM: arm/arm64: Fix vgic init raceChristoffer Dall1-0/+4
The vgic_init function can race with kvm_arch_vcpu_create() which does not hold kvm_lock() and we therefore have no synchronization primitives to ensure we're doing the right thing. As the user is trying to initialize or run the VM while at the same time creating more VCPUs, we just have to refuse to initialize the VGIC in this case rather than silently failing with a broken VCPU. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-6/+11
Pull kvm fixes from Paolo Bonzini: "Miscellaneous bugfixes, plus a small patchlet related to Spectre v2" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvmclock: fix TSC calibration for nested guests KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel. x86/kvmclock: set pvti_cpu0_va after enabling kvmclock x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR
2018-07-18KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumerPaolo Bonzini1-5/+6
A comment warning against this bug is there, but the code is not doing what the comment says. Therefore it is possible that an EPOLLHUP races against irq_bypass_register_consumer. The EPOLLHUP handler schedules irqfd_shutdown, and if that runs soon enough, you get a use-after-free. Reported-by: syzbot <syzkaller@googlegroups.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-18KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel.Lan Tianyu1-1/+5
Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel for one specific eventfd. When the assign path hasn't finished but irqfd has been added to kvm->irqfds.items list, another thead may deassign the eventfd and free struct kvm_kernel_irqfd(). The assign path then uses the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid such issue, keep irqfd under kvm->irq_srcu protection after the irqfd has been added to kvm->irqfds.items list, and call synchronize_srcu() in irq_shutdown() to make sure that irqfd has been fully initialized in the assign path. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Tianyu Lan <tianyu.lan@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-13KVM: s390: a utility function for migrationClaudio Imbrenda1-1/+1
Introduce a utility function that will be used later on for storage attributes migration, and use it in kvm_main.c to replace existing code that does the same thing. Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Message-Id: <1525106005-13931-2-git-send-email-imbrenda@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-09KVM: arm/arm64: Enable adaptative WFE trappingMarc Zyngier1-0/+6
Trapping blocking WFE is extremely beneficial in situations where the system is oversubscribed, as it allows another thread to run while being blocked. In a non-oversubscribed environment, this is the complete opposite, and trapping WFE is just unnecessary overhead. Let's only enable WFE trapping if the CPU has more than a single task to run (that is, more than just the vcpu thread). Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-09KVM: arm/arm64: Remove unnecessary CMOs when creating HYP page tablesMarc Zyngier1-4/+0
There is no need to perform cache maintenance operations when creating the HYP page tables if we have the multiprocessing extensions. ARMv7 mandates them with the virtualization support, and ARMv8 just mandates them unconditionally. Let's remove these operations. Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-09KVM: arm/arm64: Stop using the kernel's {pmd,pud,pgd}_populate helpersMarc Zyngier1-4/+21
The {pmd,pud,pgd}_populate accessors usage have always been a bit weird in KVM. We don't have a struct mm to pass (and neither does the kernel most of the time, but still...), and the 32bit code has all kind of cache maintenance that doesn't make sense on ARMv7+ when MP extensions are mandatory (which is the case when the VEs are present). Let's bite the bullet and provide our own implementations. The only bit of architectural code left has to do with building the table entry itself (arm64 having up to 52bit PA, arm lacking PUD level). Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-09KVM: arm/arm64: Consolidate page-table accessorsMarc Zyngier1-0/+12
The arm and arm64 KVM page tables accessors are pointlessly different between the two architectures, and likely both wrong one way or another: arm64 lacks a dsb(), and arm doesn't use WRITE_ONCE. Let's unify them. Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-09arm64: KVM: Add support for Stage-2 control of memory types and cacheabilityMarc Zyngier1-0/+4
Up to ARMv8.3, the combinaison of Stage-1 and Stage-2 attributes results in the strongest attribute of the two stages. This means that the hypervisor has to perform quite a lot of cache maintenance just in case the guest has some non-cacheable mappings around. ARMv8.4 solves this problem by offering a different mode (FWB) where Stage-2 has total control over the memory attribute (this is limited to systems where both I/O and instruction fetches are coherent with the dcache). This is achieved by having a different set of memory attributes in the page tables, and a new bit set in HCR_EL2. On such a system, we can then safely sidestep any form of dcache management. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-07-05kvm/arm: use PSR_AA32 definitionsMark Rutland1-10/+10
Some code cares about the SPSR_ELx format for exceptions taken from AArch32 to inspect or manipulate the SPSR_ELx value, which is already in the SPSR_ELx format, and not in the AArch32 PSR format. To separate these from cases where we care about the AArch32 PSR format, migrate these cases to use the PSR_AA32_* definitions rather than COMPAT_PSR_*. There should be no functional change as a result of this patch. Note that arm64 KVM does not support a compat KVM API, and always uses the SPSR_ELx format, even for AArch32 guests. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-03Merge branch 'sched/urgent' into sched/core, to pick up fixesIngo Molnar4-16/+12
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-06-21KVM: arm64: Prevent KVM_COMPAT from being selectedMarc Zyngier1-1/+1
There is very little point in trying to support the 32bit KVM/arm API on arm64, and this was never an anticipated use case. Let's make it clear by not selecting KVM_COMPAT. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: Enforce error in ioctl for compat tasks when !KVM_COMPATMarc Zyngier1-10/+9
The current behaviour of the compat ioctls is a bit odd. We provide a compat_ioctl method when KVM_COMPAT is set, and NULL otherwise. But NULL means that the normal, non-compat ioctl should be used directly for compat tasks, and there is no way to actually prevent a compat task from issueing KVM ioctls. This patch changes this behaviour, by always registering a compat_ioctl method, even if KVM_COMPAT is not selected. In that case, the callback will always return -EINVAL. Fixes: de8e5d744051568c8aad ("KVM: Disable compat ioctl for s390") Reported-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: arm/arm64: add WARN_ON if size is not PAGE_SIZE aligned in unmap_stage2_rangeJia He1-0/+2
There is a panic in armv8a server(QDF2400) under memory pressure tests (start 20 guests and run memhog in the host). ---------------------------------begin-------------------------------- [35380.800950] BUG: Bad page state in process qemu-kvm pfn:dd0b6 [35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0 [35380.815024] flags: 0x1fffc00000000000() [35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000 [35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000 [35380.805825] page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0 [35380.815024] flags: 0x1fffc00000000000() [35380.818845] raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000 [35380.826569] raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000 [35380.834294] page dumped because: nonzero _refcount [...] --------------------------------end-------------------------------------- The root cause might be what was fixed at [1]. But from the KVM points of view, it would be better if the issue was caught earlier. If the size is not PAGE_SIZE aligned, unmap_stage2_range might unmap the wrong(more or less) page range. Hence it caused the "BUG: Bad page state" Let's WARN in that case, so that the issue is obvious. [1] https://lkml.org/lkml/2018/5/3/1042 Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: jia.he@hxt-semitech.com [maz: tidied up commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-21KVM: arm/arm64: Drop resource size check for GICV windowArd Biesheuvel1-5/+0
When booting a 64 KB pages kernel on a ACPI GICv3 system that implements support for v2 emulation, the following warning is produced GICV size 0x2000 not a multiple of page size 0x10000 and support for v2 emulation is disabled, preventing GICv2 VMs from being able to run on such hosts. The reason is that vgic_v3_probe() performs a sanity check on the size of the window (it should be a multiple of the page size), while the ACPI MADT parsing code hardcodes the size of the window to 8 KB. This makes sense, considering that ACPI does not bother to describe the size in the first place, under the assumption that platforms implementing ACPI will follow the architecture and not put anything else in the same 64 KB window. So let's just drop the sanity check altogether, and assume that the window is at least 64 KB in size. Fixes: 909777324588 ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-06-20sched/swait: Rename to exclusivePeter Zijlstra4-6/+6
Since swait basically implemented exclusive waits only, make sure the API reflects that. $ git grep -l -e "\<swake_up\>" -e "\<swait_event[^ (]*" -e "\<prepare_to_swait\>" | while read file; do sed -i -e 's/\<swake_up\>/&_one/g' -e 's/\<swait_event[^ (]*/&_exclusive/g' -e 's/\<prepare_to_swait\>/&_exclusive/g' $file; done With a few manual touch-ups. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: bigeasy@linutronix.de Cc: oleg@redhat.com Cc: paulmck@linux.vnet.ibm.com Cc: pbonzini@redhat.com Link: https://lkml.kernel.org/r/20180612083909.261946548@infradead.org
2018-06-12Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds2-2/+3
Pull more overflow updates from Kees Cook: "The rest of the overflow changes for v4.18-rc1. This includes the explicit overflow fixes from Silvio, further struct_size() conversions from Matthew, and a bug fix from Dan. But the bulk of it is the treewide conversions to use either the 2-factor argument allocators (e.g. kmalloc(a * b, ...) into kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a * b) into vmalloc(array_size(a, b)). Coccinelle was fighting me on several fronts, so I've done a bunch of manual whitespace updates in the patches as well. Summary: - Error path bug fix for overflow tests (Dan) - Additional struct_size() conversions (Matthew, Kees) - Explicitly reported overflow fixes (Silvio, Kees) - Add missing kvcalloc() function (Kees) - Treewide conversions of allocators to use either 2-factor argument variant when available, or array_size() and array3_size() as needed (Kees)" * tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits) treewide: Use array_size in f2fs_kvzalloc() treewide: Use array_size() in f2fs_kzalloc() treewide: Use array_size() in f2fs_kmalloc() treewide: Use array_size() in sock_kmalloc() treewide: Use array_size() in kvzalloc_node() treewide: Use array_size() in vzalloc_node() treewide: Use array_size() in vzalloc() treewide: Use array_size() in vmalloc() treewide: devm_kzalloc() -> devm_kcalloc() treewide: devm_kmalloc() -> devm_kmalloc_array() treewide: kvzalloc() -> kvcalloc() treewide: kvmalloc() -> kvmalloc_array() treewide: kzalloc_node() -> kcalloc_node() treewide: kzalloc() -> kcalloc() treewide: kmalloc() -> kmalloc_array() mm: Introduce kvcalloc() video: uvesafb: Fix integer overflow in allocation UBIFS: Fix potential integer overflow in allocation leds: Use struct_size() in allocation Convert intel uncore to struct_size ...
2018-06-12treewide: Use array_size() in vmalloc()Kees Cook1-1/+2
The vmalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: vmalloc(a * b) with: vmalloc(array_size(a, b)) as well as handling cases of: vmalloc(a * b * c) with: vmalloc(array3_size(a, b, c)) This does, however, attempt to ignore constant size factors like: vmalloc(4 * 1024) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( vmalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | vmalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( vmalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | vmalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | vmalloc( - sizeof(char) * (COUNT) + COUNT , ...) | vmalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | vmalloc( - sizeof(u8) * COUNT + COUNT , ...) | vmalloc( - sizeof(__u8) * COUNT + COUNT , ...) | vmalloc( - sizeof(char) * COUNT + COUNT , ...) | vmalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( vmalloc( - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) | vmalloc( - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) | vmalloc( - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | vmalloc( - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | vmalloc( - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) | vmalloc( - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) | vmalloc( - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) | vmalloc( - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ vmalloc( - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( vmalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vmalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vmalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vmalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | vmalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | vmalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | vmalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | vmalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( vmalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | vmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | vmalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | vmalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | vmalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | vmalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( vmalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | vmalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( vmalloc(C1 * C2 * C3, ...) | vmalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression E1, E2; constant C1, C2; @@ ( vmalloc(C1 * C2, ...) | vmalloc( - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12treewide: kzalloc() -> kcalloc()Kees Cook1-1/+1
The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) | kzalloc( - sizeof(u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) | kzalloc( - sizeof(char) * COUNT + COUNT , ...) | kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) | kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) | kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) | kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) | kzalloc(sizeof(TYPE) * C2, ...) | kzalloc(C1 * C2 * C3, ...) | kzalloc(C1 * C2, ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) | - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) | - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) | - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds9-141/+404
Pull KVM updates from Paolo Bonzini: "Small update for KVM: ARM: - lazy context-switching of FPSIMD registers on arm64 - "split" regions for vGIC redistributor s390: - cleanups for nested - clock handling - crypto - storage keys - control register bits x86: - many bugfixes - implement more Hyper-V super powers - implement lapic_timer_advance_ns even when the LAPIC timer is emulated using the processor's VMX preemption timer. - two security-related bugfixes at the top of the branch" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (79 commits) kvm: fix typo in flag name kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system KVM: x86: introduce linear_{read,write}_system kvm: nVMX: Enforce cpl=0 for VMX instructions kvm: nVMX: Add support for "VMWRITE to any supported field" kvm: nVMX: Restrict VMX capability MSR changes KVM: VMX: Optimize tscdeadline timer latency KVM: docs: nVMX: Remove known limitations as they do not exist now KVM: docs: mmu: KVM support exposing SLAT to guests kvm: no need to check return value of debugfs_create functions kvm: Make VM ioctl do valloc for some archs kvm: Change return type to vm_fault_t KVM: docs: mmu: Fix link to NPT presentation from KVM Forum 2008 kvm: x86: Amend the KVM_GET_SUPPORTED_CPUID API documentation KVM: x86: hyperv: declare KVM_CAP_HYPERV_TLBFLUSH capability KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE}_EX implementation KVM: x86: hyperv: simplistic HVCALL_FLUSH_VIRTUAL_ADDRESS_{LIST,SPACE} implementation KVM: introduce kvm_make_vcpus_request_mask() API KVM: x86: hyperv: do rep check for each hypercall separately ...
2018-06-08Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds2-2/+20
Pull arm64 updates from Catalin Marinas: "Apart from the core arm64 and perf changes, the Spectre v4 mitigation touches the arm KVM code and the ACPI PPTT support touches drivers/ (acpi and cacheinfo). I should have the maintainers' acks in place. Summary: - Spectre v4 mitigation (Speculative Store Bypass Disable) support for arm64 using SMC firmware call to set a hardware chicken bit - ACPI PPTT (Processor Properties Topology Table) parsing support and enable the feature for arm64 - Report signal frame size to user via auxv (AT_MINSIGSTKSZ). The primary motivation is Scalable Vector Extensions which requires more space on the signal frame than the currently defined MINSIGSTKSZ - ARM perf patches: allow building arm-cci as module, demote dev_warn() to dev_dbg() in arm-ccn event_init(), miscellaneous cleanups - cmpwait() WFE optimisation to avoid some spurious wakeups - L1_CACHE_BYTES reverted back to 64 (for performance reasons that have to do with some network allocations) while keeping ARCH_DMA_MINALIGN to 128. cache_line_size() returns the actual hardware Cache Writeback Granule - Turn LSE atomics on by default in Kconfig - Kernel fault reporting tidying - Some #include and miscellaneous cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (53 commits) arm64: Fix syscall restarting around signal suppressed by tracer arm64: topology: Avoid checking numa mask for scheduler MC selection ACPI / PPTT: fix build when CONFIG_ACPI_PPTT is not enabled arm64: cpu_errata: include required headers arm64: KVM: Move VCPU_WORKAROUND_2_FLAG macros to the top of the file arm64: signal: Report signal frame size to userspace via auxv arm64/sve: Thin out initialisation sanity-checks for sve_max_vl arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests arm64: KVM: Add ARCH_WORKAROUND_2 support for guests arm64: KVM: Add HYP per-cpu accessors arm64: ssbd: Add prctl interface for per-thread mitigation arm64: ssbd: Introduce thread flag to control userspace mitigation arm64: ssbd: Restore mitigation status on CPU resume arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation arm64: ssbd: Add global mitigation state accessor arm64: Add 'ssbd' command-line option arm64: Add ARCH_WORKAROUND_2 probing arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2 arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1 ...
2018-06-04Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespaceLinus Torvalds1-0/+1
Pull siginfo updates from Eric Biederman: "This set of changes close the known issues with setting si_code to an invalid value, and with not fully initializing struct siginfo. There remains work to do on nds32, arc, unicore32, powerpc, arm, arm64, ia64 and x86 to get the code that generates siginfo into a simpler and more maintainable state. Most of that work involves refactoring the signal handling code and thus careful code review. Also not included is the work to shrink the in kernel version of struct siginfo. That depends on getting the number of places that directly manipulate struct siginfo under control, as it requires the introduction of struct kernel_siginfo for the in kernel things. Overall this set of changes looks like it is making good progress, and with a little luck I will be wrapping up the siginfo work next development cycle" * 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) signal/sh: Stop gcc warning about an impossible case in do_divide_error signal/mips: Report FPE_FLTUNK for undiagnosed floating point exceptions signal/um: More carefully relay signals in relay_signal. signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR} signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code signal/signalfd: Add support for SIGSYS signal/signalfd: Remove __put_user from signalfd_copyinfo signal/xtensa: Use force_sig_fault where appropriate signal/xtensa: Consistenly use SIGBUS in do_unaligned_user signal/um: Use force_sig_fault where appropriate signal/sparc: Use force_sig_fault where appropriate signal/sparc: Use send_sig_fault where appropriate signal/sh: Use force_sig_fault where appropriate signal/s390: Use force_sig_fault where appropriate signal/riscv: Replace do_trap_siginfo with force_sig_fault signal/riscv: Use force_sig_fault where appropriate signal/parisc: Use force_sig_fault where appropriate signal/parisc: Use force_sig_mceerr where appropriate signal/openrisc: Use force_sig_fault where appropriate signal/nios2: Use force_sig_fault where appropriate ...
2018-06-04Merge branch 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+1
Pull aio updates from Al Viro: "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly. The only thing I'm holding back for a day or so is Adam's aio ioprio - his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case), but let it sit in -next for decency sake..." * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) aio: sanitize the limit checking in io_submit(2) aio: fold do_io_submit() into callers aio: shift copyin of iocb into io_submit_one() aio_read_events_ring(): make a bit more readable aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way aio: take list removal to (some) callers of aio_complete() aio: add missing break for the IOCB_CMD_FDSYNC case random: convert to ->poll_mask timerfd: convert to ->poll_mask eventfd: switch to ->poll_mask pipe: convert to ->poll_mask crypto: af_alg: convert to ->poll_mask net/rxrpc: convert to ->poll_mask net/iucv: convert to ->poll_mask net/phonet: convert to ->poll_mask net/nfc: convert to ->poll_mask net/caif: convert to ->poll_mask net/bluetooth: convert to ->poll_mask net/sctp: convert to ->poll_mask net/tipc: convert to ->poll_mask ...
2018-06-01kvm: no need to check return value of debugfs_create functionsGreg Kroah-Hartman3-47/+14
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. This cleans up the error handling a lot, as this code will never get hit. Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄmář" <rkrcmar@redhat.com> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: kvm-ppc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: kvm@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01kvm: Make VM ioctl do valloc for some archsMarc Orr1-0/+15
The kvm struct has been bloating. For example, it's tens of kilo-bytes for x86, which turns out to be a large amount of memory to allocate contiguously via kzalloc. Thus, this patch does the following: 1. Uses architecture-specific routines to allocate the kvm struct via vzalloc for x86. 2. Switches arm to __KVM_HAVE_ARCH_VM_ALLOC so that it can use vzalloc when has_vhe() is true. Other architectures continue to default to kalloc, as they have a dependency on kalloc or have a small-enough struct kvm. Signed-off-by: Marc Orr <marcorr@google.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01kvm: Change return type to vm_fault_tSouptick Joarder2-2/+2
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. commit 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01Merge tag 'kvmarm-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEADPaolo Bonzini8-84/+347
KVM/ARM updates for 4.18 - Lazy context-switching of FPSIMD registers on arm64 - Allow virtual redistributors to be part of two or more MMIO ranges
2018-05-31arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_IDMarc Zyngier1-2/+16
Now that all our infrastructure is in place, let's expose the availability of ARCH_WORKAROUND_2 to guests. We take this opportunity to tidy up a couple of SMCCC constants. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-31arm64: KVM: Add ARCH_WORKAROUND_2 support for guestsMarc Zyngier1-0/+4
In order to offer ARCH_WORKAROUND_2 support to guests, we need a bit of infrastructure. Let's add a flag indicating whether or not the guest uses SSBD mitigation. Depending on the state of this flag, allow KVM to disable ARCH_WORKAROUND_2 before entering the guest, and enable it when exiting it. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-05-26KVM: introduce kvm_make_vcpus_request_mask() APIVitaly Kuznetsov1-8/+26
Hyper-V style PV TLB flush hypercalls inmplementation will use this API. To avoid memory allocation in CONFIG_CPUMASK_OFFSTACK case add cpumask_var_t argument. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-05-26fs: add new vfs_poll and file_can_poll helpersChristoph Hellwig1-1/+1
These abstract out calls to the poll method in preparation for changes in how we poll. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-05-25KVM: arm/arm64: Implement KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONEric Auger4-4/+67
Now all the internals are ready to handle multiple redistributor regions, let's allow the userspace to register them. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Check all vcpu redistributors are set on map_resourcesEric Auger1-5/+14
On vcpu first run, we eventually know the actual number of vcpus. This is a synchronization point to check all redistributors were assigned. On kvm_vgic_map_resources() we check both dist and redist were set, eventually check potential base address inconsistencies. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Check vcpu redist base before registering an iodevEric Auger2-0/+6
As we are going to register several redist regions, vgic_register_all_redist_iodevs() may be called several times. We need to register a redist_iodev for a given vcpu only once. So let's check if the base address has already been set. Initialize this latter in kvm_vgic_vcpu_init(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Remove kvm_vgic_vcpu_early_initEric Auger2-44/+37
kvm_vgic_vcpu_early_init gets called after kvm_vgic_cpu_init which is confusing. The call path is as follows: kvm_vm_ioctl_create_vcpu |_ kvm_arch_cpu_create |_ kvm_vcpu_init |_ kvm_arch_vcpu_init |_ kvm_vgic_vcpu_init |_ kvm_arch_vcpu_postcreate |_ kvm_vgic_vcpu_early_init Static initialization currently done in kvm_vgic_vcpu_early_init() can be moved to kvm_vgic_vcpu_init(). So let's move the code and remove kvm_vgic_vcpu_early_init(). kvm_arch_vcpu_postcreate() does nothing. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Helper to register a new redistributor regionEric Auger2-16/+81
We introduce a new helper that creates and inserts a new redistributor region into the rdist region list. This helper both handles the case where the redistributor region size is known at registration time and the legacy case where it is not (eventually depending on the number of online vcpus). Depending on pfns, we perform all the possible checks that we can do: - end of memory crossing - incorrect alignment of the base address - collision with distributor region if already defined - collision with already registered rdist regions - check of the new index Rdist regions must be inserted by increasing order of indices. Indices must be contiguous. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Adapt vgic_v3_check_base to multiple rdist regionsEric Auger2-17/+42
vgic_v3_check_base() currently only handles the case of a unique legacy redistributor region whose size is not explicitly set but inferred, instead, from the number of online vcpus. We adapt it to handle the case of multiple redistributor regions with explicitly defined size. We rely on two new helpers: - vgic_v3_rdist_overlap() is used to detect overlap with the dist region if defined - vgic_v3_rd_region_size computes the size of the redist region, would it be a legacy unique region or a new explicitly sized region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Revisit Redistributor TYPER last bit computationEric Auger1-1/+6
The TYPER of an redistributor reflects whether the rdist is the last one of the redistributor region. Let's compare the TYPER GPA against the address of the last occupied slot within the redistributor region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-05-25KVM: arm/arm64: Helper to locate free rdist indexEric Auger3-2/+35
We introduce vgic_v3_rdist_free_slot to help identifying where we can place a new 2x64KB redistributor. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>