aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm64/include (follow)
AgeCommit message (Collapse)AuthorFilesLines
2015-10-01irqchip / ACPI: Add probing infrastructure for ACPI-based irqchipsMarc Zyngier1-11/+0
DT enjoys a rather nice probing infrastructure for irqchips, while ACPI is so far stuck into a very distant past. This patch introduces a declarative API, allowing irqchips to be self-contained and be called when a particular entry is matched in the MADT table. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-29irqchip/gicv3: Workaround for Cavium ThunderX erratum 23154Robert Richter2-8/+12
This patch implements Cavium ThunderX erratum 23154. The gicv3 of ThunderX requires a modified version for reading the IAR status to ensure data synchronization. Since this is in the fast-path and called with each interrupt, runtime patching is used using jump label patching for smallest overhead (no-op). This is the same technique as used for tracepoints. Signed-off-by: Robert Richter <rrichter@cavium.com> Reviewed-by: Marc Zygnier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Tirumalesh Chalamarla <tchalamarla@cavium.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Jason Cooper <jason@lakedaemon.net> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1442869119-1814-3-git-send-email-rric@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+1
Pull KVM fixes from Paolo Bonzini: "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC bug fixes too" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: disable halt_poll_ns as default for s390x KVM: x86: fix off-by-one in reserved bits check KVM: x86: use correct page table format to check nested page table reserved bits KVM: svm: do not call kvm_set_cr0 from init_vmcb KVM: x86: trap AMD MSRs for the TSeg base and mask KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs kvm: svm: reset mmu on VCPU reset
2015-09-25KVM: disable halt_poll_ns as default for s390xDavid Hildenbrand1-0/+1
We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-23atomic, arch: Audit atomic_{read,set}()Peter Zijlstra1-1/+1
This patch makes sure that atomic_{read,set}() are at least {READ,WRITE}_ONCE(). We already had the 'requirement' that atomic_read() should use ACCESS_ONCE(), and most archs had this, but a few were lacking. All are now converted to use READ_ONCE(). And, by a symmetry and general paranoia argument, upgrade atomic_set() to use WRITE_ONCE(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: james.hogan@imgtec.com Cc: linux-kernel@vger.kernel.org Cc: oleg@redhat.com Cc: will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-18Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-13/+11
Pull KVM fixes from Paolo Bonzini: "Mostly stable material, a lot of ARM fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) sched: access local runqueue directly in single_task_running arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' arm64: KVM: Remove all traces of the ThumbEE registers arm: KVM: Disable virtual timer even if the guest is not using it arm64: KVM: Disable virtual timer even if the guest is not using it arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources KVM: s390: Replace incorrect atomic_or with atomic_andnot arm: KVM: Fix incorrect device to IPA mapping arm64: KVM: Fix user access for debug registers KVM: vmx: fix VPID is 0000H in non-root operation KVM: add halt_attempted_poll to VCPU stats kvm: fix zero length mmio searching kvm: fix double free for fast mmio eventfd kvm: factor out core eventfd assign/deassign logic kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd KVM: make the declaration of functions within 80 characters KVM: arm64: add workaround for Cortex-A57 erratum #852523 KVM: fix polling for guest halt continued even if disable it arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores arm64: KVM: set {v,}TCR_EL2 RES1 bits ...
2015-09-18Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-5/+0
Pull irq updates from Thomas Gleixner: "This is a rather large update post rc1 due to the final steps of cleanups and API changes which had to wait for the preparatory patches to hit your tree. - Regression fixes for ARM GIC irqchips - Regression fixes and lockdep anotations for renesas irq chips - The leftovers of the cleanup and preparatory patches which have been ignored by maintainers - Final conversions of the newly merged users of obsolete APIs - Final removal of obsolete APIs - Final removal of ARM artifacts which had been introduced during the conversion of ARM to the generic interrupt code. - Final split of the irq_data into chip specific and common data to reflect the needs of hierarchical irq domains. - Treewide removal of the first argument of interrupt flow handlers, i.e. the irq number, which is not used by the majority of handlers and simple to retrieve from the other argument the irq descriptor. - A few comment updates and build warning fixes" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) arm64: Remove ununsed set_irq_flags ARM: Remove ununsed set_irq_flags sh: Kill off set_irq_flags usage irqchip: Kill off set_irq_flags usage gpu/drm: Kill off set_irq_flags usage genirq: Remove irq argument from irq flow handlers genirq: Move field 'msi_desc' from irq_data into irq_common_data genirq: Move field 'affinity' from irq_data into irq_common_data genirq: Move field 'handler_data' from irq_data into irq_common_data genirq: Move field 'node' from irq_data into irq_common_data irqchip/gic-v3: Use IRQD_FORWARDED_TO_VCPU flag irqchip/gic: Use IRQD_FORWARDED_TO_VCPU flag genirq: Provide IRQD_FORWARDED_TO_VCPU status flag genirq: Simplify irq_data_to_desc() genirq: Remove __irq_set_handler_locked() pinctrl/pistachio: Use irq_set_handler_locked gpio: vf610: Use irq_set_handler_locked powerpc/mpc8xx: Use irq_set_handler_locked() powerpc/ipic: Use irq_set_handler_locked() powerpc/cpm2: Use irq_set_handler_locked() ...
2015-09-17Merge tag 'kvm-arm-for-4.3-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-masterPaolo Bonzini3-10/+3
Second set of KVM/ARM changes for 4.3-rc2 - Workaround for a Cortex-A57 erratum - Bug fix for the debugging infrastructure - Fix for 32bit guests with more than 4GB of address space on a 32bit host - A number of fixes for the (unusual) case when we don't use the in-kernel GIC emulation - Removal of ThumbEE handling on arm64, since these have been dropped from the architecture before anyone actually ever built a CPU - Remove the KVM_ARM_MAX_VCPUS limitation which has become fairly pointless
2015-09-17arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'Ming Lei1-6/+2
This patch removes config option of KVM_ARM_MAX_VCPUS, and like other ARCHs, just choose the maximum allowed value from hardware, and follows the reasons: 1) from distribution view, the option has to be defined as the max allowed value because it need to meet all kinds of virtulization applications and need to support most of SoCs; 2) using a bigger value doesn't introduce extra memory consumption, and the help text in Kconfig isn't accurate because kvm_vpu structure isn't allocated until request of creating VCPU is sent from QEMU; 3) the main effect is that the field of vcpus[] in 'struct kvm' becomes a bit bigger(sizeof(void *) per vcpu) and need more cache lines to hold the structure, but 'struct kvm' is one generic struct, and it has worked well on other ARCHs already in this way. Also, the world switch frequecy is often low, for example, it is ~2000 when running kernel building load in VM from APM xgene KVM host, so the effect is very small, and the difference can't be observed in my test at all. Cc: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-17arm64: KVM: Remove all traces of the ThumbEE registersWill Deacon2-4/+1
Although the ThumbEE registers and traps were present in earlier versions of the v8 architecture, it was retrospectively removed and so we can do the same. Whilst this breaks migrating a guest started on a previous version of the kernel, it is much better to kill these (non existent) registers as soon as possible. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> [maz: added commend about migration] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-16arm64: Remove ununsed set_irq_flagsRob Herring1-5/+0
Now that all users of set_irq_flags and custom flags are converted to genirq functions, the ARM specific set_irq_flags can be removed. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-09-16KVM: add halt_attempted_poll to VCPU statsPaolo Bonzini1-0/+1
This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-14Merge tag 'kvm-arm-for-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-masterPaolo Bonzini1-3/+7
KVM/ARM changes for 4.3-rc2 - Fix timer interrupt injection after the rework that went in during the merge window - Reset the timer to zero on reboot - Make sure the TCR_EL2 RES1 bits are really set to 1 - Fix a PSCI affinity bug for non-existing vcpus
2015-09-14arm64: pgtable: use a single bit for PTE_WRITE regardless of DBMWill Deacon1-5/+1
Depending on CONFIG_ARM64_HW_AFDBM, we use either bit 57 or 51 of the pte to represent PTE_WRITE. Given that bit 51 is reserved prior to ARMv8.1, we can just use that bit regardless of the config option. That also matches what happens if a kernel configured with ARM64_HW_AFDBM=y is run on a CPU without the DBM functionality. Cc: Julien Grall <julien.grall@citrix.com> Tested-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-09-14arm64: Fix pte_modify() to preserve the hardware dirty informationCatalin Marinas1-1/+1
The pte_modify() function with hardware AF/DBM enabled must transfer the hardware dirty information to the software PTE_DIRTY bit. However, it was setting this bit in newprot and the mask does not cover such bit. This patch sets PTE_DIRTY on the original pte which will be preserved in the returned value. Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Cc: Julien Grall <julien.grall@citrix.com> Tested-by: Julien Grall <julien.grall@citrix.com> Tested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-09-14arm64: Fix the pte_hw_dirty() check when AF/DBM is enabledCatalin Marinas1-2/+2
Commit 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") introduced support for handling hardware updates of the access flag and dirty status. The PTE is automatically dirtied in hardware (if supported) by clearing the PTE_RDONLY bit when the PTE_DBM/PTE_WRITE bit is set. The pte_hw_dirty() macro was added to detect a hardware dirtied pte. The pte_dirty() macro checks for both software PTE_DIRTY and pte_hw_dirty(). Functions like pte_modify() clear the PTE_RDONLY bit since it is meant to be set in set_pte_at() when written to memory. In such cases, pte_hw_dirty() would return true even though such pte is clean. This patch changes pte_hw_dirty() to test the PTE_DBM/PTE_WRITE bit together with PTE_RDONLY. Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Reported-by: Julien Grall <julien.grall@citrix.com> Tested-by: Julien Grall <julien.grall@citrix.com> Tested-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-09-14arm64, acpi/apei: Implement arch_apei_get_mem_attributes()Jonathan (Zhixiong) Zhang1-0/+5
Table 8 of UEFI 2.5 section 2.3.6.1 defines mappings from EFI memory types to MAIR attribute encodings for arm64. If the physical address has memory attributes defined by EFI memmap as EFI_MEMORY_[UC|WC|WT], return approprate page protection type according to the UEFI spec. Otherwise, return PAGE_KERNEL. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [ Small stylistic tweaks. ] Reviewed-by: Matt Fleming <matt.fleming@intel.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1441372302-23242-2-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-10Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-69/+0
Merge third patch-bomb from Andrew Morton: - even more of the rest of MM - lib/ updates - checkpatch updates - small changes to a few scruffy filesystems - kmod fixes/cleanups - kexec updates - a dma-mapping cleanup series from hch * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits) dma-mapping: consolidate dma_set_mask dma-mapping: consolidate dma_supported dma-mapping: cosolidate dma_mapping_error dma-mapping: consolidate dma_{alloc,free}_noncoherent dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd() mm: make sure all file VMAs have ->vm_ops set mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff() mm: mark most vm_operations_struct const namei: fix warning while make xmldocs caused by namei.c ipc: convert invalid scenarios to use WARN_ON zlib_deflate/deftree: remove bi_reverse() lib/decompress_unlzma: Do a NULL check for pointer lib/decompressors: use real out buf size for gunzip with kernel fs/affs: make root lookup from blkdev logical size sysctl: fix int -> unsigned long assignments in INT_MIN case kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo kexec: align crash_notes allocation to make it be inside one physical page kexec: remove unnecessary test in kimage_alloc_crash_control_pages() kexec: split kexec_load syscall from kexec core code ...
2015-09-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds5-25/+99
Pull more kvm updates from Paolo Bonzini: "ARM: - Full debug support for arm64 - Active state switching for timer interrupts - Lazy FP/SIMD save/restore for arm64 - Generic ARMv8 target PPC: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8 x86: - Compiler warnings Generic: - Adaptive polling for guest halt" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits) kvm: irqchip: fix memory leak kvm: move new trace event outside #ifdef CONFIG_KVM_ASYNC_PF KVM: trace kvm_halt_poll_ns grow/shrink KVM: dynamic halt-polling KVM: make halt_poll_ns per-vCPU Silence compiler warning in arch/x86/kvm/emulate.c kvm: compile process_smi_save_seg_64() only for x86_64 KVM: x86: avoid uninitialized variable warning KVM: PPC: Book3S: Fix typo in top comment about locking KVM: PPC: Book3S: Fix size of the PSPB register KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set KVM: PPC: Book3S HV: Fix race in starting secondary threads KVM: PPC: Book3S: correct width in XER handling KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation KVM: PPC: Book3S HV: Fix preempted vcore list locking KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD KVM: PPC: Book3S HV: Fix bug in dirty page tracking KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 KVM: PPC: Book3S HV: Make use of unused threads when running guests ...
2015-09-10dma-mapping: consolidate dma_set_maskChristoph Hellwig1-9/+0
Almost everyone implements dma_set_mask the same way, although some time that's hidden in ->set_dma_mask methods. This patch consolidates those into a common implementation that either calls ->set_dma_mask if present or otherwise uses the default implementation. Some architectures used to only call ->set_dma_mask after the initial checks, and those instance have been fixed to do the full work. h8300 implemented dma_set_mask bogusly as a no-ops and has been fixed. Unfortunately some architectures overload unrelated semantics like changing the dma_ops into it so we still need to allow for an architecture override for now. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10dma-mapping: consolidate dma_supportedChristoph Hellwig1-6/+0
Most architectures just call into ->dma_supported, but some also return 1 if the method is not present, or 0 if no dma ops are present (although that should never happeb). Consolidate this more broad version into common code. Also fix h8300 which inorrectly always returned 0, which would have been a problem if it's dma_set_mask implementation wasn't a similarly buggy noop. As a few architectures have much more elaborate implementations, we still allow for arch overrides. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10dma-mapping: cosolidate dma_mapping_errorChristoph Hellwig1-7/+0
Currently there are three valid implementations of dma_mapping_error: (1) call ->mapping_error (2) check for a hardcoded error code (3) always return 0 This patch provides a common implementation that calls ->mapping_error if present, then checks for DMA_ERROR_CODE if defined or otherwise returns 0. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10dma-mapping: consolidate dma_{alloc,free}_noncoherentChristoph Hellwig1-14/+0
Most architectures do not support non-coherent allocations and either define dma_{alloc,free}_noncoherent to their coherent versions or stub them out. Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips implements them directly. This patch moves the Openrisc version to common code, and handles the DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance. Note that actual non-coherent allocations require a dma_cache_sync implementation, so if non-coherent allocations didn't work on an architecture before this patch they still won't work after it. [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-10dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent}Christoph Hellwig1-33/+0
Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds1-6/+0
Pull libnvdimm updates from Dan Williams: "This update has successfully completed a 0day-kbuild run and has appeared in a linux-next release. The changes outside of the typical drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and the introduction of ZONE_DEVICE + devm_memremap_pages(). Summary: - Introduce ZONE_DEVICE and devm_memremap_pages() as a generic mechanism for adding device-driver-discovered memory regions to the kernel's direct map. This facility is used by the pmem driver to enable pfn_to_page() operations on the page frames returned by DAX ('direct_access' in 'struct block_device_operations'). For now, the 'memmap' allocation for these "device" pages comes from "System RAM". Support for allocating the memmap from device memory will arrive in a later kernel. - Introduce memremap() to replace usages of ioremap_cache() and ioremap_wt(). memremap() drops the __iomem annotation for these mappings to memory that do not have i/o side effects. The replacement of ioremap_cache() with memremap() is limited to the pmem driver to ease merging the api change in v4.3. Completion of the conversion is targeted for v4.4. - Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem driver, update the VFS DAX implementation and PMEM api to provide persistence guarantees for kernel operations on a DAX mapping. - Convert the ACPI NFIT 'BLK' driver to map the block apertures as cacheable to improve performance. - Miscellaneous updates and fixes to libnvdimm including support for issuing "address range scrub" commands, clarifying the optimal 'sector size' of pmem devices, a clarification of the usage of the ACPI '_STA' (status) property for DIMM devices, and other minor fixes" * tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits) libnvdimm, pmem: direct map legacy pmem by default libnvdimm, pmem: 'struct page' for pmem libnvdimm, pfn: 'struct page' provider infrastructure x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB add devm_memremap_pages mm: ZONE_DEVICE for "device memory" mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h dax: drop size parameter to ->direct_access() nd_blk: change aperture mapping from WC to WB nvdimm: change to use generic kvfree() pmem, dax: have direct_access use __pmem annotation dax: update I/O path to do proper PMEM flushing pmem: add copy_from_iter_pmem() and clear_pmem() pmem, x86: clean up conditional pmem includes pmem: remove layer when calling arch_has_wmb_pmem() pmem, x86: move x86 PMEM API to new pmem.h header libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option pmem: switch to devm_ allocations devres: add devm_memremap libnvdimm, btt: write and validate parent_uuid ...
2015-09-08Merge tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds1-0/+6
Pull xen updates from David Vrabel: "Xen features and fixes for 4.3: - Convert xen-blkfront to the multiqueue API - [arm] Support binding event channels to different VCPUs. - [x86] Support > 512 GiB in a PV guests (off by default as such a guest cannot be migrated with the current toolstack). - [x86] PMU support for PV dom0 (limited support for using perf with Xen and other guests)" * tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits) xen: switch extra memory accounting to use pfns xen: limit memory to architectural maximum xen: avoid another early crash of memory limited dom0 xen: avoid early crash of memory limited dom0 arm/xen: Remove helpers which are PV specific xen/x86: Don't try to set PCE bit in CR4 xen/PMU: PMU emulation code xen/PMU: Intercept PMU-related MSR and APIC accesses xen/PMU: Describe vendor-specific PMU registers xen/PMU: Initialization code for Xen PMU xen/PMU: Sysfs interface for setting Xen PMU mode xen: xensyms support xen: remove no longer needed p2m.h xen: allow more than 512 GB of RAM for 64 bit pv-domains xen: move p2m list if conflicting with e820 map xen: add explicit memblock_reserve() calls for special pages mm: provide early_memremap_ro to establish read-only mapping xen: check for initrd conflicting with e820 map xen: check pre-allocated page tables for conflict with memory map xen: check for kernel memory conflicting with memory layout ...
2015-09-04arm64: KVM: set {v,}TCR_EL2 RES1 bitsMark Rutland1-3/+7
Currently we don't set the RES1 bits of TCR_EL2 and VTCR_EL2 when configuring them, which could lead to unexpected behaviour when an architectural meaning is defined for those bits. Set the RES1 bits to avoid issues. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-09-04Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds39-560/+1411
Pull arm64 updates from Will Deacon: - Support for new architectural features introduced in ARMv8.1: * Privileged Access Never (PAN) to catch user pointer dereferences in the kernel * Large System Extension (LSE) for building scalable atomics and locks (depends on locking/arch-atomic from tip, which is included here) * Hardware Dirty Bit Management (DBM) for updating clean PTEs automatically - Move our PSCI implementation out into drivers/firmware/, where it can be shared with arch/arm/. RMK has also pulled this component branch and has additional patches moving arch/arm/ over. MAINTAINERS is updated accordingly. - Better BUG implementation based on the BRK instruction for trapping - Leaf TLB invalidation for unmapping user pages - Support for PROBE_ONLY PCI configurations - Various cleanups and non-critical fixes, including: * Always flush FP/SIMD state over exec() * Restrict memblock additions based on range of linear mapping * Ensure *(LIST_POISON) generates a fatal fault * Context-tracking syscall return no longer corrupts return value when not forced on. * Alternatives patching synchronisation/stability improvements * Signed sub-word cmpxchg compare fix (tickled by HAVE_CMPXCHG_LOCAL) * Force SMP=y * Hide direct DCC access from userspace * Fix EFI stub memory allocation when DRAM starts at 0x0 * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits) arm64: flush FP/SIMD state correctly after execve() arm64: makefile: fix perf_callchain.o kconfig dependency arm64: set MAX_MEMBLOCK_ADDR according to linear region size of/fdt: make memblock maximum physical address arch configurable arm64: Fix source code file path in comments arm64: entry: always restore x0 from the stack on syscall return arm64: mdscr_el1: avoid exposing DCC to userspace arm64: kconfig: Move LIST_POISON to a safe value arm64: Add __exception_irq_entry definition for function graph arm64: mm: ensure patched kernel text is fetched from PoU arm64: alternatives: ensure secondary CPUs execute ISB after patching arm64: make ll/sc __cmpxchg_case_##name asm consistent arm64: dma-mapping: Simplify pgprot handling arm64: restore cpu suspend/resume functionality ARM64: PCI: do not enable resources on PROBE_ONLY systems arm64: cmpxchg: truncate sub-word signed types before comparison arm64: alternative: put secondary CPUs into polling loop during patch arm64/Documentation: clarify wording regarding memory below the Image arm64: lse: fix lse cmpxchg code indentation arm64: remove redundant object file list ...
2015-09-03Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds2-30/+2
Pull ARM development updates from Russell King: "Included in this update: - moving PSCI code from ARM64/ARM to drivers/ - removal of some architecture internals from global kernel view - addition of software based "privileged no access" support using the old domains register to turn off the ability for kernel loads/stores to access userspace. Only the proper accessors will be usable. - addition of early fixup support for early console - re-addition (and reimplementation) of OMAP special interconnect barrier - removal of finish_arch_switch() - only expose cpuX/online in sysfs if hotpluggable - a number of code cleanups" * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits) ARM: software-based priviledged-no-access support ARM: entry: provide uaccess assembly macro hooks ARM: entry: get rid of multiple macro definitions ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die() ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore() ARM: mm: improve do_ldrd_abort macro ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit() ARM: entry: efficiency cleanups ARM: entry: get rid of asm_trace_hardirqs_on_cond ARM: uaccess: simplify user access assembly ARM: domains: remove DOMAIN_TABLE ARM: domains: keep vectors in separate domain ARM: domains: get rid of manager mode for user domain ARM: domains: move initial domain setting value to asm/domains.h ARM: domains: provide domain_mask() ARM: domains: switch to keeping domain value in register ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD() ARM: 8416/1: Feroceon: use of_iomap() to map register base ARM: 8415/1: early fixmap support for earlycon ...
2015-08-27mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.hChristoph Hellwig1-6/+0
Three architectures already define these, and we'll need them genericly soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-24arm64: set MAX_MEMBLOCK_ADDR according to linear region sizeArd Biesheuvel1-0/+8
The linear region size of a 39-bit VA kernel is only 256 GB, which may be insufficient to cover all of system RAM, even on platforms that have much less than 256 GB of memory but which is laid out very sparsely. So make sure we clip the memory we will not be able to map before installing it into the memblock memory table, by setting MAX_MEMBLOCK_ADDR accordingly. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-24arm64: Fix source code file path in commentsAlexander Kuleshov1-1/+1
Architecture specific code for i386 and x86_64 was unified and merged to the arch/x86. This patch fix old path of x86 architecture in a comment from the arch/arm64/include/asm/fixmap.h. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-22Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm-queuePaolo Bonzini2-15/+1
Patch queue for ppc - 2015-08-22 Highlights for KVM PPC this time around: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8
2015-08-20xen/events: Support event channel rebind on ARMJulien Grall1-0/+6
Currently, the event channel rebind code is gated with the presence of the vector callback. The virtual interrupt controller on ARM has the concept of per-CPU interrupt (PPI) which allow us to support per-VCPU event channel. Therefore there is no need of vector callback for ARM. Xen is already using a free PPI to notify the guest VCPU of an event. Furthermore, the xen code initialization in Linux (see arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ. Introduce new helper xen_support_evtchn_rebind to allow architecture decide whether rebind an event is support or not. It will always return true on ARM and keep the same behavior on x86. This is also allow us to drop the usage of xen_have_vector_callback entirely in the ARM code. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2015-08-19arm64: KVM: Optimize arm64 skip 30-50% vfp/simd save/restore on exitsMario Smarduch1-1/+4
This patch only saves and restores FP/SIMD registers on Guest access. To do this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit. lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD context is not saved/restored [chazy/maz: fixed save/restore logic for 32bit guests] Signed-off-by: Mario Smarduch <m.smarduch@samsung.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12arm64: Add __exception_irq_entry definition for function graphJungseok Lee2-2/+27
The gic_handle_irq() is defined with __exception_irq_entry attribute. A single remaining work is to add its definition as ARM did. Below shows how function graph data is changed with these hunks. A prologue of an interrupt handler is drawn as follows. - current status 0) 0.208 us | cpuidle_not_available(); 0) | default_idle_call() { 0) | arch_cpu_idle() { 0) | __handle_domain_irq() { 0) | irq_enter() { 0) 0.313 us | rcu_irq_enter(); 0) 0.261 us | __local_bh_disable_ip(); - with this change 0) 0.625 us | cpuidle_not_available(); 0) | default_idle_call() { 0) | arch_cpu_idle() { 0) ==========> | 0) | gic_handle_irq() { 0) | __handle_domain_irq() { 0) | irq_enter() { 0) 0.885 us | rcu_irq_enter(); 0) 0.781 us | __local_bh_disable_ip(); An epilogue of an interrupt handler is recorded as follows. - current status 0) 0.261 us | idle_cpu(); 0) | rcu_irq_exit() { 0) 0.521 us | rcu_eqs_enter_common.isra.46(); 0) 2.552 us | } 0) ! 322.448 us | } 0) ! 583.437 us | } 0) # 1656.041 us | } 0) # 1658.073 us | } - with this change 0) 0.677 us | idle_cpu(); 0) | rcu_irq_exit() { 0) 1.770 us | rcu_eqs_enter_common.isra.46(); 0) 7.968 us | } 0) # 1803.541 us | } 0) # 2626.667 us | } 0) # 2632.969 us | } 0) <========== | 0) # 14425.00 us | } 0) # 14430.98 us | } Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Rabin Vincent <rabin@rab.in> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-12arm64: KVM: remove remaining reference to vgic_sr_vectorsVladimir Murzin1-5/+0
Since commit 8a14849 (arm64: KVM: Switch vgic save/restore to alternative_insn) vgic_sr_vectors is not used anymore, so remove remaining leftovers and kill the structure. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12arm64/kvm: Add generic v8 KVM targetSuzuki K. Poulose1-2/+8
This patch adds a generic ARM v8 KVM target cpu type for use by the new CPUs which eventualy ends up using the common sys_reg table. For backward compatibility the existing targets have been preserved. Any new target CPU that can be covered by generic v8 sys_reg tables should make use of the new generic target. Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Acked-by: Marc Zyngier <Marc.Zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-08-12Merge branch 'locking/arch-atomic' into locking/core, because it's ready for upstreamIngo Molnar1-0/+14
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-08arm64/mm: Add PROT_DEVICE_nGnRnE and PROT_NORMAL_WTJonathan (Zhixiong) Zhang2-0/+3
UEFI spec 2.5 section 2.3.6.1 defines that EFI_MEMORY_[UC|WC|WT|WB] are possible EFI memory types for AArch64. Each of those EFI memory types is mapped to a corresponding AArch64 memory type. So we need to define PROT_DEVICE_nGnRnE and PROT_NORMWL_WT additionaly. MT_NORMAL_WT is defined, and its encoding is added to MAIR_EL1 when initializing the CPU. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1438936621-5215-6-git-send-email-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-05Merge branch 'aarch64/psci/drivers' into aarch64/for-next/coreWill Deacon2-30/+2
Move our PSCI implementation out into drivers/firmware/ where it can be shared with arch/arm/. Conflicts: arch/arm64/kernel/psci.c
2015-08-04arm64: make ll/sc __cmpxchg_case_##name asm consistentMark Rutland1-1/+1
The ll/sc __cmpxchg_case_##name assembly mostly uses symbolic names for operands, but in a single case uses %2 to refer to what is otherwise known as %[v]. This makes the code more painful to read than is necessary. Use %[v] instead. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-03arm64: psci: factor invocation code to driversMark Rutland2-30/+2
To enable sharing with arm, move the core PSCI framework code to drivers/firmware. This results in a minor gain in lines of code, but this will quickly be amortised by the removal of code currently duplicated in arch/arm. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-03locking/static_keys: Add a new static_key interfacePeter Zijlstra1-2/+16
There are various problems and short-comings with the current static_key interface: - static_key_{true,false}() read like a branch depending on the key value, instead of the actual likely/unlikely branch depending on init value. - static_key_{true,false}() are, as stated above, tied to the static_key init values STATIC_KEY_INIT_{TRUE,FALSE}. - we're limited to the 2 (out of 4) possible options that compile to a default NOP because that's what our arch_static_branch() assembly emits. So provide a new static_key interface: DEFINE_STATIC_KEY_TRUE(name); DEFINE_STATIC_KEY_FALSE(name); Which define a key of different types with an initial true/false value. Then allow: static_branch_likely() static_branch_unlikely() to take a key of either type and emit the right instruction for the case. This means adding a second arch_static_branch_jump() assembly helper which emits a JMP per default. In order to determine the right instruction for the right state, encode the branch type in the LSB of jump_entry::key. This is the final step in removing the naming confusion that has led to a stream of avoidable bugs such as: a833581e372a ("x86, perf: Fix static_key bug in load_mm_cr4()") ... but it also allows new static key combinations that will give us performance enhancements in the subsequent patches. Tested-by: Rabin Vincent <rabin@rab.in> # arm Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # ppc Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03locking, arch: use WRITE_ONCE()/READ_ONCE() in smp_store_release()/smp_load_acquire()Andrey Konovalov1-2/+2
Replace ACCESS_ONCE() macro in smp_store_release() and smp_load_acquire() with WRITE_ONCE() and READ_ONCE() on x86, arm, arm64, ia64, metag, mips, powerpc, s390, sparc and asm-generic since ACCESS_ONCE() does not work reliably on non-scalar types. WRITE_ONCE() and READ_ONCE() were introduced in the following commits: 230fa253df63 ("kernel: Provide READ_ONCE and ASSIGN_ONCE") 43239cbe79fc ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Alexander Duyck <alexander.h.duyck@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@suse.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1438528264-714-1-git-send-email-andreyknvl@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-30arm64: cmpxchg: truncate sub-word signed types before comparisonWill Deacon1-4/+4
When performing a cmpxchg operation on a signed sub-word type (e.g. s8), we need to ensure that the upper register bits of the "old" value used for comparison are zeroed, otherwise we may erroneously fail the cmpxchg which may even be interpreted as success by the caller (if the compiler performs the truncation as part of its check). This has been observed in mod_state, where negative values where causing problems with this_cpu_cmpxchg. This patch fixes the issue by explicitly casting 8-bit and 16-bit "old" values using unsigned types in our cmpxchg wrappers. 32-bit types can be left alone, since the underlying asm makes use of W registers in this case. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-30arm64: alternative: put secondary CPUs into polling loop during patchWill Deacon1-1/+2
When patching the kernel text with alternatives, we may end up patching parts of the stop_machine state machine (e.g. atomic_dec_and_test in ack_state) and consequently corrupt the instruction stream of any secondary CPUs. This patch passes the cpu_online_mask to stop_machine, forcing all of the CPUs into our own callback which can place the secondary cores into a dumb (but safe!) polling loop whilst the patching is carried out. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-29arm64: lse: fix lse cmpxchg code indentationWill Deacon1-3/+3
For some reason, the ll/sc cmpxchg asm is all off to the left and awkward to read in conjunction with the following (correctly indented) LSE version. This patch shifts the ll/sc code back to where it should be. Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-29arm64: remove dead-code depending on CONFIG_UP_LATE_INITJonas Rabenstein1-2/+0
Commit 4b3dc9679cf7 ("arm64: force CONFIG_SMP=y and remove redundant and therfore can not be selected anymore. Remove dead #ifdef-block depending on UP_LATE_INIT in arch/arm64/kernel/setup.c Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> [will: kill do_post_cpus_up_work altogether] Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-07-28arm64: pgtable: fix definition of pte_validWill Deacon1-1/+1
pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte, so fix the macro definition to use bitwise & instead of logical &&. Signed-off-by: Will Deacon <will.deacon@arm.com>