aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/apic/apic.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-07-16x86/apic: Silence -Wtype-limits compiler warningsQian Cai1-1/+1
There are many compiler warnings like this, In file included from ./arch/x86/include/asm/smp.h:13, from ./arch/x86/include/asm/mmzone_64.h:11, from ./arch/x86/include/asm/mmzone.h:5, from ./include/linux/mmzone.h:969, from ./include/linux/gfp.h:6, from ./include/linux/mm.h:10, from arch/x86/kernel/apic/io_apic.c:34: arch/x86/kernel/apic/io_apic.c: In function 'check_timer': ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2160:2: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_INFO "..TIMER: vector=0x%02X " ^~~~~~~~~~~ ./arch/x86/include/asm/apic.h:37:11: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits] if ((v) <= apic_verbosity) \ ^~ arch/x86/kernel/apic/io_apic.c:2207:4: note: in expansion of macro 'apic_printk' apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: " ^~~~~~~~~~~ APIC_QUIET is 0, so silence them by making apic_verbosity type int. Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1562621805-24789-1-git-send-email-cai@lca.pw
2019-07-08Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-25/+62
Pull x96 apic updates from Thomas Gleixner: "Updates for the x86 APIC interrupt handling and APIC timer: - Fix a long standing issue with spurious interrupts which was caused by the big vector management rework a few years ago. Robert Hodaszi provided finally enough debug data and an excellent initial failure analysis which allowed to understand the underlying issues. This contains a change to the core interrupt management code which is required to handle this correctly for the APIC/IO_APIC. The core changes are NOOPs for most architectures except ARM64. ARM64 is not impacted by the change as confirmed by Marc Zyngier. - Newer systems allow to disable the PIT clock for power saving causing panic in the timer interrupt delivery check of the IO/APIC when the HPET timer is not enabled either. While the clock could be turned on this would cause an endless whack a mole game to chase the proper register in each affected chipset. These systems provide the relevant frequencies for TSC, CPU and the local APIC timer via CPUID and/or MSRs, which allows to avoid the PIT/HPET based calibration. As the calibration code is the only usage of the legacy timers on modern systems and is skipped anyway when the frequencies are known already, there is no point in setting up the PIT and actually checking for the interrupt delivery via IO/APIC. To achieve this on a wide variety of platforms, the CPUID/MSR based frequency readout has been made more robust, which also allowed to remove quite some workarounds which turned out to be not longer required. Thanks to Daniel Drake for analysis, patches and verification" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Seperate unused system vectors from spurious entry again x86/irq: Handle spurious interrupt after shutdown gracefully x86/ioapic: Implement irq_get_irqchip_state() callback genirq: Add optional hardware synchronization for shutdown genirq: Fix misleading synchronize_irq() documentation genirq: Delay deactivation in free_irq() x86/timer: Skip PIT initialization on modern chipsets x86/apic: Use non-atomic operations when possible x86/apic: Make apic_bsp_setup() static x86/tsc: Set LAPIC timer period to crystal clock frequency x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period' x86/tsc: Use CPUID.0x16 to calculate missing crystal frequency
2019-07-03x86/irq: Seperate unused system vectors from spurious entry againThomas Gleixner1-11/+22
Quite some time ago the interrupt entry stubs for unused vectors in the system vector range got removed and directly mapped to the spurious interrupt vector entry point. Sounds reasonable, but it's subtly broken. The spurious interrupt vector entry point pushes vector number 0xFF on the stack which makes the whole logic in __smp_spurious_interrupt() pointless. As a consequence any spurious interrupt which comes from a vector != 0xFF is treated as a real spurious interrupt (vector 0xFF) and not acknowledged. That subsequently stalls all interrupt vectors of equal and lower priority, which brings the system to a grinding halt. This can happen because even on 64-bit the system vector space is not guaranteed to be fully populated. A full compile time handling of the unused vectors is not possible because quite some of them are conditonally populated at runtime. Bring the entry stubs back, which wastes 160 bytes if all stubs are unused, but gains the proper handling back. There is no point to selectively spare some of the stubs which are known at compile time as the required code in the IDT management would be way larger and convoluted. Do not route the spurious entries through common_interrupt and do_IRQ() as the original code did. Route it to smp_spurious_interrupt() which evaluates the vector number and acts accordingly now that the real vector numbers are handed in. Fixup the pr_warn so the actual spurious vector (0xff) is clearly distiguished from the other vectors and also note for the vectored case whether it was pending in the ISR or not. "Spurious APIC interrupt (vector 0xFF) on CPU#0, should never happen." "Spurious interrupt vector 0xed on CPU#1. Acked." "Spurious interrupt vector 0xee on CPU#1. Not pending!." Fixes: 2414e021ac8d ("x86: Avoid building unused IRQ entry stubs") Reported-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Jan Beulich <jbeulich@suse.com> Link: https://lkml.kernel.org/r/20190628111440.550568228@linutronix.de
2019-06-29x86/timer: Skip PIT initialization on modern chipsetsThomas Gleixner1-0/+27
Recent Intel chipsets including Skylake and ApolloLake have a special ITSSPRC register which allows the 8254 PIT to be gated. When gated, the 8254 registers can still be programmed as normal, but there are no IRQ0 timer interrupts. Some products such as the Connex L1430 and exone go Rugged E11 use this register to ship with the PIT gated by default. This causes Linux to fail to boot: Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with apic=debug and send a report. The panic happens before the framebuffer is initialized, so to the user, it appears as an early boot hang on a black screen. Affected products typically have a BIOS option that can be used to enable the 8254 and make Linux work (Chipset -> South Cluster Configuration -> Miscellaneous Configuration -> 8254 Clock Gating), however it would be best to make Linux support the no-8254 case. Modern sytems allow to discover the TSC and local APIC timer frequencies, so the calibration against the PIT is not required. These systems have always running timers and the local APIC timer works also in deep power states. So the setup of the PIT including the IO-APIC timer interrupt delivery checks are a pointless exercise. Skip the PIT setup and the IO-APIC timer interrupt checks on these systems, which avoids the panic caused by non ticking PITs and also speeds up the boot process. Thanks to Daniel for providing the changelog, initial analysis of the problem and testing against a variety of machines. Reported-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Daniel Drake <drake@endlessm.com> Cc: bp@alien8.de Cc: hpa@zytor.com Cc: linux@endlessm.com Cc: rafael.j.wysocki@intel.com Cc: hdegoede@redhat.com Link: https://lkml.kernel.org/r/20190628072307.24678-1-drake@endlessm.com
2019-06-22x86/apic: Fix integer overflow on 10 bit left shift of cpu_khzColin Ian King1-1/+2
The left shift of unsigned int cpu_khz will overflow for large values of cpu_khz, so cast it to a long long before shifting it to avoid overvlow. For example, this can happen when cpu_khz is 4194305, i.e. ~4.2 GHz. Addresses-Coverity: ("Unintentional integer overflow") Fixes: 8c3ba8d04924 ("x86, apic: ack all pending irqs when crashed/on kexec") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: kernel-janitors@vger.kernel.org Link: https://lkml.kernel.org/r/20190619181446.13635-1-colin.king@canonical.com
2019-06-16x86/apic: Make apic_bsp_setup() staticThomas Gleixner1-4/+3
No user outside of apic.c. Remove the stale and bogus function comment while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-05-21treewide: Add SPDX license identifier for missed filesThomas Gleixner1-0/+1
Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-09x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period'Daniel Drake1-10/+10
This variable is a period unit (number of clock cycles per jiffy), not a frequency (which is number of cycles per second). Give it a more appropriate name. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Daniel Drake <drake@endlessm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: len.brown@intel.com Cc: linux@endlessm.com Cc: rafael.j.wysocki@intel.com Link: http://lkml.kernel.org/r/20190509055417.13152-2-drake@endlessm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-25x86/apic: Unify duplicated local apic timer clockevent initializationJacob Pan1-26/+31
Local APIC timer clockevent parameters can be calculated based on platform specific methods. However the code is mostly duplicated with the interrupt based calibration. The commit which increased the max_delta parameter updated only one place and made the implementations diverge. Unify it to prevent further damage. [ tglx: Rename function to lapic_init_clockevent() and adjust changelog a bit ] Fixes: 4aed89d6b515 ("x86, lapic-timer: Increase the max_delta to 31 bits") Reported-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/1556213272-63568-1-git-send-email-jacob.jun.pan@linux.intel.com
2018-12-08x86/kernel: Fix more -Wmissing-prototypes warningsBorislav Petkov1-0/+1
... with the goal of eventually enabling -Wmissing-prototypes by default. At least on x86. Make functions static where possible, otherwise add prototypes or make them visible through includes. asm/trace/ changes courtesy of Steven Rostedt <rostedt@goodmis.org>. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> # ACPI + cpufreq bits Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Travis <mike.travis@hpe.com> Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yi Wang <wang.yi59@zte.com.cn> Cc: linux-acpi@vger.kernel.org
2018-10-31mm: remove include/linux/bootmem.hMike Rapoport1-1/+1
Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-09-27x86/apic: Add Hygon Dhyana supportPu Wen1-0/+7
Add Hygon Dhyana support to the APIC subsystem. When running in 32 bit mode, bigsmp should be enabled if there are more than 8 cores online. Signed-off-by: Pu Wen <puwen@hygon.cn> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: tglx@linutronix.de Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: x86@kernel.org Cc: thomas.lendacky@amd.com Link: https://lkml.kernel.org/r/7a557265a8c7c9e842fe60f9d8e064458801aef3.1537533369.git.puwen@hygon.cn
2018-08-14x86/smp: fix non-SMP broken build due to redefinition of apic_id_is_primary_threadVlastimil Babka1-0/+2
The function has an inline "return false;" definition with CONFIG_SMP=n but the "real" definition is also visible leading to "redefinition of ‘apic_id_is_primary_thread’" compiler error. Guard it with #ifdef CONFIG_SMP Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 6a4d2657e048 ("x86/smp: Provide topology_is_primary_thread()") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-14Merge branch 'l1tf-final' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+16
Merge L1 Terminal Fault fixes from Thomas Gleixner: "L1TF, aka L1 Terminal Fault, is yet another speculative hardware engineering trainwreck. It's a hardware vulnerability which allows unprivileged speculative access to data which is available in the Level 1 Data Cache when the page table entry controlling the virtual address, which is used for the access, has the Present bit cleared or other reserved bits set. If an instruction accesses a virtual address for which the relevant page table entry (PTE) has the Present bit cleared or other reserved bits set, then speculative execution ignores the invalid PTE and loads the referenced data if it is present in the Level 1 Data Cache, as if the page referenced by the address bits in the PTE was still present and accessible. While this is a purely speculative mechanism and the instruction will raise a page fault when it is retired eventually, the pure act of loading the data and making it available to other speculative instructions opens up the opportunity for side channel attacks to unprivileged malicious code, similar to the Meltdown attack. While Meltdown breaks the user space to kernel space protection, L1TF allows to attack any physical memory address in the system and the attack works across all protection domains. It allows an attack of SGX and also works from inside virtual machines because the speculation bypasses the extended page table (EPT) protection mechanism. The assoicated CVEs are: CVE-2018-3615, CVE-2018-3620, CVE-2018-3646 The mitigations provided by this pull request include: - Host side protection by inverting the upper address bits of a non present page table entry so the entry points to uncacheable memory. - Hypervisor protection by flushing L1 Data Cache on VMENTER. - SMT (HyperThreading) control knobs, which allow to 'turn off' SMT by offlining the sibling CPU threads. The knobs are available on the kernel command line and at runtime via sysfs - Control knobs for the hypervisor mitigation, related to L1D flush and SMT control. The knobs are available on the kernel command line and at runtime via sysfs - Extensive documentation about L1TF including various degrees of mitigations. Thanks to all people who have contributed to this in various ways - patches, review, testing, backporting - and the fruitful, sometimes heated, but at the end constructive discussions. There is work in progress to provide other forms of mitigations, which might be less horrible performance wise for a particular kind of workloads, but this is not yet ready for consumption due to their complexity and limitations" * 'l1tf-final' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits) x86/microcode: Allow late microcode loading with SMT disabled tools headers: Synchronise x86 cpufeatures.h for L1TF additions x86/mm/kmmio: Make the tracer robust against L1TF x86/mm/pat: Make set_memory_np() L1TF safe x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert x86/speculation/l1tf: Invert all not present mappings cpu/hotplug: Fix SMT supported evaluation KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Documentation/l1tf: Remove Yonah processors from not vulnerable list x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d x86: Don't include linux/irq.h from asm/hardirq.h x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() cpu/hotplug: detect SMT disabled by BIOS ...
2018-08-13Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull x86 apic update from Thomas Gleixner: "Trivial cleanups of the APIC related code" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Trivial coding style fixes x86/vector: Merge allocate_vector() into assign_vector_locked()
2018-08-05x86: Don't include linux/irq.h from asm/hardirq.hNicolai Stange1-0/+1
The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to *.c files as needed. Note that some of these *.c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-07-30x86/apic: Trivial coding style fixesYi Wang1-1/+1
There is inconsistent indenting in calibrate_APIC_clock() and activate_managed(). Remove the surplus TAB. Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: hpa@zytor.com Cc: douly.fnst@cn.fujitsu.com Cc: jgross@suse.com Cc: ville.syrjala@linux.intel.com Cc: len.brown@intel.com Cc: gregkh@linuxfoundation.org Cc: zhong.weidong@zte.com.cn Link: https://lkml.kernel.org/r/1532672103-32250-1-git-send-email-wang.yi59@zte.com.cn
2018-07-24x86/apic: Future-proof the TSC_DEADLINE quirk for SKXLen Brown1-0/+3
All SKX with stepping higher than 4 support the TSC_DEADLINE, no matter the microcode version. Without this patch, upcoming SKX steppings will not be able to use their TSC_DEADLINE timer. Signed-off-by: Len Brown <len.brown@intel.com> Cc: <stable@kernel.org> # v4.14+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 616dd5872e ("x86/apic: Update TSC_DEADLINE quirk with additional SKX stepping") Link: http://lkml.kernel.org/r/d0c7129e509660be9ec6b233284b8d42d90659e8.1532207856.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-02Revert "x86/apic: Ignore secondary threads if nosmt=force"Thomas Gleixner1-19/+0
Dave Hansen reported, that it's outright dangerous to keep SMT siblings disabled completely so they are stuck in the BIOS and wait for SIPI. The reason is that Machine Check Exceptions are broadcasted to siblings and the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or reboots the machine. The MCE chapter in the SDM contains the following blurb: Because the logical processors within a physical package are tightly coupled with respect to shared hardware resources, both logical processors are notified of machine check errors that occur within a given physical processor. If machine-check exceptions are enabled when a fatal error is reported, all the logical processors within a physical package are dispatched to the machine-check exception handler. If machine-check exceptions are disabled, the logical processors enter the shutdown state and assert the IERR# signal. When enabling machine-check exceptions, the MCE flag in control register CR4 should be set for each logical processor. Reverting the commit which ignores siblings at enumeration time solves only half of the problem. The core cpuhotplug logic needs to be adjusted as well. This thoughtful engineered mechanism also turns the boot process on all Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU before the secondary CPUs are brought up. Depending on the number of physical cores the window in which this situation can happen is smaller or larger. On a HSW-EX it's about 750ms: MCE is enabled on the boot CPU: [ 0.244017] mce: CPU supports 22 MCE banks The corresponding sibling #72 boots: [ 1.008005] .... node #0, CPUs: #72 That means if an MCE hits on physical core 0 (logical CPUs 0 and 72) between these two points the machine is going to shutdown. At least it's a known safe state. It's obvious that the early boot can be hit by an MCE as well and then runs into the same situation because MCEs are not yet enabled on the boot CPU. But after enabling them on the boot CPU, it does not make any sense to prevent the kernel from recovering. Adjust the nosmt kernel parameter documentation as well. Reverts: 2207def700f9 ("x86/apic: Ignore secondary threads if nosmt=force") Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com>
2018-06-21x86/apic: Ignore secondary threads if nosmt=forceThomas Gleixner1-0/+19
nosmt on the kernel command line merely prevents the onlining of the secondary SMT siblings. nosmt=force makes the APIC detection code ignore the secondary SMT siblings completely, so they even do not show up as possible CPUs. That reduces the amount of memory allocations for per cpu variables and saves other resources from being allocated too large. This is not fully equivalent to disabling SMT in the BIOS because the low level SMT enabling in the BIOS can result in partitioning of resources between the siblings, which is not undone by just ignoring them. Some CPUs can use the full resources when their sibling is not onlined, but this is depending on the CPU family and model and it's not well documented whether this applies to all partitioned resources. That means depending on the workload disabling SMT in the BIOS might result in better performance. Linus analysis of the Intel manual: The intel optimization manual is not very clear on what the partitioning rules are. I find: "In general, the buffers for staging instructions between major pipe stages are partitioned. These buffers include µop queues after the execution trace cache, the queues after the register rename stage, the reorder buffer which stages instructions for retirement, and the load and store buffers. In the case of load and store buffers, partitioning also provided an easier implementation to maintain memory ordering for each logical processor and detect memory ordering violations" but some of that partitioning may be relaxed if the HT thread is "not active": "In Intel microarchitecture code name Sandy Bridge, the micro-op queue is statically partitioned to provide 28 entries for each logical processor, irrespective of software executing in single thread or multiple threads. If one logical processor is not active in Intel microarchitecture code name Ivy Bridge, then a single thread executing on that processor core can use the 56 entries in the micro-op queue" but I do not know what "not active" means, and how dynamic it is. Some of that partitioning may be entirely static and depend on the early BIOS disabling of HT, and even if we park the cores, the resources will just be wasted. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org>
2018-06-21x86/smp: Provide topology_is_primary_thread()Thomas Gleixner1-0/+15
If the CPU is supporting SMT then the primary thread can be found by checking the lower APIC ID bits for zero. smp_num_siblings is used to build the mask for the APIC ID bits which need to be taken into account. This uses the MPTABLE or ACPI/MADT supplied APIC ID, which can be different than the initial APIC ID in CPUID. But according to AMD the lower bits have to be consistent. Intel gave a tentative confirmation as well. Preparatory patch to support disabling SMT at boot/runtime. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org>
2018-04-02Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-51/+60
Pull x86 apic updates from Ingo Molnar: "The main x86 APIC/IOAPIC changes in this cycle were: - Robustify kexec support to more carefully restore IRQ hardware state before calling into kexec/kdump kernels. (Baoquan He) - Clean up the local APIC code a bit (Dou Liyang) - Remove unused callbacks (David Rientjes)" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Finish removing unused callbacks x86/apic: Drop logical_smp_processor_id() inline x86/apic: Modernize the pending interrupt code x86/apic: Move pending interrupt check code into it's own function x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified x86/apic: Rename variables and functions related to x86_io_apic_ops x86/apic: Remove the (now) unused disable_IO_APIC() function x86/apic: Fix restoring boot IRQ mode in reboot and kexec/kdump x86/apic: Split disable_IO_APIC() into two functions to fix CONFIG_KEXEC_JUMP=y x86/apic: Split out restore_boot_irq_mode() from disable_IO_APIC() x86/apic: Make setup_local_APIC() static x86/apic: Simplify init_bsp_APIC() usage x86/x2apic: Mark set_x2apic_phys_mode() as __init
2018-03-01x86/apic: Drop logical_smp_processor_id() inlineDou Liyang1-5/+5
The logical_smp_processor_id() inline which is only called in setup_local_APIC() on x86_32 systems has no real value. Drop it and directly use GET_APIC_LOGICAL_ID() at the call site and use a more suitable variable name for readability Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: andy.shevchenko@gmail.com Cc: bhe@redhat.com Cc: ebiederm@xmission.com Link: https://lkml.kernel.org/r/20180301055930.2396-4-douly.fnst@cn.fujitsu.com
2018-03-01x86/apic: Modernize the pending interrupt codeDou Liyang1-9/+8
The pending interrupt check code is old, update the following: - Use for_each_set_bit() instead of open coding it - Replace printk() with pr_err() - Get rid of printk line breaks - Make curly braces balanced Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: bhe@redhat.com Cc: ebiederm@xmission.com Link: https://lkml.kernel.org/r/20180301055930.2396-3-douly.fnst@cn.fujitsu.com
2018-03-01x86/apic: Move pending interrupt check code into it's own functionDou Liyang1-45/+55
The pending interrupt check code is mixed with the local APIC setup code, that looks messy. Extract the related code, move it into a new function named apic_pending_intr_clear(). Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: bhe@redhat.com Cc: ebiederm@xmission.com Link: https://lkml.kernel.org/r/20180301055930.2396-2-douly.fnst@cn.fujitsu.com
2018-02-17x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specifiedBaoquan He1-1/+1
Currently the kdump kernel becomes very slow if 'noapic' is specified. Normal kernel doesn't have this bug. Kernel parameter 'noapic' is used to disable IO-APIC in system for testing or special purpose. Here the root cause is that in kdump kernel LAPIC is disabled since commit: 522e664644 ("x86/apic: Disable I/O APIC before shutdown of the local APIC") In this case we need set up through-local-APIC on boot CPU in setup_local_APIC(). In normal kernel the legacy irq mode is enabled by the BIOS. If it is virtual wire mode, the local-APIC has been enabled and set as through-local-APIC. Though we fixed the regression introduced by commit 522e664644, to further improve robustness set up the through-local-APIC mode explicitly, do not rely on the default boot IRQ mode. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: douly.fnst@cn.fujitsu.com Cc: joro@8bytes.org Cc: prarit@redhat.com Cc: uobergfe@redhat.com Link: http://lkml.kernel.org/r/20180214054656.3780-7-bhe@redhat.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-16x86/apic: Make setup_local_APIC() staticDou Liyang1-1/+1
This function isn't used outside of apic.c, so let's mark it static. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bhe@redhat.com Cc: ebiederm@xmission.com Link: http://lkml.kernel.org/r/20180214062554.21020-1-douly.fnst@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_steppingJia Zhang1-3/+3
x86_mask is a confusing name which is hard to associate with the processor's stepping. Additionally, correct an indent issue in lib/cpu.c. Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com> [ Updated it to more recent kernels. ] Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-14Revert "x86/apic: Remove init_bsp_APIC()"Ville Syrjälä1-0/+49
This reverts commit b371ae0d4a194b178817b0edfb6a7395c7aec37a. It causes boot hangs on old P3/P4 systems when the local APIC is enforced in UP mode. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/20171128145350.21560-1-ville.syrjala@linux.intel.com
2017-12-28x86/apic: Avoid wrong warning when parsing 'apic=' in X86-32 caseDou Liyang1-0/+2
There are two consumers of apic=: apic_set_verbosity() for setting the APIC debug level; parse_apic() for registering APIC driver by hand. X86-32 supports both of them, but sometimes, kernel issues a weird warning. eg: when kernel was booted up with 'apic=bigsmp' in command line, early_param would warn like that: ... [ 0.000000] APIC Verbosity level bigsmp not recognised use apic=verbose or apic=debug [ 0.000000] Malformed early option 'apic' ... Wrap the warning code in CONFIG_X86_64 case to avoid this. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: rdunlap@infradead.org Cc: corbet@lwn.net Link: https://lkml.kernel.org/r/20171204040313.24824-1-douly.fnst@cn.fujitsu.com
2017-11-13Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-140/+99
Pull x86 APIC updates from Thomas Gleixner: "This update provides a major overhaul of the APIC initialization and vector allocation code: - Unification of the APIC and interrupt mode setup which was scattered all over the place and was hard to follow. This also distangles the timer setup from the APIC initialization which brings a clear separation of functionality. Great detective work from Dou Lyiang! - Refactoring of the x86 vector allocation mechanism. The existing code was based on nested loops and rather convoluted APIC callbacks which had a horrible worst case behaviour and tried to serve all different use cases in one go. This led to quite odd hacks when supporting the new managed interupt facility for multiqueue devices and made it more or less impossible to deal with the vector space exhaustion which was a major roadblock for server hibernation. Aside of that the code dealing with cpu hotplug and the system vectors was disconnected from the actual vector management and allocation code, which made it hard to follow and maintain. Utilizing the new bitmap matrix allocator core mechanism, the new allocator and management code consolidates the handling of system vectors, legacy vectors, cpu hotplug mechanisms and the actual allocation which needs to be aware of system and legacy vectors and hotplug constraints into a single consistent entity. This has one visible change: The support for multi CPU targets of interrupts, which is only available on a certain subset of CPUs/APIC variants has been removed in favour of single interrupt targets. A proper analysis of the multi CPU target feature revealed that there is no real advantage as the vast majority of interrupts end up on the CPU with the lowest APIC id in the set of target CPUs anyway. That change was agreed on by the relevant folks and allowed to simplify the implementation significantly and to replace rather fragile constructs like the vector cleanup IPI with straight forward and solid code. Furthermore this allowed to cleanly separate the allocation details for legacy, normal and managed interrupts: * Legacy interrupts are not longer wasting 16 vectors unconditionally * Managed interrupts have now a guaranteed vector reservation, but the actual vector assignment happens when the interrupt is requested. It's guaranteed not to fail. * Normal interrupts no longer allocate vectors unconditionally when the interrupt is set up (IO/APIC init or MSI(X) enable). The mechanism has been switched to a best effort reservation mode. The actual allocation happens when the interrupt is requested. Contrary to managed interrupts the request can fail due to vector space exhaustion, but drivers must handle a fail of request_irq() anyway. When the interrupt is freed, the vector is handed back as well. This solves a long standing problem with large unconditional vector allocations for a certain class of enterprise devices which prevented server hibernation due to vector space exhaustion when the unused allocated vectors had to be migrated to CPU0 while unplugging all non boot CPUs. The code has been equipped with trace points and detailed debugfs information to aid analysis of the vector space" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code genirq: Add config option for reservation mode x86/vector: Use correct per cpu variable in free_moved_vector() x86/apic/vector: Ignore set_affinity call for inactive interrupts x86/apic: Fix spelling mistake: "symmectic" -> "symmetric" x86/apic: Use dead_cpu instead of current CPU when cleaning up ACPI/init: Invoke early ACPI initialization earlier x86/vector: Respect affinity mask in irq descriptor x86/irq: Simplify hotplug vector accounting x86/vector: Switch IOAPIC to global reservation mode x86/vector/msi: Switch to global reservation mode x86/vector: Handle managed interrupts proper x86/io_apic: Reevaluate vector configuration on activate() iommu/amd: Reevaluate vector configuration on activate() iommu/vt-d: Reevaluate vector configuration on activate() x86/apic/msi: Force reactivation of interrupts at startup time x86/vector: Untangle internal state from irq_cfg x86/vector: Compile SMP only code conditionally x86/apic: Remove unused callbacks ...
2017-11-13Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull x86 platform updates from Ingo Molnar: "The main changes in this cycle were: - a refactoring of the early virt init code by merging 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init', which allows simplifications and also the addition of a new ->guest_late_init() callback. (Juergen Gross) - timer_setup() conversion of the UV code (Kees Cook)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/virt/xen: Use guest_late_init to detect Xen PVH guest x86/virt, x86/platform: Add ->guest_late_init() callback to hypervisor_x86 structure x86/virt, x86/acpi: Add test for ACPI_FADT_NO_VGA x86/virt: Add enum for hypervisors to replace x86_hyper x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init' x86/platform/UV: Convert timers to use timer_setup()
2017-11-13Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull x86 boot updates from Ingo Molnar: "Three smaller changes: - clang fix - boot message beautification - unnecessary header inclusion removal" * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Disable Clang warnings about GNU extensions x86/boot: Remove unnecessary #include <generated/utsrelease.h> x86/boot: Spell out "boot CPU" for BP
2017-11-10x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init'Juergen Gross1-1/+1
Instead of x86_hyper being either NULL on bare metal or a pointer to a struct hypervisor_x86 in case of the kernel running as a guest merge the struct into x86_platform and x86_init. This will remove the need for wrappers making it hard to find out what is being called. With dummy functions added for all callbacks testing for a NULL function pointer can be removed, too. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: akataria@vmware.com Cc: boris.ostrovsky@oracle.com Cc: devel@linuxdriverproject.org Cc: haiyangz@microsoft.com Cc: kvm@vger.kernel.org Cc: kys@microsoft.com Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: rusty@rustcorp.com.au Cc: sthemmin@microsoft.com Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20171109132739.23465-2-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07Merge branch 'linus' into x86/apic, to resolve conflictsIngo Molnar1-2/+13
Conflicts: arch/x86/include/asm/x2apic.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-12x86/apic: Update TSC_DEADLINE quirk with additional SKX steppingLen Brown1-1/+11
SKX stepping-3 fixed the TSC_DEADLINE issue in a different ucode version number than stepping-4. Linux needs to know this stepping-3 specific version number to also enable the TSC_DEADLINE on stepping-3. The steppings and ucode versions are documented in the SKX BIOS update: https://downloadmirror.intel.com/26978/eng/ReleaseNotes_R00.01.0004.txt Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Link: https://lkml.kernel.org/r/60f2bbf7cf617e212b522e663f84225bfebc50e5.1507756305.git.len.brown@intel.com
2017-10-12x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on hypervisorsPaolo Bonzini1-1/+2
Commit 594a30fb1242 ("x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature", 2017-08-30) was also about silencing the warning on VirtualBox; however, KVM does expose the TSC deadline timer, and it's virtualized so that it is immune from CPU errata. Therefore, booting 4.13 with "-cpu Haswell" shows this in the logs: [ 0.000000] [Firmware Bug]: TSC_DEADLINE disabled due to Errata; please update microcode to version: 0xb2 (or later) Even if you had a hypervisor that does _not_ virtualize the TSC deadline and rather exposes the hardware one, it should be the hypervisors task to update microcode and possibly hide the flag from CPUID. So just hide the message when running on _any_ hypervisor, not just those that do not support the TSC deadline timer. The older check still makes sense, so keep it. Fixes: bd9240a18e ("x86/apic: Add TSC_DEADLINE quirk due to errata") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1507630377-54471-1-git-send-email-pbonzini@redhat.com
2017-10-03x86/boot: Spell out "boot CPU" for BPJean Delvare1-1/+1
It's not obvious to everybody that BP stands for boot processor. At least it was not for me. And BP is also a CPU register on x86, so it is ambiguous. Spell out "boot CPU" everywhere instead. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Alok Kataria <akataria@vmware.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-09-28x86/apic: Fix spelling mistake: "symmectic" -> "symmetric"Colin Ian King1-2/+2
Trivial fix to spelling mistakes in pr_info messages Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Link: https://lkml.kernel.org/r/20170927102223.31920-1-colin.king@canonical.com
2017-09-25x86/apic: Move common APIC callbacksThomas Gleixner1-28/+0
Move more apic struct specific functions out of the header and the apic management code into the common source file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213153.834421893@linutronix.de
2017-09-25x86/apic: Move probe32 specific APIC functionsThomas Gleixner1-10/+0
The apic functions which are used in probe_32.c are implemented as inlines or in apic.c. There is no reason to have them at random places. Move them to the actual usage site and make them static. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Juergen Gross <jgross@suse.com> Tested-by: Yu Chen <yu.c.chen@intel.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Alok Kataria <akataria@vmware.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Rui Zhang <rui.zhang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Len Brown <lenb@kernel.org> Link: https://lkml.kernel.org/r/20170913213153.596768194@linutronix.de
2017-09-25x86/apic: Use lapic_is_integrated() consistentlyDou Liyang1-5/+4
lapic_is_integrated() is a wrapper around APIC_INTEGRATED(), but not used consistently. Replace the direct usage of APIC_INTEGRATED() and fixup a hard to read tail comment. No functional change. [ tglx: Made it compile and work .... ] Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1504774161-7137-2-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Remove duplicate X86_64 conditional in lapic_is_integrated()Dou Liyang1-4/+0
The macro APIC_INTEGRATED(x) is already wrapped by CONFIG_X86_32. So it can be invoked unconditionally. Remove the extra "#ifdef CONFIG_X86_64...". No functional change. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1504774161-7137-1-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Remove init_bsp_APIC()Dou Liyang1-49/+0
init_bsp_APIC() which works for the virtual wire mode is used in ISA irq initialization at boot time. With the new APIC interrupt delivery mode scheme, which initializes the APIC before the first interrupt is expected, init_bsp_APIC() is not longer required and can be removed. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-13-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Initialize interrupt mode after timer initDou Liyang1-2/+0
A cold or warm boot through BIOS sets the APIC in default interrupt delivery mode. A dump-capture kernel will not go through a BIOS reset and leave the interrupt delivery mode in the state which was active on the crashed kernel, but the dump kernel startup code assumes default delivery mode which can result in interrupt delivery/handling to fail. To solve this problem, it's required to set up the final interrupt delivery mode as soon as possible. As IOAPIC setup needs the timer initialized for verifying the timer interrupt delivery mode, the earliest point is right after timer setup in late_time_init(). That results in the following init order: 1) Set up the legacy timer, if applicable on the platform 2) Set up APIC/IOAPIC which includes the verification of the legacy timer interrupt delivery. 3) TSC calibration 4) Local APIC timer setup Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-12-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/init: Add intr_mode_init to x86_init_opsDou Liyang1-1/+1
X86 and XEN initialize interrupt delivery mode in different way. To avoid conditionals, add a new x86_init_ops function which defaults to the standard function and can be overridden by the early XEN platform code. [ tglx: Folded the XEN part which was a separate patch to preserve bisectability ] Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-10-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Unify interrupt mode setup for UP systemDou Liyang1-41/+6
In UniProcessor kernel with UP_LATE_INIT=y, the interrupt delivery mode is initialized in up_late_init(). Use the new unified apic_intr_mode_init() function and remove APIC_init_uniprocessor(). Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-8-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Mark the apic_intr_mode extern for sanity check cleanupDou Liyang1-10/+6
Calling native_smp_prepare_cpus() to prepare for SMP bootup, does some sanity checking, enables APIC mode and disables SMP feature. Now, APIC mode setup has been unified to apic_intr_mode_init(), some sanity checks are redundant and need to be cleanup. Mark the apic_intr_mode extern to refine the switch and remove the redundant sanity check. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-7-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Unify interrupt mode setup for SMP-capable systemDou Liyang1-3/+35
On a SMP-capable system, the kernel enables and sets up the APIC interrupt delivery mode in native_smp_prepare_cpus(). The decision how to setup the APIC is intermingled with the decision of setting up SMP or not. Split the initialization of the APIC interrupt mode independent from other decisions and have a separate apic_intr_mode_init() function for it. The invocation time stays the same for now. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-6-git-send-email-douly.fnst@cn.fujitsu.com
2017-09-25x86/apic: Move logical APIC ID away from apic_bsp_setup()Dou Liyang1-9/+1
apic_bsp_setup() sets and returns logical APIC ID for initializing cpu0_logical_apicid in a SMP-capable system. The id has nothing to do with the initialization of local APIC and I/O APIC. And apic_bsp_setup() should be called for interrupt mode setup only. Move the id setup into a separate helper function for cleanup and mark apic_bsp_setup() void. Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: yinghai@kernel.org Cc: bhe@redhat.com Link: https://lkml.kernel.org/r/1505293975-26005-5-git-send-email-douly.fnst@cn.fujitsu.com