aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/kernel/vmlinux.lds.S (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-08-10x86-64: Rework vsyscall emulation and add vsyscall= parameterAndy Lutomirski1-33/+0
There are three choices: vsyscall=native: Vsyscalls are native code that issues the corresponding syscalls. vsyscall=emulate (default): Vsyscalls are emulated by instruction fault traps, tested in the bad_area path. The actual contents of the vsyscall page is the same as the vsyscall=native case except that it's marked NX. This way programs that make assumptions about what the code in the page does will not be confused when they read that code. vsyscall=none: Trying to execute a vsyscall will segfault. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-08-04x86-64: Work around gold bug 13023Andy Lutomirski1-6/+10
Gold has trouble assigning numbers to the location counter inside of an output section description. The bug was triggered by 9fd67b4ed0714ab718f1f9bd14c344af336a6df7, which consolidated all of the vsyscall sections into a single section. The workaround is IMO still nicer than the old way of doing it. This produces an apparently valid kernel image and passes my vdso tests on both GNU ld version 2.21.51.0.6-2.fc15 20110118 and GNU gold (version 2.21.51.0.6-2.fc15 20110118) 1.10 as distributed by Fedora 15. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/0b260cb806f1f9a25c00ce8377a5f035d57f557a.1312378163.git.luto@mit.edu Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-08-04x86-64: Move the "user" vsyscall segment out of the data segment.Andy Lutomirski1-18/+18
The kernel's loader doesn't seem to care, but gold complains. Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/f0716870c297242a841b949953d80c0d87bf3d3f.1312378163.git.luto@mit.edu Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-14x86-64: Move vread_tsc and vread_hpet into the vDSOAndy Lutomirski1-3/+0
The vsyscall page now consists entirely of trap instructions. Cc: John Stultz <johnstul@us.ibm.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Link: http://lkml.kernel.org/r/637648f303f2ef93af93bae25186e9a1bea093f5.1310639973.git.luto@mit.edu Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-06x86-64: Fill unused parts of the vsyscall page with 0xccAndy Lutomirski1-9/+7
Jumping to 0x00 might do something depending on the following bytes. Jumping to 0xcc is a trap. So fill the unused parts of the vsyscall page with 0xcc to make it useless for exploits to jump there. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/ed54bfcfbe50a9070d20ec1edbe0d149e22a4568.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-06x86-64: Remove vsyscall number 3 (venosys)Andy Lutomirski1-4/+0
It just segfaults since April 2008 (a4928cff), so I'm pretty sure that nothing uses it. And having an empty section makes the linker script a bit fragile. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/4a4abcf47ecadc269f2391a313576fe6d06acef7.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-05x86-64: Give vvars their own pageAndy Lutomirski1-11/+17
Move vvars out of the vsyscall page into their own page and mark it NX. Without this patch, an attacker who can force a daemon to call some fixed address could wait until the time contains, say, 0xCD80, and then execute the current time. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Jesper Juhl <jj@chaosbits.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Jan Beulich <JBeulich@novell.com> Cc: richard -rw- weinberger <richard.weinberger@gmail.com> Cc: Mikael Pettersson <mikpe@it.uu.se> Cc: Andi Kleen <andi@firstfloor.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Louis Rilling <Louis.Rilling@kerlabs.com> Cc: Valdis.Kletnieks@vt.edu Cc: pageexec@freemail.hu Link: http://lkml.kernel.org/r/b1460f81dc4463d66ea3f2b5ce240f58d48effec.1307292171.git.luto@mit.edu Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-26Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tipLinus Torvalds1-23/+11
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: vdso: Remove unused variable x86-64: Optimize vDSO time() x86-64: Add time to vDSO x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO x86-64: Move vread_tsc into a new file with sensible options x86-64: Vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0 x86-64: Don't generate cmov in vread_tsc x86-64: Remove unnecessary barrier in vread_tsc x86-64: Clean up vdso/kernel shared variables
2011-05-24Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds1-1/+1
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: Unify input section names percpu: Avoid extra NOP in percpu_cmpxchg16b_double percpu: Cast away printk format warning percpu: Always align percpu output section to PAGE_SIZE Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
2011-05-24x86-64: Clean up vdso/kernel shared variablesAndy Lutomirski1-23/+11
Variables that are shared between the vdso and the kernel are currently a bit of a mess. They are each defined with their own magic, they are accessed differently in the kernel, the vsyscall page, and the vdso, and one of them (vsyscall_clock) doesn't even really exist. This changes them all to use a common mechanism. All of them are delcared in vvar.h with a fixed address (validated by the linker script). In the kernel (as before), they look like ordinary read-write variables. In the vsyscall page and the vdso, they are accessed through a new macro VVAR, which gives read-only access. The vdso is now loaded verbatim into memory without any fixups. As a side bonus, access from the vdso is faster because a level of indirection is removed. While we're at it, pack jiffies and vgetcpu_mode into the same cacheline. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-05-22x86, apic: Introduce .apicdrivers section to find the list of apic driversSuresh Siddha1-0/+7
This will pave the way for each apic driver to be self-contained and eliminate the need for apic_probe[]. Order in which apic drivers are listed in the .apicdrivers section is important, as this determines the apic probe order. And this is enforced by the ordering of apic driver files in the Makefile and the macros apic_driver()/apic_drivers(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: steiner@sgi.com Cc: gorcunov@openvz.org Cc: yinghai@kernel.org Link: http://lkml.kernel.org/r/20110521005526.068775085@sbsiddha-MOBL3.sc.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-24percpu: Always align percpu output section to PAGE_SIZETejun Heo1-1/+1
Percpu allocator honors alignment request upto PAGE_SIZE and both the percpu addresses in the percpu address space and the translated kernel addresses should be aligned accordingly. The calculation of the former depends on the alignment of percpu output section in the kernel image. The linker script macros PERCPU_VADDR() and PERCPU() are used to define this output section and the latter takes @align parameter. Several architectures are using @align smaller than PAGE_SIZE breaking percpu memory alignment. This patch removes @align parameter from PERCPU(), renames it to PERCPU_SECTION() and makes it always align to PAGE_SIZE. While at it, add PCPU_SETUP_BUG_ON() checks such that alignment problems are reliably detected and remove percpu alignment comment recently added in workqueue.c as the condition would trigger BUG way before reaching there. For um, this patch raises the alignment of percpu area. As the area is in .init, there shouldn't be any noticeable difference. This problem was discovered by David Howells while debugging boot failure on mn10300. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Mike Frysinger <vapier@gentoo.org> Cc: uclinux-dist-devel@blackfin.uclinux.org Cc: David Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: user-mode-linux-devel@lists.sourceforge.net
2011-03-16Merge branch 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tipLinus Torvalds1-0/+13
* 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix binutils-2.21 symbol related build failures x86-64, trampoline: Remove unused variable x86, reboot: Fix the use of passed arguments in 32-bit BIOS reboot x86, reboot: Move the real-mode reboot code to an assembly file x86: Make the GDT_ENTRY() macro in <asm/segment.h> safe for assembly x86, trampoline: Use the unified trampoline setup for ACPI wakeup x86, trampoline: Common infrastructure for low memory trampolines Fix up trivial conflicts in arch/x86/kernel/Makefile
2011-03-16Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds1-2/+2
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support percpu: Generic support for this_cpu_cmpxchg_double() alpha: use L1_CACHE_BYTES for cacheline size in the linker script percpu: align percpu readmostly subsection to cacheline Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the percpu alignment having changed ("x86: Reduce back the alignment of the per-CPU data section")
2011-03-15Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tipLinus Torvalds1-1/+1
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, binutils, xen: Fix another wrong size directive x86: Remove dead config option X86_CPU x86: Really print supported CPUs if PROCESSOR_SELECT=y x86: Fix a bogus unwind annotation in lib/semaphore_32.S um, x86-64: Fix UML build after adding CFI annotations to lib/rwsem_64.S x86: Remove unused bits from lib/thunk_*.S x86: Use {push,pop}_cfi in more places x86-64: Add CFI annotations to lib/rwsem_64.S x86, asm: Cleanup unnecssary macros in asm-offsets.c x86, system.h: Drop unused __SAVE/__RESTORE macros x86: Use bitmap library functions x86: Partly unify asm-offsets_{32,64}.c x86: Reduce back the alignment of the per-CPU data section
2011-03-08x86: Separate out entry text sectionJiri Olsa1-0/+1
Put x86 entry code into a separate link section: .entry.text. Separating the entry text section seems to have performance benefits - caused by more efficient instruction cache usage. Running hackbench with perf stat --repeat showed that the change compresses the icache footprint. The icache load miss rate went down by about 15%: before patch: 19417627 L1-icache-load-misses ( +- 0.147% ) after patch: 16490788 L1-icache-load-misses ( +- 0.180% ) The motivation of the patch was to fix a particular kprobes bug that relates to the entry text section, the performance advantage was discovered accidentally. Whole perf output follows: - results for current tip tree: Performance counter stats for './hackbench/hackbench 10' (500 runs): 19417627 L1-icache-load-misses ( +- 0.147% ) 2676914223 instructions # 0.497 IPC ( +- 0.079% ) 5389516026 cycles ( +- 0.144% ) 0.206267711 seconds time elapsed ( +- 0.138% ) - results for current tip tree with the patch applied: Performance counter stats for './hackbench/hackbench 10' (500 runs): 16490788 L1-icache-load-misses ( +- 0.180% ) 2717734941 instructions # 0.502 IPC ( +- 0.079% ) 5414756975 cycles ( +- 0.148% ) 0.206747566 seconds time elapsed ( +- 0.137% ) Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: masami.hiramatsu.pt@hitachi.com Cc: ananth@in.ibm.com Cc: davem@davemloft.net Cc: 2nddept-manager@sdl.hitachi.co.jp LKML-Reference: <20110307181039.GB15197@jolsa.redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-17x86, trampoline: Common infrastructure for low memory trampolinesH. Peter Anvin1-0/+13
Common infrastructure for low memory trampolines. This code installs the trampolines permanently in low memory very early. It also permits multiple pieces of code to be used for this purpose. This code also introduces a standard infrastructure for computing symbol addresses in the trampoline code. The only change to the actual SMP trampolines themselves is that the 64-bit trampoline has been made reusable -- the previous version would overwrite the code with a status variable; this moves the status variable to a separate location. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> LKML-Reference: <4D5DFBE4.7090104@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Matthieu Castet <castet.matthieu@free.fr> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
2011-02-10x86: Reduce back the alignment of the per-CPU data sectionJan Beulich1-1/+1
This complements commit: 47f19a0814e8: percpu: Remove the multi-page alignment facility reverting one leftover of: fe8e0c25cad2: x86, 32-bit: Align percpu area and irq stacks to THREAD_SIZE Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Alexander van Heukelum <heukelum@fastmail.fm> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4D525CE60200007800030EE5@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Alexander van Heukelum <heukelum@fastmail.fm>
2011-01-25percpu: align percpu readmostly subsection to cachelineTejun Heo1-2/+2
Currently percpu readmostly subsection may share cachelines with other percpu subsections which may result in unnecessary cacheline bounce and performance degradation. This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR() linker macros, makes each arch linker scripts specify its cacheline size and use it to align percpu subsections. This is based on Shaohua's x86 only patch. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Shaohua Li <shaohua.li@intel.com>
2011-01-19Revert "x86: Make relocatable kernel work with new binutils"Ingo Molnar1-9/+2
This reverts commit 86b1e8dd83cb ("x86: Make relocatable kernel work with new binutils"). Markus Trippelsdorf reported a boot failure caused by this patch. The real solution to the original patch will likely involve an arch-generic solution to define an overlaid jiffies_64 and jiffies variables. Until that's done and tested on all architectures revert this commit to solve the regression. Reported-and-bisected-by: Markus Trippelsdorf <markus@trippelsdorf.de> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>, Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <4D36A759.60704@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-18x86: Make relocatable kernel work with new binutilsShaohua Li1-2/+9
The CONFIG_RELOCATABLE=y option is broken with new binutils, which will make boot panic. According to Lu Hongjiu, the affected binutils are from 2.20.51.0.12 to 2.21.51.0.3, which are release since Oct 22 this year. At least ubuntu 10.10 is using such binutils. See: http://sourceware.org/bugzilla/show_bug.cgi?id=12327 The reason of the boot panic is that we have 'jiffies = jiffies_64;' in vmlinux.lds.S. The jiffies isn't in any section. In kernel build, there is warning saying jiffies is an absolute address and can't be relocatable. At runtime, jiffies will have virtual address 0. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Cc: Lu Hongjiu<hongjiu.lu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <1295312269.1949.725.camel@sli10-conroe> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-18x86: Add NX protection for kernel dataMatthieu Castet1-2/+6
This patch expands functionality of CONFIG_DEBUG_RODATA to set main (static) kernel data area as NX. The following steps are taken to achieve this: 1. Linker script is adjusted so .text always starts and ends on a page bound 2. Linker script is adjusted so .rodata always start and end on a page boundary 3. NX is set for all pages from _etext through _end in mark_rodata_ro. 4. free_init_pages() sets released memory NX in arch/x86/mm/init.c 5. bios rom is set to x when pcibios is used. The results of patch application may be observed in the diff of kernel page table dumps: pcibios: -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400 ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400 0x00000000-0xc0000000 3G pmd ---[ Kernel Mapping ]--- -0xc0000000-0xc0100000 1M RW GLB x pte +0xc0000000-0xc00a0000 640K RW GLB NX pte +0xc00a0000-0xc0100000 384K RW GLB x pte -0xc0100000-0xc03d7000 2908K ro GLB x pte +0xc0100000-0xc0318000 2144K ro GLB x pte +0xc0318000-0xc03d7000 764K ro GLB NX pte -0xc03d7000-0xc0600000 2212K RW GLB x pte +0xc03d7000-0xc0600000 2212K RW GLB NX pte 0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd 0xf7a00000-0xf7bfe000 2040K RW GLB NX pte 0xf7bfe000-0xf7c00000 8K pte No pcibios: -- data_nx_pt_before.txt 2009-10-13 07:48:59.000000000 -0400 ++ data_nx_pt_after.txt 2009-10-13 07:26:46.000000000 -0400 0x00000000-0xc0000000 3G pmd ---[ Kernel Mapping ]--- -0xc0000000-0xc0100000 1M RW GLB x pte +0xc0000000-0xc0100000 1M RW GLB NX pte -0xc0100000-0xc03d7000 2908K ro GLB x pte +0xc0100000-0xc0318000 2144K ro GLB x pte +0xc0318000-0xc03d7000 764K ro GLB NX pte -0xc03d7000-0xc0600000 2212K RW GLB x pte +0xc03d7000-0xc0600000 2212K RW GLB NX pte 0xc0600000-0xf7a00000 884M RW PSE GLB NX pmd 0xf7a00000-0xf7bfe000 2040K RW GLB NX pte 0xf7bfe000-0xf7c00000 8K pte The patch has been originally developed for Linux 2.6.34-rc2 x86 by Siarhei Liakh <sliakh.lkml@gmail.com> and Xuxian Jiang <jiang@cs.ncsu.edu>. -v1: initial patch for 2.6.30 -v2: patch for 2.6.31-rc7 -v3: moved all code into arch/x86, adjusted credits -v4: fixed ifdef, removed credits from CREDITS -v5: fixed an address calculation bug in mark_nxdata_nx() -v6: added acked-by and PT dump diff to commit log -v7: minor adjustments for -tip -v8: rework with the merge of "Set first MB as RW+NX" Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com> Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu> Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr> Cc: Arjan van de Ven <arjan@infradead.org> Cc: James Morris <jmorris@namei.org> Cc: Andi Kleen <ak@muc.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Dave Jones <davej@redhat.com> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4CE2F82E.60601@free.fr> [ minor cleanliness edits ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-22Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tipLinus Torvalds1-1/+1
* 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, 32-bit: Align percpu area and irq stacks to THREAD_SIZE x86: Move alloc_desk_mask variables inside ifdef x86-32: Align IRQ stacks properly x86: Remove CONFIG_4KSTACKS x86: Always use irq stacks Fixed up trivial conflicts in include/linux/{irq.h, percpu-defs.h}
2010-09-07x86, 32-bit: Align percpu area and irq stacks to THREAD_SIZEAlexander van Heukelum1-1/+1
The irq stacks, located in the percpu-area, need to be THREAD_SIZE aligned. Add the infrastucture to align percpu variables to larger-than-pagesize amounts within the percpu area, and use it to specify the alignment for the irq stacks. Also align the percpu area itself to THREAD_SIZE. This should make irq stacks work with 8K THREAD_SIZE. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Cc: Tejun Heo <tj@kernel.org> Cc: hch@lst.de LKML-Reference: <1283799222.15941.1393621887@webmail.messagingengine.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31x86, iommu: Fix IOMMU_INIT alignment rulesKonrad Rzeszutek Wilk1-2/+1
This boot crash was observed: DMA-API: preallocated 32768 debug entries DMA-API: debugging enabled by kernel config BUG: unable to handle kernel paging request at 19da8955 IP: [<f4ffffff>] 0xf4ffffff *pde = 00000000 The crux of the failure was that even if we did not use any of the .iommu_table section, the linker would still insert it in the vmlinux file. This patch fixes that and also fixes the runtime crash where we would try to access the array. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> LKML-Reference: <1283191802-25086-1-git-send-email-konrad.wilk@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-27x86, doc: Adding comments about .iommu_table and its neighbors.Konrad Rzeszutek Wilk1-0/+22
Updating the linker section with comments about .iommu_table and some other ones that I know of. CC: Sam Ravnborg <sam@ravnborg.org> CC: H. Peter Anvin <hpa@zytor.com> CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> LKML-Reference: <1282933173-19960-1-git-send-email-konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-26x86, iommu: Add IOMMU_INIT macros, .iommu_table section, and iommu_table_entry structureKonrad Rzeszutek Wilk1-0/+7
This patch set adds a mechanism to "modularize" the IOMMUs we have on X86. Currently the count of IOMMUs is up to six and they have a complex relationship that requires careful execution order. 'pci_iommu_alloc' does that today, but most folks are unhappy with how it does it. This patch set addresses this and also paves a mechanism to jettison unused IOMMUs during run-time. For details that sparked this, please refer to: http://lkml.org/lkml/2010/8/2/282 The first solution that comes to mind is to convert wholesale the IOMMU detection routines to be called during initcall time frame. Unfortunately that misses the dependency relationship that some of the IOMMUs have (for example: for AMD-Vi IOMMU to work, GART detection MUST run first, and before all of that SWIOTLB MUST run). The second solution would be to introduce a registration call wherein the IOMMU would provide its detection/init routines and as well on what MUST run before it. That would work, except that the 'pci_iommu_alloc' which would run through this list, is called during mem_init. This means we don't have any memory allocator, and it is so early that we haven't yet started running through the initcall_t list. This solution borrows concepts from the 2nd idea and from how MODULE_INIT works. A macro is provided that each IOMMU uses to define it's detect function and early_init (before the memory allocate is active), and as well what other IOMMU MUST run before us. Since most IOMMUs depend on having SWIOTLB run first ("pci_swiotlb_detect") a convenience macro to depends on that is also provided. This macro is similar in design to MODULE_PARAM macro wherein we setup a .iommu_table section in which we populate it with the values that match a struct iommu_table_entry. During bootup we will sort through the array so that the IOMMUs that MUST run before us are first elements in the array. And then we just iterate through them calling the detection routine and if appropiate, the init routines. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> LKML-Reference: <1282845485-8991-2-git-send-email-konrad.wilk@oracle.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-06-01Merge branch 'for-35' of git://repo.or.cz/linux-kbuildLinus Torvalds1-2/+2
* 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits) kbuild: Revert part of e8d400a to resolve a conflict kbuild: Fix checking of scm-identifier variable gconfig: add support to show hidden options that have prompts menuconfig: add support to show hidden options which have prompts gconfig: remove show_debug option gconfig: remove dbg_print_ptype() and dbg_print_stype() kconfig: fix zconfdump() kconfig: some small fixes add random binaries to .gitignore kbuild: Include gen_initramfs_list.sh and the file list in the .d file kconfig: recalc symbol value before showing search results .gitignore: ignore *.lzo files headerdep: perlcritic warning scripts/Makefile.lib: Align the output of LZO kbuild: Generate modules.builtin in make modules_install Revert "kbuild: specify absolute paths for cscope" kbuild: Do not unnecessarily regenerate modules.builtin headers_install: use local file handles headers_check: fix perl warnings export_report: fix perl warnings ...
2010-03-29x86: Make smp_locks end with page alignmentYinghai Lu1-1/+1
Fix: ------------[ cut here ]------------ WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa() free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace: [<40232e9f>] warn_slowpath_common+0x65/0x7c [<4021c9f0>] ? free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40232eea>] warn_slowpath_fmt+0x24/0x27 [<4021c9f0>] free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40d3f4bd>] alternative_instructions+0xf6/0x100 [<40d3fe4f>] check_bugs+0xbd/0xbf [<40d398a7>] start_kernel+0x2d5/0x2e4 [<40d390ce>] i386_start_kernel+0xce/0xd5 ---[ end trace 4eaa2a86a8e2da22 ]--- Comments in vmlinux.lds.S already said: | /* | * smp_locks might be freed after init | * start/end must be page aligned | */ Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-2-git-send-email-yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-03Rename .text.page_aligned to .text..page_aligned.Denys Vlasenko1-1/+1
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-03-03Rename .bss.page_aligned to .bss..page_aligned.Tim Abbott1-1/+1
Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-01-05Merge branch 'master' into percpuTejun Heo1-8/+37
Conflicts: arch/powerpc/platforms/pseries/hvCall.S include/linux/percpu.h
2009-12-14x86: Regex support and known-movable symbols for relocs, fix _endH. Peter Anvin1-3/+1
This adds a new category of symbols to the relocs program: symbols which are known to be relative, even though the linker emits them as absolute; this is the case for symbols that live in the linker script, which currently applies to _end. Unfortunately the previous workaround of putting _end in its own empty section was defeated by newer binutils, which remove empty sections completely. This patch also changes the symbol matching to use regular expressions instead of hardcoded C for specific patterns. This is a decidedly non-minimal patch: a modified version of the relocs program is used as part of the Syslinux build, and this is basically a backport to Linux of some of those changes; they have thus been well tested. Signed-off-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4AF86211.3070103@zytor.com> Acked-by: Michal Marek <mmarek@suse.cz> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2009-12-08Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tipLinus Torvalds1-5/+33
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (36 commits) x86, mm: Correct the implementation of is_untracked_pat_range() x86/pat: Trivial: don't create debugfs for memtype if pat is disabled x86, mtrr: Fix sorting of mtrr after subtracting x86: Move find_smp_config() earlier and avoid bootmem usage x86, platform: Change is_untracked_pat_range() to bool; cleanup init x86: Change is_ISA_range() into an inline function x86, mm: is_untracked_pat_range() takes a normal semiclosed range x86, mm: Call is_untracked_pat_range() rather than is_ISA_range() x86: UV SGI: Don't track GRU space in PAT x86: SGI UV: Fix BAU initialization x86, numa: Use near(er) online node instead of roundrobin for NUMA x86, numa, bootmem: Only free bootmem on NUMA failure path x86: Change crash kernel to reserve via reserve_early() x86: Eliminate redundant/contradicting cache line size config options x86: When cleaning MTRRs, do not fold WP into UC x86: remove "extern" from function prototypes in <asm/proto.h> x86, mm: Report state of NX protections during boot x86, mm: Clean up and simplify NX enablement x86, pageattr: Make set_memory_(x|nx) aware of NX support x86, sleep: Always save the value of EFER ... Fix up conflicts (added both iommu_shutdown and is_untracked_pat_range) to 'struct x86_platform_ops') in arch/x86/include/asm/x86_init.h arch/x86/kernel/x86_init.c
2009-11-19x86: Eliminate redundant/contradicting cache line size config optionsJan Beulich1-5/+5
Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT (with inconsistent defaults), just having the latter suffices as the former can be easily calculated from it. To be consistent, also change X86_INTERNODE_CACHE_BYTES to X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA to account for last level cache line size (which here matters more than L1 cache line size). Finally, make sure the default value for X86_L1_CACHE_SHIFT, when X86_GENERIC is selected, is being seen before that for the individual CPU model options (other than on x86-64, where GENERIC_CPU is part of the choice construct, X86_GENERIC is a separate option on ix86). Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Ravikiran Thirumalai <kiran@scalex86.org> Acked-by: Nick Piggin <npiggin@suse.de> LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-29percpu: remove per_cpu__ prefix.Rusty Russell1-2/+2
Now that the return from alloc_percpu is compatible with the address of per-cpu vars, it makes sense to hand around the address of per-cpu variables. To make this sane, we remove the per_cpu__ prefix we used created to stop people accidentally using these vars directly. Now we have sparse, we can use that (next patch). tj: * Updated to convert stuff which were missed by or added after the original patch. * Kill per_cpu_var() macro. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
2009-10-20x86-64: add comment for RODATA large page retainmentSuresh Siddha1-1/+12
Add a comment explaining why RODATA is aligned to 2 MB. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-20x86-64: align RODATA kernel section to 2MB with CONFIG_DEBUG_RODATASuresh Siddha1-0/+17
CONFIG_DEBUG_RODATA chops the large pages spanning boundaries of kernel text/rodata/data to small 4KB pages as they are mapped with different attributes (text as RO, RODATA as RO and NX etc). On x86_64, preserve the large page mappings for kernel text/rodata/data boundaries when CONFIG_DEBUG_RODATA is enabled. This is done by allowing the RODATA section to be hugepage aligned and having same RWX attributes for the 2MB page boundaries Extra Memory pages padding the sections will be freed during the end of the boot and the kernel identity mappings will have different RWX permissions compared to the kernel text mappings. Kernel identity mappings to these physical pages will be mapped with smaller pages but large page mappings are still retained for kernel text,rodata,data mappings. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20091014220254.190119924@sbs-t61.sc.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-10-16x86: Document linker script ASSERT() quirkIngo Molnar1-0/+3
Older binutils breaks if ASSERT() is used without a sink for the output. For example 2.14.90.0.6 is known to be broken, the link fails with: LD .tmp_vmlinux1 ld:arch/x86/kernel/vmlinux.lds:678: parse error Document this quirk in all three files that use it. See: http://marc.info/?l=linux-kbuild&m=124930110427870&w=2 See[2]: d2ba8b2 ("x86: Fix assert syntax in vmlinux.lds.S") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Roland McGrath <roland@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <4AD6523D.5030909@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15Revert "x86: linker script syntax nits"Ingo Molnar1-8/+9
This reverts commit e9a63a4e559fbdc522072281d05e6b13c1022f4b. This breaks older binutils, where sink-less asserts are broken. See this commit for further details: d2ba8b2: x86: Fix assert syntax in vmlinux.lds.S Acked-by: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4AD6523D.5030909@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-14x86: linker script syntax nitsRoland McGrath1-9/+8
The linker scripts grew some use of weirdly wrong linker script syntax. It happens to work, but it's not what the syntax is documented to be. Clean it up to use the official syntax. Signed-off-by: Roland McGrath <roland@redhat.com> CC: Ian Lance Taylor <iant@google.com>
2009-09-25Merge branch 'x86/asm' into x86/urgentIngo Molnar1-66/+13
Merge reason: The linker script cleanups are ready for upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-20x86: Correct segment permission flags in 64-bit linker scriptJan Beulich1-2/+2
While these don't get actively used (afaict), it still doesn't hurt for them to properly reflect what how respective segments will get mapped/ accessed. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4AA0E95F0200007800013707@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-18x86: Cleanup linker script using new linker script macros.Tim Abbott1-44/+3
Signed-off-by: Tim Abbott <tabbott@ksplice.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-18x86: Use section .data.page_aligned for the idt_table.Tim Abbott1-1/+0
The .data.idt section is just squashed into the .data.page_aligned output section by the linker script anyway, so it might as well be in the .data.page_aligned section. This eliminates all references to .data.idt on x86. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-18x86: convert to use __HEAD and HEAD_TEXT macros.Tim Abbott1-9/+3
This has the consequence of changing the section name use for head code from ".text.head" to ".head.text". It also eliminates the ".text.head" output section (instead placing head code at the start of the .text output section), which should be harmless. This patch only changes the sections in the actual kernel, not those in the compressed boot loader. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-18x86: fix fragile computation of vsyscall addressAnders Kaseorg1-12/+7
Previously, the address of the vsyscall page (VSYSCALL_PHYS_ADDR, VSYSCALL_VIRT_ADDR) was computed by arithmetic on the address of the last section. This leads to bugs when new sections are inserted, such as the one fixed by commit d312ceda567ab91acd756cde95ac5fbc6b40ed40. Let's compute it from the current address instead. Signed-off-by: Anders Kaseorg <andersk@ksplice.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-15Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds1-7/+4
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c
2009-08-25x86: Fix build with older binutils and consolidate linker scriptJan Beulich1-79/+47
binutils prior to 2.17 can't deal with the currently possible situation of a new segment following the per-CPU segment, but that new segment being empty - objcopy misplaces the .bss (and perhaps also the .brk) sections outside of any segment. However, the current ordering of sections really just appears to be the effect of cumulative unrelated changes; re-ordering things allows to easily guarantee that the segment following the per-CPU one is non-empty, and at once eliminates the need for the bogus data.init2 segment. Once touching this code, also use the various data section helper macros from include/asm-generic/vmlinux.lds.h. -v2: fix !SMP builds. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: <sam@ravnborg.org> LKML-Reference: <4A94085D02000078000119A5@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-14Merge branch 'percpu-for-linus' into percpu-for-nextTejun Heo1-15/+8
Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>