aboutsummaryrefslogtreecommitdiffstats
path: root/arch/alpha (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2010-01-21sh: Rework P2 to only include kernel text.Paul Mundt1-77/+92
This effectively neutralizes P2 by getting rid of P1 identity mapping for all available memory and instead only establishes a single unbuffered PMB entry (16MB -- the smallest available) that covers the kernel. As using segmentation for abusing caching attributes in drivers is no longer supported (and there are no drivers that can be enabled in 32-bit mode that do this), this provides us with all of the uncached access needs by the kernel itself. Drivers and their ilk need to specify their caching attributes when remapping through page tables, as usual. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-21sh: initial PMB mapping iteration by helper macro.Paul Mundt1-147/+60
All of the cached/uncached mapping setup is duplicated for each size, and also misses out on the 16MB case. Rather than duplicating the same iter code for that we just consolidate it in to a helper macro that builds an iter for each size. The 16MB case is then trivially bolted on at the end. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: pretty print virtual memory map on boot.Paul Mundt1-2/+36
This cribs the pretty printing from arch/x86/mm/init_32.c to dump the virtual memory layout on boot. This is primarily intended as a debugging aid, given that the newer CPUs have full control over their address space and as such have little to nothing in common with the legacy layout. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: mach-sdk7786: Probe system FPGA area mapping.Paul Mundt3-4/+50
This implements dynamic probing for the system FPGA. The system reset controller contains a fixed magic read word in order to identify the FPGA. This just utilizes a simple loop that scans across all of the fixed physical areas (area 0 through area 6) to locate the FPGA. The FPGA also contains register information detailing the area mappings and chip select settings for all of the other blocks, so this needs to be done before we can set up anything else. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: Correct iounmap fixmap teardown.Paul Mundt1-7/+2
iounmap_fixed() had a couple of bugs in it that caused it to effectively fail at life. The total number of pages to unmap factored in the mapping offset and aligned up to the next page boundary, which doesn't match the ioremap_fixed() behaviour. When ioremap_fixed() pegs a slot, the address in the mapping data already contains the offset displacement, and the size is recorded verbatim given that we're only interested in total number of pages required. As such, we need to calculate the total number from the original size in the unmap path as well. At the same time, there was also an off-by-1 problem in the fixmap index calculation which has also been corrected. Previously subsequent remaps of an identical fixmap index would trigger the pte_ERROR() in set_pte_phys(): arch/sh/mm/init.c:77: bad pte 8053ffb0(0000781003fff506). arch/sh/mm/init.c:77: bad pte 8053ffb0(0000781003fff506). arch/sh/mm/init.c:77: bad pte 8053ffb0(0000781003fff506). arch/sh/mm/init.c:77: bad pte 8053ffb0(0000781003fff506). arch/sh/mm/init.c:77: bad pte 8053ffb0(0000781003fff506). arch/sh/mm/init.c:77: bad pte 8053ffb0(0000781003fff506). With this patch in place, the iounmap-driven fixmap teardown actually does what it's supposed to do. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: mach-sdk7786: reset controller reboot support.Paul Mundt1-0/+8
This wires up the machine_ops reboot call to use the system reset controller. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: machine_ops based reboot support.Paul Mundt11-90/+156
This provides a machine_ops-based reboot interface loosely cloned from x86, and converts the native sh32 and sh64 cases over to it. Necessary both for tying in SMP support and also enabling platforms like SDK7786 to add support for their microcontroller-based power managers. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: Make 29/32-bit mode check helper generally available.Paul Mundt4-14/+14
Presently __in_29bit_mode() is only defined for the PMB case, but it's also easily derived from the CONFIG_29BIT and CONFIG_32BIT && CONFIG_PMB=n cases. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: mach-sdk7786: Split out FPGA IRQ controller setup.Paul Mundt4-38/+59
This moves out the FPGA IRQ controller setup code to its own file, in preparation for switching off of IRL mode and having it provide its own irq_chip. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: mach-sdk7786: FPGA updates.Paul Mundt4-70/+159
This does a bit of refactoring of the FPGA management code. The primary FPGA initialization is moved out to its own file in preparation for implementing some of the more complex capabilities, a complete set of register definitions is provided, and all of the existing users in the board code are moved over to use the new interface instead of setting up overlapping mappings. This also corrects the FPGA size, which previously was chomped off at the SDIF control register. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: Handle SH-4 FPU variants with broken CVR values.Paul Mundt1-2/+3
Usually we can look to the CVR to work out whether we have an FPU or not. Unfortunately not all parts comply with this, so just set the flag manually for all SH-4 parts and clear it on the only SH-4 that doesn't have one (SH4-501). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: update PFC to allow any enum in MARK listsMagnus Damm1-5/+32
This patch updates the PFC code with some clarifying comments together with a functional change. The change allows function type of GPIO to select any type of enum in their MARK lists. Without this patch only function type of enums are allowed in MARK lists. Signed-off-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-20sh: Shut up noisy IOREMAP_FIXED=n build.Paul Mundt1-1/+2
The ioremap_fixed() stub neglected to provide a return value, resulting in a fairly noisy build. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: support SIU sourcing from external clock on sh7722Guennadi Liakhovetski3-7/+116
Implement .set_rate() for all SH "div4 clocks," .enable(), .disable(), and .set_parent() for those, that support them. This allows, among other uses, reparenting of SIU clocks to the external source, and enabling and disabling of the IrDA clock on sh7722. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Fix up sdk7780 and urquell builds.Paul Mundt2-2/+2
These two got broken in the heartbeat private data conversion, fix them up. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: urquell: Handle EXTAL configuration here, too.Paul Mundt1-1/+24
urquell happens to use the same mode pins and EXTAL configuration as SDK7786, so just copy it over. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: mach-sdk7786: Detect/configure/propagate EXTAL.Paul Mundt1-0/+23
This uses the mode pins exposed through the FPGA to work out whether we're driven from EXTAL or not and does the appropriate setup and propagation through the clock framework. This will also -EINVAL out for anyone adding in their own oscillators, forcing proper configuration with the clock framework instead of proceeding on with bogus clock values. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: SH7786 clock framework rewrite.Paul Mundt2-99/+91
This rewrites the SH7786 clock framework support completely. It's reworked to provide all of the DIV4 and MSTP function clocks. This brings it in line with the current clock framework code and lets us drop SH7786 from the list of CPUs that require legacy CPG handling. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh64: Fixup build breakage from breakpoint handler rename.Paul Mundt2-3/+4
The breakpoint handler was renamed on sh32, but sh64 was overlooked in the conversion. Fix it up now. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh64: Use the shared FPU state restorer.Paul Mundt3-62/+8
This kills off the sh64-specific state restorer and switches over to the generic one. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh64: Fix up PC casting in unaligned fixup notifier with 32bit ABI.Paul Mundt1-2/+2
Presently the build bails with the following: CC arch/sh/mm/alignment.o cc1: warnings being treated as errors arch/sh/mm/alignment.c: In function 'unaligned_fixups_notify': arch/sh/mm/alignment.c:69: warning: cast to pointer from integer of different size arch/sh/mm/alignment.c:74: warning: cast to pointer from integer of different size make[2]: *** [arch/sh/mm/alignment.o] Error 1 This is due to the fact that regs->pc is always 64-bit, while the pointer size depends on the ABI. Wrapping through instruction_pointer() takes care of the appropriate casting for both configurations. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh64: Fix up the build for the thread_xstate changes.Paul Mundt8-41/+43
This updates the sh64 processor info with the sh32 changes in order to tie in to the generic task_xstate management code. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Kill off now bogus fixmap/page wiring documentation.Paul Mundt1-15/+0
The plans for _PAGE_WIRED were detailed in a comment with the fixmap code, but as it's now all taken care of, we no longer have any reason for keeping it around, particularly since it's no longer accurate. Kill it off. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Split out MMUCR.URB based entry wiring in to shared helper.Paul Mundt6-176/+124
Presently this is duplicated between tlb-sh4 and tlb-pteaex. Split the helpers out in to a generic tlb-urb that can be used by any parts equipped with MMUCR.URB. At the same time, move the SH-5 code out-of-line, as we require single global state for DTLB entry wiring. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Provide a dummy _PAGE_WIRED flag for non-X2TLB parts.Paul Mundt1-2/+2
This provides a dummy value for legacy parts which permits the entry wiring to be open-coded. The compiler takes care of optimizing the entry wiring away in these cases. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Limit ioremap_prot() to 32bit pgprot parts.Paul Mundt2-1/+3
Presently ioremap_prot() uses an unsigned long to pass the pgprot value around. This results in the upper half of the pgprot being chomped when using 64-bit pgprots on a 32-bit ABI (X2TLB and SH-5). As the only users of ioremap_prot() are presently legacy parts, this doesn't cause too much of an issue. In the future when the interface is converted to use pgprot_t directly this can be re-enabled for the other parts, too. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Convert p3_ioremap() users to ioremap_prot().Paul Mundt4-5/+4
This kills off the ancient p3_ioremap(), converting over to the more generic ioremap_prot() instead. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Kill off duplicate address alignment in ioremap_fixed().Paul Mundt3-22/+8
This is already taken care of in the top-level ioremap, and now that no one should be calling ioremap_fixed() directly we can simply throw the mapping displacement in as an additional argument. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-19sh: Prevent 64-bit pgprot clobbering across ioremap implementations.Paul Mundt6-33/+41
Presently 'flags' gets passed around a lot between the various ioremap helpers and implementations, which is only 32-bits. In the X2TLB case we use 64-bit pgprots which presently results in the upper 32bits being chopped off (which handily include our read/write/exec permissions). As such, we convert everything internally to using pgprot_t directly and simply convert over with pgprot_val() where needed. With this in place, transparent fixmap utilization for early ioremap works as expected. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Flag __ioremap_caller() __init_refok.Paul Mundt1-2/+3
The mem_init_done test makes sure that this path is only entered in __init cases, so leaving ioremap_fixed() as __init and flagging the caller __init_refok is sufficient. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Handle unmapping of fixed slots transparently in iounmap().Paul Mundt1-0/+6
iounmap() should balance whatever is done by ioremap(). Presently ioremap() can do any of fixed mappings, PMB mappings, or page table mappings. Presently only the latter two are handled through the standard unmap path, so tie in the fixed unmapping, too. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Make iounmap_fixed() return success/failure for iounmap() path.Paul Mundt2-4/+10
This converts iounmap_fixed() to return success/error if it handled the unmap request or not. At the same time, drop the __init label, as this can be called in to later. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Merge _32/_64 ioremap implementations.Paul Mundt3-48/+1
There is nothing of interest in the _64 version anymore, so the _32 one can be renamed and used unconditionally. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Fixup the IOREMAP_FIXED=n build.Paul Mundt1-0/+9
Presently the fixed ioremap API is only defined when CONFIG_IOREMAP_FIXED is set. As we want to call in to it unconditionally, provide a stubbed out interface. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Handle early ioremaps through fixed mappings.Paul Mundt3-3/+16
This adds in a mem_init_done to work out when a standard ioremap() is possible, falling back to the fixmap based ioremap otherwise. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Need IRQs enabled for init_fpu().Paul Mundt1-0/+2
This tosses in a local_irq_enable()/disable() pair around the init_fpu() callsite in the FPU state restore exception handler. Fixes up a slab BUG triggered by making a slab cache allocation that can sleep whilst irqs_disabled(). This follows the behaviour undertaken by the x86 implementation. Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-18sh: Setup early PMB mappings.Matt Fleming2-51/+346
More and more boards are going to start shipping that boot with the MMU in 32BIT mode by default. Previously we relied on the bootloader to setup PMB mappings for use by the kernel but we also need to cater for boards whose bootloaders don't set them up. If CONFIG_PMB_LEGACY is not enabled we have full control over our PMB mappings and can compress our address space. Usually, the distance between the the cached and uncached mappings of RAM is always 512MB, however we can compress the distance to be the amount of RAM on the board. pmb_init() now becomes much simpler. It no longer has to calculate any mappings, it just has to synchronise the software PMB table with the hardware. Tested on SDK7786 and SH7785LCR. Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2010-01-17modpost: fix segfault in sym_is() with prefixed archesMike Frysinger1-1/+1
The sym_is() compares a symbol in an attempt to automatically skip symbol prefixes. It does this first by searching the real symbol with the normal unprefixed symbol. But then it uses the length of the original symbol to check the end of the substring instead of the length of the symbol it is looking for. On non-prefixed arches, this is effectively the same thing, so there is no problem. On prefixed-arches, since this is exceeds by just one byte, a crash is rare and it is usually a NUL byte anyways. But every once in a blue moon, you get the right page alignment and it segfaults. For example, on the Blackfin arch, sym_is() will be called with the real symbol "___mod_usb_device_table" as "symbol" when looking for the normal symbol "__mod_usb_device_table" as "name". The substring will thus return one byte into "symbol" and store it into "match". But then "match" will be indexed with the length of "symbol" instead of "name" and so we will exceed the storage. i.e. the code ends up doing: char foo[] = "abc"; return foo[strlen(foo)+1] == '\0'; Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16page allocator: update NR_FREE_PAGES only when necessaryKOSAKI Motohiro1-1/+1
commit f2260e6b (page allocator: update NR_FREE_PAGES only as necessary) made one minor regression. if __rmqueue() was failed, NR_FREE_PAGES stat go wrong. this patch fixes it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reported-by: Huang Shijie <shijie8@gmail.com> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16revert "drivers/video/s3c-fb.c: fix clock setting for Samsung SoC Framebuffer"Mark Brown1-6/+8
Fix divide by zero and broken output. Commit 600ce1a0fa ("fix clock setting for Samsung SoC Framebuffer") introduced a mandatory refresh parameter to the platform data for the S3C framebuffer but did not introduce any validation code, causing existing platforms (none of which have refresh set) to divide by zero whenever the framebuffer is configured, generating warnings and unusable output. Ben Dooks noted several problems with the patch: - The platform data supplies the pixclk directly and should already have taken care of the refresh rate. - The addition of a window ID parameter doesn't help since only the root framebuffer can control the pixclk. - pixclk is specified in picoseconds (rather than Hz) as the patch assumed. and suggests reverting the commit so do that. Without fixing this no mainline user of the driver will produce output. [akpm@linux-foundation.org: don't revert the correct bit] Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: InKi Dae <inki.dae@samsung.com> Cc: Kyungmin Park <kmpark@infradead.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16nommu: fix shared mmap after truncate shrinkage problemsDavid Howells3-30/+64
Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen over the end of a truncation. The problem is that ramfs_nommu_check_mappings() checks that the reduced file size against the VMA tree, but not the vm_region tree. The following sequence of events can cause the problem: fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600); ftruncate(fd, 32 * 1024); a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); munmap(a, 32 * 1024); ftruncate(fd, 16 * 1024); c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); Mapping 'a' creates a vm_region covering 32KB of the file. Mapping 'b' sees that the vm_region from 'a' is covering the region it wants and so shares it, pinning it in memory. Mapping 'a' then goes away and the file is truncated to the end of VMA 'b'. However, the region allocated by 'a' is still in effect, and has _not_ been reduced. Mapping 'c' is then created, and because there's a vm_region covering the desired region, get_unmapped_area() is _not_ called to repeat the check, and the mapping is granted, even though the pages from the latter half of the mapping have been discarded. However: d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); Mapping 'd' should work, and should end up sharing the region allocated by 'a'. To deal with this, we shrink the vm_region struct during the truncation, lest do_mmap_pgoff() take it as licence to share the full region automatically without calling the get_unmapped_area() file op again. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16nommu: fix race between ramfs truncation and shared mmapDavid Howells1-1/+6
Fix the race between the truncation of a ramfs file and an attempt to make a shared mmap of region of that file. The problem is that do_mmap_pgoff() calls f_op->get_unmapped_area() to verify that the file region is made of contiguous pages and to find its base address - but there isn't any locking to guarantee this region until vma_prio_tree_insert() is called by add_vma_to_mm(). Note that moving the functionality into f_op->mmap() doesn't help as that is also called before vma_prio_tree_insert(). Instead make ramfs_nommu_check_mappings() grab nommu_region_sem whilst it does its checks. This means that this function will wait whilst mmaps take place. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16nommu: don't need get_unmapped_area() for NOMMUDavid Howells4-24/+8
get_unmapped_area() is unnecessary for NOMMU as no-one calls it. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16nommu: remove a superfluous check of vm_region::vm_usageDavid Howells1-4/+3
In split_vma(), there's no need to check if the VMA being split has a region that's in use by more than one VMA because: (1) The preceding test prohibits splitting of non-anonymous VMAs and regions (eg: file or chardev backed VMAs). (2) Anonymous regions can't be mapped multiple times because there's no handle by which to refer to the already existing region. (3) If a VMA has previously been split, then the region backing it has also been split into two regions, each of usage 1. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16nommu: struct vm_region's vm_usage count need not be atomicDavid Howells2-8/+8
The vm_usage count field in struct vm_region does not need to be atomic as it's only even modified whilst nommu_region_sem is write locked. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16nommu: fix SYSV SHM for NOMMUDavid Howells1-0/+3
Commit c4caa778157dbbf04116f0ac2111e389b5cd7a29 ("file ->get_unmapped_area() shouldn't duplicate work of get_unmapped_area()") broke SYSV SHM for NOMMU by taking away the pointer to shm_get_unmapped_area() from shm_file_operations. Put it back conditionally on CONFIG_MMU=n. file->f_ops->get_unmapped_area() is used to find out the base address for a mapping of a mappable chardev device or mappable memory-based file (such as a ramfs file). It needs to be called prior to file->f_ops->mmap() being called. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16sysdev: fix prototype for memory_sysdev_class show/store functionsWu Fengguang1-12/+20
The function prototype mismatches in call stack: [<ffffffff81494268>] print_block_size+0x58/0x60 [<ffffffff81487e3f>] sysdev_class_show+0x1f/0x30 [<ffffffff811d629b>] sysfs_read_file+0xcb/0x1f0 [<ffffffff81176328>] vfs_read+0xc8/0x180 Due to prototype mismatch, print_block_size() will sprintf() into *attribute instead of *buf, hence user space will read the initial zeros from *buf: $ hexdump /sys/devices/system/memory/block_size_bytes 0000000 0000 0000 0000 0000 0000008 After patch: cat /sys/devices/system/memory/block_size_bytes 0x8000000 This complements commits c29af9636 and 4a0b2b4dbe. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: "Zheng, Shaohui" <shaohui.zheng@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16memory-hotplug: add 0x prefix to HEX block_size_bytesWu Fengguang1-1/+1
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16memcg: ensure list is empty at rmdirDaisuke Nishimura1-7/+4
Current mem_cgroup_force_empty() only ensures mem->res.usage == 0 on success. But this doesn't guarantee memcg's LRU is really empty, because there are some cases in which !PageCgrupUsed pages exist on memcg's LRU. For example: - Pages can be uncharged by its owner process while they are on LRU. - race between mem_cgroup_add_lru_list() and __mem_cgroup_uncharge_common(). So there can be a case in which the usage is zero but some of the LRUs are not empty. OTOH, mem_cgroup_del_lru_list(), which can be called asynchronously with rmdir, accesses the mem_cgroup, so this access can cause a problem if it races with rmdir because the mem_cgroup might have been freed by rmdir. Actually, I saw a bug which seems to be caused by this race. [1530745.949906] BUG: unable to handle kernel NULL pointer dereference at 0000000000000230 [1530745.950651] IP: [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80 [1530745.950651] PGD 3863de067 PUD 3862c7067 PMD 0 [1530745.950651] Oops: 0002 [#1] SMP [1530745.950651] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index1/shared_cpu_map [1530745.950651] CPU 3 [1530745.950651] Modules linked in: configs ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp nfsd nfs_acl auth_rpcgss exportfs autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_mirror dm_multipath scsi_dh video output sbs sbshc battery ac lp kvm_intel kvm sg ide_cd_mod cdrom serio_raw tpm_tis tpm tpm_bios acpi_memhotplug button parport_pc parport rtc_cmos rtc_core rtc_lib e1000 i2c_i801 i2c_core pcspkr dm_region_hash dm_log dm_mod ata_piix libata shpchp megaraid_mbox sd_mod scsi_mod megaraid_mm ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: freq_table] [1530745.950651] Pid: 19653, comm: shmem_test_02 Tainted: G M 2.6.32-mm1-00701-g2b04386 #3 Express5800/140Rd-4 [N8100-1065] [1530745.950651] RIP: 0010:[<ffffffff810fbc11>] [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80 [1530745.950651] RSP: 0018:ffff8803863ddcb8 EFLAGS: 00010002 [1530745.950651] RAX: 00000000000001e0 RBX: ffff8803abc02238 RCX: 00000000000001e0 [1530745.950651] RDX: 0000000000000000 RSI: ffff88038611a000 RDI: ffff8803abc02238 [1530745.950651] RBP: ffff8803863ddcc8 R08: 0000000000000002 R09: ffff8803a04c8643 [1530745.950651] R10: 0000000000000000 R11: ffffffff810c7333 R12: 0000000000000000 [1530745.950651] R13: ffff880000017f00 R14: 0000000000000092 R15: ffff8800179d0310 [1530745.950651] FS: 0000000000000000(0000) GS:ffff880017800000(0000) knlGS:0000000000000000 [1530745.950651] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [1530745.950651] CR2: 0000000000000230 CR3: 0000000379d87000 CR4: 00000000000006e0 [1530745.950651] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1530745.950651] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [1530745.950651] Process shmem_test_02 (pid: 19653, threadinfo ffff8803863dc000, task ffff88038612a8a0) [1530745.950651] Stack: [1530745.950651] ffffea00040c2fe8 0000000000000000 ffff8803863ddd98 ffffffff810c739a [1530745.950651] <0> 00000000863ddd18 000000000000000c 0000000000000000 0000000000000000 [1530745.950651] <0> 0000000000000002 0000000000000000 ffff8803863ddd68 0000000000000046 [1530745.950651] Call Trace: [1530745.950651] [<ffffffff810c739a>] release_pages+0x142/0x1e7 [1530745.950651] [<ffffffff810c778f>] ? pagevec_move_tail+0x6e/0x112 [1530745.950651] [<ffffffff810c781e>] pagevec_move_tail+0xfd/0x112 [1530745.950651] [<ffffffff810c78a9>] lru_add_drain+0x76/0x94 [1530745.950651] [<ffffffff810dba0c>] exit_mmap+0x6e/0x145 [1530745.950651] [<ffffffff8103f52d>] mmput+0x5e/0xcf [1530745.950651] [<ffffffff81043ea8>] exit_mm+0x11c/0x129 [1530745.950651] [<ffffffff8108fb29>] ? audit_free+0x196/0x1c9 [1530745.950651] [<ffffffff81045353>] do_exit+0x1f5/0x6b7 [1530745.950651] [<ffffffff8106133f>] ? up_read+0x2b/0x2f [1530745.950651] [<ffffffff8137d187>] ? lockdep_sys_exit_thunk+0x35/0x67 [1530745.950651] [<ffffffff81045898>] do_group_exit+0x83/0xb0 [1530745.950651] [<ffffffff810458dc>] sys_exit_group+0x17/0x1b [1530745.950651] [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b [1530745.950651] Code: 54 53 0f 1f 44 00 00 83 3d cc 29 7c 00 00 41 89 f4 75 63 eb 4e 48 83 7b 08 00 75 04 0f 0b eb fe 48 89 df e8 18 f3 ff ff 44 89 e2 <48> ff 4c d0 50 48 8b 05 2b 2d 7c 00 48 39 43 08 74 39 48 8b 4b [1530745.950651] RIP [<ffffffff810fbc11>] mem_cgroup_del_lru_list+0x30/0x80 [1530745.950651] RSP <ffff8803863ddcb8> [1530745.950651] CR2: 0000000000000230 [1530745.950651] ---[ end trace c3419c1bb8acc34f ]--- [1530745.950651] Fixing recursive fault but reboot is needed! The problem here is pages on LRU may contain pointer to stale memcg. To make res->usage to be 0, all pages on memcg must be uncharged or moved to another(parent) memcg. Moved page_cgroup have already removed from original LRU, but uncharged page_cgroup contains pointer to memcg withou PCG_USED bit. (This asynchronous LRU work is for improving performance.) If PCG_USED bit is not set, page_cgroup will never be added to memcg's LRU. So, about pages not on LRU, they never access stale pointer. Then, what we have to take care of is page_cgroup _on_ LRU list. This patch fixes this problem by making mem_cgroup_force_empty() visit all LRUs before exiting its loop and guarantee there are no pages on its LRU. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16virtio: fix section mismatch warningsJeff Mahoney2-6/+6
Fix fixes the following warnings by renaming the driver structures to be suffixed with _driver. WARNING: drivers/virtio/virtio_balloon.o(.data+0x88): Section mismatch in reference from the variable virtio_balloon to the function .devexit.text:virtballoon_remove() WARNING: drivers/char/hw_random/virtio-rng.o(.data+0x88): Section mismatch in reference from the variable virtio_rng to the function .devexit.text:virtrng_remove() Signed-off-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>