linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2008-03-28	audit: silence two kerneldoc warnings in kernel/audit.c	Dave Jones	1	-3/+3
	Silence two kerneldoc warnings. Warning(kernel/audit.c:1276): No description found for parameter 'string' Warning(kernel/audit.c:1276): No description found for parameter 'len' [also fix a typo for bonus points] Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	afs: prevent double cell registration	Sven Schnelle	1	-2/+13
	kafs doesn't check if the cell already exists - so if you do an echo "add newcell.org 1.2.3.4" >/proc/fs/afs/cells it will try to create this cell again. kobject will also complain about a double registration. To prevent such problems, return -EEXIST in that case. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	afs: add a MAINTAINERS record for AFS	David Howells	1	-0/+6
	Add a MAINTAINERS record for AFS. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	vfs: fix data leak in nobh_write_end()	Dmitri Monakhov	1	-7/+6
	Current nobh_write_end() implementation ignore partial writes(copied < len) case if page was fully mapped and simply mark page as Uptodate, which is totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in previous write_begin call. It simply contains garbage from pagecache and result in data leakage. #TEST_CASE_BEGIN: ~~~~~~~~~~~~~~~~ In fact issue triggered by classical testcase open("/mnt/test", O_RDWR\|O_CREAT\|O_TRUNC, 0666) = 3 ftruncate(3, 409600) = 0 writev(3, [{"a", 1}, {NULL, 4095}], 2) = 1 ##TESTCASE_SOURCE: ~~~~~~~~~~~~~~~~~ #include <stdio.h> #include <stdlib.h> #include <fcntl.h> #include <sys/uio.h> #include <sys/mman.h> #include <errno.h> int main(int argc, char *argv) { int fd, ret; void p; struct iovec iov[2]; fd = open(argv[1], O_RDWR\|O_CREAT\|O_TRUNC, 0666); ftruncate(fd, 409600); iov[0].iov_base="a"; iov[0].iov_len=1; iov[1].iov_base=NULL; iov[1].iov_len=4096; ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec)); printf("writev = %d, err = %d\n", ret, errno); return 0; } ##TESTCASE RESULT: ~~~~~~~~~~~~~~~~~~ [root@ts63 ~]# mount \| grep mnt2 /dev/mapper/test on /mnt2 type ext2 (rw,nobh) [root@ts63 ~]# /tmp/writev /mnt2/test writev = 1, err = 0 [root@ts63 ~]# hexdump -C /mnt2/test 00000000 61 65 62 6f 6f 74 00 00 f0 b9 b4 59 3a 00 00 00 \|aeboot.....Y:...\| 00000010 20 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 \| .......!.......\| 00000020 df df df df df df df df df df df df df df df df \|................\| 00000030 3a 00 00 00 2a 00 00 00 21 00 00 00 00 00 00 00 \|:......!.......\| 00000040 60 c0 8c 00 00 00 00 00 40 4a 8d 00 00 00 00 00 \|`.......@J......\| 00000050 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 \|........A.......\| 00000060 74 69 6d 65 20 64 64 20 69 66 3d 2f 64 65 76 2f \|time dd if=/dev/\| 00000070 6c 6f 6f 70 30 20 20 6f 66 3d 2f 64 65 76 2f 6e \|loop0 of=/dev/n\| skip.. 00000f50 00 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 \|........1.......\| 00000f60 6d 6b 66 73 2e 65 78 74 33 20 2f 64 65 76 2f 76 \|mkfs.ext3 /dev/v\| 00000f70 7a 76 67 2f 74 65 73 74 20 2d 62 34 30 39 36 00 \|zvg/test -b4096.\| 00000f80 a0 fe 8c 00 00 00 00 00 21 00 00 00 00 00 00 00 \|........!.......\| 00000f90 23 31 32 30 35 39 35 30 34 30 34 00 3a 00 00 00 \|#1205950404.:...\| 00000fa0 20 00 8d 00 00 00 00 00 21 00 00 00 00 00 00 00 \| .......!.......\| 00000fb0 d0 cf 8c 00 00 00 00 00 10 d0 8c 00 00 00 00 00 \|................\| 00000fc0 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 \|........A.......\| 00000fd0 6d 6f 75 6e 74 20 2f 64 65 76 2f 76 7a 76 67 2f \|mount /dev/vzvg/\| 00000fe0 74 65 73 74 20 20 2f 76 7a 20 2d 6f 20 64 61 74 \|test /vz -o dat\| 00000ff0 61 3d 77 72 69 74 65 62 61 63 6b 00 00 00 00 00 \|a=writeback.....\| 00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 \|................\| As you can see file's page contains garbage from pagecache instead of zeros. #TEST_CASE_END Attached patch: - Add sanity check BUG_ON in order to prevent incorrect usage by caller, This is function invariant because page can has buffers and in no zero fadata pointer at the same time. - Always attach buffers to page is it is partial write case. - Always switch back to generic_write_end if page has buffers. This is reasonable because if page already has buffer then generic_write_begin was called previously. Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Reviewed-by: Nick Piggin <npiggin@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	memcgroup: fix spurious EBUSY on memory cgroup removal	YAMAMOTO Takashi	1	-1/+1
	Call mm_free_cgroup earlier. Otherwise a reference due to lazy mm switching can prevent cgroup removal. Signed-off-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	in_atomic(): document why it is unsuitable for general use	Jonathan Corbet	1	-0/+7
	Discourage people from inappropriately using in_atomic() Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	drivers/char/drm/ati_pcigart.c: fix printk warning	Andrew Morton	1	-2/+3
	drivers/char/drm/ati_pcigart.c: In function 'drm_ati_pcigart_init': drivers/char/drm/ati_pcigart.c:125: warning: format '%08X' expects type 'unsigned int', but argument 3 has type 'dma_addr_t' Cc: Dave Airlie <airlied@linux.ie> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	mtd: nand: add out label in rfc_from4	Sebastian Siewior	1	-1/+1
	This has been forgotten in commit f5bbdacc419 ("[MTD] NAND Modularize read function") and nobody compiled the driver. Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joern Engel <joern@wh.fh-wedel.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	Avoid false positive warnings in kmap_atomic_prot() with DEBUG_HIGHMEM	Andrew Morton	1	-3/+3
	I believe http://bugzilla.kernel.org/show_bug.cgi?id=10318 is a false positive. There's no way in which networking will be using highmem pages here, so it won't be taking the KM_USER0 kmap slot, so there's no point in performing these checks. Cc: Pawel Staszewski <pstaszewski@artcom.pl> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Really sad. We lose almost all real-life coverage of the debug tests with this patch. Now it will only report problems for the cases where people actually end up using a HIGHMEM page, not when they just _might_ use one. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	RDMA/cxgb3: Program hardware IRD with correct value	Roland Dreier	1	-1/+1
	Because of a typo in iwch_accept_cr(), the cxgb3 connection handling code programs the hardware IRD (incoming RDMA read queue depth) with the value that is passed in for the ORD (outgoing RDMA read queue depth). In particular this means that if an application passes in IRD > 0 and ORD = 0 (which is a completely sane and valid thing to do for an app that expects only incoming RDMA read requests), then the hardware will end up programmed with IRD = 0 and the app will fail in a mysterious way. Fix this by using "ep->ird" instead of "ep->ord" in the intended place. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	revert "ACPI: drivers/acpi: elide a non-zero test on a result that is never 0"	Ingo Molnar	3	-43/+49
	Revert commit 1192aeb957402b45f311895f124e4ca41206843c ("ACPI: drivers/acpi: elide a non-zero test on a result that is never 0") because it turns out that thermal_cooling_device_register() does actually return NULL if CONFIG_THERMAL is turned off (then the routine turns into a dummy inline routine in the header files that returns NULL unconditionally). This was found with randconfig testing, causing a crash during bootup: initcall 0x78878534 ran for 13 msecs: acpi_button_init+0x0/0x51() Calling initcall 0x78878585: acpi_fan_init+0x0/0x2c() BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [<782b8ad0>] acpi_fan_add+0x7d/0xfd *pde = 00000000 Oops: 0000 [#1] Modules linked in: Pid: 1, comm: swapper Not tainted (2.6.25-rc7-sched-devel.git-x86-latest.git #14) EIP: 0060:[<782b8ad0>] EFLAGS: 00010246 CPU: 0 EIP is at acpi_fan_add+0x7d/0xfd EAX: b787c718 EBX: b787c400 ECX: b782ceb4 EDX: 00000007 ESI: 00000000 EDI: b787c6f4 EBP: b782cee0 ESP: b782cecc DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process swapper (pid: 1, ti=b782c000 task=b7846000 task.ti=b782c000) Stack: b787c459 00000000 b787c400 78790888 b787c60c b782cef8 782b6fb8 ffffffda b787c60c 00000000 78790958 b782cf0c 783005d7 b787c60c 78790958 78790584 b782cf1c 783007f6 b782cf28 00000000 b782cf40 782ffc4a 78790958 b794d558 Call Trace: [<782b6fb8>] ? acpi_device_probe+0x3e/0xdb [<783005d7>] ? driver_probe_device+0x82/0xfc [<783007f6>] ? __driver_attach+0x3a/0x70 [<782ffc4a>] ? bus_for_each_dev+0x3e/0x60 [<7830048c>] ? driver_attach+0x14/0x16 [<783007bc>] ? __driver_attach+0x0/0x70 [<7830006a>] ? bus_add_driver+0x9d/0x1b0 [<783008c3>] ? driver_register+0x47/0xa3 [<7813db00>] ? timespec_to_ktime+0x9/0xc [<782b7331>] ? acpi_bus_register_driver+0x3a/0x3c [<78878592>] ? acpi_fan_init+0xd/0x2c [<78863656>] ? kernel_init+0xac/0x1f9 [<788635aa>] ? kernel_init+0x0/0x1f9 [<78114563>] ? kernel_thread_helper+0x7/0x10 ======================= Code: 6e 78 e8 57 44 e7 ff 58 e9 93 00 00 00 8b 55 f0 8d bb f4 02 00 00 80 4b 2d 10 8b 03 e8 87 cb ff ff 8d 83 18 03 00 00 80 63 2d ef <ff> 35 00 00 00 00 50 68 e8 9c 6e 78 e8 22 44 e7 ff b9 b6 9c 6e EIP: [<782b8ad0>] acpi_fan_add+0x7d/0xfd SS:ESP 0068:b782cecc ---[ end trace 778e504de7e3b1e3 ]--- Kernel panic - not syncing: Attempted to kill init! Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Julia Lawall <julia@diku.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-28	[POWERPC] Fix missed hardware breakpoints across multiple threads	Michael Ellerman	1	-5/+5
	There is a bug in the powerpc DABR (data access breakpoint) handling, which can result in us missing breakpoints if several threads are trying to break on the same address. The circumstances are that do_page_fault() calls do_dabr(), this clears the DABR (sets it to 0) and sets up the signal which will report to userspace that the DABR was hit. The do_signal() code will restore the DABR value on the way out to userspace. If we reschedule before calling do_signal(), __switch_to() will check the cached DABR value and compare it to the new thread's value, if they match we don't set the DABR in hardware. So if two threads have the same DABR value, and we schedule from one to the other after taking the interrupt for the first thread hitting the DABR, the second thread will run without the DABR set in hardware. The cleanest fix is to move the cache update into set_dabr(), that way we can't forget to do it. Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com> Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-03-27	Revert "SLUB: remove useless masking of GFP_ZERO"	Linus Torvalds	1	-0/+3
	This reverts commit 3811dbf67162bd08412f1b0e02e554f353e93bdb. The masking was not at all useless, and it was sensible. We handle GFP_ZERO in the caller, and passing it down to any page allocator logic is buggy and wrong. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-27	[PATCH] mnt_expire is protected by namespace_sem, no need for vfsmount_lock	Al Viro	1	-9/+2
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27	[PATCH] do shrink_submounts() for all fs types	Al Viro	7	-27/+10
	... and take it out of ->umount_begin() instances. Call with all locks already taken (by do_umount()) and leave calling release_mounts() to caller (it will do release_mounts() anyway, so we can just put into the same list). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27	[PATCH] sanitize locking in mark_mounts_for_expiry() and shrink_submounts()	Al Viro	1	-81/+24
	... and fix a race on access of ->mnt_share et.al. without namespace_sem in the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27	[PATCH] count ghost references to vfsmounts	Al Viro	3	-2/+6
	make propagate_mount_busy() exclude references from the vfsmounts that had been isolated by umount_tree() and are just waiting for release_mounts() to dispose of their ->mnt_parent/->mnt_mountpoint. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-27	[PATCH] reduce stack footprint in namespace.c	Al Viro	1	-35/+37
	A lot of places misuse struct nameidata when they need struct path. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-03-28	lguest: comment documentation update.	Rusty Russell	13	-142/+208
	Took some cycles to re-read the Lguest Journey end-to-end, fix some rot and tighten some phrases. Only comments change. No new jokes, but a couple of recycled old jokes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-28	lguest: Don't need comment terminator before disk section.	Rusty Russell	1	-1/+0
	Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-28	lguest: lguest.txt documentation fix	Paul Bolle	1	-4/+8
	Mention the config options for the Virtio drivers and move the Virtualization menu to the toplevel. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-28	lguest: Add puppies which where previously missing.	Tim Ansell	2	-3/+12
	lguest doesn't have features, it has puppies! Signed-off-by: Timothy R Ansell <mithro@mithis.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-28	virtio_pci: unregister virtio device at device remove	Anthony Liguori	1	-0/+1
	Make sure to call unregister_virtio_device() when a virtio device is removed. Otherwise, virtio_pci.ko cannot be rmmod'd. This was spotted by Marcelo Tosatti. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-03-27	x86: prefetch fix #2	Ingo Molnar	1	-7/+4
	Linus noticed a second bug and an uncleanliness: - we'd return on any instruction fetch fault - we'd use both the value of 16 and the PF_INSTR symbol which are the same and make no sense the cleanup nicely unifies this piece of logic. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	xen: fix UP setup of shared_info	Jeremy Fitzhardinge	1	-20/+25
	We need to set up the shared_info pointer once we've mapped the real shared_info into its fixmap slot. That needs to happen once the general pagetable setup has been done. Previously, the UP shared_info was set up one in xen_start_kernel, but that was left pointing to the dummy shared info. Unfortunately there's no really good place to do a later setup of the shared_info in UP, so just do it once the pagetable setup has been done. [ Stable: needed in 2.6.24.x ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	xen: fix RMW when unmasking events	Jeremy Fitzhardinge	2	-3/+8
	xen_irq_enable_direct and xen_sysexit were using "andw $0x00ff, XEN_vcpu_info_pending(vcpu)" to unmask events and test for pending ones in one instuction. Unfortunately, the pending flag must be modified with a locked operation since it can be set by another CPU, and the unlocked form of this operation was causing the pending flag to get lost, allowing the processor to return to usermode with pending events and ultimately deadlock. The simple fix would be to make it a locked operation, but that's rather costly and unnecessary. The fix here is to split the mask-clearing and pending-testing into two instructions; the interrupt window between them is of no concern because either way pending or new events will be processed. This should fix lingering bugs in using direct vcpu structure access too. [ Stable: needed in 2.6.24.x ] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	x86, documentation: nmi_watchdog=2 works on x86_64	Marcin Slusarz	1	-2/+1
	Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	x86: stricter check in follow_huge_addr()	Christoph Lameter	1	-1/+1
	The first page of the compound page is determined in follow_huge_addr() but then PageCompound() only checks if the page is part of a compound page. PageHead() allows checking if this is indeed the first page of the compound. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	rdc321x: GPIO routines bugfixes	Florian Fainelli	4	-53/+165
	This patch fixes the use of GPIO routines which are in the PCI configuration space of the RDC321x, therefore reading/writing to this space without spinlock protection can be problematic. We also now request and free GPIOs and support the MGB100 board, previous code was very AR525W-centric. Signed-off-by: Volker Weiss <volker@tintuc.de> Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	x86: ptrace.c: fix defined-but-unused warnings	Andrew Morton	1	-84/+85
	arch/x86/kernel/ptrace.c:548: warning: 'ptrace_bts_get_size' defined but not used arch/x86/kernel/ptrace.c:558: warning: 'ptrace_bts_read_record' defined but not used arch/x86/kernel/ptrace.c:607: warning: 'ptrace_bts_clear' defined but not used arch/x86/kernel/ptrace.c:617: warning: 'ptrace_bts_drain' defined but not used arch/x86/kernel/ptrace.c:720: warning: 'ptrace_bts_config' defined but not used arch/x86/kernel/ptrace.c:788: warning: 'ptrace_bts_status' defined but not used Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-27	x86: fix prefetch workaround	Ingo Molnar	1	-1/+2
	some early Athlon XP's and Opterons generate bogus faults on prefetch instructions. The workaround for this regressed over .24 - reinstate it. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-27	Give futex init a proper name	Benjamin Herrenschmidt	1	-2/+2
	The futex init function is called init(). This is a pain in the neck when debugging when you code dies in ... init :-) This renames it to futex_init(). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-27	avr32: Fix bug in early resource allocation code	Haavard Skinnemoen	1	-0/+1
	add_reserved_region() tries to keep the resource list sorted, so when looking for a place to insert the new resource, it may break out before the last entry. When this happens, the list is broken in two because the sibling field of the new entry doesn't point to the next resource. Fix it by updating the new resource's sibling field appropriately. Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-03-27	ACPI: drivers/acpi: elide a non-zero test on a result that is never 0	Julia Lawall	3	-49/+43
	The function thermal_cooling_device_register always returns either a valid pointer or a value made with ERR_PTR, so a test for non-zero on the result will always succeed. The problem was found using the following semantic match. (http://www.emn.fr/x-info/coccinelle/) //<smpl> @a@ expression E, E1; statement S,S1; position p; @@ E = thermal_cooling_device_register(...) ... when != E = E1 if@p (E) S else S1 @n@ position a.p; expression E,E1; statement S,S1; @@ E = NULL ... when != E = E1 if@p (E) S else S1 @depends on !n@ expression E; statement S,S1; position a.p; @@ * if@p (E) S else S1 //</smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
2008-03-26	[SPARC64]: Define TASK_SIZE_OF()	David S. Miller	1	-0/+3
	This make "cat /proc/${PID}/pagemap" more efficient for 32-bit tasks. Based upon a report by Mariusz Kozlowski. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26	[IPSEC]: Fix BEET output	Herbert Xu	5	-6/+16
	The IPv6 BEET output function is incorrectly including the inner header in the payload to be protected. This causes a crash as the packet doesn't actually have that many bytes for a second header. The IPv4 BEET output on the other hand is broken when it comes to handling an inner IPv6 header since it always assumes an inner IPv4 header. This patch fixes both by making sure that neither BEET output function touches the inner header at all. All access is now done through the protocol-independent cb structure. Two new attributes are added to make this work, the IP header length and the IPv4 option length. They're filled in by the inner mode's output function. Thanks to Joakim Koskela for finding this problem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-26	hugetlb: fix potential livelock in return_unused_surplus_hugepages()	Nishanth Aravamudan	1	-1/+10
	Running the counters testcase from libhugetlbfs results in on 2.6.25-rc5 and 2.6.25-rc5-mm1: BUG: soft lockup - CPU#3 stuck for 61s! [counters:10531] NIP: c0000000000d1f3c LR: c0000000000d1f2c CTR: c0000000001b5088 REGS: c000005db12cb360 TRAP: 0901 Not tainted (2.6.25-rc5-autokern1) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 48008448 XER: 20000000 TASK = c000005dbf3d6000[10531] 'counters' THREAD: c000005db12c8000 CPU: 3 GPR00: 0000000000000004 c000005db12cb5e0 c000000000879228 0000000000000004 GPR04: 0000000000000010 0000000000000000 0000000000200200 0000000000100100 GPR08: c0000000008aba10 000000000000ffff 0000000000000004 0000000000000000 GPR12: 0000000028000442 c000000000770080 NIP [c0000000000d1f3c] .return_unused_surplus_pages+0x84/0x18c LR [c0000000000d1f2c] .return_unused_surplus_pages+0x74/0x18c Call Trace: [c000005db12cb5e0] [c000005db12cb670] 0xc000005db12cb670 (unreliable) [c000005db12cb670] [c0000000000d24c4] .hugetlb_acct_memory+0x2e0/0x354 [c000005db12cb740] [c0000000001b5048] .truncate_hugepages+0x1d4/0x214 [c000005db12cb890] [c0000000001b50a4] .hugetlbfs_delete_inode+0x1c/0x3c [c000005db12cb920] [c000000000103fd8] .generic_delete_inode+0xf8/0x1c0 [c000005db12cb9b0] [c0000000001b5100] .hugetlbfs_drop_inode+0x3c/0x24c [c000005db12cba50] [c00000000010287c] .iput+0xdc/0xf8 [c000005db12cbad0] [c0000000000fee54] .dentry_iput+0x12c/0x194 [c000005db12cbb60] [c0000000000ff050] .d_kill+0x6c/0xa4 [c000005db12cbbf0] [c0000000000ffb74] .dput+0x18c/0x1b0 [c000005db12cbc70] [c0000000000e9e98] .__fput+0x1a4/0x1e8 [c000005db12cbd10] [c0000000000e61ec] .filp_close+0xb8/0xe0 [c000005db12cbda0] [c0000000000e62d0] .sys_close+0xbc/0x134 [c000005db12cbe30] [c00000000000872c] syscall_exit+0x0/0x40 Instruction dump: ebbe8038 38800010 e8bf0002 3bbd0008 7fa3eb78 38a50001 7ca507b4 4818df25 60000000 38800010 38a00000 7c601b78 <7fa3eb78> 2f800010 409d0008 38000010 This was tracked down to a potential livelock in return_unused_surplus_hugepages(). In the case where we have surplus pages on some node, but no free pages on the same node, we may never break out of the loop. To avoid this livelock, terminate the search if we iterate a number of times equal to the number of online nodes without freeing a page. Thanks to Andy Whitcroft and Adam Litke for helping with debugging and the patch. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-26	hugetlb: indicate surplus huge page counts in per-node meminfo	Nishanth Aravamudan	1	-2/+4
	Currently we show the surplus hugetlb pool state in /proc/meminfo, but not in the per-node meminfo files, even though we track the information on a per-node basis. Printing it there can help track down dynamic pool bugs including the one in the follow-on patch. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-26	x86: fix performance drop for glx	Suresh Siddha	3	-2/+8
	fix the 3D performance drop reported at: http://bugzilla.kernel.org/show_bug.cgi?id=10328 fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with WC attribute. Recent changes in page attribute code made both ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This breaks the graphics performance, as the effective memory type is UC instead of expected WC. The correct way to fix this is to add ioremap_wc() (which uses UC- in the absence of PAT kernel support and WC with PAT) and change all the fb drivers to use this new ioremap_wc() API. We can take this correct and longer route for post 2.6.25. For now, revert back to the UC- behavior for ioremap/ioremap_nocache. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26	x86: fix trim mtrr not to setup_memory two times	Yinghai Lu	2	-6/+4
	we could call find_max_pfn() directly instead of setup_memory() to get max_pfn needed for mtrr trimming. otherwise setup_memory() is called two times... that is duplicated... [ mingo@elte.hu: both Thomas and me simulated a double call to setup_bootmem_allocator() and can confirm that it is a real bug which can hang in certain configs. It's not been reported yet but that is probably due to the relatively scarce nature of MTRR-trimming systems. ] Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26	x86: GEODE: add missing module.h include	Andres Salomon	1	-0/+1
	On Wed, 26 Mar 2008 11:56:22 -0600 Jordan Crouse <jordan.crouse@amd.com> wrote: > On 26/03/08 14:31 +0100, Stefan Pfetzing wrote: > > Hello Jordan, > > > > I just tried to build your geodwdt driver for the geode watchdog. Therefore > > I pulled your repository from http://git.infradead.org/geode.git (or more, > > the git url). > > > > I tried to build the geodewdt driver as a module - which didn't work, and > > it failed with the same problem as earlier mentioned on lkmk [1]. I also > > checked the fix [2], but that seems to be already in your (or linus) tree - > > and so I'm unsure what the problem is. > > > > [1] http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/884074 > > [2] http://kerneltrap.org/mailarchive/linux-kernel/2008/2/17/884174 > > > > Building directly into the kernel seems to work. > > > > Maybe you have some idea? > > Hmm - that is strange. Exporting the symbols should work. I recommend > starting over with a clean tree. > > CCing Andres - any thoughts? > > Jordan > Er, yeah. The patch below should fix it. This should probably go into 2.6.25. Oops, EXPORT_SYMBOL_GPL wasn't being declared due to this header being missing. Signed-off-by: Andres Salomon <dilinger@debian.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26	x86, cpufreq: fix Speedfreq-SMI call that clobbers ECX	Stephan Diestelhorst	1	-15/+24
	I have found that using SMI to change the cpu's frequency on my DELL Latitude L400 clobbers the ECX register in speedstep_set_state, causing unneccessary retries because the "state" variable has changed silently (GCC assumes it is still present in ECX). play safe and avoid gcc caching any register across IO port accesses that trigger SMIs. Signed-off by: <Stephan.Diestelhorst@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26	x86: fix memoryless node oops during boot	Yinghai Lu	1	-1/+1
	fix oops during boot reported in this thread: http://lkml.org/lkml/2008/2/6/65 enable booting on memoryless nodes. Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-03-26	x86: add dmi quirk for io_delay	Ingo Molnar	1	-0/+8
	reported by mereandor@gmail.com, in: http://bugzilla.kernel.org/show_bug.cgi?id=6307 Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-26	x86: convert mtrr/generic.c to kernel-doc	Randy Dunlap	1	-19/+23
	Convert function comment blocks to kernel-doc notation. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-26	x86: Documentation/i386/IO-APIC.txt: fix description	Nick Andrew	1	-1/+1
	The description of the interrupt routing doesn't match the (nice) diagram. Signed-off-by: Nick Andrew <nick@nick-andrew.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-03-26	kprobes: MAINTAINERS update	Masami Hiramatsu	1	-0/+2
	Add Masami Hiramatsu to kprobes maintainers Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-26	SVCRDMA: Check num_sge when setting LAST_CTXT bit	Tom Tucker	1	-10/+11
	The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly when the last chunk in the read-list spanned multiple pages. This resulted in a kernel panic when the wrong context was used to build the RPC iovec page list. RDMA_READ is used to fetch RPC data from the client for NFS_WRITE requests. A scatter-gather is used to map the advertised client side buffer to the server-side iovec and associated page list. WR contexts are used to convey which scatter-gather entries are handled by each WR. When the write data is large, a single RPC may require multiple RDMA_READ requests so the contexts for a single RPC are chained together in a linked list. The last context in this list is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes, the CQ handler code can enqueue the RPC for processing. The code in rdma_read_xdr was setting this bit on the last two contexts on this list when the last read-list chunk spanned multiple pages. This caused the svc_rdma_recvfrom logic to incorrectly build the RPC and caused the kernel to crash because the second-to-last context doesn't contain the iovec page list. Modified the condition that sets this bit so that it correctly detects the last context for the RPC. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Tested-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-26	Revert "PCI: remove transparent bridge sizing"	Linus Torvalds	1	-5/+0
	This reverts commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f, which caused various interesting problems for people, including wrong resource allocations. See for example bugzilla entry "2.6.25-rc2: ohci1394 problem (MMIO broken)" at http://bugzilla.kernel.org/show_bug.cgi?id=10080 And Gary Hade says: "The same change had also exposed an issue reported by Paul Martin that has been causing an Oops while hotplugging ThinkPads to a ThinkPad Dock II. See http://lkml.org/lkml/2008/2/19/405 http://bugzilla.kernel.org/show_bug.cgi?id=9961 I have a fix for the ThinkPad docking Oops but if the issue being discussed here is caused by the transparent bridge sizing removal change I totally agree that it should be reverted." The transparent bridge sizing removal change was motivated by insufficient PCI memory resource for a transparent bridge window that was being created as a result of expansion ROM(s) being included in the transparent bridge sizing calculations. A later "PCI: Remove default PCI expansion ROM memory allocation" change ( re: http://lkml.org/lkml/2007/12/11/361 ) removes the expansion ROM(s) from the transparent bridge sizing calculations which actually resolves the original issue in a different manner. So, even if the "PCI: remove transparent bridge sizing" is not problematic it is no longer needed anyway." Identified-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Tested-by: Thomas Meyer <thomas@m3y3r.de> Acked-by: Gary Hade <garyhade@us.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-03-26	pnpacpi: reduce printk severity for "pnpacpi: exceeded the max number of ..."	Len Brown	1	-4/+4
	We have been printing these messages at KERN_ERR since 2.6.24, per http://bugzilla.kernel.org/show_bug.cgi?id=9535 But KERN_ERR pops up on a console booted with "quiet" and causes users to get alarmed and file bugs about the message itself: https://bugzilla.redhat.com/show_bug.cgi?id=436589 So reduce the severity of these messages to KERN_WARNING, which is not printed by "quiet". This message will still be seen without "quiet", but a lot of messages are printed in that mode and it will be less likely to cause undue alarm. We could go all the way to KERN_DEBUG, but this is a real warning after all, so it seems prudent not to require "debug" to see it. Signed-off-by: Len Brown <len.brown@intel.com>