linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2009-09-24	memory controller: soft limit organize cgroups	Balbir Singh	3	-47/+277
	Organize cgroups over soft limit in a RB-Tree Introduce an RB-Tree for storing memory cgroups that are over their soft limit. The overall goal is to 1. Add a memory cgroup to the RB-Tree when the soft limit is exceeded. We are careful about updates, updates take place only after a particular time interval has passed 2. We remove the node from the RB-Tree when the usage goes below the soft limit The next set of patches will exploit the RB-Tree to get the group that is over its soft limit by the largest amount and reclaim from it, when we face memory contention. [hugh.dickins@tiscali.co.uk: CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_PREEMPT=y fails to boot] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	memory controller: soft limit interface	Balbir Singh	3	-0/+81
	Add an interface to allow get/set of soft limits. Soft limits for memory plus swap controller (memsw) is currently not supported. Resource counters have been enhanced to support soft limits and new type RES_SOFT_LIMIT has been added. Unlike hard limits, soft limits can be directly set and do not need any reclaim or checks before setting them to a newer value. Kamezawa-San raised a question as to whether soft limit should belong to res_counter. Since all resources understand the basic concepts of hard and soft limits, it is justified to add soft limits here. Soft limits are a generic resource usage feature, even file system quotas support soft limits. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	memory controller: soft limit documentation	Balbir Singh	1	-1/+36
	Soft limits is a new feature for the memory resource controller, something similar has existed in the group scheduler in the form of shares. The CPU controllers interpretation of shares is very different though. Soft limits are the most useful feature to have for environments where the administrator wants to overcommit the system, such that only on memory contention do the limits become active. The current soft limits implementation provides a soft_limit_in_bytes interface for the memory controller and not for memory+swap controller. The implementation maintains an RB-Tree of groups that exceed their soft limit and starts reclaiming from the group that exceeds this limit by the maximum amount. This patch: Add documentation for soft limits Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	memcg: add comments explaining memory barriers	KAMEZAWA Hiroyuki	1	-0/+7
	Add comments for the reason of smp_wmb() in mem_cgroup_commit_charge(). [akpm@linux-foundation.org: coding-style fixes] Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	memcg: remove the overhead associated with the root cgroup	Balbir Singh	3	-14/+57
	Change the memory cgroup to remove the overhead associated with accounting all pages in the root cgroup. As a side-effect, we can no longer set a memory hard limit in the root cgroup. A new flag to track whether the page has been accounted or not has been added as well. Flags are now set atomically for page_cgroup, pcg_default_flags is now obsolete and removed. [akpm@linux-foundation.org: fix a few documentation glitches] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: let ss->can_attach and ss->attach do whole threadgroups at a time	Ben Blum	9	-30/+131
	Alter the ss->can_attach and ss->attach functions to be able to deal with a whole threadgroup at a time, for use in cgroup_attach_proc. (This is a pre-patch to cgroup-procs-writable.patch.) Currently, new mode of the attach function can only tell the subsystem about the old cgroup of the threadgroup leader. No subsystem currently needs that information for each thread that's being moved, but if one were to be added (for example, one that counts tasks within a group) this bit would need to be reworked a bit to tell the subsystem the right information. [hidave.darkstar@gmail.com: fix build] Signed-off-by: Ben Blum <bblum@google.com> Signed-off-by: Paul Menage <menage@google.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Reviewed-by: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: change css_set freeing mechanism to be under RCU	Ben Blum	2	-1/+10
	Changes css_set freeing mechanism to be under RCU This is a prepatch for making the procs file writable. In order to free the old css_sets for each task to be moved as they're being moved, the freeing mechanism must be RCU-protected, or else we would have to have a call to synchronize_rcu() for each task before freeing its old css_set. Signed-off-by: Ben Blum <bblum@google.com> Signed-off-by: Paul Menage <menage@google.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: use vmalloc for large cgroups pidlist allocations	Ben Blum	1	-5/+42
	Separates all pidlist allocation requests to a separate function that judges based on the requested size whether or not the array needs to be vmalloced or can be gotten via kmalloc, and similar for kfree/vfree. Signed-off-by: Ben Blum <bblum@google.com> Signed-off-by: Paul Menage <menage@google.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: ensure correct concurrent opening/reading of pidlists across pid namespaces	Ben Blum	2	-22/+119
	Previously there was the problem in which two processes from different pid namespaces reading the tasks or procs file could result in one process seeing results from the other's namespace. Rather than one pidlist for each file in a cgroup, we now keep a list of pidlists keyed by namespace and file type (tasks versus procs) in which entries are placed on demand. Each pidlist has its own lock, and that the pidlists themselves are passed around in the seq_file's private pointer means we don't have to touch the cgroup or its master list except when creating and destroying entries. Signed-off-by: Ben Blum <bblum@google.com> Signed-off-by: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: add a read-only "procs" file similar to "tasks" that shows only unique tgids	Ben Blum	2	-114/+186
	struct cgroup used to have a bunch of fields for keeping track of the pidlist for the tasks file. Those are now separated into a new struct cgroup_pidlist, of which two are had, one for procs and one for tasks. The way the seq_file operations are set up is changed so that just the pidlist struct gets passed around as the private data. Interface example: Suppose a multithreaded process has pid 1000 and other threads with ids 1001, 1002, 1003: $ cat tasks 1000 1001 1002 1003 $ cat cgroup.procs 1000 $ Signed-off-by: Ben Blum <bblum@google.com> Signed-off-by: Paul Menage <menage@google.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: revert "cgroups: fix pid namespace bug"	Paul Menage	2	-75/+31
	The following series adds a "cgroup.procs" file to each cgroup that reports unique tgids rather than pids, and allows all threads in a threadgroup to be atomically moved to a new cgroup. The subsystem "attach" interface is modified to support attaching whole threadgroups at a time, which could introduce potential problems if any subsystem were to need to access the old cgroup of every thread being moved. The attach interface may need to be revised if this becomes the case. Also added is functionality for read/write locking all CLONE_THREAD fork()ing within a threadgroup, by means of an rwsem that lives in the sighand_struct, for per-threadgroup-ness and also for sharing a cacheline with the sighand's atomic count. This scheme should introduce no extra overhead in the fork path when there's no contention. The final patch reveals potential for a race when forking before a subsystem's attach function is called - one potential solution in case any subsystem has this problem is to hang on to the group's fork mutex through the attach() calls, though no subsystem yet demonstrates need for an extended critical section. This patch: Revert commit 096b7fe012d66ed55e98bc8022405ede0cc80e96 Author: Li Zefan <lizf@cn.fujitsu.com> AuthorDate: Wed Jul 29 15:04:04 2009 -0700 Commit: Linus Torvalds <torvalds@linux-foundation.org> CommitDate: Wed Jul 29 19:10:35 2009 -0700 cgroups: fix pid namespace bug This is in preparation for some clashing cgroups changes that subsume the original commit's functionaliy. The original commit fixed a pid namespace bug which Ben Blum fixed independently (in the same way, but with different code) as part of a series of patches. I played around with trying to reconcile Ben's patch series with Li's patch, but concluded that it was simpler to just revert Li's, given that Ben's patch series contained essentially the same fix. Signed-off-by: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: allow cgroup hierarchies to be created with no bound subsystems	Paul Menage	1	-59/+99
	This patch removes the restriction that a cgroup hierarchy must have at least one bound subsystem. The mount option "none" is treated as an explicit request for no bound subsystems. A hierarchy with no subsystems can be useful for plain task tracking, and is also a step towards the support for multiply-bindable subsystems. As part of this change, the hierarchy id is no longer calculated from the bitmask of subsystems in the hierarchy (since this is not guaranteed to be unique) but is allocated via an ida. Reference counts on cgroups from css_set objects are now taken explicitly one per hierarchy, rather than one per subsystem. Example usage: mount -t cgroup -o none,name=foo cgroup /mnt/cgroup Based on the "no-op"/"none" subsystem concept proposed by kamezawa.hiroyu@jp.fujitsu.com Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: add a back-pointer from struct cg_cgroup_link to struct cgroup	Paul Menage	1	-49/+199
	Currently the cgroups code makes the assumption that the subsystem pointers in a struct css_set uniquely identify the hierarchy->cgroup mappings associated with the css_set; and there's no way to directly identify the associated set of cgroups other than by indirecting through the appropriate subsystem state pointers. This patch removes the need for that assumption by adding a back-pointer from struct cg_cgroup_link object to its associated cgroup; this allows the set of cgroups to be determined by traversing the cg_links list in the struct css_set. Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: move the cgroup debug subsys into cgroup.c to access internal state	Paul Menage	3	-106/+88
	While it's architecturally clean to have the cgroup debug subsystem be completely independent of the cgroups framework, it limits its usefulness for debugging the contents of internal data structures. Move the debug subsystem code into the scope of all the cgroups data structures to make more detailed debugging possible. Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: support named cgroups hierarchies	Paul Menage	2	-48/+156
	To simplify referring to cgroup hierarchies in mount statements, and to allow disambiguation in the presence of empty hierarchies and multiply-bindable subsystems this patch adds support for naming a new cgroup hierarchy via the "name=" mount option A pre-existing hierarchy may be specified by either name or by subsystems; a hierarchy's name cannot be changed by a remount operation. Example usage: # To create a hierarchy called "foo" containing the "cpu" subsystem mount -t cgroup -oname=foo,cpu cgroup /mnt/cgroup1 # To mount the "foo" hierarchy on a second location mount -t cgroup -oname=foo cgroup /mnt/cgroup2 Signed-off-by: Paul Menage <menage@google.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cgroups: make unlock sequence in cgroup_get_sb consistent	Xiaotian Feng	1	-1/+1
	Make the last unlock sequence consistent with previous unlock sequeue. Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	docs: fix various Documentation/ paths in header files	Randy Dunlap	4	-5/+5
	Fix various Documentation/ paths in include/linux/. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Reviewed-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	page-types: add feature for walking process address space	Wu Fengguang	1	-20/+180
	Introduce "-p\|--pid <pid>" for walking the process address space. The default action is to walk raw memory PFNs. Both the virtual address and physical address of each present pages will be listed: # ./tools/vm/page-types -lp $$ \| head -3 voffset offset len flags 400 11bebe 1 __RU_lA____M______________________ 402 11bebc 1 __RU_lA____M______________________ Note that voffset/offset/len are now showed as hex numbers. [akpm@linux-foundation.org: coding-style fixes] Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	Documentation/vm/.gitignore: add page-types	Josh Triplett	1	-0/+1
	Signed-off-by: Josh Triplett <josh@joshtriplett.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	includecheck fix: Documentation, cfag12864b-example.c	Jaswinder Singh Rajput	1	-1/+0
	fix the following 'make includecheck' warning: Documentation/auxdisplay/cfag12864b-example.c: string.h is included more than once. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	Documentation: update stale definition of file-nr in fs.txt	Xiaotian Feng	1	-7/+10
	In "documentation: update Documentation/filesystem/proc.txt and Documentation/sysctls" (commit 760df93ec) we merged /proc/sys/fs documentation in Documentation/sysctl/fs.txt and Documentation/filesystem/proc.txt, but stale file-nr definition remained. This patch adds back the right fs-nr definition for 2.6 kernel. Signed-off-by: Xiaotian Feng<dfeng@redhat.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	doc/filesystems: more mount cleanups	Peng Tao	1	-6/+11
	Documentation/filesystems/sharedsubtree.txt needs updating because the mount command in util-linux package is well aware of shared subtree features now. The patch also fixes two typos in sharedsubtree.txt. Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	doc/filesystems: remove smount program	Randy Dunlap	1	-175/+34
	mount(8) handles shared subtrees just fine, so remove the smount program from Documentation/filesystems/sharedsubtree.txt. Fix annoying "Lets" -> "Let's". Insert space between '#' prompt and "mount" command. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	time: add function to convert between calendar time and broken-down time for universal use	Zhaolei	3	-1/+156
	There are many similar code in kernel for one object: convert time between calendar time and broken-down time. Here is some source I found: fs/ncpfs/dir.c fs/smbfs/proc.c fs/fat/misc.c fs/udf/udftime.c fs/cifs/netmisc.c net/netfilter/xt_time.c drivers/scsi/ips.c drivers/input/misc/hp_sdc_rtc.c drivers/rtc/rtc-lib.c arch/ia64/hp/sim/boot/fw-emu.c arch/m68k/mac/misc.c arch/powerpc/kernel/time.c arch/parisc/include/asm/rtc.h ... We can make a common function for this type of conversion, At least we can get following benefit: 1: Make kernel simple and unify 2: Easy to fix bug in converting code 3: Reduce clone of code in future For example, I'm trying to make ftrace display walltime, this patch will make me easy. This code is based on code from glibc-2.6 Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pavel Machek <pavel@ucw.cz> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	hugetlbfs: do not call user_shm_lock() for MAP_HUGETLB fix	From: Mel Gorman	1	-9/+3
	Commit 6bfde05bf5c ("hugetlbfs: allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount") altered can_do_hugetlb_shm() to check if a file is being created for shared memory or mmap(). If this returns false, we then unconditionally call user_shm_lock() triggering a warning. This block should never be entered for MAP_HUGETLB. This patch partially reverts the problem and fixes the check. Signed-off-by: Eric B Munson <ebmunson@us.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	ksm: change default values to better fit into mainline kernel	Izik Eidus	1	-3/+11
	Now that ksm is in mainline it is better to change the default values to better fit to most of the users. This patch change the ksm default values to be: ksm_thread_pages_to_scan = 100 (instead of 200) ksm_thread_sleep_millisecs = 20 (like before) ksm_run = KSM_RUN_STOP (instead of KSM_RUN_MERGE - meaning ksm is disabled by default) ksm_max_kernel_pages = nr_free_buffer_pages / 4 (instead of 2046) The important aspect of this patch is: it disables ksm by default, and sets the number of the kernel_pages that can be allocated to be a reasonable number. Signed-off-by: Izik Eidus <ieidus@redhat.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	input: fix build failures caused by Kconfig Winbond WPCD376I Consumer IR hardware driver Kconfig entry	Ingo Molnar	1	-0/+1
	Fix these warnings: drivers/built-in.o: In function `apanel_remove': apanel.c:(.text+0x56e852): undefined reference to `led_classdev_unregister' drivers/built-in.o: In function `apanel_probe': apanel.c:(.text+0x56eae3): undefined reference to `led_classdev_register' drivers/built-in.o: In function `acpi_fujitsu_hotkey_add': fujitsu-laptop.c:(.text+0x5d7647): undefined reference to `led_classdev_register' fujitsu-laptop.c:(.text+0x5d76b5): undefined reference to `led_classdev_register' drivers/built-in.o: In function `wbcir_probe': winbond-cir.c:(.devinit.text+0x5f375): undefined reference to `led_classdev_register' winbond-cir.c:(.devinit.text+0x5f663): undefined reference to `led_classdev_unregister' drivers/built-in.o: In function `wbcir_remove': winbond-cir.c:(.devexit.text+0x7f23): undefined reference to `led_classdev_unregister' drivers/built-in.o: In function `fujitsu_cleanup': fujitsu-laptop.c:(.exit.text+0xbe37): undefined reference to `led_classdev_unregister' fujitsu-laptop.c:(.exit.text+0xbe53): undefined reference to `led_classdev_unregister' It happens because the new INPUT_WINBOND_CIR driver relies on new-leds infrastructure - but does not select it in drivers/input/misc/Kconfig. But it selects LEDS_CLASS, which confuses a number of other drivers into thinking that all the leds infrastructure is in place. Fix this by selecting NEW_LEDS as well, like similar drivers do. Eventually, this whole leds infrastructure complexity should be cleaned up, it's been going on for years. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: David Härdeman <david@hardeman.nu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23	headers: utsname.h redux	Alexey Dobriyan	56	-56/+2
	* remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23	Revert "kmod: fix race in usermodehelper code"	Sebastian Andrzej Siewior	1	-8/+5
	This reverts commit c02e3f361c7 ("kmod: fix race in usermodehelper code") The patch is wrong. UMH_WAIT_EXEC is called with VFORK what ensures that the child finishes prior returing back to the parent. No race. In fact, the patch makes it even worse because it does the thing it claims not do: - It calls ->complete() on UMH_WAIT_EXEC - the complete() callback may de-allocated subinfo as seen in the following call chain: [<c009f904>] (__link_path_walk+0x20/0xeb4) from [<c00a094c>] (path_walk+0x48/0x94) [<c00a094c>] (path_walk+0x48/0x94) from [<c00a0a34>] (do_path_lookup+0x24/0x4c) [<c00a0a34>] (do_path_lookup+0x24/0x4c) from [<c00a158c>] (do_filp_open+0xa4/0x83c) [<c00a158c>] (do_filp_open+0xa4/0x83c) from [<c009ba90>] (open_exec+0x24/0xe0) [<c009ba90>] (open_exec+0x24/0xe0) from [<c009bfa8>] (do_execve+0x7c/0x2e4) [<c009bfa8>] (do_execve+0x7c/0x2e4) from [<c0026a80>] (kernel_execve+0x34/0x80) [<c0026a80>] (kernel_execve+0x34/0x80) from [<c004b514>] (____call_usermodehelper+0x130/0x148) [<c004b514>] (____call_usermodehelper+0x130/0x148) from [<c0024858>] (kernel_thread_exit+0x0/0x8) and the path pointer was NULL. Good that ARM's kernel_execve() doesn't check the pointer for NULL or else I wouldn't notice it. The only race there might be is with UMH_NO_WAIT but it is too late for me to investigate it now. UMH_WAIT_PROC could probably also use VFORK and we could save one exec. So the only race I see is with UMH_NO_WAIT and recent scheduler changes where the child does not always run first might have trigger here something but as I said, it is late.... Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-24	cpumask: Move deprecated functions to end of header.	Rusty Russell	1	-341/+252
	The new ones have pretty kerneldoc. Move the old ones to the end to avoid confusing people. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: benh@kernel.crashing.org
2009-09-24	cpumask: remove unused deprecated functions, avoid accusations of insanity	Rusty Russell	1	-111/+1
	We're not forcing removal of the old cpu_ functions, but we might as well delete the now-unused ones. Especially CPUMASK_ALLOC and friends. I actually got a phone call (!) from a hacker who thought I had introduced them as the new cpumask API. He seemed bewildered that I had lost all taste. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: benh@kernel.crashing.org
2009-09-24	cpumask: use new-style cpumask ops in mm/quicklist.	Rusty Russell	1	-2/+1
	This slipped past the previous sweeps. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christoph Lameter <cl@linux-foundation.org>
2009-09-24	cpumask: use mm_cpumask() wrapper: x86	Rusty Russell	4	-14/+15
	Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer). It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: use mm_cpumask() wrapper: um	Rusty Russell	1	-2/+2
	Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: use mm_cpumask() wrapper: mips	Rusty Russell	2	-6/+6
	Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: use mm_cpumask() wrapper: mn10300	Rusty Russell	1	-6/+6
	Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer). It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Also change the actual arg name here to "mm" (which it is), not "task". Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: use mm_cpumask() wrapper: m32r	Rusty Russell	2	-6/+6
	Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Hirokazu Takata <takata@linux-m32r.org> (fixes)
2009-09-24	cpumask: use mm_cpumask() wrapper: arm	Rusty Russell	6	-20/+21
	Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: Use accessors for cpu_*_mask: um	Rusty Russell	1	-1/+1
	Use the accessors rather than frobbing bits directly (the new versions are const). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
2009-09-24	cpumask: Use accessors for cpu_*_mask: powerpc	Rusty Russell	4	-13/+13
	Use the accessors rather than frobbing bits directly (the new versions are const). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
2009-09-24	cpumask: Use accessors for cpu_*_mask: mips	Rusty Russell	4	-8/+8
	Use the accessors rather than frobbing bits directly (the new versions are const). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
2009-09-24	cpumask: Use accessors for cpu_*_mask: m32r	Rusty Russell	1	-1/+1
	Use the accessors rather than frobbing bits directly (the new versions are const). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
2009-09-24	cpumask: remove arch_send_call_function_ipi	Rusty Russell	12	-18/+0
	Now everyone is converted to arch_send_call_function_ipi_mask, remove the shim and the #defines. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: arch_send_call_function_ipi_mask: s390	Rusty Russell	2	-3/+4
	We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: arch_send_call_function_ipi_mask: powerpc	Rusty Russell	2	-3/+4
	We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(), and by defining it, the old arch_send_call_function_ipi is defined by the core code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: arch_send_call_function_ipi_mask: mips	Rusty Russell	12	-20/+25
	We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(), and by defining it, the old arch_send_call_function_ipi is defined by the core code. We also take the chance to wean the implementations off the obsolescent for_each_cpu_mask(): making send_ipi_mask take the pointer seemed the most natural way to ensure all implementations used for_each_cpu. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: arch_send_call_function_ipi_mask: m32r	Rusty Russell	2	-12/+13
	We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(), and by defining it, the old arch_send_call_function_ipi is defined by the core code. We also take the chance to wean the implementations off the obsolescent for_each_cpu_mask(): making send_ipi_mask take the pointer seemed the most natural way to ensure all implementations used for_each_cpu. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: arch_send_call_function_ipi_mask: alpha	Rusty Russell	2	-8/+9
	We're weaning the core code off handing cpumask's around on-stack. This introduces arch_send_call_function_ipi_mask(). We also take the chance to wean the send_ipi_message off the obsolescent for_each_cpu_mask(): making it take a pointer seemed the most natural way to do this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64	Rusty Russell	1	-2/+0
	There were replaced by topology_core_cpumask and topology_thread_cpumask. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24	cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: powerpc	Rusty Russell	1	-2/+0
	There were replaced by topology_core_cpumask and topology_thread_cpumask. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>