aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2010-05-27kmod: add init function to usermodehelperNeil Horman3-35/+73
About 6 months ago, I made a set of changes to how the core-dump-to-a-pipe feature in the kernel works. We had reports of several races, including some reports of apps bypassing our recursion check so that a process that was forked as part of a core_pattern setup could infinitely crash and refork until the system crashed. We fixed those by improving our recursion checks. The new check basically refuses to fork a process if its core limit is zero, which works well. Unfortunately, I've been getting grief from maintainer of user space programs that are inserted as the forked process of core_pattern. They contend that in order for their programs (such as abrt and apport) to work, all the running processes in a system must have their core limits set to a non-zero value, to which I say 'yes'. I did this by design, and think thats the right way to do things. But I've been asked to ease this burden on user space enough times that I thought I would take a look at it. The first suggestion was to make the recursion check fail on a non-zero 'special' number, like one. That way the core collector process could set its core size ulimit to 1, and enable the kernel's recursion detection. This isn't a bad idea on the surface, but I don't like it since its opt-in, in that if a program like abrt or apport has a bug and fails to set such a core limit, we're left with a recursively crashing system again. So I've come up with this. What I've done is modify the call_usermodehelper api such that an extra parameter is added, a function pointer which will be called by the user helper task, after it forks, but before it exec's the required process. This will give the caller the opportunity to get a call back in the processes context, allowing it to do whatever it needs to to the process in the kernel prior to exec-ing the user space code. In the case of do_coredump, this callback is ues to set the core ulimit of the helper process to 1. This elimnates the opt-in problem that I had above, as it allows the ulimit for core sizes to be set to the value of 1, which is what the recursion check looks for in do_coredump. This patch: Create new function call_usermodehelper_fns() and allow it to assign both an init and cleanup function, as we'll as arbitrary data. The init function is called from the context of the forked process and allows for customization of the helper process prior to calling exec. Its return code gates the continuation of the process, or causes its exit. Also add an arbitrary data pointer to the subprocess_info struct allowing for data to be passed from the caller to the new process, and the subsequent cleanup process Also, use this patch to cleanup the cleanup function. It currently takes an argp and envp pointer for freeing, which is ugly. Lets instead just make the subprocess_info structure public, and pass that to the cleanup and init routines Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27signals: check_kill_permission(): don't check creds if same_thread_group()Oleg Nesterov1-2/+4
Andrew Tridgell reports that aio_read(SIGEV_SIGNAL) can fail if the notification from the helper thread races with setresuid(), see http://samba.org/~tridge/junkcode/aio_uid.c This happens because check_kill_permission() doesn't permit sending a signal to the task with the different cred->xids. But there is not any security reason to check ->cred's when the task sends a signal (private or group-wide) to its sub-thread. Whatever we do, any thread can bypass all security checks and send SIGKILL to all threads, or it can block a signal SIG and do kill(gettid(), SIG) to deliver this signal to another sub-thread. Not to mention that CLONE_THREAD implies CLONE_VM. Change check_kill_permission() to avoid the credentials check when the sender and the target are from the same thread group. Also, move "cred = current_cred()" down to avoid calling get_current() twice. Note: David Howells pointed out we could relax this even more, the CLONE_SIGHAND (without CLONE_THREAD) case probably does not need these checks too. Roland said: : The glibc (libpthread) that does set*id across threads has : been in use for a while (2.3.4?), probably in distro's using kernels as old : or older than any active -stable streams. In the race in question, this : kernel bug is breaking valid POSIX application expectations. Reported-by: Andrew Tridgell <tridge@samba.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Jakub Jelinek <jakub@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Roland McGrath <roland@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: <stable@kernel.org> [all kernel versions] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27ptrace: PTRACE_GETFDPIC: fix the unsafe usage of child->mmOleg Nesterov1-2/+8
Now that Mike Frysinger unified the FDPIC ptrace code, we can fix the unsafe usage of child->mm in ptrace_request(PTRACE_GETFDPIC). We have the reference to task_struct, and ptrace_check_attach() verified the tracee is stopped. But nothing can protect from SIGKILL after that, we must not assume child->mm != NULL. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Mike Frysinger <vapier.adi@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Greg Ungerer <gerg@snapgear.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27ptrace: unify FDPIC implementationsMike Frysinger4-67/+29
The Blackfin/FRV/SuperH guys all have the same exact FDPIC ptrace code in their arch handlers (since they were probably copied & pasted). Since these ptrace interfaces are an arch independent aspect of the FDPIC code, unify them in the common ptrace code so new FDPIC ports don't need to copy and paste this fundamental stuff yet again. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27cpusets: randomize node rotor used in cpuset_mem_spread_node()Jack Steiner5-1/+31
Some workloads that create a large number of small files tend to assign too many pages to node 0 (multi-node systems). Part of the reason is that the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at node 0 for newly created tasks. This patch changes the rotor to be initialized to a random node number of the cpuset. [akpm@linux-foundation.org: fix layout] [Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration] Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Paul Menage <menage@google.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27cpusets: new round-robin rotor for SLAB allocationsJack Steiner4-5/+24
We have observed several workloads running on multi-node systems where memory is assigned unevenly across the nodes in the system. There are numerous reasons for this but one is the round-robin rotor in cpuset_mem_spread_node(). For example, a simple test that writes a multi-page file will allocate pages on nodes 0 2 4 6 ... Odd nodes are skipped. (Sometimes it allocates on odd nodes & skips even nodes). An example is shown below. The program "lfile" writes a file consisting of 10 pages. The program then mmaps the file & uses get_mempolicy(..., MPOL_F_NODE) to determine the nodes where the file pages were allocated. The output is shown below: # ./lfile allocated on nodes: 2 4 6 0 1 2 6 0 2 There is a single rotor that is used for allocating both file pages & slab pages. Writing the file allocates both a data page & a slab page (buffer_head). This advances the RR rotor 2 nodes for each page allocated. A quick confirmation seems to confirm this is the cause of the uneven allocation: # echo 0 >/dev/cpuset/memory_spread_slab # ./lfile allocated on nodes: 6 7 8 9 0 1 2 3 4 5 This patch introduces a second rotor that is used for slab allocations. Signed-off-by: Jack Steiner <steiner@sgi.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Paul Menage <menage@google.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: clean up memory thresholdsKirill A. Shutemov1-85/+66
Introduce struct mem_cgroup_thresholds. It helps to reduce number of checks of thresholds type (memory or mem+swap). [akpm@linux-foundation.org: repair comment] Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27cgroups: make cftype.unregister_event() void-returningKirill A. Shutemov3-26/+42
Since we are unable to handle an error returned by cftype.unregister_event() properly, let's make the callback void-returning. mem_cgroup_unregister_event() has been rewritten to be a "never fail" function. On mem_cgroup_usage_register_event() we save old buffer for thresholds array and reuse it in mem_cgroup_usage_unregister_event() to avoid allocation. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: fix mis-accounting of file mapped racy with migrationakpm@linux-foundation.org4-41/+107
FILE_MAPPED per memcg of migrated file cache is not properly updated, because our hook in page_add_file_rmap() can't know to which memcg FILE_MAPPED should be counted. Basically, this patch is for fixing the bug but includes some big changes to fix up other messes. Now, at migrating mapped file, events happen in following sequence. 1. allocate a new page. 2. get memcg of an old page. 3. charge ageinst a new page before migration. But at this point, no changes to new page's page_cgroup, no commit for the charge. (IOW, PCG_USED bit is not set.) 4. page migration replaces radix-tree, old-page and new-page. 5. page migration remaps the new page if the old page was mapped. 6. Here, the new page is unlocked. 7. memcg commits the charge for newpage, Mark the new page's page_cgroup as PCG_USED. Because "commit" happens after page-remap, we can count FILE_MAPPED at "5", because we should avoid to trust page_cgroup->mem_cgroup. if PCG_USED bit is unset. (Note: memcg's LRU removal code does that but LRU-isolation logic is used for helping it. When we overwrite page_cgroup->mem_cgroup, page_cgroup is not on LRU or page_cgroup->mem_cgroup is NULL.) We can lose file_mapped accounting information at 5 because FILE_MAPPED is updated only when mapcount changes 0->1. So we should catch it. BTW, historically, above implemntation comes from migration-failure of anonymous page. Because we charge both of old page and new page with mapcount=0, we can't catch - the page is really freed before remap. - migration fails but it's freed before remap or .....corner cases. New migration sequence with memcg is: 1. allocate a new page. 2. mark PageCgroupMigration to the old page. 3. charge against a new page onto the old page's memcg. (here, new page's pc is marked as PageCgroupUsed.) 4. page migration replaces radix-tree, page table, etc... 5. At remapping, new page's page_cgroup is now makrked as "USED" We can catch 0->1 event and FILE_MAPPED will be properly updated. And we can catch SWAPOUT event after unlock this and freeing this page by unmap() can be caught. 7. Clear PageCgroupMigration of the old page. So, FILE_MAPPED will be correctly updated. Then, for what MIGRATION flag is ? Without it, at migration failure, we may have to charge old page again because it may be fully unmapped. "charge" means that we have to dive into memory reclaim or something complated. So, it's better to avoid charge it again. Before this patch, __commit_charge() was working for both of the old/new page and fixed up all. But this technique has some racy condtion around FILE_MAPPED and SWAPOUT etc... Now, the kernel use MIGRATION flag and don't uncharge old page until the end of migration. I hope this change will make memcg's page migration much simpler. This page migration has caused several troubles. Worth to add a flag for simplification. Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27mm: memcontrol - uninitialised return valuePhil Carmody1-1/+1
Only an out of memory error will cause ret to be set. Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27mm: remove unnecessary use of atomicPhil Carmody1-7/+7
The bottom 4 hunks are atomically changing memory to which there are no aliases as it's freshly allocated, so there's no need to use atomic operations. The other hunks are just atomic_read and atomic_set, and do not involve any read-modify-write. The use of atomic_{read,set} doesn't prevent a read/write or write/write race, so if a race were possible (I'm not saying one is), then it would still be there even with atomic_set. See: http://digitalvampire.org/blog/index.php/2007/05/13/atomic-cargo-cults/ Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: make oom killer a no-op when no killable task can be foundDavid Rientjes1-4/+1
It's pointless to try to kill current if select_bad_process() did not find an eligible task to kill in mem_cgroup_out_of_memory() since it's guaranteed that current is a member of the memcg that is oom and it is, by definition, unkillable. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: update documentationKAMEZAWA Hiroyuki1-93/+198
Some information are old, and I think current document doesn't work as "a guide for users". We need summary of all of our controls, at least. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: move charge of file pagesDaisuke Nishimura4-18/+125
This patch adds support for moving charge of file pages, which include normal file, tmpfs file and swaps of tmpfs file. It's enabled by setting bit 1 of <target cgroup>/memory.move_charge_at_immigrate. Unlike the case of anonymous pages, file pages(and swaps) in the range mmapped by the task will be moved even if the task hasn't done page fault, i.e. they might not be the task's "RSS", but other task's "RSS" that maps the same file. And mapcount of the page is ignored(the page can be moved even if page_mapcount(page) > 1). So, conditions that the page/swap should be met to be moved is that it must be in the range mmapped by the target task and it must be charged to the old cgroup. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix warning] Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: clean up move chargeDaisuke Nishimura1-37/+59
This patch cleans up move charge code by: - define functions to handle pte for each types, and make is_target_pte_for_mc() cleaner. - instead of checking the MOVE_CHARGE_TYPE_ANON bit, define a function that checks the bit. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: oom kill disable and oom statusKAMEZAWA Hiroyuki2-19/+117
This adds a feature to disable oom-killer for memcg, if disabled, of course, tasks under memcg will stop. But now, we have oom-notifier for memcg. And the world around memcg is not under out-of-memory. memcg's out-of-memory just shows memcg hits limit. Then, administrator or management daemon can recover the situation by - kill some process - enlarge limit, add more swap. - migrate some tasks - remove file cache on tmps (difficult ?) Unlike oom-killer, you can take enough information before killing tasks. (by gcore, or, ps etc.) [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: oom notifierKAMEZAWA Hiroyuki2-9/+111
Considering containers or other resource management softwares in userland, event notification of OOM in memcg should be implemented. Now, memcg has "threshold" notifier which uses eventfd, we can make use of it for oom notification. This patch adds oom notification eventfd callback for memcg. The usage is very similar to threshold notifier, but control file is memory.oom_control and no arguments other than eventfd is required. % cgroup_event_notifier /cgroup/A/memory.oom_control dummy (About cgroup_event_notifier, see Documentation/cgroup/) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: David Rientjes <rientjes@google.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27memcg: oom wakeup filterKAMEZAWA Hiroyuki1-17/+46
memcg's oom waitqueue is a system-wide wait_queue (for handling hierarchy.) So, it's better to add custom wake function and do filtering in wake up path. This patch adds a filtering feature for waking up oom-waiters. Hierarchy is properly handled. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27Documentation/cgroups/cgroups.txt: fix reference to "numtasks"Trevor Woerner1-1/+1
Signed-off-by: Trevor Woerner <twoerner@gmail.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27Documentation: SubmittingDrivers: ResourcesArce, Abraham1-0/+5
- Add additional location (Git) for the kernel master tree - Add reference to Git Project Signed-off-by: Abraham Arce <x0066660@ti.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27ufs: permit mounting of BorderWare filesystemsThomas Stewart2-0/+3
I recently had to recover some files from an old broken machine that was running BorderWare Document Gateway. It's basically a drop in web server for sharing files. From the look of the init process and using strings on of a few files it seems to be based on FreeBSD 3.3. The process turned out to be more difficult than I imagined, but to cut a long story short BorderWare in their wisdom use a nonstandard magic number in their UFS (ufstype=44bsd) file systems. Thus Linux refuses to mount the file systems in order to recover the data. After a bit of hunting I was able to make a quick fix to fs/ufs/super.c in order to detect the new magic number. I assume that this number is the same for all installations. It's quite easy to find out from ufs_fs.h. The superblock sits 8k into the block device and the magic number its 1372 bytes into the superblock struct. # dd if=/dev/sda5 skip=$(( 8192 + 1372 )) bs=1 count=4 2> /dev/null | hd 00000000 97 26 24 0f |.&$.| # Signed-off-by: Thomas Stewart <thomas@stewarts.org.uk> Cc: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27drivers/telephony/ixj.c: use memdup_userJulia Lawall1-11/+4
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27fbdev: bf54x-lq043fb: fix unused warnings with backlight codeMike Frysinger1-3/+4
The current backlight code is stubbed out, so the new props changes added some warnings: drivers/video/bf54x-lq043fb.c: In function 'bfin_bf54x_probe': drivers/video/bf54x-lq043fb.c:666: warning: label 'out9' defined but not used drivers/video/bf54x-lq043fb.c:504: warning: unused variable 'props' Fix em ! Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27fbdev: bfin-t350mcqb-fb: avoid unused warnings in backlight codeMike Frysinger1-3/+4
The current backlight code is stubbed out, so the new props changes added some warnings about unused label/prop. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27drivers/video/via: use memdup_userJulia Lawall1-8/+3
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Joseph Chan <JosephChan@via.com.tw> Cc: Scott Fang <ScottFang@viatech.com.cn> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27add support for S3 Trio3D/1X/2XOndrej Zary1-20/+81
Add support for S3 Trio3D/1X (86C360) and S3 Trio3D/2X (86C362 and 86C368) cards to s3fb driver. Tested with 86C362 AGP and 86C368 PCI&AGP. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Acked-by: Ondrej Zajicek <santiago@crfreenet.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27drivers/gpio/it8761e_gpio: check return value of gpiochip_remove()Daniel Mack1-1/+4
This eliminates the following build warning: drivers/gpio/it8761e_gpio.c: In function `it8761e_gpio_exit': drivers/gpio/it8761e_gpio.c:220: warning: ignoring return value of `gpiochip_remove', declared with attribute warn_unused_result Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Denis Turischev <denis@compulab.co.il> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpio: add Penwell gpio supportAlek Du2-34/+53
Intel Penwell chip has two 96 pins GPIO blocks, which are very similiar as Intel Langwell chip GPIO block, except for pin number difference. This patch expends the original Langwell GPIO driver to support Penwell's. Signed-off-by: Alek Du <alek.du@intel.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27arm: omap: remove the unused omap_gpio_set_debounce methodsFelipe Balbi1-74/+0
Nobody uses that anymore, so remove and expect drivers to use the gpiolib implementation. Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Cc: Tony Lindgren <tony@atomide.com> Cc: David Brownell <david-b@pacbell.net> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27arm: omap: switch over to gpio_set_debounceFelipe Balbi5-12/+6
Stop using the omap-specific implementations for gpio debouncing now that gpiolib provides its own support. Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Cc: Tony Lindgren <tony@atomide.com> Cc: David Brownell <david-b@pacbell.net> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27arm: omap: gpio: implement set_debounce methodFelipe Balbi1-0/+68
OMAP supports debouncing of gpio lines, implement the method using gpiolib. Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Cc: Tony Lindgren <tony@atomide.com> Cc: David Brownell <david-b@pacbell.net> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpiolib: introduce set_debounce methodFelipe Balbi3-0/+53
A few architectures, like OMAP, allow you to set a debouncing time for the gpio before generating the IRQ. Teach gpiolib about that. Mark said: : This would be generally useful for embedded systems, especially where : the interrupt concerned is a wake source. It allows drivers to avoid : spurious interrupts from noisy sources so if the hardware supports it : the driver can avoid having to explicitly wait for the signal to become : stable and software has to cope with fewer events. We've lived without : it for quite some time, though. David said: : I looked at adding debounce support to the generic GPIO calls (and thus : gpiolib) some time back, but decided against it. I forget why at this : time (check list archives) but it wasn't because of lack of utility in : certain contexts. : : One thing to watch out for is just how variable the hardware capabilities : are. Atmel GPIOs have something like a fixed number of 32K clock cycles : for debounce, twl4030 had something odd, OMAPs were more like the Atmel : chips but with a different clock. In some cases debouncing had to be : ganged, not per-GPIO. And so forth. Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com> Cc: Tony Lindgren <tony@atomide.com> Cc: David Brownell <david-b@pacbell.net> Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpiolib: make gpiochip_add() show a better error messageBen Dooks1-1/+1
The current message, 'not registered' is confusing as it implies it was not registered with something, whereas printing 'failed to register' implies it was the gpiochip_add() call that did not work correctly. Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpio: max732x: fix input configuration for open-drain pinsMarc Zyngier1-0/+7
Fix a bug I noticed while hacking on the max732x driver for interrupt support. According to the datasheets, open-drain pins have to be configured as output-high (which in that case is actually high impedance) to be used as input. Signed-off-by: Marc Zyngier <maz@misterjones.org> Acked-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27max732x: correct nr_port checking off by one errorAxel Lin1-3/+3
Setup both client_group_a and client_group_b if nr_port > 8 (not including nr_port==8). Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Eric Miao <eric.miao@marvell.com> Cc: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27pl061: fix offset value range checkingAxel Lin1-1/+1
The valid offset value is 0..PL061_GPIO_NR-1, this patch corrects the offset value range checking. Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpiolib: document that names can contain printk format specifiersUwe Kleine-König1-1/+3
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpiolib: a gpio is unsigned, so use %u to print itUwe Kleine-König1-1/+1
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpiolib: make names array and its values constUwe Kleine-König5-5/+5
gpiolib doesn't need to modify the names and I assume most initializers use string constants that shouldn't be modified anyhow. [akpm@linux-foundation.org: fix drivers/gpio/cs5535-gpio.c] Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: Kevin Wells <kevin.wells@nxp.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27gpio: add interrupt handling capability to max732xMarc Zyngier3-19/+346
Most of the GPIO expanders supported by the max732x driver have interrupt generation capability by reporting changes on input pins through an INT# pin. This patch implements the irq_chip functionnality (edge detection only). Signed-off-by: Marc Zyngier <maz@misterjones.org> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Jebediah Huang <jebediah.huang@gmail.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27rtc: AB8500 RTC driverVirupax Sadashivpetimath3-0/+371
Add a driver for the RTC on the AB8500 power management chip. This is a client of the AB8500 MFD driver. Signed-off-by: Virupax Sadashivpetimath <virupax.sadashivpetimath@stericsson.com> Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Srinidhi Kasagar <srinidhi.kasagar@stericsson.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27fs/autofs4: use memdup_userJulia Lawall1-11/+2
Use memdup_user when user data is immediately copied into the allocated region. Elimination of the variable ads, which is no longer useful. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27drivers/message/i2o/i2o_config.c: use memdup_userJulia Lawall1-8/+3
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27drivers/char/vt.c: use memdup_userJulia Lawall1-7/+3
Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = \(kmalloc@p\|kzalloc@p\)(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) || ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27drivers/mmc/host: use ERR_CASTJulia Lawall1-1/+1
Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; T x; identifier f; @@ T f (...) { <+... - ERR_PTR(PTR_ERR(x)) + x ...+> } @@ expression x; @@ - ERR_PTR(PTR_ERR(x)) + ERR_CAST(x) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27sdhci-spear: ST SPEAr based SDHCI controller glueViresh KUMAR5-0/+359
Add a glue layer to support the sdhci driver on the ST SPEAr platform. Signed-off-by: Viresh Kumar <viresh.kumar@st.com> Cc: <shiraz.hashim@st.com> Cc: Linus Walleij <linus.ml.walleij@gmail.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27sdio: add new function for RAW (Read after Write) operationGrazvydas Ignotas2-0/+33
SDIO specification allows RAW (Read after Write) operation using IO_RW_DIRECT command (CMD52) by setting the RAW bit. This operation is similar to ordinary read/write commands, except that both write and read are performed using single command/response pair. The Linux SDIO layer already supports this internaly, only external function is missing for drivers to make use, which is added by this patch. This type of command is required to implement proper power save mode support in wl1251 wifi driver. Android has similar patch for G1 in it's tree for the same reason: http://android.git.kernel.org/?p=kernel/common.git;a=commitdiff;h=74a47786f6ecbe6c1cf9fb15efe6a968451deb52 Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Acked-by: Kalle Valo <kalle.valo@iki.fi> Cc: Dmitry Shmidt <dimitrysh@google.com> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27mmc: remove the "state" argument to mmc_suspend_host()Matt Fleming23-26/+23
Even though many mmc host drivers pass a pm_message_t argument to mmc_suspend_host() that argument isn't used the by MMC core. As host drivers are converted to dev_pm_ops they'll have to construct pm_message_t's (as they won't be passed by the PM subsystem any more) just to appease the mmc suspend interface. We might as well just delete the unused paramter. Signed-off-by: Matt Fleming <matt@console-pimps.org> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Acked-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>ZZ Acked-by: Sascha Sommer <saschasommer@freenet.de> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27mmc: OMAP HS-MMC: convert to dev_pm_opsKevin Hilman1-4/+11
Convert PM operations to use dev_pm_ops. This will facilitate the runtime PM coversion which will add to dev_pm_ops hooks. Note that dev_pm_ops version of the suspend hook no longer takes a 'state' argument. However, the MMC core function mmc_suspend_host() still takes a 'state' argument, but it is unused, so a dummy state variable was created to pass to the MMC core. In the future, the MMC core should be converted to drop this state argument and the rest of the MMC drivers could be easily converted to dev_pm_ops as well. Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com> Cc: Madhusudhan Chikkature <madhu.cr@ti.com> Cc: Adrian Hunter <adrian.hunter@nokia.com> Cc: Matt Fleming <matt@console-pimps.org> Cc: Tony Lindgren <tony@atomide.com> Cc: Denis Karpov <ext-denis.2.karpov@nokia.com> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27omap_hsmmc: improve interrupt synchronisationAdrian Hunter1-128/+134
The following changes were needed: - do not use in_interrupt() because it will not work with threaded interrupts In addition, the following improvements were made: - ensure DMA is unmapped only after the final DMA interrupt - ensure a request is completed only after the final DMA interrupt - disable controller interrupts when a request is not in progress - remove the spin-lock protecting the start of a new request from an unexpected interrupt because the locking was complicated and a 'req_in_progress' flag suffices (since the spin-lock only defers the unexpected interrupts anyway) - instead use the spin-lock to protect the MMC interrupt handler from the DMA interrupt handler - remove the semaphore preventing DMA from being started while the previous DMA is still in progress - the other changes make that impossible, so it is now a BUG_ON condition - ensure the controller interrupt status is clear before exiting the interrrupt handler In general, these changes make the code safer but do not fix any specific bugs so backporting is not necessary. Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com> Tested-by: Venkatraman S <svenkatr@ti.com> Acked-by: Madhusudhan Chikkature <madhu.cr@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>