aboutsummaryrefslogtreecommitdiffstats
path: root/fs/select.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2011-01-20ACPI: Introduce acpi_os_ioremap()Rafael J. Wysocki4-11/+27
Commit ca9b600be38c ("ACPI / PM: Make suspend_nvs_save() use acpi_os_map_memory()") attempted to prevent the code in osl.c and nvs.c from using different ioremap() variants by making the latter use acpi_os_map_memory() for mapping the NVS pages. However, that also requires acpi_os_unmap_memory() to be used for unmapping them, which causes synchronize_rcu() to be executed many times in a row unnecessarily and introduces substantial delays during resume on some systems. Instead of using acpi_os_map_memory() for mapping the NVS pages in nvs.c introduce acpi_os_ioremap() calling ioremap_cache() and make the code in both osl.c and nvs.c use it. Reported-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20kernel/smp.c: consolidate writes in smp_call_function_interrupt()Milton Miller1-10/+19
We have to test the cpu mask in the interrupt handler before checking the refs, otherwise we can start to follow an entry before its deleted and find it partially initailzed for the next trip. Presently we also clear the cpumask bit before executing the called function, which implies getting write access to the line. After the function is called we then decrement refs, and if they go to zero we then unlock the structure. However, this implies getting write access to the call function data before and after another the function is called. If we can assert that no smp_call_function execution function is allowed to enable interrupts, then we can move both writes to after the function is called, hopfully allowing both writes with one cache line bounce. On a 256 thread system with a kernel compiled for 1024 threads, the time to execute testcase in the "smp_call_function_many race" changelog was reduced by about 30-40ms out of about 545 ms. I decided to keep this as WARN because its now a buggy function, even though the stack trace is of no value -- a simple printk would give us the information needed. Raw data: Without patch: ipi_test startup took 1219366ns complete 539819014ns total 541038380ns ipi_test startup took 1695754ns complete 543439872ns total 545135626ns ipi_test startup took 7513568ns complete 539606362ns total 547119930ns ipi_test startup took 13304064ns complete 533898562ns total 547202626ns ipi_test startup took 8668192ns complete 544264074ns total 552932266ns ipi_test startup took 4977626ns complete 548862684ns total 553840310ns ipi_test startup took 2144486ns complete 541292318ns total 543436804ns ipi_test startup took 21245824ns complete 530280180ns total 551526004ns With patch: ipi_test startup took 5961748ns complete 500859628ns total 506821376ns ipi_test startup took 8975996ns complete 495098924ns total 504074920ns ipi_test startup took 19797750ns complete 492204740ns total 512002490ns ipi_test startup took 14824796ns complete 487495878ns total 502320674ns ipi_test startup took 11514882ns complete 494439372ns total 505954254ns ipi_test startup took 8288084ns complete 502570774ns total 510858858ns ipi_test startup took 6789954ns complete 493388112ns total 500178066ns #include <linux/module.h> #include <linux/init.h> #include <linux/sched.h> /* sched clock */ #define ITERATIONS 100 static void do_nothing_ipi(void *dummy) { } static void do_ipis(struct work_struct *dummy) { int i; for (i = 0; i < ITERATIONS; i++) smp_call_function(do_nothing_ipi, NULL, 1); printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id()); } static struct work_struct work[NR_CPUS]; static int __init testcase_init(void) { int cpu; u64 start, started, done; start = local_clock(); for_each_online_cpu(cpu) { INIT_WORK(&work[cpu], do_ipis); schedule_work_on(cpu, &work[cpu]); } started = local_clock(); for_each_online_cpu(cpu) flush_work(&work[cpu]); done = local_clock(); pr_info("ipi_test startup took %lldns complete %lldns total %lldns\n", started-start, done-started, done-start); return 0; } static void __exit testcase_exit(void) { } module_init(testcase_init) module_exit(testcase_exit) MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anton Blanchard"); Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Anton Blanchard <anton@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20kernel/smp.c: fix smp_call_function_many() SMP raceAnton Blanchard1-0/+30
I noticed a failure where we hit the following WARN_ON in generic_smp_call_function_interrupt: if (!cpumask_test_and_clear_cpu(cpu, data->cpumask)) continue; data->csd.func(data->csd.info); refs = atomic_dec_return(&data->refs); WARN_ON(refs < 0); <------------------------- We atomically tested and cleared our bit in the cpumask, and yet the number of cpus left (ie refs) was 0. How can this be? It turns out commit 54fdade1c3332391948ec43530c02c4794a38172 ("generic-ipi: make struct call_function_data lockless") is at fault. It removes locking from smp_call_function_many and in doing so creates a rather complicated race. The problem comes about because: - The smp_call_function_many interrupt handler walks call_function.queue without any locking. - We reuse a percpu data structure in smp_call_function_many. - We do not wait for any RCU grace period before starting the next smp_call_function_many. Imagine a scenario where CPU A does two smp_call_functions back to back, and CPU B does an smp_call_function in between. We concentrate on how CPU C handles the calls: CPU A CPU B CPU C CPU D smp_call_function smp_call_function_interrupt walks call_function.queue sees data from CPU A on list smp_call_function smp_call_function_interrupt walks call_function.queue sees (stale) CPU A on list smp_call_function int clears last ref on A list_del_rcu, unlock smp_call_function reuses percpu *data A data->cpumask sees and clears bit in cpumask might be using old or new fn! decrements refs below 0 set data->refs (too late!) The important thing to note is since the interrupt handler walks a potentially stale call_function.queue without any locking, then another cpu can view the percpu *data structure at any time, even when the owner is in the process of initialising it. The following test case hits the WARN_ON 100% of the time on my PowerPC box (having 128 threads does help :) #include <linux/module.h> #include <linux/init.h> #define ITERATIONS 100 static void do_nothing_ipi(void *dummy) { } static void do_ipis(struct work_struct *dummy) { int i; for (i = 0; i < ITERATIONS; i++) smp_call_function(do_nothing_ipi, NULL, 1); printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id()); } static struct work_struct work[NR_CPUS]; static int __init testcase_init(void) { int cpu; for_each_online_cpu(cpu) { INIT_WORK(&work[cpu], do_ipis); schedule_work_on(cpu, &work[cpu]); } return 0; } static void __exit testcase_exit(void) { } module_init(testcase_init) module_exit(testcase_exit) MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anton Blanchard"); I tried to fix it by ordering the read and the write of ->cpumask and ->refs. In doing so I missed a critical case but Paul McKenney was able to spot my bug thankfully :) To ensure we arent viewing previous iterations the interrupt handler needs to read ->refs then ->cpumask then ->refs _again_. Thanks to Milton Miller and Paul McKenney for helping to debug this issue. [miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ] [miltonm@bga.com: remove excess tests] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: <stable@kernel.org> [2.6.32+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20memcg: correctly order reading PCG_USED and pc->mem_cgroupJohannes Weiner1-18/+9
The placement of the read-side barrier is confused: the writer first sets pc->mem_cgroup, then PCG_USED. The read-side barrier has to be between testing PCG_USED and reading pc->mem_cgroup. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20backlight: fix 88pm860x_bl macro collisionRandy Dunlap1-2/+2
Fix collision with kernel-supplied #define: drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Haojian Zhuang <haojian.zhuang@marvell.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20drivers/leds/ledtrig-gpio.c: make output match input, tighten input checkingJanusz Krzysztofik1-7/+8
Replicate changes made to drivers/leds/ledtrig-backlight.c. Cc: Paul Mundt <lethal@linux-sh.org> Cc: Richard Purdie <richard.purdie@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20MAINTAINERS: update Atmel AT91 entryNicolas Ferre1-2/+6
Add two co-maintainers and update the entry with new information. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Andrew Victor <linux@maxim.org.za> Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20mm: fix truncate_setsize() commentJan Kara1-6/+5
Contrary to what the comment says, truncate_setsize() should be called *before* filesystem truncated blocks. Signed-off-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20memcg: fix rmdir, force_empty with THPKAMEZAWA Hiroyuki1-11/+26
Now, when THP is enabled, memcg's rmdir() function is broken because move_account() for THP page is not supported. This will cause account leak or -EBUSY issue at rmdir(). This patch fixes the issue by supporting move_account() THP pages. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20memcg: fix LRU accounting with THPKAMEZAWA Hiroyuki1-4/+18
memory cgroup's LRU stat should take care of size of pages because Transparent Hugepage inserts hugepage into LRU. If this value is the number wrong, memory reclaim will not work well. Note: only head page of THP's huge page is linked into LRU. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20memcg: fix USED bit handling at uncharge in THPKAMEZAWA Hiroyuki3-40/+62
Now, under THP: at charge: - PageCgroupUsed bit is set to all page_cgroup on a hugepage. ....set to 512 pages. at uncharge - PageCgroupUsed bit is unset on the head page. So, some pages will remain with "Used" bit. This patch fixes that Used bit is set only to the head page. Used bits for tail pages will be set at splitting if necessary. This patch adds this lock order: compound_lock() -> page_cgroup_move_lock(). [akpm@linux-foundation.org: fix warning] Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20memcg: modify accounting function for supporting THP betterKAMEZAWA Hiroyuki1-13/+12
mem_cgroup_charge_statisics() was designed for charging a page but now, we have transparent hugepage. To fix problems (in following patch) it's required to change the function to get the number of pages as its arguments. The new function gets following as argument. - type of page rather than 'pc' - size of page which is accounted. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bioDavid Dillow1-3/+7
When using devices that support max_segments > BIO_MAX_PAGES (256), direct IO tries to allocate a bio with more pages than allowed, which leads to an oops in dio_bio_alloc(). Clamp the request to the supported maximum, and change dio_bio_alloc() to reflect that bio_alloc() will always return a bio when called with __GFP_WAIT and a valid number of vectors. [akpm@linux-foundation.org: remove redundant BUG_ON()] Signed-off-by: David Dillow <dillowda@ornl.gov> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20mm: compaction: prevent division-by-zero during user-requested compactionJohannes Weiner1-0/+11
Up until 3e7d344 ("mm: vmscan: reclaim order-0 and use compaction instead of lumpy reclaim"), compaction skipped calculating the fragmentation index of a zone when compaction was explicitely requested through the procfs knob. However, when compaction_suitable was introduced, it did not come with an extra check for order == -1, set on explicit compaction requests, and passed this order on to the fragmentation index calculation, where it overshifts the number of requested pages, leading to a division by zero. This patch makes sure that order == -1 is recognized as the flag it is rather than passing it along as valid order parameter. [akpm@linux-foundation.org: add comment, per Mel] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20mm/vmscan.c: remove duplicate include of compaction.hJesper Juhl1-1/+0
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20memblock: fix memblock_is_region_memory()Tomi Valkeinen1-4/+4
memblock_is_region_memory() uses reserved memblocks to search for the given region, while it should use the memory memblocks. I encountered the problem with OMAP's framebuffer ram allocation. Normally the ram is allocated dynamically, and this function is not called. However, if we want to pass the framebuffer from the bootloader to the kernel (to retain the boot image), this function is used to check the validity of the kernel parameters for the framebuffer ram area. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@nokia.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20thp: keep highpte mapped until it is no longer neededJohannes Weiner1-1/+2
Two users reported THP-related crashes on 32-bit x86 machines. Their oops reports indicated an invalid pte, and subsequent code inspection showed that the highpte is actually used after unmap. The fix is to unmap the pte only after all operations against it are finished. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Ilya Dryomov <idryomov@gmail.com> Reported-by: werner <w.landgraf@ru.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Steven Rostedt <rostedt@goodmis.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERTDavid Rientjes298-423/+431
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg KH <gregkh@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <holt@sgi.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20Fix broken "pipe: use event aware wakeups" optimizationLinus Torvalds1-5/+5
Commit e462c448fdc8 ("pipe: use event aware wakeups") optimized the pipe event wakeup calls to avoid wakeups if the events do not match the requested set. However, the optimization was buggy, in that it didn't actually use the correct sets for the events: when we make room for more data to be written, the pipe poll() routine will return both the POLLOUT _and_ POLLWRNORM bits. Similarly for read. And most critically, when a pipe is released, that will potentially result in POLLHUP|POLLERR (depending on whether it was the last reader or writer), not just the regular POLLIN|POLLOUT. This bug showed itself as a hung gnome-screensaver-dialog process, stuck forever (or at least until it was poked by a signal or by being traced) in a poll() system call. Cc: Davide Libenzi <davidel@xmailserver.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20i915: Fix i915 suspend delayLinus Torvalds2-3/+3
During system suspend, the "wait for ring buffer to empty" loop would always time out after three seconds, because the faster cached ring buffer head read would always return zero. Force the slow-and-careful PIO read on all but the first iterations of the loop to fix it. This also removes the unused (and useless) 'actual_head' variable that tried to approximate doing this, but did it incorrectly. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Dave Airlie <airlied@linux.ie> Cc: DRI mailing list <dri-devel@lists.freedesktop.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20ACPI / Battery: remove battery refresh on resumeLinus Torvalds1-1/+0
This partially reverts commit da8aeb92d4853f37e281f11fddf61f9c7d84c3cd ("ACPI / Battery: Update information on info notification and resume"), which causes a hang on resume on at least some machines. This bug was bisected on an ASUS EeePC 901, which hangs at resume time if we do that "acpi_battery_refresh(battery)" in the battery resume function. Rafael suspects we'll still need to refresh the sysfs files upon resume, but that that can be done from a PM notifier (that will run after thawing user space). Bisected-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Garrett <mjg@redhat.com> Cc: Len Brown <len.brown@intel.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-01-20cifs: mangle existing header for SMB_COM_NT_CANCELJeff Layton1-25/+38
The NT_CANCEL command looks just like the original command, except for a few small differences. The send_nt_cancel function however currently takes a tcon, which we don't have in SendReceive and SendReceive2. Instead of "respinning" the entire header for an NT_CANCEL, just mangle the existing header by replacing just the fields we need. This means we don't need a tcon and allows us to call it from other places. Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: remove code for setting timeouts on requestsJeff Layton6-50/+17
Since we don't time out individual requests anymore, remove the code that we used to use for setting timeouts on different requests. Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20[CIFS] cifs: reconnect unresponsive serversSteve French3-5/+25
If the server isn't responding to echoes, we don't want to leave tasks hung waiting for it to reply. At that point, we'll want to reconnect so that soft mounts can return an error to userspace quickly. If the client hasn't received a reply after a specified number of echo intervals, assume that the transport is down and attempt to reconnect the socket. The number of echo_intervals to wait before attempting to reconnect is tunable via a module parameter. Setting it to 0, means that the client will never attempt to reconnect. The default is 5. Signed-off-by: Jeff Layton <jlayton@redhat.com>
2011-01-20cifs: set up recurring workqueue job to do SMB echo requestsJeff Layton2-0/+30
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: add ability to send an echo requestJeff Layton4-1/+65
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: add cifs_call_asyncJeff Layton2-1/+62
Add a function that will send a request, and set up the mid for an async reply. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: allow for different handling of received responseJeff Layton4-35/+60
In order to incorporate async requests, we need to allow for a more general way to do things on receive, rather than just waking up a process. Turn the task pointer in the mid_q_entry into a callback function and a generic data pointer. When a response comes in, or the socket is reconnected, cifsd can call the callback function in order to wake up the process. The default is to just wake up the current process which should mean no change in behavior for existing code. Also, clean up the locking in cifs_reconnect. There doesn't seem to be any need to hold both the srv_mutex and GlobalMid_Lock when walking the list of mids. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: clean up sync_mid_resultJeff Layton1-17/+18
Make it use a switch statement based on the value of the midStatus. If the resp_buf is set, then MID_RESPONSE_RECEIVED is too. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: don't reconnect server when we don't get a responseJeff Layton1-3/+1
We only want to force a reconnect to the server under very limited and specific circumstances. Now that we have processes waiting indefinitely for responses, we shouldn't reach this point unless a reconnect is already in process. Thus, there's no reason to re-mark the server for reconnect here. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20cifs: wait indefinitely for responsesJeff Layton1-93/+17
The client should not be timing out on individual SMB requests. Too much of the state between client and server is tied to the state of the socket. If we time out requests and issue spurious disconnects then that comprimises data integrity. Instead of doing this complicated dance where we try to decide how long to wait for a response for particular requests, have the client instead wait indefinitely for a response. Also, use a TASK_KILLABLE sleep here so that fatal signals will break out of this waiting. Later patches will add support for detecting dead peers and forcing reconnects based on that. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-20smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.cTejun Heo1-4/+7
percpu may end up calling vfree() during early boot which in turn may call on_each_cpu() for TLB flushes. The function of on_each_cpu() can be done safely while IRQ is disabled during early boot but it assumed that the function is always called with local IRQ enabled which ended up enabling local IRQ prematurely during boot and triggering a couple of warnings. This patch updates on_each_cpu() and smp_call_function_many() such on_each_cpu() can be used safely while early_boot_irqs_disabled is set. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110120110713.GC6036@htj.dyndns.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Ingo Molnar <mingo@elte.hu>
2011-01-20lockdep: Move early boot local IRQ enable/disable status to init/main.cTejun Heo6-36/+15
During early boot, local IRQ is disabled until IRQ subsystem is properly initialized. During this time, no one should enable local IRQ and some operations which usually are not allowed with IRQ disabled, e.g. operations which might sleep or require communications with other processors, are allowed. lockdep tracked this with early_boot_irqs_off/on() callbacks. As other subsystems need this information too, move it to init/main.c and make it generally available. While at it, toggle the boolean to early_boot_irqs_disabled instead of enabled so that it can be initialized with %false and %true indicates the exceptional condition. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110120110635.GB6036@htj.dyndns.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-20virtio: remove virtio-pci root deviceMilton Miller1-18/+2
We sometimes need to map between the virtio device and the given pci device. One such use is OS installer that gets the boot pci device from BIOS and needs to find the relevant block device. Since it can't, installation fails. Instead of creating a top-level devices/virtio-pci directory, create each device under the corresponding pci device node. Symlinks to all virtio-pci devices can be found under the pci driver link in bus/pci/drivers/virtio-pci/devices, and all virtio devices under drivers/bus/virtio/devices. Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Gleb Natapov <gleb@redhat.com> Tested-by: "Daniel P. Berrange" <berrange@redhat.com> Cc: stable@kernel.org
2011-01-20LGUEST_GUEST: fix unmet direct dependencies (VIRTUALIZATION && VIRTIO)Randy Dunlap1-0/+1
Honor the kconfig menu hierarchy to remove kconfig dependency warnings: VIRTIO and VIRTIO_RING are subordinate to VIRTUALIZATION. warning: (LGUEST_GUEST) selects VIRTIO which has unmet direct dependencies (VIRTUALIZATION) warning: (LGUEST_GUEST && VIRTIO_PCI && VIRTIO_BALLOON) selects VIRTIO_RING which has unmet direct dependencies (VIRTUALIZATION && VIRTIO) Reported-by: Toralf F_rster <toralf.foerster@gmx.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-20lguest: compile fixesRusty Russell2-2/+2
arch/x86/lguest/boot.c: In function ‘lguest_init_IRQ’: arch/x86/lguest/boot.c:824: error: macro "__this_cpu_write" requires 2 arguments, but only 1 given arch/x86/lguest/boot.c:824: error: ‘__this_cpu_write’ undeclared (first use in this function) arch/x86/lguest/boot.c:824: error: (Each undeclared identifier is reported only once arch/x86/lguest/boot.c:824: error: for each function it appears in.) drivers/lguest/x86/core.c: In function ‘copy_in_guest_info’: drivers/lguest/x86/core.c:94: error: lvalue required as left operand of assignment Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-20lguest: Use this_cpu_opsChristoph Lameter3-4/+4
Use this_cpu_ops in a couple of places in lguest. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-20lguest: document --rng in example LauncherPhilip Sanderson1-0/+5
Rusty Russell wrote: > Ah, it will appear as /dev/hwrng. It's a weirdness of Linux that our actual > hardware number generators are not wired up to /dev/random... Reflected this in the documentation, thanks :-) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-20lguest: example launcher to use guard pages, drop PROT_EXEC, fix limit logicPhilip Sanderson1-8/+15
PROT_EXEC seems to be completely unnecessary (as the lguest binary never executes there), and will allow it to work with SELinux (and more importantly, PaX :-) as they can/do forbid writable and executable mappings. Also, map PROT_NONE guard pages at start and end of guest memory for extra paranoia. I changed the length check to addr + size > guest_limit because >= is wrong (addr of 0, size of getpagesize() with a guest_limit of getpagesize() would false positive). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-20lguest: --username and --chroot optionsPhilip Sanderson1-0/+50
I've attached a patch which implements dropping to privileges and chrooting to a directory. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-01-19sctp: user perfect name for Delayed SACK Timer optionShan Wei2-2/+3
The option name of Delayed SACK Timer should be SCTP_DELAYED_SACK, not SCTP_DELAYED_ACK. Left SCTP_DELAYED_ACK be concomitant with SCTP_DELAYED_SACK, for making compatibility with existing applications. Reference: 8.1.19. Get or Set Delayed SACK Timer (SCTP_DELAYED_SACK) (http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-25) Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19net: fix can_checksum_protocol() arguments swapEric Dumazet1-1/+1
commit 0363466866d901fbc (net offloading: Convert checksums to use centrally computed features.) mistakenly swapped can_checksum_protocol() arguments. This broke IPv6 on bnx2 for instance, on NIC without TCPv6 checksum offloads. Reported-by: Hans de Bruin <jmdebruin@xmsnet.nl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Jesse Gross <jesse@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19Revert "netlink: test for all flags of the NLM_F_DUMP composite"David S. Miller5-6/+6
This reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf. It breaks several things including the avahi daemon. Signed-off-by: David S. Miller <davem@davemloft.net>
2011-01-19cifs: Use mask of ACEs for SID Everyone to calculate all three permissions user, group, and otherShirish Pargaonkar1-2/+11
If a DACL has entries for ACEs for SID Everyone and Authenticated Users, factor in mask in respective entries during calculation of permissions for all three, user, group, and other. http://technet.microsoft.com/en-us/library/bb463216.aspx Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-19cifs: Fix regression during share-level security mounts (Repost)Shirish Pargaonkar1-2/+2
NTLM response length was changed to 16 bytes instead of 24 bytes that are sent in Tree Connection Request during share-level security share mounts. Revert it back to 24 bytes. Reported-and-Tested-by: Grzegorz Ozanski <grzegorz.ozanski@intel.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Cc: stable@kernel.org Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-19[CIFS] Update cifs version numberSteve French1-1/+1
Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-19cifs: move mid result processing into common functionJeff Layton1-78/+43
Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-19cifs: move locked sections out of DeleteMidQEntry and AllocMidQEntryJeff Layton1-17/+24
In later patches, we're going to need to have finer-grained control over the addition and removal of these structs from the pending_mid_q and we'll need to be able to call the destructor while holding the spinlock. Move the locked sections out of both routines and into the callers. Fix up current callers of DeleteMidQEntry to call a new routine that dequeues the entry and then destroys it. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-19cifs: clean up accesses to midCountJeff Layton3-5/+5
It's an atomic_t and the code accesses the "counter" field in it directly instead of using atomic_read(). It also is sometimes accessed under a spinlock and sometimes not. Move it out of the spinlock since we don't need belt-and-suspenders for something that's just informational. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-01-19cifs: make wait_for_free_request take a TCP_Server_Info pointerJeff Layton1-13/+13
The cifsSesInfo pointer is only used to get at the server. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>