aboutsummaryrefslogtreecommitdiffstats
path: root/block/blk-softirq.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2012-03-28selftests/Makefile: make `run_tests' depend on `all'Andrew Morton1-1/+1
So a "make run_tests" will build the tests before trying to run them. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28selftests: launch individual selftests from the main MakefileFrederic Weisbecker3-10/+10
Remove the run_tests script and launch the selftests by calling "make run_tests" from the selftests top directory instead. This delegates to the Makefile in each selftest directory, where it is decided how to launch the local test. This removes the need to add each selftest directory to the now removed "run_tests" top script. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Dave Young <dyoung@redhat.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28radix-tree: use iterators in find_get_pages* functionsKonstantin Khlebnikov1-48/+38
Replace radix_tree_gang_lookup_slot() and radix_tree_gang_lookup_tag_slot() in page-cache lookup functions with brand-new radix-tree direct iterating. This avoids the double-scanning and pointer copying. Iterator don't stop after nr_pages page-get fails in a row, it continue lookup till the radix-tree end. Thus we can safely remove these restart conditions. Unfortunately, old implementation didn't forbid nr_pages == 0, this corner case does not fit into new code, so the patch adds an extra check at the beginning. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Tested-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28radix-tree: rewrite gang lookup using iteratorKonstantin Khlebnikov1-258/+33
Rewrite radix_tree_gang_lookup_* functions using the new radix-tree iterator. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Tested-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28radix-tree: introduce bit-optimized iteratorKonstantin Khlebnikov2-0/+347
A series of radix tree cleanups, and usage of them in the core pagecache code. Micro-benchmark: lookup 14 slots (typical page-vector size) in radix-tree there earch <step> slot filled and tagged before/after - nsec per full scan through tree * Intel Sandy Bridge i7-2620M 4Mb L3 New code always faster * AMD Athlon 6000+ 2x1Mb L2, without L3 New code generally faster, Minor degradation (marked with "*") for huge sparse trees * i386 on Sandy Bridge New code faster for common cases: tagged and dense trees. Some degradations for non-tagged lookup on sparse trees. Ideally, there might help __ffs() analog for searching first non-zero long element in array, gcc sometimes cannot optimize this loop corretly. Numbers: CPU: Intel Sandy Bridge i7-2620M 4Mb L3 radix-tree with 1024 slots: tagged lookup step 1 before 7156 after 3613 step 2 before 5399 after 2696 step 3 before 4779 after 1928 step 4 before 4456 after 1429 step 5 before 4292 after 1213 step 6 before 4183 after 1052 step 7 before 4157 after 951 step 8 before 4016 after 812 step 9 before 3952 after 851 step 10 before 3937 after 732 step 11 before 4023 after 709 step 12 before 3872 after 657 step 13 before 3892 after 633 step 14 before 3720 after 591 step 15 before 3879 after 578 step 16 before 3561 after 513 normal lookup step 1 before 4266 after 3301 step 2 before 2695 after 2129 step 3 before 2083 after 1712 step 4 before 1801 after 1534 step 5 before 1628 after 1313 step 6 before 1551 after 1263 step 7 before 1475 after 1185 step 8 before 1432 after 1167 step 9 before 1373 after 1092 step 10 before 1339 after 1134 step 11 before 1292 after 1056 step 12 before 1319 after 1030 step 13 before 1276 after 1004 step 14 before 1256 after 987 step 15 before 1228 after 992 step 16 before 1247 after 999 radix-tree with 1024*1024*128 slots: tagged lookup step 1 before 1086102841 after 674196409 step 2 before 816839155 after 498138306 step 7 before 599728907 after 240676762 step 15 before 555729253 after 185219677 step 63 before 606637748 after 128585664 step 64 before 608384432 after 102945089 step 65 before 596987114 after 123996019 step 128 before 304459225 after 56783056 step 256 before 158846855 after 31232481 step 512 before 86085652 after 18950595 step 12345 before 6517189 after 1674057 normal lookup step 1 before 626064869 after 544418266 step 2 before 418809975 after 336321473 step 7 before 242303598 after 207755560 step 15 before 208380563 after 176496355 step 63 before 186854206 after 167283638 step 64 before 176188060 after 170143976 step 65 before 185139608 after 167487116 step 128 before 88181865 after 86913490 step 256 before 45733628 after 45143534 step 512 before 24506038 after 23859036 step 12345 before 2177425 after 2018662 * AMD Athlon 6000+ 2x1Mb L2, without L3 radix-tree with 1024 slots: tag-lookup step 1 before 8164 after 5379 step 2 before 5818 after 5581 step 3 before 4959 after 4213 step 4 before 4371 after 3386 step 5 before 4204 after 2997 step 6 before 4950 after 2744 step 7 before 4598 after 2480 step 8 before 4251 after 2288 step 9 before 4262 after 2243 step 10 before 4175 after 2131 step 11 before 3999 after 2024 step 12 before 3979 after 1994 step 13 before 3842 after 1929 step 14 before 3750 after 1810 step 15 before 3735 after 1810 step 16 before 3532 after 1660 normal-lookup step 1 before 7875 after 5847 step 2 before 4808 after 4071 step 3 before 4073 after 3462 step 4 before 3677 after 3074 step 5 before 4308 after 2978 step 6 before 3911 after 3807 step 7 before 3635 after 3522 step 8 before 3313 after 3202 step 9 before 3280 after 3257 step 10 before 3166 after 3083 step 11 before 3066 after 3026 step 12 before 2985 after 2982 step 13 before 2925 after 2924 step 14 before 2834 after 2808 step 15 before 2805 after 2803 step 16 before 2647 after 2622 radix-tree with 1024*1024*128 slots: tag-lookup step 1 before 1288059720 after 951736580 step 2 before 961292300 after 884212140 step 7 before 768905140 after 547267580 step 15 before 771319480 after 456550640 step 63 before 504847640 after 242704304 step 64 before 392484800 after 177920786 step 65 before 491162160 after 246895264 step 128 before 208084064 after 97348392 step 256 before 112401035 after 51408126 step 512 before 75825834 after 29145070 step 12345 before 5603166 after 2847330 normal-lookup step 1 before 1025677120 after 861375100 step 2 before 647220080 after 572258540 step 7 before 505518960 after 484041813 step 15 before 430483053 after 444815320 * step 63 before 388113453 after 404250546 * step 64 before 374154666 after 396027440 * step 65 before 381423973 after 396704853 * step 128 before 190078700 after 202619384 * step 256 before 100886756 after 102829108 * step 512 before 64074505 after 56158720 step 12345 before 4237289 after 4422299 * * i686 on Sandy bridge radix-tree with 1024 slots: tagged lookup step 1 before 7990 after 4019 step 2 before 5698 after 2897 step 3 before 5013 after 2475 step 4 before 4630 after 1721 step 5 before 4346 after 1759 step 6 before 4299 after 1556 step 7 before 4098 after 1513 step 8 before 4115 after 1222 step 9 before 3983 after 1390 step 10 before 4077 after 1207 step 11 before 3921 after 1231 step 12 before 3894 after 1116 step 13 before 3840 after 1147 step 14 before 3799 after 1090 step 15 before 3797 after 1059 step 16 before 3783 after 745 normal lookup step 1 before 5103 after 3499 step 2 before 3299 after 2550 step 3 before 2489 after 2370 step 4 before 2034 after 2302 * step 5 before 1846 after 2268 * step 6 before 1752 after 2249 * step 7 before 1679 after 2164 * step 8 before 1627 after 2153 * step 9 before 1542 after 2095 * step 10 before 1479 after 2109 * step 11 before 1469 after 2009 * step 12 before 1445 after 2039 * step 13 before 1411 after 2013 * step 14 before 1374 after 2046 * step 15 before 1340 after 1975 * step 16 before 1331 after 2000 * radix-tree with 1024*1024*128 slots: tagged lookup step 1 before 1225865377 after 667153553 step 2 before 842427423 after 471533007 step 7 before 609296153 after 276260116 step 15 before 544232060 after 226859105 step 63 before 519209199 after 141343043 step 64 before 588980279 after 141951339 step 65 before 521099710 after 138282060 step 128 before 298476778 after 83390628 step 256 before 149358342 after 43602609 step 512 before 76994713 after 22911077 step 12345 before 5328666 after 1472111 normal lookup step 1 before 819284564 after 533635310 step 2 before 512421605 after 364956155 step 7 before 271443305 after 305721345 * step 15 before 223591630 after 273960216 * step 63 before 190320247 after 217770207 * step 64 before 178538168 after 267411372 * step 65 before 186400423 after 215347937 * step 128 before 88106045 after 140540612 * step 256 before 44812420 after 70660377 * step 512 before 24435438 after 36328275 * step 12345 before 2123924 after 2148062 * bloat-o-meter delta for this patchset + patchset with related shmem cleanups bloat-o-meter: x86_64 add/remove: 4/3 grow/shrink: 5/6 up/down: 928/-939 (-11) function old new delta radix_tree_next_chunk - 499 +499 shmem_unuse 428 554 +126 shmem_radix_tree_replace 131 227 +96 find_get_pages_tag 354 419 +65 find_get_pages_contig 345 407 +62 find_get_pages 362 396 +34 __kstrtab_radix_tree_next_chunk - 22 +22 __ksymtab_radix_tree_next_chunk - 16 +16 __kcrctab_radix_tree_next_chunk - 8 +8 radix_tree_gang_lookup_slot 204 203 -1 static.shmem_xattr_set 384 381 -3 radix_tree_gang_lookup_tag_slot 208 191 -17 radix_tree_gang_lookup 231 187 -44 radix_tree_gang_lookup_tag 247 199 -48 shmem_unlock_mapping 278 190 -88 __lookup 217 - -217 __lookup_tag 242 - -242 radix_tree_locate_item 279 - -279 bloat-o-meter: i386 add/remove: 3/3 grow/shrink: 8/9 up/down: 1075/-1275 (-200) function old new delta radix_tree_next_chunk - 757 +757 shmem_unuse 352 449 +97 find_get_pages_contig 269 322 +53 shmem_radix_tree_replace 113 154 +41 find_get_pages_tag 277 318 +41 dcache_dir_lseek 426 458 +32 __kstrtab_radix_tree_next_chunk - 22 +22 vc_do_resize 968 977 +9 snd_pcm_lib_read1 725 733 +8 __ksymtab_radix_tree_next_chunk - 8 +8 netlbl_cipsov4_list 1120 1127 +7 find_get_pages 293 291 -2 new_slab 467 459 -8 bitfill_unaligned_rev 425 417 -8 radix_tree_gang_lookup_tag_slot 177 146 -31 blk_dump_cmd 267 229 -38 radix_tree_gang_lookup_slot 212 134 -78 shmem_unlock_mapping 221 128 -93 radix_tree_gang_lookup_tag 275 162 -113 radix_tree_gang_lookup 255 126 -129 __lookup 227 - -227 __lookup_tag 271 - -271 radix_tree_locate_item 277 - -277 This patch: Implement a clean, simple and effective radix-tree iteration routine. Iterating divided into two phases: * lookup next chunk in radix-tree leaf node * iterating through slots in this chunk Main iterator function radix_tree_next_chunk() returns pointer to first slot, and stores in the struct radix_tree_iter index of next-to-last slot. For tagged-iterating it also constuct bitmask of tags for retunted chunk. All additional logic implemented as static-inline functions and macroses. Also adds radix_tree_find_next_bit() static-inline variant of find_next_bit() optimized for small constant size arrays, because find_next_bit() too heavy for searching in an array with one/two long elements. [akpm@linux-foundation.org: rework comments a bit] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Tested-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28fs/proc/namespaces.c: prevent crash when ns_entries[] is emptyAndrew Morton1-3/+3
If CONFIG_NET_NS, CONFIG_UTS_NS and CONFIG_IPC_NS are disabled, ns_entries[] becomes empty and things like ns_entries[ARRAY_SIZE(ns_entries) - 1] will explode. Reported-by: Richard Weinberger <richard@nod.at> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28nbd: rename the nbd_device variable from lo to nbdWanlong Gao1-147/+148
rename the nbd_device variable from "lo" to "nbd", since "lo" is just a name copied from loop.c. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28pidns: add reboot_pid_ns() to handle the reboot syscallDaniel Lezcano3-1/+49
In the case of a child pid namespace, rebooting the system does not really makes sense. When the pid namespace is used in conjunction with the other namespaces in order to create a linux container, the reboot syscall leads to some problems. A container can reboot the host. That can be fixed by dropping the sys_reboot capability but we are unable to correctly to poweroff/ halt/reboot a container and the container stays stuck at the shutdown time with the container's init process waiting indefinitively. After several attempts, no solution from userspace was found to reliabily handle the shutdown from a container. This patch propose to make the init process of the child pid namespace to exit with a signal status set to : SIGINT if the child pid namespace called "halt/poweroff" and SIGHUP if the child pid namespace called "reboot". When the reboot syscall is called and we are not in the initial pid namespace, we kill the pid namespace for "HALT", "POWEROFF", "RESTART", and "RESTART2". Otherwise we return EINVAL. Returning EINVAL is also an easy way to check if this feature is supported by the kernel when invoking another 'reboot' option like CAD. By this way the parent process of the child pid namespace knows if it rebooted or not and can take the right decision. Test case: ========== #include <alloca.h> #include <stdio.h> #include <sched.h> #include <unistd.h> #include <signal.h> #include <sys/reboot.h> #include <sys/types.h> #include <sys/wait.h> #include <linux/reboot.h> static int do_reboot(void *arg) { int *cmd = arg; if (reboot(*cmd)) printf("failed to reboot(%d): %m\n", *cmd); } int test_reboot(int cmd, int sig) { long stack_size = 4096; void *stack = alloca(stack_size) + stack_size; int status; pid_t ret; ret = clone(do_reboot, stack, CLONE_NEWPID | SIGCHLD, &cmd); if (ret < 0) { printf("failed to clone: %m\n"); return -1; } if (wait(&status) < 0) { printf("unexpected wait error: %m\n"); return -1; } if (!WIFSIGNALED(status)) { printf("child process exited but was not signaled\n"); return -1; } if (WTERMSIG(status) != sig) { printf("signal termination is not the one expected\n"); return -1; } return 0; } int main(int argc, char *argv[]) { int status; status = test_reboot(LINUX_REBOOT_CMD_RESTART, SIGHUP); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_RESTART) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_RESTART2, SIGHUP); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_RESTART2) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_HALT, SIGINT); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_HALT) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_POWER_OFF, SIGINT); if (status < 0) return 1; printf("reboot(LINUX_REBOOT_CMD_POWERR_OFF) succeed\n"); status = test_reboot(LINUX_REBOOT_CMD_CAD_ON, -1); if (status >= 0) { printf("reboot(LINUX_REBOOT_CMD_CAD_ON) should have failed\n"); return 1; } printf("reboot(LINUX_REBOOT_CMD_CAD_ON) has failed as expected\n"); return 0; } [akpm@linux-foundation.org: tweak and add comments] [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Tested-by: Serge Hallyn <serge.hallyn@canonical.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28sysctl: use bitmap library functionsAkinobu Mita1-5/+3
Use bitmap_set() instead of using set_bit() for each bit. This conversion is valid because the bitmap is private in the function call and atomic bitops were unnecessary. This also includes minor change. - Use bitmap_copy() for shorter typing Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28ipmi: use locks on watchdog timeout set on rebootCorey Minyard1-2/+2
The IPMI watchdog timer clears or extends the timer on reboot/shutdown. It was using the non-locking routine for setting the watchdog timer, but this was causing race conditions. Instead, use the locking version to avoid the races. It seems to work fine. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28ipmi: simplify lockingCorey Minyard1-33/+21
Now that the the IPMI driver is using a tasklet, we can simplify the locking in the driver and get rid of the message lock. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28ipmi: fix message handling during panicsCorey Minyard2-64/+56
The part of the IPMI driver that delivered panic information to the event log and extended the watchdog timeout during a panic was not properly handling the messages. It used static messages to avoid allocation, but wasn't properly waiting for these, or wasn't properly handling the refcounts. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28ipmi: use a tasklet for handling received messagesCorey Minyard2-67/+88
The IPMI driver would release a lock, deliver a message, then relock. This is obviously ugly, and this patch converts the message handler interface to use a tasklet to schedule work. This lets the receive handler be called from an interrupt handler with interrupts enabled. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28ipmi: increase KCS timeoutsMatthew Garrett1-2/+2
We currently time out and retry KCS transactions after 1 second of waiting for IBF or OBF. This appears to be too short for some hardware. The IPMI spec says "All system software wait loops should include error timeouts. For simplicity, such timeouts are not shown explicitly in the flow diagrams. A five-second timeout or greater is recommended". Change the timeout to five seconds to satisfy the slow hardware. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28ipmi: decrease the IPMI message transaction time in interrupt modeSrinivas_Gowda1-1/+3
Call the event handler immediately after starting the next message. This change considerably decreases the IPMI transaction time (cuts off ~9ms for a single ipmitool transaction). Signed-off-by: Srinivas_Gowda <srinivas_g_gowda@dell.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28kdump x86: fix total mem size calculation for reservationDave Young1-10/+1
crashkernel reservation need know the total memory size. Current get_total_mem simply use max_pfn - min_low_pfn. It is wrong because it will including memory holes in the middle. Especially for kvm guest with memory > 0xe0000000, there's below in qemu code: qemu split memory as below: if (ram_size >= 0xe0000000 ) { above_4g_mem_size = ram_size - 0xe0000000; below_4g_mem_size = 0xe0000000; } else { below_4g_mem_size = ram_size; } So for 4G mem guest, seabios will insert a 512M usable region beyond of 4G. Thus in above case max_pfn - min_low_pfn will be more than original memsize. Fixing this issue by using memblock_phys_mem_size() to get the total memsize. Signed-off-by: Dave Young <dyoung@redhat.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28kexec: add further check to crashkernelZhenzhong Duan1-0/+4
When using crashkernel=2M-256M, the kernel doesn't give any warning. This is misleading sometimes. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28kexec: crash: don't save swapper_pg_dir for !CONFIG_MMU configurationsWill Deacon1-0/+2
nommu platforms don't have very interesting swapper_pg_dir pointers and usually just #define them to NULL, meaning that we can't include them in the vmcoreinfo on the kexec crash path. This patch only saves the swapper_pg_dir if we have an MMU. Signed-off-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Simon Horman <horms@verge.net.au> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28arch/ia64: remove references to cpu_*_mapSrivatsa S. Bhat8-26/+24
This was marked as obsolete for quite a while now.. Now it is time to remove it altogether. And while doing this, get rid of first_cpu() as well. Also, remove the redundant setting of cpu_online_mask in smp_prepare_cpus() because the generic code would have already set cpu 0 in cpu_online_mask. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28lib/cpumask.c: remove __any_online_cpu()Srivatsa S. Bhat2-14/+1
__any_online_cpu() is not optimal and also unnecessary. So, replace its use by faster cpumask_* operations. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28mm: only IPI CPUs to drain local pages if they existGilad Ben-Yossef1-2/+38
Calculate a cpumask of CPUs with per-cpu pages in any zone and only send an IPI requesting CPUs to drain these pages to the buddy allocator if they actually have pages when asked to flush. This patch saves 85%+ of IPIs asking to drain per-cpu pages in case of severe memory pressure that leads to OOM since in these cases multiple, possibly concurrent, allocation requests end up in the direct reclaim code path so when the per-cpu pages end up reclaimed on first allocation failure for most of the proceeding allocation attempts until the memory pressure is off (possibly via the OOM killer) there are no per-cpu pages on most CPUs (and there can easily be hundreds of them). This also has the side effect of shortening the average latency of direct reclaim by 1 or more order of magnitude since waiting for all the CPUs to ACK the IPI takes a long time. Tested by running "hackbench 400" on a 8 CPU x86 VM and observing the difference between the number of direct reclaim attempts that end up in drain_all_pages() and those were more then 1/2 of the online CPU had any per-cpu page in them, using the vmstat counters introduced in the next patch in the series and using proc/interrupts. In the test sceanrio, this was seen to save around 3600 global IPIs after trigerring an OOM on a concurrent workload: $ cat /proc/vmstat | tail -n 2 pcp_global_drain 0 pcp_global_ipi_saved 0 $ cat /proc/interrupts | grep CAL CAL: 1 2 1 2 2 2 2 2 Function call interrupts $ hackbench 400 [OOM messages snipped] $ cat /proc/vmstat | tail -n 2 pcp_global_drain 3647 pcp_global_ipi_saved 3642 $ cat /proc/interrupts | grep CAL CAL: 6 13 6 3 3 3 1 2 7 Function call interrupts Please note that if the global drain is removed from the direct reclaim path as a patch from Mel Gorman currently suggests this should be replaced with an on_each_cpu_cond invocation. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Acked-by: Michal Nazarewicz <mina86@mina86.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28fs: only send IPI to invalidate LRU BH when neededGilad Ben-Yossef1-1/+14
In several code paths, such as when unmounting a file system (but not only) we send an IPI to ask each cpu to invalidate its local LRU BHs. For multi-cores systems that have many cpus that may not have any LRU BH because they are idle or because they have not performed any file system accesses since last invalidation (e.g. CPU crunching on high perfomance computing nodes that write results to shared memory or only using filesystems that do not use the bh layer.) This can lead to loss of performance each time someone switches the KVM (the virtual keyboard and screen type, not the hypervisor) if it has a USB storage stuck in. This patch attempts to only send an IPI to cpus that have LRU BH. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28slub: only IPI CPUs that have per cpu obj to flushGilad Ben-Yossef1-1/+9
flush_all() is called for each kmem_cache_destroy(). So every cache being destroyed dynamically ends up sending an IPI to each CPU in the system, regardless if the cache has ever been used there. For example, if you close the Infinband ipath driver char device file, the close file ops calls kmem_cache_destroy(). So running some infiniband config tool on one a single CPU dedicated to system tasks might interrupt the rest of the 127 CPUs dedicated to some CPU intensive or latency sensitive task. I suspect there is a good chance that every line in the output of "git grep kmem_cache_destroy linux/ | grep '\->'" has a similar scenario. This patch attempts to rectify this issue by sending an IPI to flush the per cpu objects back to the free lists only to CPUs that seem to have such objects. The check which CPU to IPI is racy but we don't care since asking a CPU without per cpu objects to flush does no damage and as far as I can tell the flush_all by itself is racy against allocs on remote CPUs anyway, so if you required the flush_all to be determinstic, you had to arrange for locking regardless. Without this patch the following artificial test case: $ cd /sys/kernel/slab $ for DIR in *; do cat $DIR/alloc_calls > /dev/null; done produces 166 IPIs on an cpuset isolated CPU. With it it produces none. The code path of memory allocation failure for CPUMASK_OFFSTACK=y config was tested using fault injection framework. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Pekka Enberg <penberg@kernel.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Sasha Levin <levinsasha928@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Avi Kivity <avi@redhat.com> Cc: Michal Nazarewicz <mina86@mina86.org> Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com> Cc: Milton Miller <miltonm@bga.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28smp: add func to IPI cpus based on parameter funcGilad Ben-Yossef2-0/+85
Add the on_each_cpu_cond() function that wraps on_each_cpu_mask() and calculates the cpumask of cpus to IPI by calling a function supplied as a parameter in order to determine whether to IPI each specific cpu. The function works around allocation failure of cpumask variable in CONFIG_CPUMASK_OFFSTACK=y by itereating over cpus sending an IPI a time via smp_call_function_single(). The function is useful since it allows to seperate the specific code that decided in each case whether to IPI a specific cpu for a specific request from the common boilerplate code of handling creating the mask, handling failures etc. [akpm@linux-foundation.org: s/gfpflags/gfp_flags/] [akpm@linux-foundation.org: avoid double-evaluation of `info' (per Michal), parenthesise evaluation of `cond_func'] [akpm@linux-foundation.org: s/CPU/CPUs, use all 80 cols in comment] Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Pekka Enberg <penberg@kernel.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Sasha Levin <levinsasha928@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Avi Kivity <avi@redhat.com> Acked-by: Michal Nazarewicz <mina86@mina86.org> Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com> Cc: Milton Miller <miltonm@bga.com> Reviewed-by: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28smp: introduce a generic on_each_cpu_mask() functionGilad Ben-Yossef5-41/+56
We have lots of infrastructure in place to partition multi-core systems such that we have a group of CPUs that are dedicated to specific task: cgroups, scheduler and interrupt affinity, and cpuisol= boot parameter. Still, kernel code will at times interrupt all CPUs in the system via IPIs for various needs. These IPIs are useful and cannot be avoided altogether, but in certain cases it is possible to interrupt only specific CPUs that have useful work to do and not the entire system. This patch set, inspired by discussions with Peter Zijlstra and Frederic Weisbecker when testing the nohz task patch set, is a first stab at trying to explore doing this by locating the places where such global IPI calls are being made and turning the global IPI into an IPI for a specific group of CPUs. The purpose of the patch set is to get feedback if this is the right way to go for dealing with this issue and indeed, if the issue is even worth dealing with at all. Based on the feedback from this patch set I plan to offer further patches that address similar issue in other code paths. This patch creates an on_each_cpu_mask() and on_each_cpu_cond() infrastructure API (the former derived from existing arch specific versions in Tile and Arm) and uses them to turn several global IPI invocation to per CPU group invocations. Core kernel: on_each_cpu_mask() calls a function on processors specified by cpumask, which may or may not include the local processor. You must not call this function with disabled interrupts or from a hardware interrupt handler or from a bottom half handler. arch/arm: Note that the generic version is a little different then the Arm one: 1. It has the mask as first parameter 2. It calls the function on the calling CPU with interrupts disabled, but this should be OK since the function is called on the other CPUs with interrupts disabled anyway. arch/tile: The API is the same as the tile private one, but the generic version also calls the function on the with interrupts disabled in UP case This is OK since the function is called on the other CPUs with interrupts disabled. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: Christoph Lameter <cl@linux.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Pekka Enberg <penberg@kernel.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Sasha Levin <levinsasha928@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Avi Kivity <avi@redhat.com> Acked-by: Michal Nazarewicz <mina86@mina86.org> Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com> Cc: Milton Miller <miltonm@bga.com> Cc: Russell King <linux@arm.linux.org.uk> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28swapon: check validity of swap_flagsHugh Dickins2-0/+6
Most system calls taking flags first check that the flags passed in are valid, and that helps userspace to detect when new flags are supported. But swapon never did so: start checking now, to help if we ever want to support more swap_flags in future. It's difficult to get stray bits set in an int, and swapon is not widely used, so this is most unlikely to break any userspace; but we can just revert if it turns out to do so. Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28mm, coredump: fail allocations when coredumping instead of oom killingDavid Rientjes1-0/+4
The size of coredump files is limited by RLIMIT_CORE, however, allocating large amounts of memory results in three negative consequences: - the coredumping process may be chosen for oom kill and quickly deplete all memory reserves in oom conditions preventing further progress from being made or tasks from exiting, - the coredumping process may cause other processes to be oom killed without fault of their own as the result of a SIGSEGV, for example, in the coredumping process, or - the coredumping process may result in a livelock while writing to the dump file if it needs memory to allocate while other threads are in the exit path waiting on the coredumper to complete. This is fixed by implying __GFP_NORETRY in the page allocator for coredumping processes when reclaim has failed so the allocations fail and the process continues to exit. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28mm: thp: fix up pmd_trans_unstable() locationsAndrea Arcangeli2-3/+6
pmd_trans_unstable() should be called before pmd_offset_map() in the locations where the mmap_sem is held for reading. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mark Salter <msalter@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28mm for fs: add truncate_pagecache_range()Hugh Dickins2-1/+41
Holepunching filesystems ext4 and xfs are using truncate_inode_pages_range but forgetting to unmap pages first (ocfs2 remembers). This is not really a bug, since races already require truncate_inode_page() to handle that case once the page is locked; but it can be very inefficient if the file being punched happens to be mapped into many vmas. Provide a drop-in replacement truncate_pagecache_range() which does the unmapping pass first, handling the awkward mismatch between arguments to truncate_inode_pages_range() and arguments to unmap_mapping_range(). Note that holepunching does not unmap privately COWed pages in the range: POSIX requires that we do so when truncating, but it's hard to justify, difficult to implement without an i_size cutoff, and no filesystem is attempting to implement it. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: Alex Elder <elder@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28procfs: fix /proc/statmKAMEZAWA Hiroyuki1-1/+1
bda7bad62bc4 ("procfs: speed up /proc/pid/stat, statm") broke /proc/statm - 'text' is printed twice by mistake. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reported-by: Ulrich Drepper <drepper@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28vfs: fix d_ancestor() case in d_materialize_uniqueMichel Lespinasse1-1/+2
In d_materialise_unique() there are 3 subcases to the 'aliased dentry' case; in two subcases the inode i_lock is properly released but this does not occur in the -ELOOP subcase. This seems to have been introduced by commit 1836750115f2 ("fix loop checks in d_materialise_unique()"). Signed-off-by: Michel Lespinasse <walken@google.com> Cc: stable@vger.kernel.org # v3.0+ [ Added a comment, and moved the unlock to where we generate the -ELOOP, which seems to be more natural. You probably can't actually trigger this without a buggy network file server - d_materialize_unique() is for finding aliases on non-local filesystems, and the d_ancestor() case is for a hardlinked directory loop. But we should be robust in the case of such buggy servers anyway. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-27net: fix a potential rcu_read_lock() imbalance in rt6_fill_node()Eric Dumazet1-2/+6
Commit f2c31e32b378 (net: fix NULL dereferences in check_peer_redir() ) added a regression in rt6_fill_node(), leading to rcu_read_lock() imbalance. Thats because NLA_PUT() can make a jump to nla_put_failure label. Fix this by using nla_put() Many thanks to Ben Greear for his help Reported-by: Ben Greear <greearb@candelatech.com> Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-25ARM: 7343/1: sa11x0: convert to sparse IRQRussell King5-12/+13
Now that Neponset, UCB1x00 and SA1111 are all converted to use the IRQ allocation interfaces, we can enable sparse IRQ support for SA11x0 platforms.
2012-03-25ARM: 7342/2: sa1100: prepare for sparse irq conversionRob Herring21-11/+40
In preparation to convert SA1100 to sparse irq, set .nr_irqs for each machine and explicitly include mach/irqs.h as needed. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-25ARM: 7341/1: input: prepare jornada720 keyboard and ts for sa11x0 sparse irqRob Herring2-0/+2
In preparation for sa11x0 sparse irq conversion, explicitly include mach/irqs.h as it will not be included for sparse irq. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-25ARM: 7340/1: rtc: sa1100: include mach/irqs.h instead of asm/irq.hRob Herring1-1/+1
Since asm/irq.h may not include mach/irqs.h, include mach/irqs.h directly. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-25ARM: sa11x0: remove unused DMA controller definitionsRussell King2-218/+3
Remove the new unused DMA controller definitions from mach/SA-1100.h. These are now private to the SA-11x0 DMA engine driver and contained within the driver. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-25ARM: sa11x0: remove old SoC private DMA driverRussell King3-466/+1
Now that all users are converted over to using the DMA engine API, we can get rid of the old platform dependent DMA driver. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-03-25net: add a truesize parameter to skb_add_rx_frag()Eric Dumazet7-9/+13
skb_add_rx_frag() API is misleading. Network skbs built with this helper can use uncharged kernel memory and eventually stress/crash machine in OOM. Add a 'truesize' parameter and then fix drivers in followup patches. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-25gianfar: Fix possible overrun and simplify interrupt name field creationJoe Perches2-33/+8
Space allocated for int_name_<foo> is insufficient for maximal device name, expand it. Code to create int_name_<foo> is obscure, simplify it by using sprintf. Found by looking for unnecessary \ line continuations. Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-25USB: qmi_wwan: Add ZTE (Vodafone) K3570-Z and K3571-Z net interfacesAndrew Bird (Sphere Systems)1-0/+18
Now that we have the beginnings of an OSS method to use the network interfaces on these USB broadband modems, add the ZTE manufactured Vodafone items to the whitelist Signed-off-by: Andrew Bird <ajb@spheresystems.co.uk> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-25USB: option: Ignore ZTE (Vodafone) K3570/71 net interfacesAndrew Bird (Sphere Systems)1-2/+4
These interfaces need to be handled by QMI/WWAN driver Signed-off-by: Andrew Bird <ajb@spheresystems.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-25USB: qmi_wwan: Add ZTE (Vodafone) K3565-Z and K4505-Z net interfacesAndrew Bird (Sphere Systems)1-0/+18
Now that we have the beginnings of an OSS method to use the network interfaces on these USB broadband modems, add the ZTE manufactured Vodafone items to the whitelist Signed-off-by: Andrew Bird <ajb@spheresystems.co.uk> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-25um: Update defconfigRichard Weinberger1-184/+471
Enable ext4, cgroups and devtmpfs. Signed-off-by: Richard Weinberger <richard@nod.at>
2012-03-25um: Switch to large mcmodel on x86_64Richard Weinberger1-0/+1
x86_64 UML is unable to load modules if more than 504MiB of memory are used. This happens because on x86_64 the UML process has a quite high start address (typically around 0x6000000). If UML's memory is larger than 504MiB VMALLOC_START happens to be after 0x8000000. This is no problem unless one loads a module which was built with R_X86_64_32S relocations. Symbols with a location > 0x8000000 cannot be used with R_X86_64_32S To deal with this x86_64 UML has to be compiled with -mcmodel=large such that no R_X86_64_32S relocations are used. Signed-off-by: Richard Weinberger <richard@nod.at> Reported-by: 전하늘 <allskyee@gmail.com>
2012-03-25MTD: Relax dependenciesRichard Weinberger5-1/+6
CONFIG_GENERIC_IO is just enough for the basic MTD stuff. Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
2012-03-25um: Wire CONFIG_GENERIC_IO upRichard Weinberger1-0/+1
UML has no io memory but implements everything defined in generic-asm/io.h. Signed-off-by: Richard Weinberger <richard@nod.at>
2012-03-25um: Serve io_remap_pfn_range()Richard Weinberger1-0/+2
At some places io_remap_pfn_range() is needed. UML has to serve it like all other archs do. Signed-off-by: Richard Weinberger <richard@nod.at>
2012-03-25Introduce CONFIG_GENERIC_IORichard Weinberger1-0/+5
There are situations where CONFIG_HAS_IOMEM is too restrictive. For example CONFIG_MTD_NAND_NANDSIM depends on CONFIG_HAS_IOMEM but it works perfectly fine if an architecture without io memory just includes asm-generic/io.h or implements everything defined in it. UML is such a corner case. Signed-off-by: Richard Weinberger <richard@nod.at>
2012-03-25um: allow SUBARCH=x86Al Viro1-2/+2
nicked from patch by dwmw2 back in July Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>