aboutsummaryrefslogtreecommitdiffstats
path: root/scripts/bootgraph.pl (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2016-01-16lib/vsprintf: factor out %pN[F] handler as netdev_bits()Andy Shevchenko1-9/+16
Move switch case to the netdev_features_string() and rename it to netdev_bits(). In the future we can extend it as needed. Here we replace the fallback of %pN from '%p' with possible flags to sticter '0x%p' without any flags variation. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf: refactor duplicate code to special_hex_number()Andy Shevchenko1-26/+27
special_hex_number() is a helper to print a fixed size type in a hex format with '0x' prefix, zero padding, and small letters. In the module we have already several copies of such code. Consolidate them under special_hex_number() helper. There are couple of differences though. It seems nobody cared about the output in case of CONFIG_KALLSYMS=n, when printing symbol address, because the asked field width is not enough to care last 2 characters in the string represantation of the pointer. Fixed here. The %pNF specifier used to be allowed with a specific field width, though there is neither any user of it nor mention the possibility in the documentation. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16printk-formats.txt: remove unimplemented %pTRasmus Villemoes1-9/+0
%pT for task->comm has been proposed (several times, I think), but is not actually implemented. Remove it from printk-formats.txt and add it back if/when it gets implemented. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16printk: help pr_debug and pr_devel to optimize out argumentsAaron Conole1-6/+6
Currently, pr_debug and pr_devel will not elide function call arguments appearing in calls to the no_printk for these macros. This is because all side effects must be honored before proceeding to the 0-value assignment in no_printk. The behavior is contrary to documentation found in the CodingStyle and the header file where these functions are declared. This patch corrects that behavior by shunting out the call to no_printk completely. The format string is still checked by gcc for correctness, but no code seems to be emitted in common cases. [akpm@linux-foundation.org: remove braces, per Joe] Fixes: 5264f2f75d86 ("include/linux/printk.h: use and neaten no_printk") Signed-off-by: Aaron Conole <aconole@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Joe Perches <joe@perches.com> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: test dentry printingRasmus Villemoes1-0/+27
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: add test for large bitmapsRasmus Villemoes1-0/+17
Following "lib/vsprintf.c: expand field_width to 24 bits", let's add a test to see that we now actually support bitmaps with 65536 bits. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: account for kvasprintf testsRasmus Villemoes1-0/+1
These should also count as performed tests. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: add a few number() testsRasmus Villemoes1-0/+24
This adds a few tests to test_number, one of which serves to document another deviation from POSIX/C99 (printing 0 with an explicit precision of 0). Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: test precision quirksRasmus Villemoes1-6/+15
The kernel's printf doesn't follow the standards in a few corner cases (which are probably mostly irrelevant). Add tests that document the current behaviour. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: check for out-of-bound writesRasmus Villemoes1-5/+19
Add a few padding bytes on either side of the test buffer, and check that these (and the part of the buffer not used) are untouched by vsnprintf. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/test_printf.c: don't BUGRasmus Villemoes1-1/+6
BUG is a completely unnecessarily big hammer, and we're more likely to get the internal bug reported if we just pr_err() and ensure the test suite fails. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/kasprintf.c: add sanity check to kvasprintfRasmus Villemoes1-4/+6
kasprintf relies on being able to replay the formatting and getting the same result (in particular, the same length). This will almost always work, but it is possible that the object pointed to by a %s or %p argument changed under us (so we might get truncated output). Add a somewhat paranoid sanity check and let's see if it ever triggers. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf.c: warn about too large precisions and field widthsRasmus Villemoes1-4/+24
The field width is overloaded to pass some extra information for some %p extensions (e.g. #bits for %pb). But we might silently truncate the passed value when we stash it in struct printf_spec (see e.g. "lib/vsprintf.c: expand field_width to 24 bits"). Hopefully 23 value bits should now be enough for everybody, but if not, let's make some noise. Do the same for the precision. In both cases, clamping seems more sensible than truncating. While, according to POSIX, "A negative precision is taken as if the precision were omitted.", the kernel's printf has always treated that case as if the precision was 0, so we use that as lower bound. For the field width, the smallest representable value is actually -(1<<23), but a negative field width means 'set the LEFT flag and use the absolute value', so we want the absolute value to fit. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf.c: help gcc make number() smallerRasmus Villemoes1-12/+14
One consequence of the reorganization of struct printf_spec to make field_width 24 bits was that number() gained about 180 bytes. Since spec is never passed to other functions, we can help gcc make number() lose most of that extra weight by using local variables for the field width and precision. Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf.c: expand field_width to 24 bitsRasmus Villemoes1-20/+22
Maurizio Lombardi reported a problem [1] with the %pb extension: It doesn't work for sufficiently large bitmaps, since the size is stashed in the field_width field of the struct printf_spec, which is currently an s16. Concretely, this manifested itself in /sys/bus/pseudo/drivers/scsi_debug/map being empty, since the bitmap printer got a size of 0, which is the 16 bit truncation of the actual bitmap size. We do want to keep struct printf_spec at 8 bytes so that it can cheaply be passed by value. The qualifier field is only used for internal bookkeeping in format_decode, so we might as well use a local variable for that. This gives us an additional 8 bits, which we can then use for the field width. To stay in 8 bytes, we need to do a little rearranging and make the type member a bitfield as well. For consistency, change all the members to bit fields. gcc doesn't generate much worse code with these changes (in fact, bloat-o-meter says we save 300 bytes - which I think is a little surprising). I didn't find a BUILD_BUG/compiletime_assertion/... which would work outside function context, so for now I just open-coded it. [1] http://thread.gmane.org/gmane.linux.kernel/2034835 [akpm@linux-foundation.org: avoid open-coded BUILD_BUG_ON] Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reported-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf.c: eliminate potential race in string()Rasmus Villemoes1-19/+9
If the string corresponding to a %s specifier can change under us, we might end up copying a \0 byte to the output buffer. There might be callers who expect the output buffer to contain a genuine C string whose length is exactly the snprintf return value (assuming truncation hasn't happened or has been checked for). We can avoid this by only passing over the source string once, stopping the first time we meet a nul byte (or when we reach the given precision), and then letting widen_string() handle left/right space padding. As a small bonus, this code reuse also makes the generated code slightly smaller. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf.c: move string() below widen_string()Rasmus Villemoes1-31/+31
This is pure code movement, making sure the widen_string() helper is defined before the string() function. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16lib/vsprintf.c: pull out padding code from dentry_name()Rasmus Villemoes1-15/+31
Pull out the logic in dentry_name() which handles field width space padding, in preparation for reusing it from string(). Rename the widen() helper to move_right(), since it is used for handling the !(flags & LEFT) case. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Kees Cook <keescook@chromium.org> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16printk: do cond_resched() between lines while outputting to consolesTejun Heo3-3/+36
@console_may_schedule tracks whether console_sem was acquired through lock or trylock. If the former, we're inside a sleepable context and console_conditional_schedule() performs cond_resched(). This allows console drivers which use console_lock for synchronization to yield while performing time-consuming operations such as scrolling. However, the actual console outputting is performed while holding irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule before starting outputting lines. Also, only a few drivers call console_conditional_schedule() to begin with. This means that when a lot of lines need to be output by console_unlock(), for example on a console registration, the task doing console_unlock() may not yield for a long time on a non-preemptible kernel. If this happens with a slow console devices, for example a serial console, the outputting task may occupy the cpu for a very long time. Long enough to trigger softlockup and/or RCU stall warnings, which in turn pile more messages, sometimes enough to trigger the next cycle of warnings incapacitating the system. Fix it by making console_unlock() insert cond_resched() between lines if @console_may_schedule. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Calvin Owens <calvinowens@fb.com> Acked-by: Jan Kara <jack@suse.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Kyle McMartin <kyle@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16printk: only unregister boot consoles when necessaryThierry Reding1-1/+25
Boot consoles are typically replaced by proper consoles during the boot process. This can be problematic if the boot console data is part of the init section that is reclaimed late during boot. If the proper console does not register before this point in time, the boot console will need to be removed (so that the freed memory is not accessed), leaving the system without output for some time. There are various reasons why the proper console may not register early enough, such as deferred probe or the driver being a loadable module. If that happens, there is some amount of time where no console messages are visible to the user, which in turn can mean that they won't see crashes or other potentially useful information. To avoid this situation, only remove the boot console when it resides in the init section. Code exists to replace the boot console by the proper console when it is registered, keeping a seamless transition between the boot and proper consoles. Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16asm/sections: add helpers to check for section dataThierry Reding1-0/+65
Add a helper to check if an object (given an address and a size) is part of a section (given beginning and end addresses). For convenience, also provide a helper that performs this check for __init data using the __init_begin and __init_end limits. Signed-off-by: Thierry Reding <treding@nvidia.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16kernel/stop_machine.c: remove CONFIG_SMP dependenciesAndrew Morton1-4/+0
stop_machine.o is only built if CONFIG_SMP=y, so this ifdef always evaluates to true. [akpm@linux-foundation.org: remove now-unneeded ifdef] Reported-by: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16uselib: default depending if libc5 was usedRiku Voipio1-1/+1
uselib hasn't been used since libc5; glibc does not use it. Deprecate uselib a bit more, by making the default y only if libc5 was widely used on the plaform. This makes arm64 kernel built with defconfig slightly smaller bloat-o-meter: add/remove: 0/3 grow/shrink: 0/2 up/down: 0/-1390 (-1390) function old new delta kernel_config_data 18164 18162 -2 uselib_flags 20 - -20 padzero 216 192 -24 sys_uselib 380 - -380 load_elf_library 964 - -964 Signed-off-by: Riku Voipio <riku.voipio@linaro.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16err.h: add (missing) unlikely() to IS_ERR_OR_NULL()Viresh Kumar1-1/+1
IS_ERR_VALUE() already contains it and so we need to add this only to the !ptr check. That will allow users of IS_ERR_OR_NULL(), to not add this compiler flag. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16Kconfig: remove HAVE_LATENCYTOP_SUPPORTWill Deacon12-37/+0
As illustrated by commit a3afe70b83fd ("[S390] latencytop s390 support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to advertise an implementation of save_stack_trace_tsk. However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk() weak alias") a dummy implementation is provided if STACKTRACE=y. Given that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Helge Deller <deller@gmx.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16include/linux/kdev_t.h: remove new_valid_dev()Yaowei Bai1-5/+0
As all new_valid_dev() checks have been removed it's time to drop new_valid_dev() itself. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16fs/stat.c: drop the last new_valid_dev checkYaowei Bai1-1/+1
New_valid_dev() always returns true, so that's unnecessary to perform new_valid_dev() checks in some filesystems. Most checks of new_valid_dev() have been removed so let's drop this last one and then we can remove new_valid_dev() from the source code. No functional change. Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16include/linux/kernel.h: change abs() macro so it uses consistent return typeMichal Nazarewicz3-24/+23
Rewrite abs() so that its return type does not depend on the architecture and no unexpected type conversion happen inside of it. The only conversion is from unsigned to signed type. char is left as a return type but treated as a signed type regradless of it's actual signedness. With the old version, int arguments were promoted to long and depending on architecture a long argument might result in s64 or long return type (which may or may not be the same). This came after some back and forth with Nicolas. The current macro has different return type (for the same input type) depending on architecture which might be midly iritating. An alternative version would promote to int like so: #define abs(x) __abs_choose_expr(x, long long, \ __abs_choose_expr(x, long, \ __builtin_choose_expr( \ sizeof(x) <= sizeof(int), \ ({ int __x = (x); __x<0?-__x:__x; }), \ ((void)0)))) I have no preference but imagine Linus might. :] Nicolas argument against is that promoting to int causes iconsistent behaviour: int main(void) { unsigned short a = 0, b = 1, c = a - b; unsigned short d = abs(a - b); unsigned short e = abs(c); printf("%u %u\n", d, e); // prints: 1 65535 } Then again, no sane person expects consistent behaviour from C integer arithmetic. ;) Note: __builtin_types_compatible_p(unsigned char, char) is always false, and __builtin_types_compatible_p(signed char, char) is also always false. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-16include/linux/poison.h: use POISON_POINTER_DELTA for poison pointersVasily Kulikov1-2/+2
TIMER_ENTRY_STATIC and TAIL_MAPPING are defined as poison pointers which should point to nowhere. Redefine them using POISON_POINTER_DELTA arithmetics to make sure they really point to non-mappable area declared by the target architecture. Signed-off-by: Vasily Kulikov <segoon@openwall.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Solar Designer <solar@openwall.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15memcg: only free spare array when readers are doneMartijn Coenen1-5/+6
A spare array holding mem cgroup threshold events is kept around to make sure we can always safely deregister an event and have an array to store the new set of events in. In the scenario where we're going from 1 to 0 registered events, the pointer to the primary array containing 1 event is copied to the spare slot, and then the spare slot is freed because no events are left. However, it is freed before calling synchronize_rcu(), which means readers may still be accessing threshold->primary after it is freed. Fixed by only freeing after synchronize_rcu(). Signed-off-by: Martijn Coenen <maco@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm: soft-offline: exit with failure for non anonymous thpNaoya Horiguchi1-8/+8
Currently memory_failure() doesn't handle non anonymous thp case, because we can hardly expect the error handling to be successful, and it can just hit some corner case which results in BUG_ON or something severe like that. This is also the case for soft offline code, so let's make it in the same way. Orignal code has a MF_COUNT_INCREASED check before put_hwpoison_page(), but it's unnecessary because get_any_page() is already called when running on this code, which takes a refcount of the target page regardress of the flag. So this patch also removes it. [akpm@linux-foundation.org: fix build] Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm: soft-offline: clean up soft_offline_page()Naoya Horiguchi1-31/+47
soft_offline_page() has some deeply indented code, that's the sign of demand for cleanup. So let's do this. No functionality change. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm/hugetlbfs: unmap pages if page fault raced with hole punchMike Kravetz1-69/+75
Page faults can race with fallocate hole punch. If a page fault happens between the unmap and remove operations, the page is not removed and remains within the hole. This is not the desired behavior. The race is difficult to detect in user level code as even in the non-race case, a page within the hole could be faulted back in before fallocate returns. If userfaultfd is expanded to support hugetlbfs in the future, this race will be easier to observe. If this race is detected and a page is mapped, the remove operation (remove_inode_hugepages) will unmap the page before removing. The unmap within remove_inode_hugepages occurs with the hugetlb_fault_mutex held so that no other faults will be processed until the page is removed. The (unmodified) routine hugetlb_vmdelete_list was moved ahead of remove_inode_hugepages to satisfy the new reference. [akpm@linux-foundation.org: move hugetlb_vmdelete_list()] Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15fs/hugetlbfs/inode.c: fix bugs in hugetlb_vmtruncate_list()Mike Kravetz1-8/+11
Hillf Danton noticed bugs in the hugetlb_vmtruncate_list routine. The argument end is of type pgoff_t. It was being converted to a vaddr offset and passed to unmap_hugepage_range. However, end was also being used as an argument to the vma_interval_tree_foreach controlling loop. In addition, the conversion of end to vaddr offset was incorrect. hugetlb_vmtruncate_list is called as part of a file truncate or fallocate hole punch operation. When truncating a hugetlbfs file, this bug could prevent some pages from being unmapped. This is possible if there are multiple vmas mapping the file, and there is a sufficiently sized hole between the mappings. The size of the hole between two vmas (A,B) must be such that the starting virtual address of B is greater than (ending virtual address of A << PAGE_SHIFT). In this case, the pages in B would not be unmapped. If pages are not properly unmapped during truncate, the following BUG is hit: kernel BUG at fs/hugetlbfs/inode.c:428! In the fallocate hole punch case, this bug could prevent pages from being unmapped as in the truncate case. However, for hole punch the result is that unmapped pages will not be removed during the operation. For hole punch, it is also possible that more pages than desired will be unmapped. This unnecessary unmapping will cause page faults to reestablish the mappings on subsequent page access. Fixes: 1bfad99ab (" hugetlbfs: hugetlb_vmtruncate_list() needs to take a range")Reported-by: Hillf Danton <hillf.zj@alibaba-inc.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: <stable@vger.kernel.org> [4.3] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm: make swapoff more robust against soft dirtyHugh Dickins1-14/+4
Both s390 and powerpc have hit the issue of swapoff hanging, when CONFIG_HAVE_ARCH_SOFT_DIRTY and CONFIG_MEM_SOFT_DIRTY ifdefs were not quite as x86_64 had them. I think it would be much clearer if HAVE_ARCH_SOFT_DIRTY was just a Kconfig option set by architectures to determine whether the MEM_SOFT_DIRTY option should be offered, and the actual code depend upon CONFIG_MEM_SOFT_DIRTY alone. But won't embark on that change myself: instead make swapoff more robust, by using pte_swp_clear_soft_dirty() on each pte it encounters, without an explicit #ifdef CONFIG_MEM_SOFT_DIRTY. That being a no-op, whether the bit in question is defined as 0 or the asm-generic fallback is used, unless soft dirty is fully turned on. Why "maybe" in maybe_same_pte()? Rename it pte_same_as_swp(). Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm: fix locking order in mm_take_all_locks()Kirill A. Shutemov3-21/+37
Dmitry Vyukov has reported[1] possible deadlock (triggered by his syzkaller fuzzer): Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&hugetlbfs_i_mmap_rwsem_key); lock(&mapping->i_mmap_rwsem); lock(&hugetlbfs_i_mmap_rwsem_key); lock(&mapping->i_mmap_rwsem); Both traces points to mm_take_all_locks() as a source of the problem. It doesn't take care about ordering or hugetlbfs_i_mmap_rwsem_key (aka mapping->i_mmap_rwsem for hugetlb mapping) vs. i_mmap_rwsem. huge_pmd_share() does memory allocation under hugetlbfs_i_mmap_rwsem_key and allocator can take i_mmap_rwsem if it hit reclaim. So we need to take i_mmap_rwsem from all hugetlb VMAs before taking i_mmap_rwsem from rest of VMAs. The patch also documents locking order for hugetlbfs_i_mmap_rwsem_key. [1] http://lkml.kernel.org/r/CACT4Y+Zu95tBs-0EvdiAKzUOsb4tczRRfCRTpLr4bg_OP9HuVg@mail.gmail.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm: mempolicy: skip non-migratable VMAs when setting MPOL_MF_LAZYLiang Chen1-1/+2
MPOL_MF_LAZY is not visible from userspace since a720094ded8c ("mm: mempolicy: Hide MPOL_NOOP and MPOL_MF_LAZY from userspace for now"), but it should still skip non-migratable VMAs such as VM_IO, VM_PFNMAP, and VM_HUGETLB VMAs, and avoid useless overhead of minor faults. Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Signed-off-by: Gavin Guo <gavin.guo@canonical.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm/page_alloc.c: remove unused struct zone *z variableAlexander Kuleshov1-2/+0
Remove unused struct zone *z variable which appeared in 86051ca5eaf5 ("mm: fix usemap initialization"). Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm/mlock.c: change can_do_mlock return value type to booleanWang Xiaoqiang2-5/+5
Since can_do_mlock only return 1 or 0, so make it boolean. No functional change. [akpm@linux-foundation.org: update declaration in mm.h] Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm/vmalloc.c: use macro IS_ALIGNED to judge the aligmentWang Xiaoqiang1-2/+2
Just cleanup, no functional change. Signed-off-by: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15cgroup, memcg, writeback: drop spurious rcu locking around mem_cgroup_css_from_page()Tejun Heo2-5/+0
In earlier versions, mem_cgroup_css_from_page() could return non-root css on a legacy hierarchy which can go away and required rcu locking; however, the eventual version simply returns the root cgroup if memcg is on a legacy hierarchy and thus doesn't need rcu locking around or in it. Remove spurious rcu lockings. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm/page_isolation: do some cleanup in "undo_isolate_page_range"Wang Xiaoqiang1-2/+4
Use "IS_ALIGNED" to judge the alignment, rather than directly judging. Signed-off-by: Wang Xiaoqiang <wang_xiaoq@126.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15memblock: fix section mismatchKirill A. Shutemov1-9/+9
allmodconfig produces following warning for me: WARNING: vmlinux.o(.text.unlikely+0x10314): Section mismatch in reference from the function movable_node_is_enabled() to the variable .meminit.data:movable_node_enabled The function movable_node_is_enabled() references the variable __meminitdata movable_node_enabled. This is often because movable_node_is_enabled lacks a __meminitdata annotation or the annotation of movable_node_enabled is wrong. Let's mark the function with __meminit. It fixes the warning. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15s390/mm: enable fixup_user_fault retryingDominik Dingel1-3/+26
By passing a non-null flag we allow fixup_user_fault to retry, which enables userfaultfd. As during these retries we might drop the mmap_sem we need to check if that happened and redo the complete chain of actions. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm: bring in additional flag for fixup_user_fault to signal unlockDominik Dingel4-11/+34
During Jason's work with postcopy migration support for s390 a problem regarding gmap faults was discovered. The gmap code will call fixup_user_fault which will end up always in handle_mm_fault. Till now we never cared about retries, but as the userfaultfd code kind of relies on it. this needs some fix. This patchset does not take care of the futex code. I will now look closer at this. This patch (of 2): With the introduction of userfaultfd, kvm on s390 needs fixup_user_fault to pass in FAULT_FLAG_ALLOW_RETRY and give feedback if during the faulting we ever unlocked mmap_sem. This patch brings in the logic to handle retries as well as it cleans up the current documentation. fixup_user_fault was not having the same semantics as filemap_fault. It never indicated if a retry happened and so a caller wasn't able to handle that case. So we now changed the behaviour to always retry a locked mmap_sem. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: "Jason J. Herne" <jjherne@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15dax: re-enable dax pmd mappingsDan Williams2-7/+4
Now that the get_user_pages() path knows how to handle dax-pmd mappings, remove the protections that disabled dax-pmd support. Tests available from github.com/pmem/ndctl: make TESTS="lib/test-dax.sh lib/test-mmap.sh" check Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15dax: provide diagnostics for pmd mapping failuresDan Williams1-8/+57
There is a wide gamut of conditions that can trigger the dax pmd path to fallback to pte mappings. Ideally we'd have a syscall interface to determine mapping characteristics after the fact. In the meantime provide debug messages. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Suggested-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm, x86: get_user_pages() for dax mappingsDan Williams8-39/+212
A dax mapping establishes a pte with _PAGE_DEVMAP set when the driver has established a devm_memremap_pages() mapping, i.e. when the pfn_t return from ->direct_access() has PFN_DEV and PFN_MAP set. Later, when encountering _PAGE_DEVMAP during a page table walk we lookup and pin a struct dev_pagemap instance to keep the result of pfn_to_page() valid until put_page(). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Cc: Dave Hansen <dave@sr71.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmdDan Williams7-27/+47
A dax-huge-page mapping while it uses some thp helpers is ultimately not a transparent huge page. The distinction is especially important in the get_user_pages() path. pmd_devmap() is used to distinguish dax-pmds from pmd_huge() and pmd_trans_huge() which have slightly different semantics. Explicitly mark the pmd_trans_huge() helpers that dax needs by adding pmd_devmap() checks. [kirill.shutemov@linux.intel.com: fix regression in handling mlocked pages in __split_huge_pmd()] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15mm, dax, pmem: introduce {get|put}_dev_pagemap() for dax-gupDan Williams6-8/+125
get_dev_page() enables paths like get_user_pages() to pin a dynamically mapped pfn-range (devm_memremap_pages()) while the resulting struct page objects are in use. Unlike get_page() it may fail if the device is, or is in the process of being, disabled. While the initial lookup of the range may be an expensive list walk, the result is cached to speed up subsequent lookups which are likely to be in the same mapped range. devm_memremap_pages() now requires a reference counter to be specified at init time. For pmem this means moving request_queue allocation into pmem_alloc() so the existing queue usage counter can track "device pages". ZONE_DEVICE pages always have an elevated count and will never be on an lru reclaim list. That space in 'struct page' can be redirected for other uses, but for safety introduce a poison value that will always trip __list_add() to assert. This allows half of the struct list_head storage to be reclaimed with some assurance to back up the assumption that the page count never goes to zero and a list_add() is never attempted. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Logan Gunthorpe <logang@deltatee.com> Cc: Dave Hansen <dave@sr71.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>