wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2017-06-27	tools/testing/selftests/sysctl: Add pre-check to the value of writes_strict	Orson Zhai	3	-2/+24
	Sysctl test will fail in some items if the value of /proc/sys/kernel /sysctrl_writes_strict is 0 as the default value in kernel older than v4.5. Make this test more robus and compatible with older kernel by checking and update writes_strict value and restore it when test is done. Signed-off-by: Orson Zhai <orson.zhai@linaro.org> Reviewed-by: Sumit Semwal <sumit.semwal@linaro.org> Tested-by: Sumit Semwal <sumit.semwal@linaro.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-26	kselftest.rst: do some adjustments after ReST conversion	Mauro Carvalho Chehab	1	-16/+16
	Do some minor adjustments after ReST conversion: - On most documents, we use prepend a "$ " before command line arguments; - Prefer to use :: on the preceding line; - Split a multi-paragraph description as such. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-23	selftest/net/Makefile: Specify output with $(OUTPUT)	SeongJae Park	1	-2/+1
	After commit a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT"), net selftest build fails because it points output file without $(OUTPUT) yet. This commit fixes the error. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Fixes: a8ba798bc8ec ("selftests: enable O and KBUILD_OUTPUT") Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-23	selftest/intel_pstate/aperf: Use LDLIBS instead of LDFLAGS	SeongJae Park	1	-1/+1
	Build of aperf fails as below: ``` gcc -Wall -D_GNU_SOURCE -lm aperf.c -o /tools/testing/selftests/intel_pstate/aperf /tmp/ccKf3GF6.o: In function `main': aperf.c:(.text+0x278): undefined reference to `sqrt' collect2: error: ld returned 1 exit status ``` The faulure occurs because -lm was defined as LDFLAGS and implicit rule of make places LDFLAGS before source file. This commit fixes the problem by using LDLIBS instead of LDFLAGS. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-23	selftest/memfd/Makefile: Fix build error	SeongJae Park	1	-1/+1
	Selftest for memfd shows build error as below: ``` gcc -D_FILE_OFFSET_BITS=64 -I../../../../include/uapi/ -I../../../../include/ -I../../../../usr/include/ fuse_mnt.c -o /home/sjpark/linux/tools/testing/selftests/memfd/fuse_mnt /tmp/cc6NHdwJ.o: In function `main': fuse_mnt.c:(.text+0x249): undefined reference to `fuse_main_real' collect2: error: ld returned 1 exit status ``` The build fails because output file is specified without $(OUTPUT) and LDFLAGS is used though Makefile implicit rule is used. This commit fixes the error by specifying output file path with $(OUTPUT) and using LDLIBS instead of LDFLAGS. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-23	selftests: lib: Skip tests on missing test modules	Sumit Semwal	2	-0/+8
	With older kernels, printf.sh and bitmap.sh fail because they can't find the respective test modules they are looking for. Use modprobe dry run to check for missing test_XXX module. Error out with the same error code as prime_numbers.sh. Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-15	kselftest: membarrier: make test names more informative	Alice Ferrazzi	1	-2/+2
	Make membarrier test names more informative. Signed-off-by: Alice Ferrazzi <alice.ferrazzi@gmail.com> Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-15	kselftest: make callers of ksft_exit_skip() output the reason for skipping	Paul Elder	3	-8/+6
	Make the three tests that did use the old ksft_ext_skip() (breakpoints/breakpoint_test_arm64, breakpoints/step_after_suspend_test, and membarrier_test) use the new one, with an output for the reason for skipping all the tests. Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-15	kselftest: make ksft_exit_skip() output a reason for skipping	Paul Elder	1	-2/+5
	Make ksft_exit_skip() input an optional message string as the reason for skipping all the tests and outputs it prior to exiting. Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-14	kselftest: convert get_size to use stricter TAP13 format	Tim Bird	1	-11/+16
	1. Add the TAP13 header 2. remove variable data from the test description line 3. move the plan count to the end of the file, for consistency with other kselftests 4. convert memory data from diagnostic (comment) format, to a YAML block Signed-off-by: Tim Bird <tim.bird@sony.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-13	kselftest: breakpoints: convert step_after_suspend_test to TAP13 output	Paul Elder	1	-26/+17
	Make the step_after_suspend test output in the TAP13 format by using the TAP13 output functions defined in kselftest.h Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Alice Ferrazzi <alice.ferrazzi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-13	kselftest: breakpoints: convert breakpoint_test to TAP13 output	Paul Elder	1	-14/+15
	Make the breakpoints test output in the TAP13 format by using the TAP13 output functions defined in kselftest.h Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Alice Ferrazzi <alice.ferrazzi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-13	kselftest: membarrier: convert to TAP13 output	Paul Elder	1	-16/+19
	Make the membarrier test output in the TAP13 format by using the TAP13 output functions defined in kselftest.h Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Alice Ferrazzi <alice.ferrazzi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-13	kselftest: add TAP13 conformant versions of ksft_* functions	Paul Elder	1	-4/+48
	Add TAP13 conformat output functions to kselftest.h. Also add exit functions that output TAP13 exiting text, as well as functions to keep track of testing progress. Signed-off-by: Paul Elder <paul.elder@pitt.edu> Signed-off-by: Alice Ferrazzi <alice.ferrazzi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-12	selftests: kselftest_harness: Fix compile warning	Mickaël Salaün	1	-1/+1
	Do not confuse the compiler with a semicolon preceding a block. Replace the semicolon with an empty block to avoid a warning: gcc -Wl,-no-as-needed -Wall -lpthread seccomp_bpf.c -o /.../linux/tools/testing/selftests/seccomp/seccomp_bpf In file included from seccomp_bpf.c:40:0: seccomp_bpf.c: In function ‘change_syscall’: ../kselftest_harness.h:558:2: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation] for (; _metadata->trigger; _metadata->trigger = __bail(_assert)) ^ ../kselftest_harness.h:574:14: note: in expansion of macro ‘OPTIONAL_HANDLER’ } while (0); OPTIONAL_HANDLER(_assert) ^~~~~~~~~~~~~~~~ ../kselftest_harness.h:440:2: note: in expansion of macro ‘__EXPECT’ __EXPECT(expected, seen, ==, 0) ^~~~~~~~ seccomp_bpf.c:1313:2: note: in expansion of macro ‘EXPECT_EQ’ EXPECT_EQ(0, ret); ^~~~~~~~~ seccomp_bpf.c:1317:2: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘for’ { ^ Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Will Drewry <wad@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-12	kselftest: MAINTAINERS git tree entry update files and dirs	Shuah Khan	1	-1/+2
	Add missing trailing slash to tools/testing/selftests to cover all files and directories below. Add kselftest documentation files. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-12	ksefltest: MAINTAINERS git tree entry is incorrect	Greg Kroah-Hartman	1	-1/+1
	There is a few more subdirectories needed in the git tree path for the linux-kselftest url in order to be able to properly clone it. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/ftrace: Return unsupported if it detects older kernel	Masami Hiramatsu	2	-0/+18
	Return unsupported if the kernel is too old to support instance independent ftrace filter for some testcases. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/ftrace: Use top-level available_filter_function	Masami Hiramatsu	1	-0/+4
	Use top-level available_filter_function if the test case is running under an instance. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/ftrace: Add instance indication in test log	Masami Hiramatsu	1	-1/+1
	Add instance test indication in test log too. Current ftracetest shows instance test indication on the list of test, but not in the log for each test. This adds instance test indication on the top of each log, like below; execute (instance) : /ftrace/test.d/ftrace/func_set_ftrace_file.tc Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/ftrace: Reset ftrace filter on older kernel	Masami Hiramatsu	1	-1/+4
	Since older kernel didn't support separated instance of set_ftrace_filter, if the test case set the filter in an instance, it will propagate to top-level instance. This means that the filter setting remains even if we remove the instance, and will cause other tests failure. To avoid this issue, reset the ftrace filter if we detect the propagation. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	ftrace/kprobes: selftests: Check kretprobe maxactive is supported	Masami Hiramatsu	2	-1/+3
	Check the kretprobe maxactive is supported by kprobe_events interface. To ensure the kernel feature, this changes ftrace README to describe it. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/ftrace: Reduce trace buffer checking overhead	Masami Hiramatsu	1	-2/+6
	Current event/toplevel-enable.tc checking the trace buffer by dumping all events while recording events. However, this makes system very busy. To reduce this overhead comes from reading trace buffer and recording trace buffer, use head instead of cat and stop tracing while reading. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/ftrace: Skip full-glob-matching filter test on older kernel	Masami Hiramatsu	1	-11/+17
	Skip a part of ftrace filter test related to full-glob matching if we are sure that the testing kernel is so old that it does not support full-glob-matching yet. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/seccomp: Force rebuild according to dependencies	Mickaël Salaün	1	-0/+2
	Rebuild the seccomp tests when kselftest_harness.h is updated. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	Documentation/dev-tools: Add kselftest_harness documentation	Mickaël Salaün	2	-85/+363
	Add ReST metadata to kselftest_harness.h to be able to include the comments in the Sphinx documentation. Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests: Remove the TEST_API() wrapper from kselftest_harness.h	Mickaël Salaün	1	-205/+150
	Remove the TEST_API() wrapper to expose the underlying macro arguments to the documentation tools. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	Documentation/dev-tools: Use reStructuredText markups for kselftest	Mickaël Salaün	2	-27/+41
	Include and convert kselftest to the Sphinx format. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	Documentation/dev-tools: Add kselftest	Mickaël Salaün	2	-2/+0
	Move kselftest.txt to dev-tools/kselftest.rst . Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests: Cosmetic renames in kselftest_harness.h	Mickaël Salaün	1	-5/+6
	Keep the content consistent with the new name. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests: Make test_harness.h more generally available	Mickaël Salaün	3	-1/+2
	The seccomp/test_harness.h file contains useful helpers to build tests. Moving it to the selftest directory should benefit to other test components. Keep seccomp maintainers for this file. Signed-off-by: Mickaël Salaün <mic@digikod.net> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Will Drewry <wad@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Shuah Khan <shuah@kernel.org> Link: https://lkml.kernel.org/r/CAGXu5j+8CVz8vL51DRYXqOY=xc3zuKFf=PTENe88XYHzFYidUQ@mail.gmail.com Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests: sync: Skip the test if kernel support is not found	Michael Ellerman	1	-0/+13
	The "Sync framework" test doesn't work if the kernel has no support, obviously. Rather than reporting a failure, check for the kernel support by looking for /sys/kernel/debug/sync/sw_sync, and if not found skip the test. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests/vm: Fix test for virtual address range mapping for arm64	Michal Suchanek	1	-9/+26
	Arm64 has 256TB address space so fix the test to pass on Arm as well. Also remove unneeded numaif header. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-07	selftests: futex: print testcase-name and PASS/FAIL/ERROR status	Naresh Kamboju	8	-10/+19
	Most of the tests under selftests follow a pattern for their results, which can then be parsed easily by other external tools easily. Though futex tests do print the test results very well, it doesn't really follow the general selftests pattern. This patch makes necessary changes to fix that. Output before this patch: futex_requeue_pi: Test requeue functionality Arguments: broadcast=0 locked=0 owner=0 timeout=0ns Result: PASS Output after this patch: futex_requeue_pi: Test requeue functionality Arguments: broadcast=0 locked=0 owner=0 timeout=0ns selftests: futex-requeue-pi [PASS] Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2017-06-04	Linux 4.12-rc4	Linus Torvalds	1	-1/+1

2017-06-04	fs/ufs: Set UFS default maximum bytes per file	Richard Narron	1	-3/+2
	This fixes a problem with reading files larger than 2GB from a UFS-2 file system: https://bugzilla.kernel.org/show_bug.cgi?id=195721 The incorrect UFS s_maxsize limit became a problem as of commit c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") which started using s_maxbytes to avoid a page index overflow in do_generic_file_read(). That caused files to be truncated on UFS-2 file systems because the default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it. Here I simply increase the default to a common value used by other file systems. Signed-off-by: Richard Narron <comet.berkeley@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Will B <will.brokenbourgh2877@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <stable@vger.kernel.org> # v4.9 and backports of c2a9737f45e2 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-04	Revert "tty: fix port buffer locking"	Greg Kroah-Hartman	1	-2/+0
	This reverts commit 925bb1ce47f429f69aad35876df7ecd8c53deb7e. It causes lots of warnings and problems so for now, let's just revert it. Reported-by: <valdis.kletnieks@vt.edu> Reported-by: Russell King <linux@armlinux.org.uk> Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Jiri Slaby <jslaby@suse.cz> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-03	nfs: Mark unnecessarily extern functions as static	Jan Kara	2	-4/+3
	nfs_initialise_sb() and nfs_clone_super() are declared as extern even though they are used only in fs/nfs/super.c. Mark them as static. Also remove explicit 'inline' directive from nfs_initialise_sb() and leave it upto compiler to decide whether inlining is worth it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-06-03	hwmon: (aspeed-pwm-tacho) make fan/pwm names start with index 1	Stefan Schaeckeler	1	-26/+26
	Make fan and pwm names in sysfs start with index 1 in accordance to Documentation/hwmon/sysfs-interface conventions. Current implementation starts with index 0, making tools such as sensors(1) skip the first fan. Signed-off-by: Stefan Schaeckeler <sschaeck@cisco.com> Fixes: 2d7a548a3eff ("drivers: hwmon: Support for ASPEED PWM/Fan tach") Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-06-03	hwmon: (aspeed-pwm-tacho) Call of_node_put() on a node not claimed	Stefan Schaeckeler	1	-1/+0
	Call of_node_put() on a node claimed with of_node_get() or by any other means such as for_each_child_of_node(). Signed-off-by: Stefan Schaeckeler <sschaeck@cisco.com> Fixes: 2d7a548a3eff ("drivers: hwmon: Support for ASPEED PWM/Fan tach") Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-06-02	Input: axp20x-pek - switch to acpi_dev_present and check for ACPI0011 too	Hans de Goede	1	-2/+3
	acpi_dev_found checks that there is a matching ACPI node, but it may be disabled (_STA method returns 0) in which case the soc_button_array driver will not bind to it and axp20x-pek should handle the power-button. This commit switches from acpi_dev_found to acpi_dev_present to avoid not registering an input-dev for the powerbutton when there is a disabled PNP0C40 device. The ACPI-6.0 standard defines a standard gpio button device using the ACPI0011 HID replacing the custom PNP0C40 gpio device, many newer devices define both PNP0C40 and ACPI0011 devices enabling one or the other depending on whether the BIOS thinks it is going to boot Android or Windows. This commit adds a check for the ACPI0011 device, so that if either device is present and enabled we don't register an input-dev for the powerbutton. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-06-02	Input: axp20x-pek - only check for "INTCFD9" ACPI device on Cherry Trail	Hans de Goede	1	-7/+36
	Commit 9b13a4ca8d2c ("Input: axp20x-pek - do not register input device on some systems") added a check for the INTCFD9 ACPI device which also handles the powerbutton as on some systems the powerbutton is connected to both the PMIC, handled by axp20x-pek, and to a gpio on the SoC, handled by soc_button_array which attaches itself to the INTCFD9 ACPI device. Testing + comparing DSDTs has shown that this only happens on Cherry Trail devices with an AXP288 PMIC, the AXP288 PMIC is also used on Bay Trail devices but there the power button is only connected to the PMIC and not handled by soc_button_array. This means that the INTCFD9 check has caused a regression on Bay Trail devices, causing power-button presses to no longer be seen. This commit fixes this by limiting the check to devices where the ACPI node for the AXP288 contains a _HRV (hardware revision) attribute with a value of 3 which indicates we are dealing with a Cherry Trail platform. Fixes: 9b13a4ca8d2c ("Input: axp20x-pek - do not register input ...") Reported-by: Сергей Трусов <t.rus76@ya.ru> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-06-02	scripts/gdb: make lx-dmesg command work (reliably)	André Draszik	1	-4/+5
	lx-dmesg needs access to the log_buf symbol from printk.c. Unfortunately, the symbol log_buf also exists in BPF's verifier.c and hence gdb can pick one or the other. If it happens to pick BPF's log_buf, lx-dmesg doesn't work: (gdb) lx-dmesg Python Exception <class 'gdb.MemoryError'> Cannot access memory at address 0x0: Error occurred in Python command: Cannot access memory at address 0x0 (gdb) p log_buf $15 = 0x0 Luckily, GDB has a way to deal with this, see https://sourceware.org/gdb/onlinedocs/gdb/Symbols.html (gdb) info variables ^log_buf$ All variables matching regular expression "^log_buf$": File <linux.git>/kernel/bpf/verifier.c: static char log_buf; File <linux.git>/kernel/printk/printk.c: static char log_buf; (gdb) p 'verifier.c'::log_buf $1 = 0x0 (gdb) p 'printk.c'::log_buf $2 = 0x811a6aa0 <__log_buf> "" (gdb) p &log_buf $3 = (char ) 0x8120fe40 <log_buf> (gdb) p &'verifier.c'::log_buf $4 = (char ) 0x8120fe40 <log_buf> (gdb) p &'printk.c'::log_buf $5 = (char **) 0x8048b7d0 <log_buf> By being explicit about the location of the symbol, we can make lx-dmesg work again. While at it, do the same for the other symbols we need from printk.c Link: http://lkml.kernel.org/r/20170526112222.3414-1-git@andred.net Signed-off-by: André Draszik <git@andred.net> Tested-by: Kieran Bingham <kieran@bingham.xyz> Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	mm: consider memblock reservations for deferred memory initialization sizing	Michal Hocko	4	-11/+54
	We have seen an early OOM killer invocation on ppc64 systems with crashkernel=4096M: kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL\|__GFP_COMP\|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0 kthreadd cpuset=/ mems_allowed=7 CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1 Call Trace: dump_stack+0xb0/0xf0 (unreliable) dump_header+0xb0/0x258 out_of_memory+0x5f0/0x640 __alloc_pages_nodemask+0xa8c/0xc80 kmem_getpages+0x84/0x1a0 fallback_alloc+0x2a4/0x320 kmem_cache_alloc_node+0xc0/0x2e0 copy_process.isra.25+0x260/0x1b30 _do_fork+0x94/0x470 kernel_thread+0x48/0x60 kthreadd+0x264/0x330 ret_from_kernel_thread+0x5c/0xa4 Mem-Info: active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:5 slab_unreclaimable:73 mapped:0 shmem:0 pagetables:0 bounce:0 free:0 free_pcp:0 free_cma:0 Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 Node 7 DMA: 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB 08192kB 0*16384kB = 0kB 0 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 819200 pages RAM 0 pages HighMem/MovableOnly 817481 pages reserved 0 pages cma reserved 0 pages hwpoisoned the reason is that the managed memory is too low (only 110MB) while the rest of the the 50GB is still waiting for the deferred intialization to be done. update_defer_init estimates the initial memoty to initialize to 2GB at least but it doesn't consider any memory allocated in that range. In this particular case we've had Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB) so the low 2GB is mostly depleted. Fix this by considering memblock allocations in the initial static initialization estimation. Move the max_initialise to reset_deferred_meminit and implement a simple memblock_reserved_memory helper which iterates all reserved blocks and sums the size of all that start below the given address. The cumulative size is than added on top of the initial estimation. This is still not ideal because reset_deferred_meminit doesn't consider holes and so reservation might be above the initial estimation whihch we ignore but let's make the logic simpler until we really need to handle more complicated cases. Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> [4.2+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified	James Morse	3	-12/+24
	KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a special case. (check_user_page_hwpoison()) When huge pages are involved, this doesn't work so well. get_user_pages() calls follow_hugetlb_page(), which stops early if it receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning -EFAULT to the caller. The step to map this to -EHWPOISON based on the FOLL_ flags is missing. The hwpoison special case is skipped, and -EFAULT is returned to user-space, causing Qemu or kvmtool to exit. Instead, move this VM_FAULT_ to errno mapping code into a header file and use it from faultin_page() and follow_hugetlb_page(). With this, KVM works as expected. This isn't a problem for arm64 today as we haven't enabled MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86 too, so I think this should be a fix. This doesn't apply earlier than stable's v4.11.1 due to all sorts of cleanup. [james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()] suggested. Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> [4.11.1+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	mlock: fix mlock count can not decrease in race condition	Yisheng Xie	1	-2/+3
	Kefeng reported that when running the follow test, the mlock count in meminfo will increase permanently: [1] testcase linux:~ # cat test_mlockal grep Mlocked /proc/meminfo for j in `seq 0 10` do for i in `seq 4 15` do ./p_mlockall >> log & done sleep 0.2 done # wait some time to let mlock counter decrease and 5s may not enough sleep 5 grep Mlocked /proc/meminfo linux:~ # cat p_mlockall.c #include <sys/mman.h> #include <stdlib.h> #include <stdio.h> #define SPACE_LEN 4096 int main(int argc, char ** argv) { int ret; void *adr = malloc(SPACE_LEN); if (!adr) return -1; ret = mlockall(MCL_CURRENT \| MCL_FUTURE); printf("mlcokall ret = %d\n", ret); ret = munlockall(); printf("munlcokall ret = %d\n", ret); free(adr); return 0; } In __munlock_pagevec() we should decrement NR_MLOCK for each page where we clear the PageMlocked flag. Commit 1ebb7cc6a583 ("mm: munlock: batch NR_MLOCK zone state updates") has introduced a bug where we don't decrement NR_MLOCK for pages where we clear the flag, but fail to isolate them from the lru list (e.g. when the pages are on some other cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK accounting gets permanently disrupted by this. Fix it by counting the number of page whose PageMlock flag is cleared. Fixes: 1ebb7cc6a583 (" mm: munlock: batch NR_MLOCK zone state updates") Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joern Engel <joern@logfs.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michel Lespinasse <walken@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: zhongjiang <zhongjiang@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	mm/migrate: fix refcount handling when !hugepage_migration_supported()	Punit Agrawal	1	-6/+2
	On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite. Soft offlining pfn 0x83ed600 at process virtual address 0x400000000000 soft offline: 0x83ed600: migration failed 1, type 1fffc00000008008 (uptodate\|head) INFO: rcu_preempt detected stalls on CPUs/tasks: Tasks blocked on level-0 rcu_node (CPUs 0-7): P2715 (detected by 7, t=5254 jiffies, g=963, c=962, q=321) thugetlb_overco R running task 0 2715 2685 0x00000008 Call trace: dump_backtrace+0x0/0x268 show_stack+0x24/0x30 sched_show_task+0x134/0x180 rcu_print_detail_task_stall_rnp+0x54/0x7c rcu_check_callbacks+0xa74/0xb08 update_process_times+0x34/0x60 tick_sched_handle.isra.7+0x38/0x70 tick_sched_timer+0x4c/0x98 __hrtimer_run_queues+0xc0/0x300 hrtimer_interrupt+0xac/0x228 arch_timer_handler_phys+0x3c/0x50 handle_percpu_devid_irq+0x8c/0x290 generic_handle_irq+0x34/0x50 __handle_domain_irq+0x68/0xc0 gic_handle_irq+0x5c/0xb0 Address this by changing the putback_active_hugepage() in soft_offline_huge_page() to putback_movable_pages(). This only triggers on systems that enable memory failure handling (ARCH_SUPPORTS_MEMORY_FAILURE) but not hugepage migration (!ARCH_ENABLE_HUGEPAGE_MIGRATION). I imagine this wasn't triggered as there aren't many systems running this configuration. [akpm@linux-foundation.org: remove dead comment, per Naoya] Link: http://lkml.kernel.org/r/20170525135146.32011-1-punit.agrawal@arm.com Reported-by: Manoj Iyer <manoj.iyer@canonical.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> [3.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	dax: fix race between colliding PMD & PTE entries	Ross Zwisler	1	-0/+23
	We currently have two related PMD vs PTE races in the DAX code. These can both be easily triggered by having two threads reading and writing simultaneously to the same private mapping, with the key being that private mapping reads can be handled with PMDs but private mapping writes are always handled with PTEs so that we can COW. Here is the first race: CPU 0 CPU 1 (private mapping write) __handle_mm_fault() create_huge_pmd() - FALLBACK handle_pte_fault() passes check for pmd_devmap() (private mapping read) __handle_mm_fault() create_huge_pmd() dax_iomap_pmd_fault() inserts PMD dax_iomap_pte_fault() does a PTE fault, but we already have a DAX PMD installed in our page tables at this spot. Here's the second race: CPU 0 CPU 1 (private mapping read) __handle_mm_fault() passes check for pmd_none() create_huge_pmd() dax_iomap_pmd_fault() inserts PMD (private mapping write) __handle_mm_fault() create_huge_pmd() - FALLBACK (private mapping read) __handle_mm_fault() passes check for pmd_none() create_huge_pmd() handle_pte_fault() dax_iomap_pte_fault() inserts PTE dax_iomap_pmd_fault() inserts PMD, but we already have a PTE at this spot. The core of the issue is that while there is isolation between faults to the same range in the DAX fault handlers via our DAX entry locking, there is no isolation between faults in the code in mm/memory.c. This means for instance that this code in __handle_mm_fault() can run: if (pmd_none(vmf.pmd) && transparent_hugepage_enabled(vma)) { ret = create_huge_pmd(&vmf); But by the time we actually get to run the fault handler called by create_huge_pmd(), the PMD is no longer pmd_none() because a racing PTE fault has installed a normal PMD here as a parent. This is the cause of the 2nd race. The first race is similar - there is the following check in handle_pte_fault(): } else { / See comment in pte_alloc_one_map() / if (pmd_devmap(vmf->pmd) \|\| pmd_trans_unstable(vmf->pmd)) return 0; So if a pmd_devmap() PMD (a DAX PMD) has been installed at vmf->pmd, we will bail and retry the fault. This is correct, but there is nothing preventing the PMD from being installed after this check but before we actually get to the DAX PTE fault handlers. In my testing these races result in the following types of errors: BUG: Bad rss-counter state mm:ffff8800a817d280 idx:1 val:1 BUG: non-zero nr_ptes on freeing mm: 15 Fix this issue by having the DAX fault handlers verify that it is safe to continue their fault after they have taken an entry lock to block other racing faults. [ross.zwisler@linux.intel.com: improve fix for colliding PMD & PTE entries] Link: http://lkml.kernel.org/r/20170526195932.32178-1-ross.zwisler@linux.intel.com Link: http://lkml.kernel.org/r/20170522215749.23516-2-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Pawel Lebioda <pawel.lebioda@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	mm: avoid spurious 'bad pmd' warning messages	Ross Zwisler	1	-10/+30
	When the pmd_devmap() checks were added by 5c7fb56e5e3f ("mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge pages, they were all added to the end of if() statements after existing pmd_trans_huge() checks. So, things like: - if (pmd_trans_huge(pmd)) + if (pmd_trans_huge(pmd) \|\| pmd_devmap(pmd)) When further checks were added after pmd_trans_unstable() checks by commit 7267ec008b5c ("mm: postpone page table allocation until we have page to map") they were also added at the end of the conditional: + if (pmd_trans_unstable(fe->pmd) \|\| pmd_devmap(fe->pmd)) This ordering is fine for pmd_trans_huge(), but doesn't work for pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd() check inside of pmd_none_or_trans_huge_or_clear_bad() (called by pmd_trans_unstable()), which prints out a warning and returns 1. So, we do end up doing the right thing, but only after spamming dmesg with suspicious looking messages: mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5) Reorder these checks in a helper so that pmd_devmap() is checked first, avoiding the error messages, and add a comment explaining why the ordering is important. Fixes: commit 7267ec008b5c ("mm: postpone page table allocation until we have page to map") Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-02	mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once	Tetsuo Handa	1	-1/+3
	Roman Gushchin has reported that the OOM killer can trivially selects next OOM victim when a thread doing memory allocation from page fault path was selected as first OOM victim. allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE\|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0 allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: oom_kill_process+0x219/0x3e0 out_of_memory+0x11d/0x480 __alloc_pages_slowpath+0xc84/0xd40 __alloc_pages_nodemask+0x245/0x260 alloc_pages_vma+0xa2/0x270 __handle_mm_fault+0xca9/0x10c0 handle_mm_fault+0xf3/0x210 __do_page_fault+0x240/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... Out of memory: Kill process 492 (allocate) score 899 or sacrifice child Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB allocate: page allocation failure: order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE\|__GFP_ZERO), nodemask=(null) allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: __alloc_pages_slowpath+0xd32/0xd40 __alloc_pages_nodemask+0x245/0x260 alloc_pages_vma+0xa2/0x270 __handle_mm_fault+0xca9/0x10c0 handle_mm_fault+0xf3/0x210 __do_page_fault+0x240/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ... allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0 allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: oom_kill_process+0x219/0x3e0 out_of_memory+0x11d/0x480 pagefault_out_of_memory+0x68/0x80 mm_fault_error+0x8f/0x190 ? handle_mm_fault+0xf3/0x210 __do_page_fault+0x4b2/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB There is a race window that the OOM reaper completes reclaiming the first victim's memory while nothing but mutex_trylock() prevents the first victim from calling out_of_memory() from pagefault_out_of_memory() after memory allocation for page fault path failed due to being selected as an OOM victim. This is a side effect of commit 9a67f6488eca926f ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath") because that commit silently changed the behavior from /* Avoid allocations with no watermarks from looping endlessly / to / * Give up allocations without trying memory reserves if selected * as an OOM victim */ in __alloc_pages_slowpath() by moving the location to check TIF_MEMDIE flag. I have noticed this change but I didn't post a patch because I thought it is an acceptable change other than noise by warn_alloc() because !__GFP_NOFAIL allocations are allowed to fail. But we overlooked that failing memory allocation from page fault path makes difference due to the race window explained above. While it might be possible to add a check to pagefault_out_of_memory() that prevents the first victim from calling out_of_memory() or remove out_of_memory() from pagefault_out_of_memory(), changing pagefault_out_of_memory() does not suppress noise by warn_alloc() when allocating thread was selected as an OOM victim. There is little point with printing similar backtraces and memory information from both out_of_memory() and warn_alloc(). Instead, if we guarantee that current thread can try allocations with no watermarks once when current thread looping inside __alloc_pages_slowpath() was selected as an OOM victim, we can follow "who can use memory reserves" rules and suppress noise by warn_alloc() and prevent memory allocations from page fault path from calling pagefault_out_of_memory(). If we take the comment literally, this patch would do - if (test_thread_flag(TIF_MEMDIE)) - goto nopage; + if (alloc_flags == ALLOC_NO_WATERMARKS \|\| (gfp_mask & __GFP_NOMEMALLOC)) + goto nopage; because gfp_pfmemalloc_allowed() returns false if __GFP_NOMEMALLOC is given. But if I recall correctly (I couldn't find the message), the condition is meant to apply to only OOM victims despite the comment. Therefore, this patch preserves TIF_MEMDIE check. Fixes: 9a67f6488eca926f ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath") Link: http://lkml.kernel.org/r/201705192112.IAF69238.OQOHSJLFOFFMtV@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Roman Gushchin <guro@fb.com> Tested-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <stable@vger.kernel.org> [4.11] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>