wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2018-03-26	kconfig: rename silentoldconfig to syncconfig	Masahiro Yamada	5	-16/+23
	As commit cedd55d49dee ("kconfig: Remove silentoldconfig from help and docs; fix kconfig/conf's help") mentioned, 'silentoldconfig' is a historical misnomer. That commit removed it from help and docs since it is an internal interface. If so, it should be allowed to rename it to something more intuitive. 'syncconfig' is the one I came up with because it updates the .config if necessary, then synchronize include/generated/autoconf.h and include/config/* with it. You should not manually invoke 'silentoldcofig'. Display warning if used in case existing scripts are doing wrong. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-26	kconfig: invoke oldconfig instead of silentoldconfig from local*config	Masahiro Yamada	1	-2/+2
	The purpose of local{yes,mod}config is to arrange the .config file based on actually loaded modules. It is unnecessary to update include/generated/autoconf.h and include/config/* stuff here. They will be updated as needed during the build. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-26	kconfig: hide irrelevant sub-menus for oldconfig	Masahiro Yamada	1	-3/+6
	Historically, "make oldconfig" has changed its behavior several times, quieter or louder. (I attached the history below.) Currently, it is not as quiet as it should be. This commit addresses it. Test Case --------- ---------------------------(Kconfig)---------------------------- menu "menu" config FOO bool "foo" menu "sub menu" config BAR bool "bar" endmenu endmenu menu "sibling menu" config BAZ bool "baz" endmenu ---------------------------------------------------------------- ---------------------------(.config)---------------------------- CONFIG_BAR=y CONFIG_BAZ=y ---------------------------------------------------------------- With the Kconfig and .config above, "make silentoldconfig" and "make oldconfig" work differently, like follows: $ make silentoldconfig scripts/kconfig/conf --silentoldconfig Kconfig * * Restart config... * * * menu * foo (FOO) [N/y/?] (NEW) y # # configuration written to .config # $ make oldconfig scripts/kconfig/conf --oldconfig Kconfig * * Restart config... * * * menu * foo (FOO) [N/y/?] (NEW) y * * sub menu * bar (BAR) [Y/n/?] y # # configuration written to .config # Both hide "sibling node" since it is irrelevant. The difference is that silentoldconfig hides "sub menu" whereas oldconfig does not. The behavior of silentoldconfig is preferred since the "sub menu" does not contain any new symbol. The root cause is in conf(). There are three input modes that can call conf(); oldaskconfig, oldconfig, and silentoldconfig. Everytime conf() encounters a menu entry, it calls check_conf() to check if it contains new symbols. If no new symbol is found, the menu is just skipped. Currently, this happens only when input_mode == silentoldconfig. The oldaskconfig enters into the check_conf() loop as silentoldconfig, so oldaskconfig works likewise for the second loop or later, but it never happens for oldconfig. So, irrelevant sub-menus are shown for oldconfig. Change the test condition to "input_mode != oldaskconfig". This is false only for the first loop of oldaskconfig; it must ask the user all symbols, so no need to call check_conf(). History of oldconfig -------------------- [0] Originally, "make oldconfig" was as loud as "make config" (It showed the entire .config file) [1] Commit cd9140e1e73a ("kconfig: make oldconfig is now less chatty") made oldconfig quieter, but it was still less quieter than silentoldconfig. (oldconfig did not hide sub-menus) [2] Commit 204c96f60904 ("kconfig: fix silentoldconfig") changed the input_mode of oldconfig to "ask_silent" from "ask_new". So, oldconfig really became as quiet as silentoldconfig. (oldconfig hided irrelevant sub-menus) [3] Commit 4062f1a4c030 ("kconfig: use long options in conf") made oldconfig as loud as [0] due to misconversion. [4] Commit 14828349719a ("kconfig: fix make oldconfig") addressed the misconversion of [3], but it made oldconfig quieter only to the same level as [1], not [2]. This commit is restoring the behavior of [2]. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-26	kconfig: remove redundant input_mode test for check_conf() loop	Masahiro Yamada	1	-1/+1
	check_conf() never increments conf_cnt for listnewconfig, so conf_cnt is always zero. In other words, conf_cnt is not zero, "input_mode != listnewconfig" is met. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-26	kconfig: remove unneeded input_mode test in conf()	Masahiro Yamada	1	-3/+1
	conf() is never called for listnewconfig / olddefconfig. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-26	kconfig: do not call check_conf() for olddefconfig	Masahiro Yamada	1	-5/+5
	check_conf() traverses the menu tree, but it is completely no-op for olddefconfig because the following if-else block does nothing. if (input_mode == listnewconfig) { ... } else if (input_mode != olddefconfig) { ... } As the help message says, olddefconfig automatically sets new symbols to their default value. There is no room for manual intervention. So, calling check_conf() for olddefconfig is odd in the first place. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-26	kconfig: only write '# CONFIG_FOO is not set' for visible symbols	Ulf Magnusson	1	-1/+2
	=== Background === - Visible n-valued bool/tristate symbols generate a '# CONFIG_FOO is not set' line in the .config file. The idea is to remember the user selection without having to set a Makefile variable. Having n correspond to the variable being undefined in the Makefiles makes for easy CONFIG_* tests. - Invisible n-valued bool/tristate symbols normally do not generate a '# CONFIG_FOO is not set' line, because user values from .config files have no effect on invisible symbols anyway. Currently, there is one exception to this rule: Any bool/tristate symbol that gets the value n through a 'default' property generates a '# CONFIG_FOO is not set' line, even if the symbol is invisible. Note that this only applies to explicitly given defaults, and not when the symbol implicitly defaults to n (like bool/tristate symbols without 'default' properties do). This is inconsistent, and seems redundant: - As mentioned, the '# CONFIG_FOO is not set' won't affect the symbol once the .config is read back in. - Even if the symbol is invisible at first but becomes visible later, there shouldn't be any harm in recalculating the default value rather than viewing the '# CONFIG_FOO is not set' as a previous user value of n. === Changes === Change sym_calc_value() to only set SYMBOL_WRITE (write to .config) for non-n-valued 'default' properties. Note that SYMBOL_WRITE is always set for visible symbols regardless of whether they have 'default' properties or not, so this change only affects invisible symbols. This reduces the size of the x86 .config on my system by about 1% (due to removed '# CONFIG_FOO is not set' entries). One side effect of (and the main motivation for) this change is making the following two definitions behave exactly the same: config FOO bool config FOO bool default n With this change, neither of these will generate a '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied). That might make it clearer to people that a bare 'default n' is redundant. This change only affects generated .config files and not autoconf.h: autoconf.h only includes #defines for non-n bool/tristate symbols. === Testing === The following testing was done with the x86 Kconfigs: - .config files generated before and after the change were compared to verify that the only difference is some '# CONFIG_FOO is not set' entries disappearing. A couple of these were inspected manually, and most turned out to be from redundant 'default n/def_bool n' properties. - The generated include/generated/autoconf.h was compared before and after the change and verified to be identical. - As a sanity check, the same modification was done to Kconfiglib. The Kconfiglib test suite was then run to check for any mismatches against the output of the C implementation. Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26	kconfig: Print reverse dependencies in groups	Eugeniu Rosca	3	-13/+20
	Surprisingly or not, disabling a CONFIG option (which is assumed to be unneeded) may be not so trivial. Especially it is not trivial, when this CONFIG option is selected by a dozen of other configs. Before the moment commit 1ccb27143360 ("kconfig: make "Selected by:" and "Implied by:" readable") popped up in v4.16-rc1, it was an absolute pain to break down the "Selected by" reverse dependency expression in order to identify all those configs which select (IOW do not allow disabling) a certain feature (assumed to be not needed). This patch tries to make one step further by putting at users' fingertips the revdep top level OR sub-expressions grouped/clustered by the tristate value they evaluate to. This should allow the users to directly concentrate on and tackle the _active_ reverse dependencies. To give some numbers and quantify the complexity of certain reverse dependencies, assuming commit 617aebe6a97e ("Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux"), ARCH=arm64 and vanilla arm64 defconfig, here is the top 10 CONFIG options with the highest amount of top level "\|\|" sub-expressions/tokens that make up the final "Selected by" reverse dependency expression. \| Config \| All revdep \| Active revdep \| \|-------------------\|------------\|---------------\| \| REGMAP_I2C \| 212 \| 9 \| \| CRC32 \| 167 \| 25 \| \| FW_LOADER \| 128 \| 5 \| \| MFD_CORE \| 124 \| 9 \| \| FB_CFB_IMAGEBLIT \| 114 \| 2 \| \| FB_CFB_COPYAREA \| 111 \| 2 \| \| FB_CFB_FILLRECT \| 110 \| 2 \| \| SND_PCM \| 103 \| 2 \| \| CRYPTO_HASH \| 87 \| 19 \| \| WATCHDOG_CORE \| 86 \| 6 \| The story behind the above is that users need to visually review/evaluate 212 expressions which potentially select REGMAP_I2C in order to identify the expressions which actually select REGMAP_I2C, for a particular ARCH and for a particular defconfig used. To make this experience smoother, change the way reverse dependencies are displayed to the user from [1] to [2]. [1] Old representation of DMA_ENGINE_RAID: Selected by: - AMCC_PPC440SPE_ADMA [=n] && DMADEVICES [=y] && (440SPe \|\| 440SP) - BCM_SBA_RAID [=m] && DMADEVICES [=y] && (ARM64 [=y] \|\| ... - FSL_RAID [=n] && DMADEVICES [=y] && FSL_SOC && ... - INTEL_IOATDMA [=n] && DMADEVICES [=y] && PCI [=y] && X86_64 - MV_XOR [=n] && DMADEVICES [=y] && (PLAT_ORION \|\| ARCH_MVEBU [=y] ... - MV_XOR_V2 [=y] && DMADEVICES [=y] && ARM64 [=y] - XGENE_DMA [=n] && DMADEVICES [=y] && (ARCH_XGENE [=y] \|\| ... - DMATEST [=n] && DMADEVICES [=y] && DMA_ENGINE [=y] [2] New representation of DMA_ENGINE_RAID: Selected by [y]: - MV_XOR_V2 [=y] && DMADEVICES [=y] && ARM64 [=y] Selected by [m]: - BCM_SBA_RAID [=m] && DMADEVICES [=y] && (ARM64 [=y] \|\| ... Selected by [n]: - AMCC_PPC440SPE_ADMA [=n] && DMADEVICES [=y] && (440SPe \|\| ... - FSL_RAID [=n] && DMADEVICES [=y] && FSL_SOC && ... - INTEL_IOATDMA [=n] && DMADEVICES [=y] && PCI [=y] && X86_64 - MV_XOR [=n] && DMADEVICES [=y] && (PLAT_ORION \|\| ARCH_MVEBU [=y] ... - XGENE_DMA [=n] && DMADEVICES [=y] && (ARCH_XGENE [=y] \|\| ... - DMATEST [=n] && DMADEVICES [=y] && DMA_ENGINE [=y] Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Reviewed-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26	kconfig: clean-up reverse dependency help implementation	Masahiro Yamada	2	-17/+23
	This commit splits out the special E_OR handling ('-' instead of '\|\|') into a dedicated helper expr_print_revdev(). Restore the original expr_print() prior to commit 1ccb27143360 ("kconfig: make "Selected by:" and "Implied by:" readable"). This makes sense because: - We need to chop those expressions only when printing the reverse dependency, and only when E_OR is encountered - Otherwise, it should be printed as before, so fall back to expr_print() This also improves the behavior; for a single line, it was previously displayed in the same line as "Selected by", like this: Selected by: A [=n] && B [=n] This will be displayed in a new line, consistently: Selected by: - A [=n] && B [=n] Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Petr Vorel <pvorel@suse.cz>
2018-03-26	checkpatch: kconfig: prefer 'help' over '---help---'	Ulf Magnusson	1	-1/+5
	IMO, we should discourage '---help---' for new help texts, even in cases where it would be consistent with other help texts in the file. This will help if we ever want to get rid of '---help---' in the future. Also simplify the code to only check for exactly '---help---'. Since commit c2264564df3d ("kconfig: warn of unhandled characters in Kconfig commands"), '---help---' is a proper keyword and can only appear in that form. Prior to that commit, '---help---' working was more of a syntactic quirk. Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26	checkpatch: kconfig: check help texts for menuconfig and choice	Ulf Magnusson	1	-2/+11
	Currently, only Kconfig symbols are checked for a missing or short help text, and are only checked if they are defined with the 'config' keyword. To make the check more general, extend it to also check help texts for choices and for symbols defined with the 'menuconfig' keyword. This increases the accuracy of the check for symbols that would already have been checked as well, since e.g. a 'menuconfig' symbol after a help text will be recognized as ending the preceding symbol/choice definition. To increase the accuracy of the check further, also recognize 'if', 'endif', 'menu', 'endmenu', 'endchoice', and 'source' as ending a symbol/choice definition. Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-26	checkpatch: kconfig: recognize more prompts when checking help texts	Ulf Magnusson	1	-1/+1
	The check for a missing or short help text only considers symbols with a prompt, but doesn't recognize any of the following as a prompt: bool 'foo' tristate 'foo' prompt "foo" prompt 'foo' Make the check recognize those too. Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-11	Linux 4.16-rc5	Linus Torvalds	1	-1/+1

2018-03-11	dmaengine: mv_xor_v2: Fix clock resource by adding a register clock	Gregory CLEMENT	2	-6/+25
	On the CP110 components which are present on the Armada 7K/8K SoC we need to explicitly enable the clock for the registers. However it is not needed for the AP8xx component, that's why this clock is optional. With this patch both clock have now a name, but in order to be backward compatible, the name of the first clock is not used. It allows to still use this clock with a device tree using the old binding. Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2018-03-09	lib/test_kmod.c: fix limit check on number of test devices created	Luis R. Rodriguez	1	-1/+1
	As reported by Dan the parentheses is in the wrong place, and since unlikely() call returns either 0 or 1 it's never less than zero. The second issue is that signed integer overflows like "INT_MAX + 1" are undefined behavior. Since num_test_devs represents the number of devices, we want to stop prior to hitting the max, and not rely on the wrap arround at all. So just cap at num_test_devs + 1, prior to assigning a new device. Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus	Li Zhijian	1	-8/+17
	Fix userfaultfd_hugetlb on hosts which have more than 64 cpus. --------------------------- running userfaultfd_hugetlb --------------------------- invalid MiB Usage: <MiB> <bounces> [FAIL] Via userfaultfd.c we can know, hugetlb_size needs to meet hugetlb_size >= nr_cpus * hugepage_size. hugepage_size is often 2M, so when host cpus > 64, it requires more than 128M. [zhijianx.li@intel.com: update changelog/comments and variable name] Link: http://lkml.kernel.org/r/20180302024356.83359-1-zhijianx.li@intel.com Link: http://lkml.kernel.org/r/20180303125027.81638-1-zhijianx.li@intel.com Link: http://lkml.kernel.org/r/20180302024356.83359-1-zhijianx.li@intel.com Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: SeongJae Park <sj38.park@gmail.com> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	mm/page_alloc: fix memmap_init_zone pageblock alignment	Daniel Vacek	1	-2/+7
	Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") introduced a bug where move_freepages() triggers a VM_BUG_ON() on uninitialized page structure due to pageblock alignment. To fix this, simply align the skipped pfns in memmap_init_zone() the same way as in move_freepages_block(). Seen in one of the RHEL reports: crash> log \| grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1 kernel BUG at mm/page_alloc.c:1389! invalid opcode: 0000 [#1] SMP -- RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160 RSP: 0018:ffff88054d727688 EFLAGS: 00010087 -- Call Trace: [<ffffffff811883b3>] move_freepages_block+0x73/0x80 [<ffffffff81189e63>] __rmqueue+0x263/0x460 [<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0 [<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420 -- RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160 RSP <ffff88054d727688> crash> page_init_bug -v \| grep RAM <struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB) <struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB) <struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB) <struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB) <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB) crash> page_init_bug \| head -6 <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 <struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0> <struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095 <struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 BUG, zones differ! Note that this range follows two not populated sections 68000000-77ffffff in this zone. 7b788000-7b7fffff is the first one after a gap. This makes memmap_init_zone() skip all the pfns up to the beginning of this range. But this range is not pageblock (2M) aligned. In fact no range has to be. crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0001e00000 78000000 0 0 0 0 ffffea0001ed7fc0 7b5ff000 0 0 0 0 ffffea0001ed8000 7b600000 0 0 0 0 <<<< ffffea0001ede1c0 7b787000 0 0 0 0 ffffea0001ede200 7b788000 0 0 1 1fffff00000000 Top part of page flags should contain nodeid and zonenr, which is not the case for page ffffea0001ed8000 here (<<<<). crash> log \| grep -o fffea0001ed[^\ ]* \| sort -u fffea0001ed8000 fffea0001eded20 fffea0001edffc0 crash> bt -r \| grep -o fffea0001ed[^\ ]* \| sort -u fffea0001ed8000 fffea0001eded00 fffea0001eded20 fffea0001edffc0 Initialization of the whole beginning of the section is skipped up to the start of the range due to the commit b92df1de5d28. Now any code calling move_freepages_block() (like reusing the page from a freelist as in this example) with a page from the beginning of the range will get the page rounded down to start_page ffffea0001ed8000 and passed to move_freepages() which crashes on assertion getting wrong zonenr. > VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); Note, page_zone() derives the zone from page flags here. From similar machine before commit b92df1de5d28: crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS fffff73941e00000 78000000 0 0 1 1fffff00000000 fffff73941ed7fc0 7b5ff000 0 0 1 1fffff00000000 fffff73941ed8000 7b600000 0 0 1 1fffff00000000 fffff73941edff80 7b7fe000 0 0 1 1fffff00000000 fffff73941edffc0 7b7ff000 ffff8e67e04d3ae0 ad84 1 1fffff00020068 uptodate,lru,active,mappedtodisk All the pages since the beginning of the section are initialized. move_freepages()' not gonna blow up. The same machine with this fix applied: crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0001e00000 78000000 0 0 0 0 ffffea0001e00000 7b5ff000 0 0 0 0 ffffea0001ed8000 7b600000 0 0 1 1fffff00000000 ffffea0001edff80 7b7fe000 0 0 1 1fffff00000000 ffffea0001edffc0 7b7ff000 ffff88017fb13720 8 2 1fffff00020068 uptodate,lru,active,mappedtodisk At least the bare minimum of pages is initialized preventing the crash as well. Customers started to report this as soon as 7.4 (where b92df1de5d28 was merged in RHEL) was released. I remember reports from September/October-ish times. It's not easily reproduced and happens on a handful of machines only. I guess that's why. But that does not make it less serious, I think. Though there actually is a report here: https://bugzilla.kernel.org/show_bug.cgi?id=196443 And there are reports for Fedora from July: https://bugzilla.redhat.com/show_bug.cgi?id=1473242 and CentOS: https://bugs.centos.org/view.php?id=13964 and we internally track several dozens reports for RHEL bug https://bugzilla.redhat.com/show_bug.cgi?id=1525121 Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.1520011945.git.neelx@redhat.com Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") Signed-off-by: Daniel Vacek <neelx@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	mm/memblock.c: hardcode the end_pfn being -1	Daniel Vacek	1	-5/+5
	This is just a cleanup. It aids handling the special end case in the next commit. [akpm@linux-foundation.org: make it work against current -linus, not against -mm] [akpm@linux-foundation.org: make it work against current -linus, not against -mm some more] Link: http://lkml.kernel.org/r/1ca478d4269125a99bcfb1ca04d7b88ac1aee924.1520011944.git.neelx@redhat.com Signed-off-by: Daniel Vacek <neelx@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT	Andrea Arcangeli	1	-2/+5
	KVM is hanging during postcopy live migration with userfaultfd because get_user_pages_unlocked is not capable to handle FOLL_NOWAIT. Earlier FOLL_NOWAIT was only ever passed to get_user_pages. Specifically faultin_page (the callee of get_user_pages_unlocked caller) doesn't know that if FAULT_FLAG_RETRY_NOWAIT was set in the page fault flags, when VM_FAULT_RETRY is returned, the mmap_sem wasn't actually released (even if nonblocking is not NULL). So it sets *nonblocking to zero and the caller won't release the mmap_sem thinking it was already released, but it wasn't because of FOLL_NOWAIT. Link: http://lkml.kernel.org/r/20180302174343.5421-2-aarcange@redhat.com Fixes: ce53053ce378c ("kvm: switch get_user_page_nowait() to get_user_pages_unlocked()") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()	Kees Cook	1	-0/+2
	Commit b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") changed the ordering of fixups, and did not take into account the case of x86 processing non-WARN() and non-BUG() exceptions. This would lead to output of a false BUG line with no other information. In the case of a refcount exception, it would be immediately followed by the refcount WARN(), producing very strange double-"cut here": lkdtm: attempting bad refcount_inc() overflow ------------[ cut here ]------------ Kernel BUG at 0000000065f29de5 [verbose debug info unavailable] ------------[ cut here ]------------ refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0 WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4 ... In the prior ordering, exceptions were searched first: do_trap_no_signal(struct task_struct tsk, int trapnr, char str, ... if (fixup_exception(regs, trapnr)) return 0; - if (fixup_bug(regs, trapnr)) - return 0; - As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account needing to search the exception list first, since that had already happened. So, instead of searching the exception list twice (once in is_valid_bugaddr() and then again in fixup_exception()), just add a simple sanity check to report_bug() that will immediately bail out if a BUG() (or WARN()) entry is not found. Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	bug: use %pB in BUG and stack protector failure	Kees Cook	2	-2/+2
	The BUG and stack protector reports were still using a raw %p. This changes it to %pB for more meaningful output. Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com>, Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	hugetlb: fix surplus pages accounting	Michal Hocko	1	-1/+1
	Dan Rue has noticed that libhugetlbfs test suite fails counter test: # mount_point="/mnt/hugetlb/" # echo 200 > /proc/sys/vm/nr_hugepages # mkdir -p "${mount_point}" # mount -t hugetlbfs hugetlbfs "${mount_point}" # export LD_LIBRARY_PATH=/root/libhugetlbfs/libhugetlbfs-2.20/obj64 # /root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters Starting testcase "/root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters", pid 3319 Base pool size: 0 Clean... FAIL Line 326: Bad HugePages_Total: expected 0, actual 1 The bug was bisected to 0c397daea1d4 ("mm, hugetlb: further simplify hugetlb allocation API"). The reason is that alloc_surplus_huge_page() misaccounts per node surplus pages. We should increase surplus_huge_pages_node rather than nr_huge_pages_node which is already handled by alloc_fresh_huge_page. Link: http://lkml.kernel.org/r/20180221191439.GM2231@dhcp22.suse.cz Fixes: 0c397daea1d4 ("mm, hugetlb: further simplify hugetlb allocation API") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Dan Rue <dan.rue@linaro.org> Tested-by: Dan Rue <dan.rue@linaro.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	RDMA/mlx5: Fix integer overflow while resizing CQ	Leon Romanovsky	1	-1/+6
	The user can provide very large cqe_size which will cause to integer overflow as it can be seen in the following UBSAN warning: ======================================================================= UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/cq.c:1192:53 signed integer overflow: 64870 * 65536 cannot be represented in type 'int' CPU: 0 PID: 267 Comm: syzkaller605279 Not tainted 4.15.0+ #90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ubsan_epilogue+0xe/0x81 handle_overflow+0x1f3/0x251 ? __ubsan_handle_negate_overflow+0x19b/0x19b ? lock_acquire+0x440/0x440 mlx5_ib_resize_cq+0x17e7/0x1e40 ? cyc2ns_read_end+0x10/0x10 ? native_read_msr_safe+0x6c/0x9b ? cyc2ns_read_end+0x10/0x10 ? mlx5_ib_modify_cq+0x220/0x220 ? sched_clock_cpu+0x18/0x200 ? lookup_get_idr_uobject+0x200/0x200 ? rdma_lookup_get_uobject+0x145/0x2f0 ib_uverbs_resize_cq+0x207/0x3e0 ? ib_uverbs_ex_create_cq+0x250/0x250 ib_uverbs_write+0x7f9/0xef0 ? cyc2ns_read_end+0x10/0x10 ? print_irqtrace_events+0x280/0x280 ? ib_uverbs_ex_create_cq+0x250/0x250 ? uverbs_devnode+0x110/0x110 ? sched_clock_cpu+0x18/0x200 ? do_raw_spin_trylock+0x100/0x100 ? __lru_cache_add+0x16e/0x290 __vfs_write+0x10d/0x700 ? uverbs_devnode+0x110/0x110 ? kernel_read+0x170/0x170 ? sched_clock_cpu+0x18/0x200 ? security_file_permission+0x93/0x260 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x433549 RSP: 002b:00007ffe63bd1ea8 EFLAGS: 00000217 ======================================================================= Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 3.13 Fixes: bde51583f49b ("IB/mlx5: Add support for resize CQ") Reported-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09	Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"	Doug Ledford	1	-6/+1
	The original commit of this patch has a munged log message that is missing several of the tags the original author intended to be on the patch. This was due to patchworks misinterpreting a cut-n-paste separator line as an end of message line and munging the mbox that was used to import the patch: https://patchwork.kernel.org/patch/10264089/ The original patch will be reapplied with a fixed commit message so the proper tags are applied. This reverts commit aa0de36a40f446f5a21a7c1e677b98206e242edb. Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09	arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery	Marc Zyngier	1	-2/+2
	A recent update to the ARM SMCCC ARCH_WORKAROUND_1 specification allows firmware to return a non zero, positive value to describe that although the mitigation is implemented at the higher exception level, the CPU on which the call is made is not affected. Let's relax the check on the return value from ARCH_WORKAROUND_1 so that we only error out if the returned value is negative. Fixes: b092201e0020 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-03-09	Documentation/sphinx: Fix Directive import error	Matthew Wilcox	1	-2/+1
	Sphinx 1.7 removed sphinx.util.compat.Directive so people who have upgraded cannot build the documentation. Switch to docutils.parsers.rst.Directive which has been available since docutils 0.5 released in 2009. Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1083694 Co-developed-by: Takashi Iwai <tiwai@suse.de> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-03-09	platform/x86: dell-smbios: Resolve dependency error on DCDBAS	Darren Hart (VMware)	1	-0/+6
	When the DELL_SMBIOS_SMM backend is enabled, the DELL_SMBIOS symbol depends on DELL_DCDBAS, and we must avoid the situation where DELL_SMBIOS=y and DCDBAS=m. Adding the conditional dependency to DELL_SMBIOS such as: depends !DELL_SMBIOS_SMM \|\| (DCDBAS \|\| DCDBAS=n) results in the Kconfig tooling complaining about a circular dependency, although it appears to work in practice. Avoid the errors by simplifying the dependency and forcing DELL_SMBIOS to be <= DCDBAS if DCDBAS is enabled (thanks to Greg KH for the suggestion). Cc: Mario.Limonciello@dell.com Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: Allow for SMBIOS backend defaults	Darren Hart (VMware)	1	-2/+4
	Avoid accidental configurations by setting default y for DELL_SMBIOS backends. Avoid this impacting the default build size, by making them dependent on DELL_SMBIOS, so they only appear when DELL_SMBIOS is manually selected, or by DELL_LAPTOP or DELL_WMI. While DELL_SMBIOS does have a prompt, it does not have any dependencies. Keeping DELL_SMBIOS visible, despite being "select"ed by DELL_LAPTOP and DELL_WMI, is a deliberate choice to provide context for the WMI and SMM backends, which would otherwise appear to float without context within the menu. Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: dell-smbios: Link all dell-smbios-* modules together	Mario Limonciello	6	-33/+66
	Some race conditions were raised due to dell-smbios and its backends not being ready by the time that a consumer would call one of the exported methods. To avoid this problem, guarantee that all initialization has been done by linking them all together and running init for them all. As part of this change the Kconfig needs to be adjusted so that CONFIG_DELL_SMBIOS_SMM and CONFIG_DELL_SMBIOS_WMI are boolean rather than modules. CONFIG_DELL_SMBIOS is a visually selectable option again and both CONFIG_DELL_SMBIOS_WMI and CONFIG_DELL_SMBIOS_SMM are optional. Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> [dvhart: Update prompt and help text for DELL_SMBIOS_* backends] Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base	Mario Limonciello	2	-0/+1
	This is being done to faciliate a later change to link all the dell-smbios drivers together. Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: dell-smbios: Correct some style warnings	Mario Limonciello	1	-3/+5
	WARNING: function definition argument 'struct calling_interface_buffer ' should also have an identifier name + int (call_fn)(struct calling_interface_buffer ); WARNING: Block comments use on subsequent lines + /* 4 bytes of table header, plus 7 bytes of Dell header, plus at least + 6 bytes of entry / WARNING: Block comments use a trailing / on a separate line + 6 bytes of entry */ Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	loop: Fix lost writes caused by missing flag	Ross Zwisler	1	-1/+1
	The following commit: commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC") replaced __do_lo_send_write(), which used ITER_KVEC iterators, with lo_write_bvec() which uses ITER_BVEC iterators. In this change, though, the WRITE flag was lost: - iov_iter_kvec(&from, ITER_KVEC \| WRITE, &kvec, 1, len); + iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len); This flag is necessary for the DAX case because we make decisions based on whether or not the iterator is a READ or a WRITE in dax_iomap_actor() and in dax_iomap_rw(). We end up going through this path in configurations where we combine a PMEM device with 4k sectors, a loopback device and DAX. The consequence of this missed flag is that what we intend as a write actually turns into a read in the DAX code, so no data is ever written. The very simplest test case is to create a loopback device and try and write a small string to it, then hexdump a few bytes of the device to see if the write took. Without this patch you read back all zeros, with this you read back the string you wrote. For XFS this causes us to fail or panic during the following xfstests: xfs/074 xfs/078 xfs/216 xfs/217 xfs/250 For ext4 we have a similar issue where writes never happen, but we don't currently have any xfstests that use loopback and show this issue. Fix this by restoring the WRITE flag argument to iov_iter_bvec(). This causes the xfstests to all pass. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Fixes: commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-09	clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependency	Masahiro Yamada	1	-0/+1
	The ATMEL_ST config selects MFD_SYSCON, but does not depend on HAS_IOMEM. Compile testing on architecture without HAS_IOMEM causes "unmet direct dependencies" in Kconfig phase. Detected by "make ARCH=score allyesconfig". Add the proper dependency to the ATMEL_ST config. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/1520335233-11277-1-git-send-email-yamada.masahiro@socionext.com
2018-03-09	rtmutex: Make rt_mutex_futex_unlock() safe for irq-off callsites	Boqun Feng	1	-2/+3
	When running rcutorture with TREE03 config, CONFIG_PROVE_LOCKING=y, and kernel cmdline argument "rcutorture.gp_exp=1", lockdep reports a HARDIRQ-safe->HARDIRQ-unsafe deadlock: ================================ WARNING: inconsistent lock state 4.16.0-rc4+ #1 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. takes: __schedule+0xbe/0xaf0 {IN-HARDIRQ-W} state was registered at: _raw_spin_lock+0x2a/0x40 scheduler_tick+0x47/0xf0 ... other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&rq->lock); <Interrupt> lock(&rq->lock); * DEADLOCK * 1 lock held by rcu_torture_rea/724: rcu_torture_read_lock+0x0/0x70 stack backtrace: CPU: 2 PID: 724 Comm: rcu_torture_rea Not tainted 4.16.0-rc4+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: lock_acquire+0x90/0x200 ? __schedule+0xbe/0xaf0 _raw_spin_lock+0x2a/0x40 ? __schedule+0xbe/0xaf0 __schedule+0xbe/0xaf0 preempt_schedule_irq+0x2f/0x60 retint_kernel+0x1b/0x2d RIP: 0010:rcu_read_unlock_special+0x0/0x680 ? rcu_torture_read_unlock+0x60/0x60 __rcu_read_unlock+0x64/0x70 rcu_torture_read_unlock+0x17/0x60 rcu_torture_reader+0x275/0x450 ? rcutorture_booster_init+0x110/0x110 ? rcu_torture_stall+0x230/0x230 ? kthread+0x10e/0x130 kthread+0x10e/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ? call_usermodehelper_exec_async+0x11a/0x150 ret_from_fork+0x3a/0x50 This happens with the following even sequence: preempt_schedule_irq(); local_irq_enable(); __schedule(): local_irq_disable(); // irq off ... rcu_note_context_switch(): rcu_note_preempt_context_switch(): rcu_read_unlock_special(): local_irq_save(flags); ... raw_spin_unlock_irqrestore(...,flags); // irq remains off rt_mutex_futex_unlock(): raw_spin_lock_irq(); ... raw_spin_unlock_irq(); // accidentally set irq on <return to __schedule()> rq_lock(): raw_spin_lock(); // acquiring rq->lock with irq on which means rq->lock becomes a HARDIRQ-unsafe lock, which can cause deadlocks in scheduler code. This problem was introduced by commit 02a7c234e540 ("rcu: Suppress lockdep false-positive ->boost_mtx complaints"). That brought the user of rt_mutex_futex_unlock() with irq off. To fix this, replace the lock_irq() in rt_mutex_futex_unlock() with lock_irq{save,restore}() to make it safe to call rt_mutex_futex_unlock() with irq off. Fixes: 02a7c234e540 ("rcu: Suppress lockdep false-positive ->boost_mtx complaints") Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20180309065630.8283-1-boqun.feng@gmail.com
2018-03-09	x86/kprobes: Fix kernel crash when probing .entry_trampoline code	Francis Deslauriers	3	-1/+12
	Disable the kprobe probing of the entry trampoline: .entry_trampoline is a code area that is used to ensure page table isolation between userspace and kernelspace. At the beginning of the execution of the trampoline, we load the kernel's CR3 register. This has the effect of enabling the translation of the kernel virtual addresses to physical addresses. Before this happens most kernel addresses can not be translated because the running process' CR3 is still used. If a kprobe is placed on the trampoline code before that change of the CR3 register happens the kernel crashes because int3 handling pages are not accessible. To fix this, add the .entry_trampoline section to the kprobe blacklist to prohibit the probing of code before all the kernel pages are accessible. Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: mathieu.desnoyers@efficios.com Cc: mhiramat@kernel.org Link: http://lkml.kernel.org/r/1520565492-4637-2-git-send-email-francis.deslauriers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-09	perf/core: Fix ctx_event_type in ctx_resched()	Song Liu	1	-1/+3
	In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is added. However, ctx_resched() calculates ctx_event_type before checking this condition. As a result, pinned events will NOT get higher priority than flexible events. The following shows this issue on an Intel CPU (where ref-cycles can only use one hardware counter). 1. First start: perf stat -C 0 -e ref-cycles -I 1000 2. Then, in the second console, run: perf stat -C 0 -e ref-cycles:D -I 1000 The second perf uses pinned events, which is expected to have higher priority. However, because it failed in ctx_resched(). It is never run. This patch fixes this by calculating ctx_event_type after re-evaluating event_type. Reported-by: Ephraim Park <ephiepark@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <jolsa@redhat.com> Cc: <kernel-team@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubraving@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-08	nvme_fc: rework sqsize handling	James Smart	1	-10/+17
	Corrected four outstanding issues in the transport around sqsize. 1: Create Connection LS is sending the 1's-based sqsize, should be sending the 0's-based value. 2: allocation of hw queue is using the 0's-base size. It should be using the 1's-based value. 3: normalization of ctrl.sqsize by MQES is using MQES+1 (1's-based value). It should be MQES (0's-based value). 4: Missing clause to ensure queue_count not larger than ctrl->sqsize. Corrected by: Clean up routines that pass queue size around. The queue size value is the actual count (1's-based) value and determined from ctrl->sqsize + 1. Routines that send 0's-based value adapt from queue size. Sset ctrl->sqsize properly for MQES. Added clause to nsure queue_count not larger than ctrl->sqsize + 1. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <keith.busch@intel.com>
2018-03-08	ALSA: hda: add dock and led support for HP ProBook 640 G2	Dennis Wassenberg	1	-0/+1
	This patch adds missing initialisation for HP 2013 UltraSlim Dock Line-In/Out PINs and activates keyboard mute/micmute leds for HP ProBook 640 G2 Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-08	ALSA: hda: add dock and led support for HP EliteBook 820 G3	Dennis Wassenberg	1	-0/+1
	This patch adds missing initialisation for HP 2013 UltraSlim Dock Line-In/Out PINs and activates keyboard mute/micmute leds for HP EliteBook 820 G3 Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-08	nvme-fabrics: Ignore nr_io_queues option for discovery controllers	Roland Dreier	1	-0/+5
	This removes a dependency on the order options are passed when creating a fabrics controller. With the old code, if "nr_io_queues" appears before an "nqn" option specifying the discovery controller, then nr_io_queues is overridden with zero. If "nr_io_queues" appears after specifying the discovery controller, then the nr_io_queues option is used to set the number of queues, and the driver attempts to establish IO connections to the discovery controller (which doesn't work). It seems better to ignore (and warn about) the "nr_io_queues" option if userspace has already asked to connect to the discovery controller. Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com>
2018-03-09	kbuild: Handle builtin dtb file names containing hyphens	James Hogan	1	-4/+4
	cmd_dt_S_dtb constructs the assembly source to incorporate a devicetree FDT (that is, the .dtb file) as binary data in the kernel image. This assembly source contains labels before and after the binary data. The label names incorporate the file name of the corresponding .dtb file. Hyphens are not legal characters in labels, so .dtb files built into the kernel with hyphens in the file name result in errors like the following: bcm3368-netgear-cvg834g.dtb.S: Assembler messages: bcm3368-netgear-cvg834g.dtb.S:5: Error: : no such section bcm3368-netgear-cvg834g.dtb.S:5: Error: junk at end of line, first unrecognized character is `-' bcm3368-netgear-cvg834g.dtb.S:6: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_begin:' bcm3368-netgear-cvg834g.dtb.S:8: Error: unrecognized opcode `__dtb_bcm3368-netgear-cvg834g_end:' bcm3368-netgear-cvg834g.dtb.S:9: Error: : no such section bcm3368-netgear-cvg834g.dtb.S:9: Error: junk at end of line, first unrecognized character is `-' Fix this by updating cmd_dt_S_dtb to transform all hyphens from the file name to underscores when constructing the labels. As of v4.16-rc2, 1139 .dts files across ARM64, ARM, MIPS and PowerPC contain hyphens in their names, but the issue only currently manifests on Broadcom MIPS platforms, as that is the only place where such files are built into the kernel. For example when CONFIG_DT_NETGEAR_CVG834G=y, or on BMIPS kernels when the dtbs target is used (in the latter case it admittedly shouldn't really build all the dtb.o files, but thats a separate issue). Fixes: 695835511f96 ("MIPS: BMIPS: rename bcm96358nb4ser to bcm6358-neufbox4-sercom") Signed-off-by: James Hogan <jhogan@kernel.org> Reviewed-by: Frank Rowand <frowand.list@gmail.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Kevin Cernekee <cernekee@gmail.com> Cc: <stable@vger.kernel.org> # 4.9+ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-09	scripts/bloat-o-meter: fix typos in help	Matteo Croce	1	-1/+1
	The bloat-o-meter script has two typos in the help, fix both. Fixes: 192efb7a1f9b ("bloat-o-meter: provide 3 different arguments for data, function and All") Signed-off-by: Matteo Croce <mcroce@redhat.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-08	x86/MCE: Serialize sysfs changes	Seunghun Han	1	-1/+21
	The check_interval file in /sys/devices/system/machinecheck/machinecheck<cpu number> directory is a global timer value for MCE polling. If it is changed by one CPU, mce_restart() broadcasts the event to other CPUs to delete and restart the MCE polling timer and __mcheck_cpu_init_timer() reinitializes the mce_timer variable. If more than one CPU writes a specific value to the check_interval file concurrently, mce_timer is not protected from such concurrent accesses and all kinds of explosions happen. Since only root can write to those sysfs variables, the issue is not a big deal security-wise. However, concurrent writes to these configuration variables is void of reason so the proper thing to do is to serialize the access with a mutex. Boris: - Make store_int_with_restart() use device_store_ulong() to filter out negative intervals - Limit min interval to 1 second - Correct locking - Massage commit message Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180302202706.9434-1-kkamagui@gmail.com
2018-03-08	x86/MCE: Save microcode revision in machine check records	Tony Luck	2	-1/+4
	Updating microcode used to be relatively rare. Now that it has become more common we should save the microcode version in a machine check record to make sure that those people looking at the error have this important information bundled with the rest of the logged information. [ Borislav: Simplify a bit. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180301233449.24311-1-tony.luck@intel.com
2018-03-08	xen: xenbus: use put_device() instead of kfree()	Arvind Yadav	1	-1/+4
	Never directly free @dev after calling device_register(), even if it returned an error! Always use put_device() to give up the reference initialized. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2018-03-08	ALSA: hda/realtek - Make dock sound work on ThinkPad L570	Dennis Wassenberg	1	-0/+1
	One version of Lenovo Thinkpad T570 did not use ALC298 (like other Kaby Lake devices). Instead it uses ALC292. In order to make the Lenovo dock working with that codec the dock quirk for ALC292 will be used. Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-08	x86/pti: Fix a comment typo	Seunghun Han	1	-1/+1
	s/visinble/visible/ Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1520397135-132809-1-git-send-email-kkamagui@gmail.com
2018-03-08	ALSA: seq: Remove superfluous snd_seq_queue_client_leave_cells() call	Takashi Iwai	1	-1/+0
	With the previous two fixes for the write / ioctl races: ALSA: seq: Don't allow resizing pool in use ALSA: seq: More protection for concurrent write and ioctl races the cells aren't any longer in queues at the point calling snd_seq_pool_done() in snd_seq_ioctl_set_client_pool(). Hence the function call snd_seq_queue_client_leave_cells() can be dropped safely from there. Suggested-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-08	ALSA: seq: More protection for concurrent write and ioctl races	Takashi Iwai	4	-13/+24
	This patch is an attempt for further hardening against races between the concurrent write and ioctls. The previous fix d15d662e89fc ("ALSA: seq: Fix racy pool initializations") covered the race of the pool initialization at writer and the pool resize ioctl by the client->ioctl_mutex (CVE-2018-1000004). However, basically this mutex should be applied more widely to the whole write operation for avoiding the unexpected pool operations by another thread. The only change outside snd_seq_write() is the additional mutex argument to helper functions, so that we can unlock / relock the given mutex temporarily during schedule() call for blocking write. Fixes: d15d662e89fc ("ALSA: seq: Fix racy pool initializations") Reported-by: 范龙飞 <long7573@126.com> Reported-by: Nicolai Stange <nstange@suse.de> Reviewed-and-tested-by: Nicolai Stange <nstange@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-08	x86/microcode: Synchronize late microcode loading	Ashok Raj	1	-26/+92
	Original idea by Ashok, completely rewritten by Borislav. Before you read any further: the early loading method is still the preferred one and you should always do that. The following patch is improving the late loading mechanism for long running jobs and cloud use cases. Gather all cores and serialize the microcode update on them by doing it one-by-one to make the late update process as reliable as possible and avoid potential issues caused by the microcode update. [ Borislav: Rewrite completely. ] Co-developed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Link: https://lkml.kernel.org/r/20180228102846.13447-8-bp@alien8.de