linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2018-03-13	auxdisplay: img-ascii-lcd: Silence 2 uninitialized warnings	Miguel Ojeda	1	-2/+2
	The warnings are: drivers/auxdisplay/img-ascii-lcd.c: warning: 'err' may be used uninitialized in this function [-Wuninitialized] At lines 109 and 207. Reported by Geert using the build service several times, e.g.: https://lkml.org/lkml/2018/2/19/303 They are two false positives, since num_chars > 0 in the three present configurations (boston, malta, sead3). Initialize to 0 in order to silence the warning. Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Paul Burton <paul.burton@mips.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2018-03-13	auxdisplay: img-ascii-lcd: Fix doc comment to silence warnings	Miguel Ojeda	1	-1/+1
	Compiling with W=1 with gcc 7.2.0 gives 2 warnings: drivers/auxdisplay/img-ascii-lcd.c:233: warning: Function parameter or member 't' not described in 'img_ascii_lcd_scroll' drivers/auxdisplay/img-ascii-lcd.c:233: warning: Excess function parameter 'arg' description in 'img_ascii_lcd_scroll' Cc: Paul Burton <paul.burton@mips.com> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2018-03-13	auxdisplay: panel: Change comments to silence fallthrough warnings	Miguel Ojeda	1	-3/+3
	Compiling with W=1 with gcc 7.2.0 gives 3 warnings like: drivers/auxdisplay/panel.c: In function ‘panel_process_inputs’: drivers/auxdisplay/panel.c:1374:17: warning: this statement may fall through [-Wimplicit-fallthrough=] Cc: Willy Tarreau <w@1wt.eu> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2018-03-13	usb: musb: Fix external abort in musb_remove on omap2430	Merlijn Wajer	1	-1/+1
	This fixes an oops on unbind / module unload (on the musb omap2430 platform). musb_remove function now calls musb_platform_exit before disabling runtime pm. Signed-off-by: Merlijn Wajer <merlijn@wizzup.org> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-13	efi/libstub/tpm: Initialize pointer variables to zero for mixed mode	Ard Biesheuvel	1	-2/+2
	As reported by Jeremy Cline, running the new TPM libstub code in mixed mode (i.e., 64-bit kernel on 32-bit UEFI) results in hangs when invoking the TCG2 protocol, or when accessing the log_tbl pool allocation. The reason turns out to be that in both cases, the 64-bit pointer variables are not fully initialized by the 32-bit EFI code, and so we should take care to zero initialize these variables beforehand, or we'll end up dereferencing bogus pointers. Reported-by: Jeremy Cline <jeremy@jcline.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hdegoede@redhat.com Cc: jarkko.sakkinen@linux.intel.com Cc: javierm@redhat.com Cc: linux-efi@vger.kernel.org Cc: tweek@google.com Link: http://lkml.kernel.org/r/20180313140922.17266-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12	drm/i915: Kick the rps worker when changing the boost frequency	Chris Wilson	1	-2/+8
	The boost frequency is only applied from the RPS worker while someone is waiting on a request and requested a boost. As such, when the user wishes to change the frequency, we have to kick the worker in order to re-evaluate whether to apply the boost frequency. v2: Check num_waiters to decide if we should kick the worker to handle boosting. Fixes: 29ecd78d3b79 ("drm/i915: Define a separate variable and control for RPS waitboost frequency") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180308142648.4016-1-chris@chris-wilson.co.uk (cherry picked from commit 59cd31f177b34deb834a5c97478502741be1cf2e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-03-12	drm/i915: Only prune fences after wait-for-all	Chris Wilson	1	-4/+12
	Currently, we only allow ourselves to prune the fences so long as all the waits completed (i.e. all the fences we checked were signaled), and that the reservation snapshot did not change across the wait. However, if we only waited for a subset of the reservation object, i.e. just waiting for the last writer to complete as opposed to all readers as well, then we would erroneously conclude we could prune the fences as indeed although all of our waits were successful, they did not represent the totality of the reservation object. v2: We only need to check the shared fences due to construction (i.e. all of the shared fences will be later than the exclusive fence, if any). Fixes: e54ca9774777 ("drm/i915: Remove completed fences after a wait") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307171303.29466-1-chris@chris-wilson.co.uk (cherry picked from commit fa73055b8442c97b3ba7cd0aa57cd2ad32124201) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-03-12	drm/i915: Enable VBT based BL control for DP	Mustamin B Mustaffa	1	-7/+3
	Currently, BXT_PP is hardcoded with value '0'. It practically disabled eDP backlight on MRB (BXT) platform. This patch will tell which BXT_PP registers (there are two set of PP_CONTROL in the spec) to be used as defined in VBT (Video Bios Timing table) and this will enabled eDP backlight controller on MRB (BXT) platform. v2: - Remove unnecessary information in commit message. - Assign vbt.backlight.controller to a backlight_controller variable and return the variable value. v3: - Rebased to latest code base. - updated commit title. Signed-off-by: Mustamin B Mustaffa <mustamin.b.mustaffa@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180227030734.37901-1-mustamin.b.mustaffa@intel.com (cherry picked from commit 73c0fcac97bf7f4a6a61b825b205d1cf127cfca7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2018-03-12	ALSA: hda - Revert power_save option default value	Takashi Iwai	1	-3/+6
	With the commit 1ba8f9d30817 ("ALSA: hda: Add a power_save blacklist"), we changed the default value of power_save option to -1 for processing the power-save blacklist. Unfortunately, this seems breaking user-space applications that actually read the power_save parameter value via sysfs and judge / adjust the power-saving status. They see the value -1 as if the power-save is turned off, although the actual value is taken from CONFIG_SND_HDA_POWER_SAVE_DEFAULT and it can be a positive. So, overall, passing -1 there was no good idea. Let's partially revert it -- at least for power_save option default value is restored again to CONFIG_SND_HDA_POWER_SAVE_DEFAULT. Meanwhile, in this patch, we keep the blacklist behavior and make is adjustable via the new option, pm_blacklist. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199073 Fixes: 1ba8f9d30817 ("ALSA: hda: Add a power_save blacklist") Acked-by: Hans de Goede <hdegoede@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-12	x86/cpufeatures: Add Intel PCONFIG cpufeature	Kirill A. Shutemov	1	-0/+1
	CPUID.0x7.0x0:EDX[18] indicates whether Intel CPU support PCONFIG instruction. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kai Huang <kai.huang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180305162610.37510-4-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12	x86/cpufeatures: Add Intel Total Memory Encryption cpufeature	Kirill A. Shutemov	1	-0/+1
	CPUID.0x7.0x0:ECX[13] indicates whether CPU supports Intel Total Memory Encryption. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kai Huang <kai.huang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20180305162610.37510-2-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-12	phy: qcom-ufs: add MODULE_LICENSE tag	Arnd Bergmann	1	-0/+5
	While the specific UFS PHY drivers (14nm and 20nm) have a module license, the common base module does not, leading to a Kbuild failure: WARNING: modpost: missing MODULE_LICENSE() in drivers/phy/qualcomm/phy-qcom-ufs.o FATAL: modpost: GPL-incompatible module phy-qcom-ufs.ko uses GPL-only symbol 'clk_enable' This adds a module description and license tag to fix the build. I added both Yaniv and Vivek as authors here, as Yaniv sent the initial submission, while Vivek did most of the work since. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
2018-03-11	Linux 4.16-rc5	Linus Torvalds	1	-1/+1

2018-03-11	dmaengine: mv_xor_v2: Fix clock resource by adding a register clock	Gregory CLEMENT	2	-6/+25
	On the CP110 components which are present on the Armada 7K/8K SoC we need to explicitly enable the clock for the registers. However it is not needed for the AP8xx component, that's why this clock is optional. With this patch both clock have now a name, but in order to be backward compatible, the name of the first clock is not used. It allows to still use this clock with a device tree using the old binding. Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2018-03-11	irqchip/irq-imx-gpcv2: Remove unused function	Fabio Estevam	1	-14/+0
	imx_gpcv2_get_wakeup_source() is not used anywhere, so remove it. This fixes the following sparse warning: drivers/irqchip/irq-imx-gpcv2.c:34:5: warning: symbol 'imx_gpcv2_get_wakeup_source' was not declared. Should it be static? Fixes: e324c4dc4a59 ("irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources") Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-11	irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis	Ard Biesheuvel	1	-5/+4
	When struct its_device instances are created, the nr_ites member will be set to a power of 2 that equals or exceeds the requested number of MSIs passed to the msi_prepare() callback. At the same time, the LPI map is allocated to be some multiple of 32 in size, where the allocated size may be less than the requested size depending on whether a contiguous range of sufficient size is available in the global LPI bitmap. This may result in the situation where the nr_ites < nr_lpis, and since nr_ites is what we program into the hardware when we map the device, the additional LPIs will be non-functional. For bog standard hardware, this does not really matter. However, in cases where ITS device IDs are shared between different PCIe devices, we may end up allocating these additional LPIs without taking into account that they don't actually work. So let's make nr_ites at least 32. This ensures that all allocated LPIs are 'live', and that its_alloc_device_irq() will fail when attempts are made to allocate MSIs beyond what was allocated in the first place. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> [maz: updated comment] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2018-03-11	ALSA: pcm: Fix UAF in snd_pcm_oss_get_formats()	Takashi Iwai	1	-4/+6
	snd_pcm_oss_get_formats() has an obvious use-after-free around snd_mask_test() calls, as spotted by syzbot. The passed format_mask argument is a pointer to the hw_params object that is freed before the loop. What a surprise that it has been present since the original code of decades ago... Reported-by: syzbot+4090700a4f13fccaf648@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-10	ALSA: seq: Clear client entry before deleting else at closing	Takashi Iwai	1	-2/+2
	When releasing a client, we need to clear the clienttab[] entry at first, then call snd_seq_queue_client_leave(). Otherwise, the in-flight cell in the queue might be picked up by the timer interrupt via snd_seq_check_queue() before calling snd_seq_queue_client_leave(), and it's delivered to another queue while the client is clearing queues. This may eventually result in an uncleared cell remaining in a queue, and the later snd_seq_pool_delete() may need to wait for a long time until the event gets really processed. By moving the clienttab[] clearance at the beginning of release, any event delivery of a cell belonging to this client will fail at a later point, since snd_seq_client_ptr() returns NULL. Thus the cell that was picked up by the timer interrupt will be returned immediately without further delivery, and the long stall of snd_seq_delete_pool() can be avoided, too. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-10	ALSA: seq: Fix possible UAF in snd_seq_check_queue()	Takashi Iwai	3	-37/+25
	Although we've covered the races between concurrent write() and ioctl() in the previous patch series, there is still a possible UAF in the following scenario: A: user client closed B: timer irq -> snd_seq_release() -> snd_seq_timer_interrupt() -> snd_seq_free_client() -> snd_seq_check_queue() -> cell = snd_seq_prioq_cell_peek() -> snd_seq_prioq_leave() .... removing all cells -> snd_seq_pool_done() .... vfree() -> snd_seq_compare_tick_time(cell) ... Oops So the problem is that a cell is peeked and accessed without any protection until it's retrieved from the queue again via snd_seq_prioq_cell_out(). This patch tries to address it, also cleans up the code by a slight refactoring. snd_seq_prioq_cell_out() now receives an extra pointer argument. When it's non-NULL, the function checks the event timestamp with the given pointer. The caller needs to pass the right reference either to snd_seq_tick or snd_seq_realtime depending on the event timestamp type. A good news is that the above change allows us to remove the snd_seq_prioq_cell_peek(), too, thus the patch actually reduces the code size. Reviewed-by: Nicolai Stange <nstange@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-09	lib/test_kmod.c: fix limit check on number of test devices created	Luis R. Rodriguez	1	-1/+1
	As reported by Dan the parentheses is in the wrong place, and since unlikely() call returns either 0 or 1 it's never less than zero. The second issue is that signed integer overflows like "INT_MAX + 1" are undefined behavior. Since num_test_devs represents the number of devices, we want to stop prior to hitting the max, and not rely on the wrap arround at all. So just cap at num_test_devs + 1, prior to assigning a new device. Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	selftests/vm/run_vmtests: adjust hugetlb size according to nr_cpus	Li Zhijian	1	-8/+17
	Fix userfaultfd_hugetlb on hosts which have more than 64 cpus. --------------------------- running userfaultfd_hugetlb --------------------------- invalid MiB Usage: <MiB> <bounces> [FAIL] Via userfaultfd.c we can know, hugetlb_size needs to meet hugetlb_size >= nr_cpus * hugepage_size. hugepage_size is often 2M, so when host cpus > 64, it requires more than 128M. [zhijianx.li@intel.com: update changelog/comments and variable name] Link: http://lkml.kernel.org/r/20180302024356.83359-1-zhijianx.li@intel.com Link: http://lkml.kernel.org/r/20180303125027.81638-1-zhijianx.li@intel.com Link: http://lkml.kernel.org/r/20180302024356.83359-1-zhijianx.li@intel.com Signed-off-by: Li Zhijian <zhijianx.li@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: SeongJae Park <sj38.park@gmail.com> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	mm/page_alloc: fix memmap_init_zone pageblock alignment	Daniel Vacek	1	-2/+7
	Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") introduced a bug where move_freepages() triggers a VM_BUG_ON() on uninitialized page structure due to pageblock alignment. To fix this, simply align the skipped pfns in memmap_init_zone() the same way as in move_freepages_block(). Seen in one of the RHEL reports: crash> log \| grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1 kernel BUG at mm/page_alloc.c:1389! invalid opcode: 0000 [#1] SMP -- RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160 RSP: 0018:ffff88054d727688 EFLAGS: 00010087 -- Call Trace: [<ffffffff811883b3>] move_freepages_block+0x73/0x80 [<ffffffff81189e63>] __rmqueue+0x263/0x460 [<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0 [<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420 -- RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160 RSP <ffff88054d727688> crash> page_init_bug -v \| grep RAM <struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB) <struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB) <struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB) <struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB) <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB) crash> page_init_bug \| head -6 <struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB) <struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 <struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0> <struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095 <struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575 BUG, zones differ! Note that this range follows two not populated sections 68000000-77ffffff in this zone. 7b788000-7b7fffff is the first one after a gap. This makes memmap_init_zone() skip all the pfns up to the beginning of this range. But this range is not pageblock (2M) aligned. In fact no range has to be. crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0001e00000 78000000 0 0 0 0 ffffea0001ed7fc0 7b5ff000 0 0 0 0 ffffea0001ed8000 7b600000 0 0 0 0 <<<< ffffea0001ede1c0 7b787000 0 0 0 0 ffffea0001ede200 7b788000 0 0 1 1fffff00000000 Top part of page flags should contain nodeid and zonenr, which is not the case for page ffffea0001ed8000 here (<<<<). crash> log \| grep -o fffea0001ed[^\ ]* \| sort -u fffea0001ed8000 fffea0001eded20 fffea0001edffc0 crash> bt -r \| grep -o fffea0001ed[^\ ]* \| sort -u fffea0001ed8000 fffea0001eded00 fffea0001eded20 fffea0001edffc0 Initialization of the whole beginning of the section is skipped up to the start of the range due to the commit b92df1de5d28. Now any code calling move_freepages_block() (like reusing the page from a freelist as in this example) with a page from the beginning of the range will get the page rounded down to start_page ffffea0001ed8000 and passed to move_freepages() which crashes on assertion getting wrong zonenr. > VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); Note, page_zone() derives the zone from page flags here. From similar machine before commit b92df1de5d28: crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS fffff73941e00000 78000000 0 0 1 1fffff00000000 fffff73941ed7fc0 7b5ff000 0 0 1 1fffff00000000 fffff73941ed8000 7b600000 0 0 1 1fffff00000000 fffff73941edff80 7b7fe000 0 0 1 1fffff00000000 fffff73941edffc0 7b7ff000 ffff8e67e04d3ae0 ad84 1 1fffff00020068 uptodate,lru,active,mappedtodisk All the pages since the beginning of the section are initialized. move_freepages()' not gonna blow up. The same machine with this fix applied: crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS ffffea0001e00000 78000000 0 0 0 0 ffffea0001e00000 7b5ff000 0 0 0 0 ffffea0001ed8000 7b600000 0 0 1 1fffff00000000 ffffea0001edff80 7b7fe000 0 0 1 1fffff00000000 ffffea0001edffc0 7b7ff000 ffff88017fb13720 8 2 1fffff00020068 uptodate,lru,active,mappedtodisk At least the bare minimum of pages is initialized preventing the crash as well. Customers started to report this as soon as 7.4 (where b92df1de5d28 was merged in RHEL) was released. I remember reports from September/October-ish times. It's not easily reproduced and happens on a handful of machines only. I guess that's why. But that does not make it less serious, I think. Though there actually is a report here: https://bugzilla.kernel.org/show_bug.cgi?id=196443 And there are reports for Fedora from July: https://bugzilla.redhat.com/show_bug.cgi?id=1473242 and CentOS: https://bugs.centos.org/view.php?id=13964 and we internally track several dozens reports for RHEL bug https://bugzilla.redhat.com/show_bug.cgi?id=1525121 Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.1520011945.git.neelx@redhat.com Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible") Signed-off-by: Daniel Vacek <neelx@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	mm/memblock.c: hardcode the end_pfn being -1	Daniel Vacek	1	-5/+5
	This is just a cleanup. It aids handling the special end case in the next commit. [akpm@linux-foundation.org: make it work against current -linus, not against -mm] [akpm@linux-foundation.org: make it work against current -linus, not against -mm some more] Link: http://lkml.kernel.org/r/1ca478d4269125a99bcfb1ca04d7b88ac1aee924.1520011944.git.neelx@redhat.com Signed-off-by: Daniel Vacek <neelx@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	mm/gup.c: teach get_user_pages_unlocked to handle FOLL_NOWAIT	Andrea Arcangeli	1	-2/+5
	KVM is hanging during postcopy live migration with userfaultfd because get_user_pages_unlocked is not capable to handle FOLL_NOWAIT. Earlier FOLL_NOWAIT was only ever passed to get_user_pages. Specifically faultin_page (the callee of get_user_pages_unlocked caller) doesn't know that if FAULT_FLAG_RETRY_NOWAIT was set in the page fault flags, when VM_FAULT_RETRY is returned, the mmap_sem wasn't actually released (even if nonblocking is not NULL). So it sets *nonblocking to zero and the caller won't release the mmap_sem thinking it was already released, but it wasn't because of FOLL_NOWAIT. Link: http://lkml.kernel.org/r/20180302174343.5421-2-aarcange@redhat.com Fixes: ce53053ce378c ("kvm: switch get_user_page_nowait() to get_user_pages_unlocked()") Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	lib/bug.c: exclude non-BUG/WARN exceptions from report_bug()	Kees Cook	1	-0/+2
	Commit b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") changed the ordering of fixups, and did not take into account the case of x86 processing non-WARN() and non-BUG() exceptions. This would lead to output of a false BUG line with no other information. In the case of a refcount exception, it would be immediately followed by the refcount WARN(), producing very strange double-"cut here": lkdtm: attempting bad refcount_inc() overflow ------------[ cut here ]------------ Kernel BUG at 0000000065f29de5 [verbose debug info unavailable] ------------[ cut here ]------------ refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0 WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4 ... In the prior ordering, exceptions were searched first: do_trap_no_signal(struct task_struct tsk, int trapnr, char str, ... if (fixup_exception(regs, trapnr)) return 0; - if (fixup_bug(regs, trapnr)) - return 0; - As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account needing to search the exception list first, since that had already happened. So, instead of searching the exception list twice (once in is_valid_bugaddr() and then again in fixup_exception()), just add a simple sanity check to report_bug() that will immediately bail out if a BUG() (or WARN()) entry is not found. Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash") Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	bug: use %pB in BUG and stack protector failure	Kees Cook	2	-2/+2
	The BUG and stack protector reports were still using a raw %p. This changes it to %pB for more meaningful output. Link: http://lkml.kernel.org/r/20180301225704.GA34198@beast Fixes: ad67b74d2469 ("printk: hash addresses printed with %p") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Weinberger <richard.weinberger@gmail.com>, Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	hugetlb: fix surplus pages accounting	Michal Hocko	1	-1/+1
	Dan Rue has noticed that libhugetlbfs test suite fails counter test: # mount_point="/mnt/hugetlb/" # echo 200 > /proc/sys/vm/nr_hugepages # mkdir -p "${mount_point}" # mount -t hugetlbfs hugetlbfs "${mount_point}" # export LD_LIBRARY_PATH=/root/libhugetlbfs/libhugetlbfs-2.20/obj64 # /root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters Starting testcase "/root/libhugetlbfs/libhugetlbfs-2.20/tests/obj64/counters", pid 3319 Base pool size: 0 Clean... FAIL Line 326: Bad HugePages_Total: expected 0, actual 1 The bug was bisected to 0c397daea1d4 ("mm, hugetlb: further simplify hugetlb allocation API"). The reason is that alloc_surplus_huge_page() misaccounts per node surplus pages. We should increase surplus_huge_pages_node rather than nr_huge_pages_node which is already handled by alloc_fresh_huge_page. Link: http://lkml.kernel.org/r/20180221191439.GM2231@dhcp22.suse.cz Fixes: 0c397daea1d4 ("mm, hugetlb: further simplify hugetlb allocation API") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Dan Rue <dan.rue@linaro.org> Tested-by: Dan Rue <dan.rue@linaro.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-09	RDMA/mlx5: Fix integer overflow while resizing CQ	Leon Romanovsky	1	-1/+6
	The user can provide very large cqe_size which will cause to integer overflow as it can be seen in the following UBSAN warning: ======================================================================= UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/cq.c:1192:53 signed integer overflow: 64870 * 65536 cannot be represented in type 'int' CPU: 0 PID: 267 Comm: syzkaller605279 Not tainted 4.15.0+ #90 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xde/0x164 ? dma_virt_map_sg+0x22c/0x22c ubsan_epilogue+0xe/0x81 handle_overflow+0x1f3/0x251 ? __ubsan_handle_negate_overflow+0x19b/0x19b ? lock_acquire+0x440/0x440 mlx5_ib_resize_cq+0x17e7/0x1e40 ? cyc2ns_read_end+0x10/0x10 ? native_read_msr_safe+0x6c/0x9b ? cyc2ns_read_end+0x10/0x10 ? mlx5_ib_modify_cq+0x220/0x220 ? sched_clock_cpu+0x18/0x200 ? lookup_get_idr_uobject+0x200/0x200 ? rdma_lookup_get_uobject+0x145/0x2f0 ib_uverbs_resize_cq+0x207/0x3e0 ? ib_uverbs_ex_create_cq+0x250/0x250 ib_uverbs_write+0x7f9/0xef0 ? cyc2ns_read_end+0x10/0x10 ? print_irqtrace_events+0x280/0x280 ? ib_uverbs_ex_create_cq+0x250/0x250 ? uverbs_devnode+0x110/0x110 ? sched_clock_cpu+0x18/0x200 ? do_raw_spin_trylock+0x100/0x100 ? __lru_cache_add+0x16e/0x290 __vfs_write+0x10d/0x700 ? uverbs_devnode+0x110/0x110 ? kernel_read+0x170/0x170 ? sched_clock_cpu+0x18/0x200 ? security_file_permission+0x93/0x260 vfs_write+0x1b0/0x550 SyS_write+0xc7/0x1a0 ? SyS_read+0x1a0/0x1a0 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL_64_fastpath+0x1e/0x8b RIP: 0033:0x433549 RSP: 002b:00007ffe63bd1ea8 EFLAGS: 00000217 ======================================================================= Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 3.13 Fixes: bde51583f49b ("IB/mlx5: Add support for resize CQ") Reported-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09	Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"	Doug Ledford	1	-6/+1
	The original commit of this patch has a munged log message that is missing several of the tags the original author intended to be on the patch. This was due to patchworks misinterpreting a cut-n-paste separator line as an end of message line and munging the mbox that was used to import the patch: https://patchwork.kernel.org/patch/10264089/ The original patch will be reapplied with a fixed commit message so the proper tags are applied. This reverts commit aa0de36a40f446f5a21a7c1e677b98206e242edb. Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09	usb: typec: tcpm: fusb302: Do not log an error on -EPROBE_DEFER	Hans de Goede	1	-1/+2
	Do not log an error if tcpm_register_port() fails with -EPROBE_DEFER. Fixes: cf140a356971 ("typec: fusb302: Use dev_err during probe") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM	Fredrik Noring	1	-1/+2
	Scatter-gather needs to be disabled when using dma_declare_coherent_memory and HCD_LOCAL_MEM. Andrea Righi made the equivalent fix for EHCI drivers in commit 4307a28eb01284 "USB: EHCI: fix NULL pointer dererence in HCDs that use HCD_LOCAL_MEM". The following NULL pointer WARN_ON_ONCE triggered with OHCI drivers: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 49 at drivers/usb/core/hcd.c:1379 hcd_alloc_coherent+0x4c/0xc8 Modules linked in: CPU: 0 PID: 49 Comm: usb-storage Not tainted 4.15.0+ #1014 Stack : 00000000 00000000 805a78d2 0000003a 81f5c2cc 8053d367 804d77fc 00000031 805a3a08 00000563 81ee9400 805a0000 00000000 10058c00 81f61b10 805c0000 00000000 00000000 805a0000 00d9038e 00000004 803ee818 00000006 312e3420 805c0000 00000000 00000073 81f61958 00000000 00000000 802eb380 804fd538 00000009 00000563 81ee9400 805a0000 00000002 80056148 00000000 805a0000 ... Call Trace: [<578af360>] show_stack+0x74/0x104 [<2f3702c6>] __warn+0x118/0x120 [<ae93fc9e>] warn_slowpath_null+0x44/0x58 [<a891a517>] hcd_alloc_coherent+0x4c/0xc8 [<3578fa36>] usb_hcd_map_urb_for_dma+0x4d8/0x534 [<110bc94c>] usb_hcd_submit_urb+0x82c/0x834 [<02eb5baf>] usb_sg_wait+0x14c/0x1a0 [<ccd09e85>] usb_stor_bulk_transfer_sglist.part.1+0xac/0x124 [<87a5c34c>] usb_stor_bulk_srb+0x40/0x60 [<ff1792ac>] usb_stor_Bulk_transport+0x160/0x37c [<b9e2709c>] usb_stor_invoke_transport+0x3c/0x500 [<004754f4>] usb_stor_control_thread+0x258/0x28c [<22edf42e>] kthread+0x134/0x13c [<a419ffd0>] ret_from_kernel_thread+0x14/0x1c ---[ end trace bcdb825805eefdcc ]--- Signed-off-by: Fredrik Noring <noring@nocrew.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	usbip: vudc: fix null pointer dereference on udc->lock	Colin Ian King	1	-2/+6
	Currently the driver attempts to spin lock on udc->lock before a NULL pointer check is performed on udc, hence there is a potential null pointer dereference on udc->lock. Fix this by moving the null check on udc before the lock occurs. Fixes: ea6873a45a22 ("usbip: vudc: Add SysFS infrastructure for VUDC") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Reviewed-by: Krzysztof Opasiak <k.opasiak@samsung.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery	Marc Zyngier	1	-2/+2
	A recent update to the ARM SMCCC ARCH_WORKAROUND_1 specification allows firmware to return a non zero, positive value to describe that although the mitigation is implemented at the higher exception level, the CPU on which the call is made is not affected. Let's relax the check on the return value from ARCH_WORKAROUND_1 so that we only error out if the returned value is negative. Fixes: b092201e0020 ("arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-03-09	Documentation/sphinx: Fix Directive import error	Matthew Wilcox	1	-2/+1
	Sphinx 1.7 removed sphinx.util.compat.Directive so people who have upgraded cannot build the documentation. Switch to docutils.parsers.rst.Directive which has been available since docutils 0.5 released in 2009. Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1083694 Co-developed-by: Takashi Iwai <tiwai@suse.de> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2018-03-09	platform/x86: dell-smbios: Resolve dependency error on DCDBAS	Darren Hart (VMware)	1	-0/+6
	When the DELL_SMBIOS_SMM backend is enabled, the DELL_SMBIOS symbol depends on DELL_DCDBAS, and we must avoid the situation where DELL_SMBIOS=y and DCDBAS=m. Adding the conditional dependency to DELL_SMBIOS such as: depends !DELL_SMBIOS_SMM \|\| (DCDBAS \|\| DCDBAS=n) results in the Kconfig tooling complaining about a circular dependency, although it appears to work in practice. Avoid the errors by simplifying the dependency and forcing DELL_SMBIOS to be <= DCDBAS if DCDBAS is enabled (thanks to Greg KH for the suggestion). Cc: Mario.Limonciello@dell.com Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: Allow for SMBIOS backend defaults	Darren Hart (VMware)	1	-2/+4
	Avoid accidental configurations by setting default y for DELL_SMBIOS backends. Avoid this impacting the default build size, by making them dependent on DELL_SMBIOS, so they only appear when DELL_SMBIOS is manually selected, or by DELL_LAPTOP or DELL_WMI. While DELL_SMBIOS does have a prompt, it does not have any dependencies. Keeping DELL_SMBIOS visible, despite being "select"ed by DELL_LAPTOP and DELL_WMI, is a deliberate choice to provide context for the WMI and SMM backends, which would otherwise appear to float without context within the menu. Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: dell-smbios: Link all dell-smbios-* modules together	Mario Limonciello	6	-33/+66
	Some race conditions were raised due to dell-smbios and its backends not being ready by the time that a consumer would call one of the exported methods. To avoid this problem, guarantee that all initialization has been done by linking them all together and running init for them all. As part of this change the Kconfig needs to be adjusted so that CONFIG_DELL_SMBIOS_SMM and CONFIG_DELL_SMBIOS_WMI are boolean rather than modules. CONFIG_DELL_SMBIOS is a visually selectable option again and both CONFIG_DELL_SMBIOS_WMI and CONFIG_DELL_SMBIOS_SMM are optional. Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> [dvhart: Update prompt and help text for DELL_SMBIOS_* backends] Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: dell-smbios: Rename dell-smbios source to dell-smbios-base	Mario Limonciello	2	-0/+1
	This is being done to faciliate a later change to link all the dell-smbios drivers together. Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	platform/x86: dell-smbios: Correct some style warnings	Mario Limonciello	1	-3/+5
	WARNING: function definition argument 'struct calling_interface_buffer ' should also have an identifier name + int (call_fn)(struct calling_interface_buffer ); WARNING: Block comments use on subsequent lines + /* 4 bytes of table header, plus 7 bytes of Dell header, plus at least + 6 bytes of entry / WARNING: Block comments use a trailing / on a separate line + 6 bytes of entry */ Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-03-09	xhci: Fix front USB ports on ASUS PRIME B350M-A	Kai-Heng Feng	3	-0/+7
	When a USB device gets plugged on ASUS PRIME B350M-A's front ports, the xHC stops working: [ 549.114587] xhci_hcd 0000:02:00.0: WARN: xHC CMD_RUN timeout [ 549.114608] suspend_common(): xhci_pci_suspend+0x0/0xc0 returns -110 [ 549.114638] xhci_hcd 0000:02:00.0: can't suspend (hcd_pci_runtime_suspend returned -110) Delay before running xHC command CMD_RUN can workaround the issue. Use a new quirk to make the delay only targets to the affected xHC. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	usb: host: xhci-plat: revert "usb: host: xhci-plat: enable clk in resume timing"	Yoshihiro Shimoda	1	-10/+1
	This patch reverts the commit 835e4241e714 ("usb: host: xhci-plat: enable clk in resume timing") because this driver also has runtime PM and the commit 560869100b99 ("clk: renesas: cpg-mssr: Restore module clocks during resume") will restore the clock on R-Car H3 environment. If the xhci_plat_suspend() disables the clk, the system cannot enable the clk in resume like the following behavior: < In resume > - genpd_resume_noirq() runs and enable the clk (enable_count = 1) - cpg_mssr_resume_noirq() restores the clk register. -- Since the clk was disabled in suspend, cpg_mssr_resume_noirq() will disable the clk and keep the enable_count. - Even if xhci_plat_resume() calls clk_prepare_enable(), since the enable_count is 1, the clk will be not enabled. After this patch is applied, the cpg-mssr driver will save the clk as enable, so the clk will be enabled in resume. Fixes: 835e4241e714 ("usb: host: xhci-plat: enable clk in resume timing") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	ASoC: amd: 16bit resolution support for i2s sp instance	Vijendar Mukunda	2	-7/+11
	Moved 16bit resolution condition check for stoney platform to acp_hw_params.Depending upon substream required register value need to be programmed rather than enabling 16bit resolution support all time in acp init. Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-03-09	usb: usbmon: Read text within supplied buffer size	Pete Zaitcev	1	-48/+78
	This change fixes buffer overflows and silent data corruption with the usbmon device driver text file read operations. Signed-off-by: Fredrik Noring <noring@nocrew.org> Signed-off-by: Pete Zaitcev <zaitcev@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	loop: Fix lost writes caused by missing flag	Ross Zwisler	1	-1/+1
	The following commit: commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC") replaced __do_lo_send_write(), which used ITER_KVEC iterators, with lo_write_bvec() which uses ITER_BVEC iterators. In this change, though, the WRITE flag was lost: - iov_iter_kvec(&from, ITER_KVEC \| WRITE, &kvec, 1, len); + iov_iter_bvec(&i, ITER_BVEC, bvec, 1, bvec->bv_len); This flag is necessary for the DAX case because we make decisions based on whether or not the iterator is a READ or a WRITE in dax_iomap_actor() and in dax_iomap_rw(). We end up going through this path in configurations where we combine a PMEM device with 4k sectors, a loopback device and DAX. The consequence of this missed flag is that what we intend as a write actually turns into a read in the DAX code, so no data is ever written. The very simplest test case is to create a loopback device and try and write a small string to it, then hexdump a few bytes of the device to see if the write took. Without this patch you read back all zeros, with this you read back the string you wrote. For XFS this causes us to fail or panic during the following xfstests: xfs/074 xfs/078 xfs/216 xfs/217 xfs/250 For ext4 we have a similar issue where writes never happen, but we don't currently have any xfstests that use loopback and show this issue. Fix this by restoring the WRITE flag argument to iov_iter_bvec(). This causes the xfstests to all pass. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Fixes: commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-09	drm/i915/gvt: keep oa config in shadow ctx	Min He	2	-0/+54
	When populating shadow ctx from guest, we should handle oa related registers in hw ctx, so that they will not be overlapped by guest oa configs. This patch made it possible to capture oa data from host for both host and guests. Signed-off-by: Min He <min.he@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2018-03-09	drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio	Xiong Zhang	1	-0/+2
	If user continuously create vgpu, boot guest, shoutdown guest and destroy vgpu from remote, the following calltrace exists in dmesg sometimes: [ 6412.954721] RPM wakelock ref not held during HW access [ 6412.954795] WARNING: CPU: 7 PID: 11941 at linux/drivers/gpu/drm/i915/intel_drv.h:1800 intel_uncore_forcewake_get.part.7+0x96/0xa0 [i915] [ 6412.954915] Call Trace: [ 6412.954951] intel_uncore_forcewake_get+0x18/0x20 [i915] [ 6412.954989] intel_gvt_switch_mmio+0x8e/0x770 [i915] [ 6412.954996] ? __slab_free+0x14d/0x2c0 [ 6412.955001] ? __slab_free+0x14d/0x2c0 [ 6412.955006] ? __slab_free+0x14d/0x2c0 [ 6412.955041] intel_vgpu_stop_schedule+0x92/0xd0 [i915] [ 6412.955073] intel_gvt_deactivate_vgpu+0x48/0x60 [i915] [ 6412.955078] __intel_vgpu_release+0x55/0x260 [kvmgt] when this happens, gvt_switch_mmio is called at vgpu destroy, host i915 is idle and doesn't hold RPM wakelock, igd is in powersave mode, but gvt_switch_mmio require igd power on to access register, so intel_runtime_pm_get should be added to make sure igd power on before gvt_switch_mmio. v2: Move runtime_pm_get/put into gvt_switch_mmio.(Zhenyu) Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
2018-03-09	clocksource/atmel-st: Add 'depends on HAS_IOMEM' to fix unmet dependency	Masahiro Yamada	1	-0/+1
	The ATMEL_ST config selects MFD_SYSCON, but does not depend on HAS_IOMEM. Compile testing on architecture without HAS_IOMEM causes "unmet direct dependencies" in Kconfig phase. Detected by "make ARCH=score allyesconfig". Add the proper dependency to the ATMEL_ST config. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/1520335233-11277-1-git-send-email-yamada.masahiro@socionext.com
2018-03-09	rtmutex: Make rt_mutex_futex_unlock() safe for irq-off callsites	Boqun Feng	1	-2/+3
	When running rcutorture with TREE03 config, CONFIG_PROVE_LOCKING=y, and kernel cmdline argument "rcutorture.gp_exp=1", lockdep reports a HARDIRQ-safe->HARDIRQ-unsafe deadlock: ================================ WARNING: inconsistent lock state 4.16.0-rc4+ #1 Not tainted -------------------------------- inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. takes: __schedule+0xbe/0xaf0 {IN-HARDIRQ-W} state was registered at: _raw_spin_lock+0x2a/0x40 scheduler_tick+0x47/0xf0 ... other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&rq->lock); <Interrupt> lock(&rq->lock); * DEADLOCK * 1 lock held by rcu_torture_rea/724: rcu_torture_read_lock+0x0/0x70 stack backtrace: CPU: 2 PID: 724 Comm: rcu_torture_rea Not tainted 4.16.0-rc4+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: lock_acquire+0x90/0x200 ? __schedule+0xbe/0xaf0 _raw_spin_lock+0x2a/0x40 ? __schedule+0xbe/0xaf0 __schedule+0xbe/0xaf0 preempt_schedule_irq+0x2f/0x60 retint_kernel+0x1b/0x2d RIP: 0010:rcu_read_unlock_special+0x0/0x680 ? rcu_torture_read_unlock+0x60/0x60 __rcu_read_unlock+0x64/0x70 rcu_torture_read_unlock+0x17/0x60 rcu_torture_reader+0x275/0x450 ? rcutorture_booster_init+0x110/0x110 ? rcu_torture_stall+0x230/0x230 ? kthread+0x10e/0x130 kthread+0x10e/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ? call_usermodehelper_exec_async+0x11a/0x150 ret_from_fork+0x3a/0x50 This happens with the following even sequence: preempt_schedule_irq(); local_irq_enable(); __schedule(): local_irq_disable(); // irq off ... rcu_note_context_switch(): rcu_note_preempt_context_switch(): rcu_read_unlock_special(): local_irq_save(flags); ... raw_spin_unlock_irqrestore(...,flags); // irq remains off rt_mutex_futex_unlock(): raw_spin_lock_irq(); ... raw_spin_unlock_irq(); // accidentally set irq on <return to __schedule()> rq_lock(): raw_spin_lock(); // acquiring rq->lock with irq on which means rq->lock becomes a HARDIRQ-unsafe lock, which can cause deadlocks in scheduler code. This problem was introduced by commit 02a7c234e540 ("rcu: Suppress lockdep false-positive ->boost_mtx complaints"). That brought the user of rt_mutex_futex_unlock() with irq off. To fix this, replace the lock_irq() in rt_mutex_futex_unlock() with lock_irq{save,restore}() to make it safe to call rt_mutex_futex_unlock() with irq off. Fixes: 02a7c234e540 ("rcu: Suppress lockdep false-positive ->boost_mtx complaints") Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20180309065630.8283-1-boqun.feng@gmail.com
2018-03-09	x86/kprobes: Fix kernel crash when probing .entry_trampoline code	Francis Deslauriers	3	-1/+12
	Disable the kprobe probing of the entry trampoline: .entry_trampoline is a code area that is used to ensure page table isolation between userspace and kernelspace. At the beginning of the execution of the trampoline, we load the kernel's CR3 register. This has the effect of enabling the translation of the kernel virtual addresses to physical addresses. Before this happens most kernel addresses can not be translated because the running process' CR3 is still used. If a kprobe is placed on the trampoline code before that change of the CR3 register happens the kernel crashes because int3 handling pages are not accessible. To fix this, add the .entry_trampoline section to the kprobe blacklist to prohibit the probing of code before all the kernel pages are accessible. Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: mathieu.desnoyers@efficios.com Cc: mhiramat@kernel.org Link: http://lkml.kernel.org/r/1520565492-4637-2-git-send-email-francis.deslauriers@efficios.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-09	perf/core: Fix ctx_event_type in ctx_resched()	Song Liu	1	-1/+3
	In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is added. However, ctx_resched() calculates ctx_event_type before checking this condition. As a result, pinned events will NOT get higher priority than flexible events. The following shows this issue on an Intel CPU (where ref-cycles can only use one hardware counter). 1. First start: perf stat -C 0 -e ref-cycles -I 1000 2. Then, in the second console, run: perf stat -C 0 -e ref-cycles:D -I 1000 The second perf uses pinned events, which is expected to have higher priority. However, because it failed in ctx_resched(). It is never run. This patch fixes this by calculating ctx_event_type after re-evaluating event_type. Reported-by: Ephraim Park <ephiepark@fb.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <jolsa@redhat.com> Cc: <kernel-team@fb.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubraving@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org>