linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-12-29	ocfs2/dlm: clear migration_pending when migration target goes down	xuejiufei	1	-0/+2
	We have found a BUG on res->migration_pending when migrating lock resources. The situation is as follows. dlm_mark_lockres_migration res->migration_pending = 1; __dlm_lockres_reserve_ast dlm_lockres_release_ast returns with res->migration_pending remains because other threads reserve asts wait dlm_migration_can_proceed returns 1 >>>>>>> o2hb found that target goes down and remove target from domain_map dlm_migration_can_proceed returns 1 dlm_mark_lockres_migrating returns -ESHOTDOWN with res->migration_pending still remains. When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON with res->migration_pending. So clear migration_pending when target is down. Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()	Andrew Banman	1	-12/+19
	test_pages_in_a_zone() does not account for the possibility of missing sections in the given pfn range. pfn_valid_within always returns 1 when CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing sections to pass the test, leading to a kernel oops. Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check for missing sections before proceeding into the zone-check code. This also prevents a crash from offlining memory devices with missing sections. Despite this, it may be a good idea to keep the related patch '[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with missing sections' because missing sections in a memory block may lead to other problems not covered by the scope of this fix. Signed-off-by: Andrew Banman <abanman@sgi.com> Acked-by: Alex Thorlton <athorlton@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: Seth Jennings <sjennings@variantweb.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	ocfs2: fix flock panic issue	Junxiao Bi	1	-1/+4
	Commit 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()") move flock/posix lock indentify code to locks_lock_inode_wait(), but missed to set fl_flags to FL_FLOCK which caused the following kernel panic on 4.4.0_rc5. kernel BUG at fs/locks.c:1895! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2(O) ocfs2_dlmfs(O) ocfs2_stack_o2cb(O) ocfs2_dlm(O) ocfs2_nodemanager(O) ocfs2_stackglue(O) iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xen_kbdfront xen_netfront xen_fbfront xen_blkfront CPU: 0 PID: 20268 Comm: flock_unit_test Tainted: G O 4.4.0-rc5-next-20151217 #1 Hardware name: Xen HVM domU, BIOS 4.3.1OVM 05/14/2014 task: ffff88007b3672c0 ti: ffff880028b58000 task.ti: ffff880028b58000 RIP: locks_lock_inode_wait+0x2e/0x160 Call Trace: ocfs2_do_flock+0x91/0x160 [ocfs2] ocfs2_flock+0x76/0xd0 [ocfs2] SyS_flock+0x10f/0x1a0 entry_SYSCALL_64_fastpath+0x12/0x71 Code: e5 41 57 41 56 49 89 fe 41 55 41 54 53 48 89 f3 48 81 ec 88 00 00 00 8b 46 40 83 e0 03 83 f8 01 0f 84 ad 00 00 00 83 f8 02 74 04 <0f> 0b eb fe 4c 8d ad 60 ff ff ff 4c 8d 7b 58 e8 0e 8e 73 00 4d RIP locks_lock_inode_wait+0x2e/0x160 RSP <ffff880028b5bce8> ---[ end trace dfca74ec9b5b274c ]--- Fixes: 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	m32r: add io*_rep helpers	Sudip Mukherjee	1	-1/+9
	m32r allmodconfig was failing with the error: error: implicit declaration of function 'read' On checking io.h it turned out that 'read' is not defined but 'readb' is defined and 'ioread8' will then obviously mean 'readb'. At the same time some of the helper functions ioreadN_rep() and iowriteN_rep() were missing which also led to the build failure. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	m32r: fix build failure	Sudip Mukherjee	1	-0/+1
	m32r allmodconfig is failing with: In file included from ../include/linux/kvm_para.h:4:0, from ../kernel/watchdog.c:26: ../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory kvm_para.h was not included in the build. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	arch/x86/xen/suspend.c: include xen/xen.h	Andrew Morton	1	-0/+1
	Fix the build warning: arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend': arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration] if (xen_pv_domain()) ^ Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	mm: memcontrol: fix possible memcg leak due to interrupted reclaim	Vladimir Davydov	1	-14/+46
	Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break() once enough pages have been reclaimed, in which case, in contrast to a full round-trip over a cgroup sub-tree, the current position stored in mem_cgroup_reclaim_iter of the target cgroup does not get invalidated and so is left holding the reference to the last scanned cgroup. If the target cgroup does not get scanned again (we might have just reclaimed the last page or all processes might exit and free their memory voluntary), we will leak it, because there is nobody to put the reference held by the iterator. The problem is easy to reproduce by running the following command sequence in a loop: mkdir /sys/fs/cgroup/memory/test echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs memhog 150M echo $$ > /sys/fs/cgroup/memory/cgroup.procs rmdir test The cgroups generated by it will never get freed. This patch fixes this issue by making mem_cgroup_iter avoid taking reference to the current position. In order not to hit use-after-free bug while running reclaim in parallel with cgroup deletion, we make use of ->css_released cgroup callback to clear references to the dying cgroup in all reclaim iterators that might refer to it. This callback is called right before scheduling rcu work which will free css, so if we access iter->position from rcu read section, we might be sure it won't go away under us. [hannes@cmpxchg.org: clean up css ref handling] Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting") Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	ocfs2: fix BUG when calculate new backup super	Joseph Qi	1	-3/+12
	When resizing, it firstly extends the last gd. Once it should backup super in the gd, it calculates new backup super and update the corresponding value. But it currently doesn't consider the situation that the backup super is already done. And in this case, it still sets the bit in gd bitmap and then decrease from bg_free_bits_count, which leads to a corrupted gd and trigger the BUG in ocfs2_block_group_set_bits: BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits); So check whether the backup super is done and then do the updates. Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Reviewed-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-29	[PATCH] arm: fix handling of F_OFD_... in oabi_fcntl64()	Al Viro	1	-36/+37
	Cc: stable@vger.kernel.org # 3.15+ Reviewed-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-28	RDMA/be2net: Remove open and close entry points	Devesh Sharma	4	-45/+1
	Recently Dough Ledford reported a deadlock happening between ocrdma-load sequence and NetworkManager service issueing "open" on be2net interface. The deadlock happens when any be2net hook (e.g. open/close) is called in parallel to insmod ocrdma.ko. A. be2net is sending administrative open/close event to ocrdma holding device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net. So sequence of locks is rtnl_lock---> device_list lock B. When new ocrdma roce device gets registered, infiniband stack now takes rtnl_lock in ib_register_device() in GID initialization routines. So sequence of locks in this path is device_list lock ---> rtnl_lock. This improper locking sequence causes deadlock. In order to resolve the above deadlock condition, ocrdma intorduced a patch to stop listening to administrative open/close events generated from be2net driver. It now depends on link-state-change async-event generated from CNA. This change leaves behind dead code which used to generate administrative open/close events. This patch cleans-up all that dead code from be2net. Reported-by: Doug Ledford <dledford@redhat.com> CC: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-28	RDMA/ocrdma: Depend on async link events from CNA	Devesh Sharma	6	-22/+119
	Recently Dough Ledford reported a deadlock happening between ocrdma-load sequence and NetworkManager service issuing "open" on be2net interface. The deadlock happens when any be2net hook (e.g. open/close) is called in parallel to insmod ocrdma.ko. A. be2net is sending administrative open/close event to ocrdma holding device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net. So sequence of locks is rtnl_lock---> device_list lock B. When new ocrdma roce device gets registered, infiniband stack now takes rtnl_lock in ib_register_device() in GID initialization routines. So sequence of locks in this path is device_list lock ---> rtnl_lock. This improper locking sequence causes deadlock. With this patch we stop using administrative open and close events injected by be2net driver. These events were used to dispatch PORT_ACTIVE and PORT_ERROR events to the IB-stack. This patch implements a logic to receive async-link-events generated from CNA whenever link-state-change is detected. Now on, these async-events will be used to dispatch PORT_ACTIVE and PORT_ERROR events to IB-stack. Depending on async-events from CNA removes the need to hold device-list-mutex and thus breaks the busy-wait scenario. Reported-by: Doug Ledford <dledford@redhat.com> CC: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Selvin Xavier <selvin.xavier@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-28	RDMA/ocrdma: Dispatch only port event when port state changes	Devesh Sharma	1	-23/+0
	Dispatch only port event to IB stack when port state changes. Don't explicitly modify qps to error. Let application listen to port events on async event queue or let QP fail with retry-exceeded completion error. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-28	RDMA/ocrdma: Fix vlan-id assignment in qp parameters	Devesh Sharma	1	-3/+4
	vlan-id is wrongly getting as 0 when PFC is enabled. Set vlan-id configured by user in QP parameters. In case vlan interface is not used, flash a warning to user to configure vlan and assign vlan-id as 0 in qp params. Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution') Cc: Matan Barak <matanb@mellanox.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-27	Linux 4.4-rc7	Linus Torvalds	1	-1/+1

2015-12-27	MIPS: Fix bitrot in __get_user_unaligned()	Al Viro	1	-3/+3
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-24	tty/serial: Skip 'NULL' char after console break when sysrq enabled	Vijay Kumar	1	-2/+4
	When sysrq is triggered from console, serial driver for SUN hypervisor console receives a console break and enables the sysrq. It expects a valid sysrq char following with break. Meanwhile if driver receives 'NULL' ASCII char then it disables sysrq and sysrq handler will never be invoked. This fix skips calling uart sysrq handler when 'NULL' is received while sysrq is enabled. Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Acked-by: Karl Volz <karl.volz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: fix FP corruption in user copy functions	Rob Gardner	13	-134/+235
	Short story: Exception handlers used by some copy_to_user() and copy_from_user() functions do not diligently clean up floating point register usage, and this can result in a user process seeing invalid values in floating point registers. This sometimes makes the process fail. Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions use floating point registers and VIS alignaddr/faligndata to accelerate data copying when source and dest addresses don't align well. Linux uses a lazy scheme for saving floating point registers; It is not done upon entering the kernel since it's a very expensive operation. Rather, it is done only when needed. If the kernel ends up not using FP regs during the course of some trap or system call, then it can return to user space without saving or restoring them. The various memcpy functions begin their FP code with VISEntry (or a variation thereof), which saves the FP regs. They conclude their FP code with VISExit (or a variation) which essentially marks the FP regs "clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned off so that a lazy restore will be triggered when/if the user process accesses floating point regs again. The bug is that the user copy variants of memcpy, copy_from_user() and copy_to_user(), employ an exception handling mechanism to detect faults when accessing user space addresses, and when this handler is invoked, an immediate return from the function is forced, and VISExit is not executed, thus leaving the fprs register in an indeterminate state, but often with fprs.FPRS_FEF set and one or more dirty bits. This results in a return to user space with invalid values in the FP regs, and since fprs.FPRS_FEF is on, no lazy restore occurs. This bug affects copy_to_user() and copy_from_user() for NG4, NG2, U3, and U1. All are fixed by using a new exception handler for those loads and stores that are done during the time between VISEnter and VISExit. n.b. In NG4memcpy, the problematic code can be triggered by a copy size greater than 128 bytes and an unaligned source address. This bug is known to be the cause of random user process memory corruptions while perf is running with the callgraph option (ie, perf record -g). This occurs because perf uses copy_from_user() to read user stacks, and may fault when it follows a stack frame pointer off to an invalid page. Validation checks on the stack address just obscure the underlying problem. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Perf should save/restore fault info	Rob Gardner	1	-0/+4
	There have been several reports of random processes being killed with a bus error or segfault during userspace stack walking in perf. One of the root causes of this problem is an asynchronous modification to thread_info fault_address and fault_code, which stems from a perf counter interrupt arriving during kernel processing of a "benign" fault, such as a TSB miss. Since perf_callchain_user() invokes copy_from_user() to read user stacks, a fault is not only possible, but probable. Validity checks on the stack address merely cover up the problem and reduce its frequency. The solution here is to save and restore fault_address and fault_code in perf_callchain_user() so that the benign fault handler is not disturbed by a perf interrupt. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Ensure perf can access user stacks	Rob Gardner	1	-0/+7
	When an interrupt (such as a perf counter interrupt) is delivered while executing in user space, the trap entry code puts ASI_AIUS in %asi so that copy_from_user() and copy_to_user() will access the correct memory. But if a perf counter interrupt is delivered while the cpu is already executing in kernel space, then the trap entry code will put ASI_P in %asi, and this will prevent copy_from_user() from reading any useful stack data in either of the perf_callchain_user_X functions, and thus no user callgraph data will be collected for this sample period. An additional problem is that a fault is guaranteed to occur, and though it will be silently covered up, it wastes time and could perturb state. In perf_callchain_user(), we ensure that %asi contains ASI_AIUS because we know for a fact that the subsequent calls to copy_from_user() are intended to read the user's stack. [ Use get_fs()/set_fs() -DaveM ] Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Don't set %pil in rtrap_nmi too early	Rob Gardner	1	-1/+7
	Commit 28a1f53 delays setting %pil to avoid potential hardirq stack overflow in the common rtrap_irq path. Setting %pil also needs to be delayed in the rtrap_nmi path for the same reason. Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	sparc64: Add ADI capability to cpu capabilities	Khalid Aziz	2	-4/+6
	Add ADI (Application Data Integrity) capability to cpu capabilities list. ADI capability allows virtual addresses to be encoded with a tag in bits 63-60. This tag serves as an access control key for the regions of virtual address with ADI enabled and a key set on them. Hypervisor encodes this capability as "adp" in "hwcap-list" property in machine description. Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	tty: serial: constify sunhv_ops structs	Aya Mahfouz	1	-3/+3
	Constifies sunhv_ops structures in tty's serial driver since they are not modified after their initialization. Detected and found using Coccinelle. Suggested-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Aya Mahfouz <mahfouz.saif.elyazal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-24	cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()	Dan Carpenter	1	-1/+1
	The "domain" variable needs to be signed for the error handling to work. Fixes: 8def31034d03 (cpufreq: arm_big_little: add SCPI interface driver) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-23	sparc: Hook up userfaultfd system call	Mike Kravetz	3	-4/+5
	After hooking up system call, userfaultfd selftest was successful for both 32 and 64 bit version of test. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-22	IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srq	Wengang Wang	1	-1/+1
	Commit 0ef2f05c7e02ff99c0b5b583d7dee2cd12b053f2 uses vmalloc for WR buffers when needed and uses kvfree to free the buffers. It missed changing kfree to kvfree in mlx4_ib_destroy_srq(). Reported-by: Matthew Finaly <matt@Mellanox.com> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22	IB/cma: cma_match_net_dev needs to take into account port_num	Matan Barak	1	-7/+9
	Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id wasn't bound to a port, then the check would occur against the first port of the device without regard to whether that port was even of the same type as the type of port the incoming packet was received on. Fix this by passing the port of the request and only checking against the same port of the device. Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22	ARM: tegra: Fix suspend hang on Tegra124 Chromebooks	Jon Hunter	1	-1/+1
	Enabling CPUFreq support for Tegra124 Chromebooks is causing the Tegra124 to hang when resuming from suspend. When CPUFreq is enabled, the CPU clock is changed from the PLLX clock to the DFLL clock during kernel boot. When resuming from suspend the CPU clock is temporarily changed back to the PLLX clock before switching back to the DFLL. If the DFLL is operating at a much lower frequency than the PLLX when we enter suspend, and so the CPU voltage rail is at a voltage too low for the CPUs to operate at the PLLX frequency, then the device will hang. Please note that the PLLX is used in the resume sequence to switch the CPU clock from the very slow 32K clock to a faster clock during early resume to speed up the resume sequence before the DFLL is resumed. Ideally, we should fix this by setting the suspend frequency so that it matches the PLLX frequency, however, that would be a bigger change. For now simply disable CPUFreq support for Tegra124 Chromebooks to avoid the hang when resuming from suspend. Fixes: 9a0baee960a7 ("ARM: tegra: Enable CPUFreq support for Tegra124 Chromebooks") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-12-22	um: Fix pointer cast	Mickaël Salaün	1	-1/+1
	Fix a pointer cast typo introduced in v4.4-rc5 especially visible for the i386 subarchitecture where it results in a kernel crash. [ Also removed pointless cast as per Al Viro - Linus ] Fixes: 8090bfd2bb9a ("um: Fix fpstate handling") Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Jeff Dike <jdike@addtoit.com> Acked-by: Richard Weinberger <richard@nod.at> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-12-22	bus: sunxi-rsb: Fix peripheral IC mapping runtime address	Chen-Yu Tsai	1	-1/+1
	0x4e is the runtime address normally associated with perihperal ICs. 0x45 is not a valid runtime address. Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-12-22	bus: sunxi-rsb: Fix primary PMIC mapping hardware address	Chen-Yu Tsai	1	-1/+1
	The primary PMICs use 0x3a3 as their hardware address, not 0x3e3. Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-12-22	null_blk: fix use-after-free error	Mike Krinkin	1	-3/+3
	blk_end_request_all may free request, so we need to save request_queue pointer before blk_end_request_all call. The problem was introduced in commit cf8ecc5a8455266f8d51 ("null_blk: guarantee device restart in all irq modes") and causes general protection fault with slab poisoning enabled. Fixes: cf8ecc5a8455266f8d51 ("null_blk: guarantee device restart in all irq modes") Signed-off-by: Mike Krinkin <krinkin.m.u@gmail.com> Reviewed-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-22	block: ensure to split after potentially bouncing a bio	Junichi Nomura	1	-2/+2
	blk_queue_bio() does split then bounce, which makes the segment counting based on pages before bouncing and could go wrong. Move the split to after bouncing, like we do for blk-mq, and the we fix the issue of having the bio count for segments be wrong. Fixes: 54efd50bfd87 ("block: make generic_make_request handle arbitrarily sized bios") Cc: stable@vger.kernel.org Tested-by: Artem S. Tashkinov <t.artem@lycos.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-22	NVMe: IO ending fixes on surprise removal	Keith Busch	1	-1/+19
	This patch fixes a lost request discovered during IO + hot removal. The driver's pci removal deletes gendisks prior to shutting down the controller to allow dirty data to sync. Dirty data can not be synced on a surprise removal, though, and would potentially block indefinitely. The driver previously had marked the queue as dying in this scenario to prevent new requests from attempting, however it will still block for requests that already entered the queue. This patch fixes this by quiescing IO first, then aborting the requeued requests before deleting disks. Reported-by: Sujith Pandel <sujith_pandel@dell.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Sujith Pandel <sujith_pandel@dell.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-22	KVM: x86: Reload pit counters for all channels when restoring state	Andrew Honig	1	-2/+6
	Currently if userspace restores the pit counters with a count of 0 on channels 1 or 2 and the guest attempts to read the count on those channels, then KVM will perform a mod of 0 and crash. This will ensure that 0 values are converted to 65536 as per the spec. This is CVE-2015-7513. Signed-off-by: Andy Honig <ahonig@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22	KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID	Paolo Bonzini	2	-3/+19
	Virtual machines can be run with CPUID such that there are no MTRRs. In that case, the firmware will never enable MTRRs and it is obviously undesirable to run the guest entirely with UC memory. Check out guest CPUID, and use WB memory if MTRR do not exist. Cc: qemu-stable@nongnu.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22	KVM: MTRR: observe maxphyaddr from guest CPUID, not host	Paolo Bonzini	1	-2/+7
	Conversion of MTRRs to ranges used the maxphyaddr from the boot CPU. This is wrong, because var_mtrr_range's mask variable then is discontiguous (like FF00FFFF000, where the first run of 0s corresponds to the bits between host and guest maxphyaddr). Instead always set up the masks to be full 64-bit values---we know that the reserved bits at the top are zero, and we can restore them when reading the MSR. This way var_mtrr_range gets a mask that just works. Fixes: a13842dc668b40daef4327294a6d3bdc8bd30276 Cc: qemu-stable@nongnu.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22	KVM: MTRR: fix fixed MTRR segment look up	Alexis Dambricourt	1	-1/+1
	This fixes the slow-down of VM running with pci-passthrough, since some MTRR range changed from MTRR_TYPE_WRBACK to MTRR_TYPE_UNCACHABLE. Memory in the 0K-640K range was incorrectly treated as uncacheable. Fixes: f7bfb57b3e89ff89c0da9f93dedab89f68d6ca27 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561 Cc: qemu-stable@nongnu.org Signed-off-by: Alexis Dambricourt <alexis.dambricourt@gmail.com> [Use correct BZ for "Fixes" annotation. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-22	MIPS: Fix build error due to unused variables.	Ralf Baechle	3	-3/+1
	c861519fcf95b2d46cb4275903423b43ae150a40 ("MIPS: Fix delay loops which may be removed by GCC.") which made it upstream was an outdated version of the patch and is lacking some the removal of two variables that became unused thus resulting in further warnings and build breakage. The commit from ae878615d7cee5d7346946cf1ae1b60e427013c2 was correct however. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22	crypto: algif_skcipher - Use new skcipher interface	Herbert Xu	1	-31/+30
	This patch replaces uses of ablkcipher with the new skcipher interface. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: <smueller@chronox.de>
2015-12-22	MIPS: VDSO: Fix build error	Qais Yousef	1	-2/+2
	Commit ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") introduced a build error. For MIPS VDSO to be compiled it requires binutils version 2.25 or above but the check in the Makefile had inverted logic causing it to be compiled in if binutils is below 2.25. This fixes the following compilation error: CC arch/mips/vdso/gettimeofday.o /tmp/ccsExcUd.s: Assembler messages: /tmp/ccsExcUd.s:62: Error: can't resolve `_start' {UND section} - `L0' {.text section} /tmp/ccsExcUd.s:467: Error: can't resolve `_start' {UND section} - `L0' {.text section} make[2]: * [arch/mips/vdso/gettimeofday.o] Error 1 make[1]: * [arch/mips/vdso] Error 2 make: *** [arch/mips] Error 2 [ralf@linux-mips: Fixed Sergei's complaint on the formatting of the cited commit and generally reformatted the log message.] Signed-off-by: Qais Yousef <qais.yousef@imgtec.com> Cc: alex@alex-smith.me.uk Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/11745/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22	MIPS: CPS: drop .set mips64r2 directives	Paul Burton	1	-2/+0
	Commit 977e043d5ea1 ("MIPS: kernel: cps-vec: Replace mips32r2 ISA level with mips64r2") leads to .set mips64r2 directives being present in 32 bit (ie. CONFIG_32BIT=y) kernels. This is incorrect & leads to MIPS64 instructions being emitted by the assembler when expanding pseudo-instructions. For example the "move" instruction can legitimately be expanded to a "daddu". This causes problems when the kernel is run on a MIPS32 CPU, as CONFIG_32BIT kernels of course often are... Fix this by dropping the .set <ISA> directives entirely now that Kconfig should be ensuring that kernels including this code are built with a suitable -march= compiler flag. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: <stable@vger.kernel.org> # 3.16+ Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/10869/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22	drm/i915: Correct max delay for HDMI hotplug live status checking	Gary Wang	1	-3/+4
	The total delay of HDMI hotplug detecting with 30ms have already been split into a resolution of 3 retries of 10ms each, for the worst cases. But it still suffered from only waiting 10ms at most in intel_hdmi_detect(). This patch corrects it by reading hotplug status with 4 times at most for 30ms delay. v2: - straight up to loop execution for more clear in code readability - mdelay will replace with msleep by Daniel's new patch drm/i915: mdelay(10) considered harmful - suggest to re-evaluate try times for being compatible to old HDMI monitor Reviewed-by: Cooper Chiou <cooper.chiou@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Gavin Hindman <gavin.hindman@intel.com> Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Signed-off-by: Gary Wang <gary.c.wang@intel.com> [danvet: fixup conflict with s/mdelay/msleep/ patch.] Cc: drm-intel-fixes@lists.freedesktop.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit 61fb3980dd396880ffba48523b1e27579868b82b) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-22	drm/i915: mdelay(10) considered harmful	Daniel Vetter	1	-1/+1
	I missed this myself when reviewing commit 237ed86c693d8a8e4db476976aeb30df4deac74b Author: Sonika Jindal <sonika.jindal@intel.com> Date: Tue Sep 15 09:44:20 2015 +0530 drm/i915: Check live status before reading edid Long sleeps like this really shouldn't waste cpu cycles spinning. Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: "Wang, Gary C" <gary.c.wang@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1449859455-32609-1-git-send-email-daniel.vetter@ffwll.ch Reviewed-by: Sonika Jindal <sonika.jindal@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit 71a199bacb398ee54eeac001699257dda083a455) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-22	drm/i915: Kill intel_crtc->cursor_bo	Ville Syrjälä	2	-6/+0
	The vma may have been rebound between the last time the cursor was enabled and now, so skipping the cursor gtt offset deduction is not safe unless we would also reset cursor_bo to NULL when disabling the cursor. Just thow cursor_bo to the bin instead since it's lost all other uses thanks to universal plane support. Chris pointed out that cursor updates are currently too slow via universal planes that micro optimizations like these wouldn't even help. v2: Add a note about futility of micro optimizations (Chris) Cc: drm-intel-fixes@lists.freedesktop.org References: http://lists.freedesktop.org/archives/intel-gfx/2015-December/082976.html Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450107302-17171-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit 1264859d648c4bdc9f0a098efbff90cbf462a075) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-22	MIPS: uaccess: Take EVA into account in [__]clear_user	James Hogan	3	-10/+26
	__clear_user() (and clear_user() which uses it), always access the user mode address space, which results in EVA store instructions when EVA is enabled even if the current user address limit is KERNEL_DS. Fix this by adding a new symbol __bzero_kernel for the normal kernel address space bzero in EVA mode, and call that from __clear_user() if eva_kernel_access(). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10844/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22	drm/i915: Workaround CHV pipe C cursor fail	Ville Syrjälä	1	-0/+17
	Turns out CHV pipe C was glued on somewhat poorly, and there's something wrong with the cursor. If the cursor straddles the left screen edge, and is then moved away from the edge or disabled, the pipe will often underrun. If enough underruns are triggered quickly enough the pipe will fall over and die (it just scans out a solid color and reports a constant underrun). We need to turn the disp2d power well off and on again to recover the pipe. None of that is very nice for the user, so let's just refuse to place the cursor in the compromised position. The ddx appears to fall back to swcursor when the ioctl returns an error, so theoretically there's no loss of functionality for the user (discounting swcursor bugs). I suppose most cursors images actually have the hotspot not exactly at 0,0 so under typical conditions the fallback will in fact kick in as soon as the cursor touches the left edge of the screen. Any atomic compositor should anyway be prepared to fall back to GPU composition when things don't work out, so there should be no problem with those. Other things that I tried to solve this include flipping all display related clock gating knobs I could find, increasing the minimum gtt alignment all the way up to 512k. I also tried to see if there are more specific screen coordinates that hit the bug, but the findings were somewhat inconclusive. Sometimes the failures happen almost across the whole left edge, sometimes more at the very top and around the bottom half. I wasn't able to find any real pattern to these variations, so it seems our only choice is to just refuse to straddle the left screen edge at all. Cc: stable@vger.kernel.org Cc: Jason Plum <max@warheads.net> Testcase: igt/kms_chv_cursor_fail Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92826 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450459479-16286-1-git-send-email-ville.syrjala@linux.intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (cherry picked from commit b29ec92c4f5e6d45d8bae8194e664427a01c6687) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-22	drm/i915: Only spin whilst waiting on the current request	Chris Wilson	2	-9/+26
	Limit busywaiting only to the request currently being processed by the GPU. If the request is not currently being processed by the GPU, there is a very low likelihood of it being completed within the 2 microsecond spin timeout and so we will just be wasting CPU cycles. v2: Check for logical inversion when rebasing - we were incorrectly checking for this request being active, and instead busywaiting for when the GPU was not yet processing the request of interest. v3: Try another colour for the seqno names. v4: Another colour for the function names. v5: Remove the forced coherency when checking for the active request. On reflection and plenty of recent experimentation, the issue is not a cache coherency problem - but an irq/seqno ordering problem (timing issue). Here, we do not need the w/a to force ordering of the read with an interrupt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-4-git-send-email-chris@chris-wilson.co.uk (cherry picked from commit 821485dc2ad665f136c57ee589bf7a8210160fe2) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-22	drm/i915: Limit the busy wait on requests to 5us not 10ms!	Chris Wilson	1	-2/+45
	When waiting for high frequency requests, the finite amount of time required to set up the irq and wait upon it limits the response rate. By busywaiting on the request completion for a short while we can service the high frequency waits as quick as possible. However, if it is a slow request, we want to sleep as quickly as possible. The tradeoff between waiting and sleeping is roughly the time it takes to sleep on a request, on the order of a microsecond. Based on measurements of synchronous workloads from across big core and little atom, I have set the limit for busywaiting as 10 microseconds. In most of the synchronous cases, we can reduce the limit down to as little as 2 miscroseconds, but that leaves quite a few test cases regressing by factors of 3 and more. The code currently uses the jiffie clock, but that is far too coarse (on the order of 10 milliseconds) and results in poor interactivity as the CPU ends up being hogged by slow requests. To get microsecond resolution we need to use a high resolution timer. The cheapest of which is polling local_clock(), but that is only valid on the same CPU. If we switch CPUs because the task was preempted, we can also use that as an indicator that the system is too busy to waste cycles on spinning and we should sleep instead. __i915_spin_request was introduced in commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2] Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 7 16:20:41 2015 +0100 drm/i915: Optimistically spin for the request completion v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe, so we can use native register sizes on smaller architectures. Mention the approximate microseconds units for elapsed time and add some extra comments describing the reason for busywaiting. v3: Raise the limit to 10us v4: Now 5us. Reported-by: Jens Axboe <axboe@kernel.dk> Link: https://lkml.org/lkml/2015/11/12/621 Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-3-git-send-email-chris@chris-wilson.co.uk (cherry picked from commit ca5b721e238226af1d767103ac852aeb8e4c0764) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-22	MIPS: uaccess: Take EVA into account in __copy_from_user()	James Hogan	1	-3/+9
	When EVA is in use, __copy_from_user() was unconditionally using the EVA instructions to read the user address space, however this can also be used for kernel access. If the address isn't a valid user address it will cause an address error or TLB exception, and if it is then user memory may be read instead of kernel memory. For example in the following stack trace from Linux v3.10 (changes since then will prevent this particular one still happening) kernel_sendmsg() set the user address limit to KERNEL_DS, and tcp_sendmsg() goes on to use __copy_from_user() with a kernel address in KSeg0. [<8002d434>] __copy_fromuser_common+0x10c/0x254 [<805710e0>] tcp_sendmsg+0x5f4/0xf00 [<804e8e3c>] sock_sendmsg+0x78/0xa0 [<804e8f28>] kernel_sendmsg+0x24/0x38 [<804ee0f8>] sock_no_sendpage+0x70/0x7c [<8017c820>] pipe_to_sendpage+0x80/0x98 [<8017c6b0>] splice_from_pipe_feed+0xa8/0x198 [<8017cc54>] __splice_from_pipe+0x4c/0x8c [<8017e844>] splice_from_pipe+0x58/0x78 [<8017e884>] generic_splice_sendpage+0x20/0x2c [<8017d690>] do_splice_from+0xb4/0x110 [<8017d710>] direct_splice_actor+0x24/0x30 [<8017d394>] splice_direct_to_actor+0xd8/0x208 [<8017d51c>] do_splice_direct+0x58/0x7c [<8014eaf4>] do_sendfile+0x1dc/0x39c [<8014f82c>] SyS_sendfile+0x90/0xf8 Add the eva_kernel_access() check in __copy_from_user() like the one in copy_from_user(). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Markos Chandras <markos.chandras@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10843/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-12-22	drm/i915: Break busywaiting for requests on pending signals	Chris Wilson	1	-5/+8
	The busywait in __i915_spin_request() does not respect pending signals and so may consume the entire timeslice for the task instead of returning to userspace to handle the signal. In the worst case this could cause a delay in signal processing of 20ms, which would be a noticeable jitter in cursor tracking. If a higher resolution signal was being used, for example to provide fairness of a server timeslices between clients, we could expect to detect some unfairness between clients (i.e. some windows not updating as fast as others). This issue was noticed when inspecting a report of poor interactivity resulting from excessively high __i915_spin_request usage. Fixes regression from commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2] Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 7 16:20:41 2015 +0100 drm/i915: Optimistically spin for the request completion v2: Try to assess the impact of the bug Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449833608-22125-2-git-send-email-chris@chris-wilson.co.uk (cherry picked from commit 91b0c352ace9afec1fb51590c7b8bd60e0eb9fbd) Signed-off-by: Jani Nikula <jani.nikula@intel.com>