aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-01-15nfs_clone_sb_security(): simplify the check for server bogosityAl Viro1-1/+1
We used to check ->i_op for being nfs_dir_inode_operations. With separate inode_operations for v3 and v4 that became bogus, but rather than going for protocol-dependent comparison we could've just checked ->i_fop instead; _that_ is the same for all protocol versions. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: get rid of mount_info ->fill_super()Al Viro4-64/+18
The only possible values are nfs_fill_super and nfs_clone_super. The latter is used only when crossing into a submount and it is almost identical to the former; the only differences are * ->s_time_gran unconditionally set to 1 (even for v2 mounts). Regression dating back to 2012, actually. * ->s_blocksize/->s_blocksize_bits set to that of parent. Rather than messing with the method, stash ->s_blocksize_bits in mount_info in submount case and after the (now unconditional) call of nfs_fill_super() override ->s_blocksize/->s_blocksize_bits if that has been set. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: don't pass nfs_subversion to ->create_server()Al Viro8-22/+17
pick it from mount_info Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: unexport nfs_fs_mount_common()Al Viro2-5/+3
Make it static, even. And remove a stale extern of (long-gone) nfs_xdev_mount_common() from internal.h, while we are at it. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: merge xdev and remote file_system_typeAl Viro4-29/+11
they are identical now... Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: don't bother passing nfs_subversion to ->try_mount() and nfs_fs_mount_common()Al Viro5-21/+14
Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: stash nfs_subversion reference into nfs_mount_infoAl Viro4-3/+6
That will allow to get rid of passing those references around in quite a few places. Moreover, that will allow to merge xdev and remote file_system_type. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: lift setting mount_info from nfs_xdev_mount()Al Viro3-37/+26
Do it in nfs_do_submount() instead. As a side benefit, nfs_clone_data doesn't need ->fh and ->fattr anymore. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs4: fold nfs_do_root_mount/nfs_follow_remote_pathAl Viro1-51/+37
Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: don't bother setting/restoring export_path around do_nfs_root_mount()Al Viro1-4/+0
nothing in it will be looking at that thing anyway Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: fold nfs4_remote_fs_type and nfs4_remote_referral_fs_typeAl Viro1-22/+4
They are identical now. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: lift setting mount_info from nfs4_remote{,_referral}_mountAl Viro1-32/+35
Do that (fhandle allocation, setting struct server up) in nfs4_referral_mount() and nfs4_try_mount() resp. and pass the server and pointer to mount_info into nfs_do_root_mount() so that nfs4_remote_referral_mount()/nfs_remote_mount() could be merged. Since we are moving stuff from ->mount() instances to the points prior to vfs_kern_mount() that would trigger those, we need to make sure that do_nfs_root_mount() will do the corresponding cleanup itself if it doesn't trigger those ->mount() instances. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15nfs: stash server into struct nfs_mount_infoAl Viro3-18/+14
Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-15saner calling conventions for nfs_fs_mount_common()Al Viro2-22/+5
Allow it to take ERR_PTR() for server and return ERR_CAST() of it in such case. All callers used to open-code that... Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14Merge tag 'nfs-for-5.5-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2-7/+24
Pull NFS client bugfixes from Anna Schumaker: "Three NFS over RDMA fixes for bugs Chuck found that can be hit during device removal: - Fix create_qp crash on device unload - Fix completion wait during device removal - Fix oops in receive handler after device removal" * tag 'nfs-for-5.5-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: xprtrdma: Fix oops in Receive handler after device removal xprtrdma: Fix completion wait during device removal xprtrdma: Fix create_qp crash on device unload
2020-01-14xprtrdma: Fix oops in Receive handler after device removalChuck Lever2-6/+21
Since v5.4, a device removal occasionally triggered this oops: Dec 2 17:13:53 manet kernel: BUG: unable to handle page fault for address: 0000000c00000219 Dec 2 17:13:53 manet kernel: #PF: supervisor read access in kernel mode Dec 2 17:13:53 manet kernel: #PF: error_code(0x0000) - not-present page Dec 2 17:13:53 manet kernel: PGD 0 P4D 0 Dec 2 17:13:53 manet kernel: Oops: 0000 [#1] SMP Dec 2 17:13:53 manet kernel: CPU: 2 PID: 468 Comm: kworker/2:1H Tainted: G W 5.4.0-00050-g53717e43af61 #883 Dec 2 17:13:53 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015 Dec 2 17:13:53 manet kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] Dec 2 17:13:53 manet kernel: RIP: 0010:rpcrdma_wc_receive+0x7c/0xf6 [rpcrdma] Dec 2 17:13:53 manet kernel: Code: 6d 8b 43 14 89 c1 89 45 78 48 89 4d 40 8b 43 2c 89 45 14 8b 43 20 89 45 18 48 8b 45 20 8b 53 14 48 8b 30 48 8b 40 10 48 8b 38 <48> 8b 87 18 02 00 00 48 85 c0 75 18 48 8b 05 1e 24 c4 e1 48 85 c0 Dec 2 17:13:53 manet kernel: RSP: 0018:ffffc900035dfe00 EFLAGS: 00010246 Dec 2 17:13:53 manet kernel: RAX: ffff888467290000 RBX: ffff88846c638400 RCX: 0000000000000048 Dec 2 17:13:53 manet kernel: RDX: 0000000000000048 RSI: 00000000f942e000 RDI: 0000000c00000001 Dec 2 17:13:53 manet kernel: RBP: ffff888467611b00 R08: ffff888464e4a3c4 R09: 0000000000000000 Dec 2 17:13:53 manet kernel: R10: ffffc900035dfc88 R11: fefefefefefefeff R12: ffff888865af4428 Dec 2 17:13:53 manet kernel: R13: ffff888466023000 R14: ffff88846c63f000 R15: 0000000000000010 Dec 2 17:13:53 manet kernel: FS: 0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000 Dec 2 17:13:53 manet kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 2 17:13:53 manet kernel: CR2: 0000000c00000219 CR3: 0000000002009002 CR4: 00000000001606e0 Dec 2 17:13:53 manet kernel: Call Trace: Dec 2 17:13:53 manet kernel: __ib_process_cq+0x5c/0x14e [ib_core] Dec 2 17:13:53 manet kernel: ib_cq_poll_work+0x26/0x70 [ib_core] Dec 2 17:13:53 manet kernel: process_one_work+0x19d/0x2cd Dec 2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf Dec 2 17:13:53 manet kernel: worker_thread+0x1a6/0x25a Dec 2 17:13:53 manet kernel: ? cancel_delayed_work_sync+0xf/0xf Dec 2 17:13:53 manet kernel: kthread+0xf4/0xf9 Dec 2 17:13:53 manet kernel: ? kthread_queue_delayed_work+0x74/0x74 Dec 2 17:13:53 manet kernel: ret_from_fork+0x24/0x30 The proximal cause is that this rpcrdma_rep has a rr_rdmabuf that is still pointing to the old ib_device, which has been freed. The only way that is possible is if this rpcrdma_rep was not destroyed by rpcrdma_ia_remove. Debugging showed that was indeed the case: this rpcrdma_rep was still in use by a completing RPC at the time of the device removal, and thus wasn't on the rep free list. So, it was not found by rpcrdma_reps_destroy(). The fix is to introduce a list of all rpcrdma_reps so that they all can be found when a device is removed. That list is used to perform only regbuf DMA unmapping, replacing that call to rpcrdma_reps_destroy(). Meanwhile, to prevent corruption of this list, I've moved the destruction of temp rpcrdma_rep objects to rpcrdma_post_recvs(). rpcrdma_xprt_drain() ensures that post_recvs (and thus rep_destroy) is not invoked while rpcrdma_reps_unmap is walking rb_all_reps, thus protecting the rb_all_reps list. Fixes: b0b227f071a0 ("xprtrdma: Use an llist to manage free rpcrdma_reps") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14xprtrdma: Fix completion wait during device removalChuck Lever1-1/+1
I've found that on occasion, "rmmod <dev>" will hang while if an NFS is under load. Ensure that ri_remove_done is initialized only just before the transport is woken up to force a close. This avoids the completion possibly getting initialized again while the CM event handler is waiting for a wake-up. Fixes: bebd031866ca ("xprtrdma: Support unplugging an HCA from under an NFS mount") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14xprtrdma: Fix create_qp crash on device unloadChuck Lever1-0/+2
On device re-insertion, the RDMA device driver crashes trying to set up a new QP: Nov 27 16:32:06 manet kernel: BUG: kernel NULL pointer dereference, address: 00000000000001c0 Nov 27 16:32:06 manet kernel: #PF: supervisor write access in kernel mode Nov 27 16:32:06 manet kernel: #PF: error_code(0x0002) - not-present page Nov 27 16:32:06 manet kernel: PGD 0 P4D 0 Nov 27 16:32:06 manet kernel: Oops: 0002 [#1] SMP Nov 27 16:32:06 manet kernel: CPU: 1 PID: 345 Comm: kworker/u28:0 Tainted: G W 5.4.0 #852 Nov 27 16:32:06 manet kernel: Hardware name: Supermicro SYS-6028R-T/X10DRi, BIOS 1.1a 10/16/2015 Nov 27 16:32:06 manet kernel: Workqueue: xprtiod xprt_rdma_connect_worker [rpcrdma] Nov 27 16:32:06 manet kernel: RIP: 0010:atomic_try_cmpxchg+0x2/0x12 Nov 27 16:32:06 manet kernel: Code: ff ff 48 8b 04 24 5a c3 c6 07 00 0f 1f 40 00 c3 31 c0 48 81 ff 08 09 68 81 72 0c 31 c0 48 81 ff 83 0c 68 81 0f 92 c0 c3 8b 06 <f0> 0f b1 17 0f 94 c2 84 d2 75 02 89 06 88 d0 c3 53 ba 01 00 00 00 Nov 27 16:32:06 manet kernel: RSP: 0018:ffffc900035abbf0 EFLAGS: 00010046 Nov 27 16:32:06 manet kernel: RAX: 0000000000000000 RBX: 00000000000001c0 RCX: 0000000000000000 Nov 27 16:32:06 manet kernel: RDX: 0000000000000001 RSI: ffffc900035abbfc RDI: 00000000000001c0 Nov 27 16:32:06 manet kernel: RBP: ffffc900035abde0 R08: 000000000000000e R09: ffffffffffffc000 Nov 27 16:32:06 manet kernel: R10: 0000000000000000 R11: 000000000002e800 R12: ffff88886169d9f8 Nov 27 16:32:06 manet kernel: R13: ffff88886169d9f4 R14: 0000000000000246 R15: 0000000000000000 Nov 27 16:32:06 manet kernel: FS: 0000000000000000(0000) GS:ffff88846fa40000(0000) knlGS:0000000000000000 Nov 27 16:32:06 manet kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 27 16:32:06 manet kernel: CR2: 00000000000001c0 CR3: 0000000002009006 CR4: 00000000001606e0 Nov 27 16:32:06 manet kernel: Call Trace: Nov 27 16:32:06 manet kernel: do_raw_spin_lock+0x2f/0x5a Nov 27 16:32:06 manet kernel: create_qp_common.isra.47+0x856/0xadf [mlx4_ib] Nov 27 16:32:06 manet kernel: ? slab_post_alloc_hook.isra.60+0xa/0x1a Nov 27 16:32:06 manet kernel: ? __kmalloc+0x125/0x139 Nov 27 16:32:06 manet kernel: mlx4_ib_create_qp+0x57f/0x972 [mlx4_ib] The fix is to copy the qp_init_attr struct that was just created by rpcrdma_ep_create() instead of using the one from the previous connection instance. Fixes: 98ef77d1aaa7 ("xprtrdma: Send Queue size grows after a reconnect") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-01-14Merge branch 'parisc-5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds2-3/+3
Pull parisc fixes from Helge Deller: "A boot crash fix by Mike Rapoport and a printk fix by Krzysztof Kozlowski" * 'parisc-5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: fix map_pages() to actually populate upper directory parisc: Use proper printk format for resource_size_t
2020-01-14Merge tag 'asm-generic-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playgroundLinus Torvalds3-6/+40
Pull asm-generic fixes from Arnd Bergmann: "Here are two bugfixes from Mike Rapoport, both fixing compile-time errors for the nds32 architecture that were recently introduced" * tag 'asm-generic-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground: nds32: fix build failure caused by page table folding updates asm-generic/nds32: don't redefine cacheflush primitives
2020-01-14Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds2-3/+4
Pull SCSI fixes from James Bottomley: "Two simple fixes in the upper drivers (so both fairly core), one in enclosures, which fixes replugging a device into an enclosure slot and one in the disk driver which fixes revalidating a drive with protection information (PI) to make it a non-PI drive ... previously we were still remembering the old PI state. Both fixed issues are quite rare in the field" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: enclosure: Fix stale device oops with hot replug scsi: sd: Clear sdkp->protection_type if disk is reformatted without PI
2020-01-14Merge branch 'dhowells' (patches from DavidH)Linus Torvalds3-21/+13
Merge misc fixes from David Howells. Two afs fixes and a key refcounting fix. * dhowells: afs: Fix afs_lookup() to not clobber the version on a new dentry afs: Fix use-after-loss-of-ref keys: Fix request_key() cache
2020-01-14afs: Fix afs_lookup() to not clobber the version on a new dentryDavid Howells1-5/+1
Fix afs_lookup() to not clobber the version set on a new dentry by afs_do_lookup() - especially as it's using the wrong version of the version (we need to use the one given to us by whatever op the dir contents correspond to rather than what's in the afs_vnode). Fixes: 9dd0b82ef530 ("afs: Fix missing dentry data version updating") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-14afs: Fix use-after-loss-of-refDavid Howells2-14/+10
afs_lookup() has a tracepoint to indicate the outcome of d_splice_alias(), passing it the inode to retrieve the fid from. However, the function gave up its ref on that inode when it called d_splice_alias(), which may have failed and dropped the inode. Fix this by caching the fid. Fixes: 80548b03991f ("afs: Add more tracepoints") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-14keys: Fix request_key() cacheDavid Howells1-2/+2
When the key cached by request_key() and co. is cleaned up on exit(), the code looks in the wrong task_struct, and so clears the wrong cache. This leads to anomalies in key refcounting when doing, say, a kernel build on an afs volume, that then trigger kasan to report a use-after-free when the key is viewed in /proc/keys. Fix this by making exit_creds() look in the passed-in task_struct rather than in current (the task_struct cleanup code is deferred by RCU and potentially run in another task). Fixes: 7743c48e54ee ("keys: Cache result of request_key*() temporarily in task_struct") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-14Merge branch 'akpm' (patches from Andrew)Linus Torvalds15-110/+102
Merge misc fixes from Andrew Morton: "11 mm fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: khugepaged: add trace status description for SCAN_PAGE_HAS_PRIVATE mm: memcg/slab: call flush_memcg_workqueue() only if memcg workqueue is valid mm/page-writeback.c: improve arithmetic divisions mm/page-writeback.c: use div64_ul() for u64-by-unsigned-long divide mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio() mm, debug_pagealloc: don't rely on static keys too early mm: memcg/slab: fix percpu slab vmstats flushing mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignment mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignment mm/memory_hotplug: don't free usage map when removing a re-added early section mm, thp: tweak reclaim/compaction effort of local-only and all-node allocations
2020-01-14parisc: fix map_pages() to actually populate upper directoryMike Rapoport1-1/+1
The commit d96885e277b5 ("parisc: use pgtable-nopXd instead of 4level-fixup") converted PA-RISC to use folded page tables, but it missed the conversion of pgd_populate() to pud_populate() in maps_pages() function. This caused the upper page table directory to remain empty and the system would crash as a result. Using pud_populate() that actually populates the page table instead of dummy pgd_populate() fixes the issue. Fixes: d96885e277b5 ("parisc: use pgtable-nopXd instead of 4level-fixup") Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Jeroen Roovers <jer@gentoo.org> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: Jeroen Roovers <jer@gentoo.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2020-01-14parisc: Use proper printk format for resource_size_tKrzysztof Kozlowski1-2/+2
resource_size_t should be printed with its own size-independent format to fix warnings when compiling on 64-bit platform (e.g. with COMPILE_TEST): arch/parisc/kernel/drivers.c: In function 'print_parisc_device': arch/parisc/kernel/drivers.c:892:9: warning: format '%p' expects argument of type 'void *', but argument 4 has type 'resource_size_t {aka unsigned int}' [-Wformat=] Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>
2020-01-13Merge tag 'Intel-CVE-2019-14615' from bundle by Akeem Abodunrin.Linus Torvalds1-0/+8
Merge Intel Gen9 graphics fix from Akeem Abodunrin: "Insufficient control flow in certain data structures for some Intel Processors with Intel Processor Graphics may allow an unauthenticated user to potentially enable information disclosure via local access This provides mitigation for Gen9 hardware. Note that Gen8 is not impacted due to a previously implemented workaround. The mitigation involves using an existing hardware feature to forcibly clear down all EU state at each context switch" * tag 'Intel-CVE-2019-14615' of emailed bundle from Akeem G Abodunrin <akeem.g.abodunrin@intel.com>: drm/i915/gen9: Clear residual context state on context switch
2020-01-13mm: khugepaged: add trace status description for SCAN_PAGE_HAS_PRIVATEYang Shi1-1/+2
Commit 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") introduced a new khugepaged scan result: SCAN_PAGE_HAS_PRIVATE, but the corresponding description for trace events were not added. Link: http://lkml.kernel.org/r/1574793844-2914-1-git-send-email-yang.shi@linux.alibaba.com Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: Song Liu <songliubraving@fb.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm: memcg/slab: call flush_memcg_workqueue() only if memcg workqueue is validAdrian Huang1-1/+2
When booting with amd_iommu=off, the following WARNING message appears: AMD-Vi: AMD IOMMU disabled on kernel command-line ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/workqueue.c:2772 flush_workqueue+0x42e/0x450 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc3-amd-iommu #6 Hardware name: Lenovo ThinkSystem SR655-2S/7D2WRCZ000, BIOS D8E101L-1.00 12/05/2019 RIP: 0010:flush_workqueue+0x42e/0x450 Code: ff 0f 0b e9 7a fd ff ff 4d 89 ef e9 33 fe ff ff 0f 0b e9 7f fd ff ff 0f 0b e9 bc fd ff ff 0f 0b e9 a8 fd ff ff e8 52 2c fe ff <0f> 0b 31 d2 48 c7 c6 e0 88 c5 95 48 c7 c7 d8 ad f0 95 e8 19 f5 04 Call Trace: kmem_cache_destroy+0x69/0x260 iommu_go_to_state+0x40c/0x5ab amd_iommu_prepare+0x16/0x2a irq_remapping_prepare+0x36/0x5f enable_IR_x2apic+0x21/0x172 default_setup_apic_routing+0x12/0x6f apic_intr_mode_init+0x1a1/0x1f1 x86_late_time_init+0x17/0x1c start_kernel+0x480/0x53f secondary_startup_64+0xb6/0xc0 ---[ end trace 30894107c3749449 ]--- x2apic: IRQ remapping doesn't support X2APIC mode x2apic disabled The warning is caused by the calling of 'kmem_cache_destroy()' in free_iommu_resources(). Here is the call path: free_iommu_resources kmem_cache_destroy flush_memcg_workqueue flush_workqueue The root cause is that the IOMMU subsystem runs before the workqueue subsystem, which the variable 'wq_online' is still 'false'. This leads to the statement 'if (WARN_ON(!wq_online))' in flush_workqueue() is 'true'. Since the variable 'memcg_kmem_cache_wq' is not allocated during the time, it is unnecessary to call flush_memcg_workqueue(). This prevents the WARNING message triggered by flush_workqueue(). Link: http://lkml.kernel.org/r/20200103085503.1665-1-ahuang12@lenovo.com Fixes: 92ee383f6daab ("mm: fix race between kmem_cache destroy, create and deactivate") Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Reported-by: Xiaochun Lee <lixc17@lenovo.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/page-writeback.c: improve arithmetic divisionsWen Yang1-1/+1
Use div64_ul() instead of do_div() if the divisor is unsigned long, to avoid truncation to 32-bit on 64-bit platforms. Link: http://lkml.kernel.org/r/20200102081442.8273-4-wenyang@linux.alibaba.com Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Qian Cai <cai@lca.pw> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/page-writeback.c: use div64_ul() for u64-by-unsigned-long divideWen Yang1-2/+2
The two variables 'numerator' and 'denominator', though they are declared as long, they should actually be unsigned long (according to the implementation of the fprop_fraction_percpu() function) And do_div() does a 64-by-32 division, while the divisor 'denominator' is unsigned long, thus 64-bit on 64-bit platforms. Hence the proper function to call is div64_ul(). Link: http://lkml.kernel.org/r/20200102081442.8273-3-wenyang@linux.alibaba.com Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Qian Cai <cai@lca.pw> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/page-writeback.c: avoid potential division by zero in wb_min_max_ratio()Wen Yang1-2/+2
Patch series "use div64_ul() instead of div_u64() if the divisor is unsigned long". We were first inspired by commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom () calculation"), then refer to the recently analyzed mm code, we found this suspicious place. 201 if (min) { 202 min *= this_bw; 203 do_div(min, tot_bw); 204 } And we also disassembled and confirmed it: /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios7.x86_64/mm/page-writeback.c: 201 0xffffffff811c37da <__wb_calc_thresh+234>: xor %r10d,%r10d 0xffffffff811c37dd <__wb_calc_thresh+237>: test %rax,%rax 0xffffffff811c37e0 <__wb_calc_thresh+240>: je 0xffffffff811c3800 <__wb_calc_thresh+272> /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios7.x86_64/mm/page-writeback.c: 202 0xffffffff811c37e2 <__wb_calc_thresh+242>: imul %r8,%rax /usr/src/debug/kernel-4.9.168-016.ali3000/linux-4.9.168-016.ali3000.alios7.x86_64/mm/page-writeback.c: 203 0xffffffff811c37e6 <__wb_calc_thresh+246>: mov %r9d,%r10d ---> truncates it to 32 bits here 0xffffffff811c37e9 <__wb_calc_thresh+249>: xor %edx,%edx 0xffffffff811c37eb <__wb_calc_thresh+251>: div %r10 0xffffffff811c37ee <__wb_calc_thresh+254>: imul %rbx,%rax 0xffffffff811c37f2 <__wb_calc_thresh+258>: shr $0x2,%rax 0xffffffff811c37f6 <__wb_calc_thresh+262>: mul %rcx 0xffffffff811c37f9 <__wb_calc_thresh+265>: shr $0x2,%rdx 0xffffffff811c37fd <__wb_calc_thresh+269>: mov %rdx,%r10 This series uses div64_ul() instead of div_u64() if the divisor is unsigned long, to avoid truncation to 32-bit on 64-bit platforms. This patch (of 3): The variables 'min' and 'max' are unsigned long and do_div truncates them to 32 bits, which means it can test non-zero and be truncated to zero for division. Fix this issue by using div64_ul() instead. Link: http://lkml.kernel.org/r/20200102081442.8273-2-wenyang@linux.alibaba.com Fixes: 693108a8a667 ("writeback: make bdi->min/max_ratio handling cgroup writeback aware") Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Qian Cai <cai@lca.pw> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm, debug_pagealloc: don't rely on static keys too earlyVlastimil Babka6-32/+34
Commit 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging") has introduced a static key to reduce overhead when debug_pagealloc is compiled in but not enabled. It relied on the assumption that jump_label_init() is called before parse_early_param() as in start_kernel(), so when the "debug_pagealloc=on" option is parsed, it is safe to enable the static key. However, it turns out multiple architectures call parse_early_param() earlier from their setup_arch(). x86 also calls jump_label_init() even earlier, so no issue was found while testing the commit, but same is not true for e.g. ppc64 and s390 where the kernel would not boot with debug_pagealloc=on as found by our QA. To fix this without tricky changes to init code of multiple architectures, this patch partially reverts the static key conversion from 96a2b03f281d. Init-time and non-fastpath calls (such as in arch code) of debug_pagealloc_enabled() will again test a simple bool variable. Fastpath mm code is converted to a new debug_pagealloc_enabled_static() variant that relies on the static key, which is enabled in a well-defined point in mm_init() where it's guaranteed that jump_label_init() has been called, regardless of architecture. [sfr@canb.auug.org.au: export _debug_pagealloc_enabled_early] Link: http://lkml.kernel.org/r/20200106164944.063ac07b@canb.auug.org.au Link: http://lkml.kernel.org/r/20191219130612.23171-1-vbabka@suse.cz Fixes: 96a2b03f281d ("mm, debug_pagelloc: use static keys to enable debugging") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Qian Cai <cai@lca.pw> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm: memcg/slab: fix percpu slab vmstats flushingRoman Gushchin2-31/+11
Currently slab percpu vmstats are flushed twice: during the memcg offlining and just before freeing the memcg structure. Each time percpu counters are summed, added to the atomic counterparts and propagated up by the cgroup tree. The second flushing is required due to how recursive vmstats are implemented: counters are batched in percpu variables on a local level, and once a percpu value is crossing some predefined threshold, it spills over to atomic values on the local and each ascendant levels. It means that without flushing some numbers cached in percpu variables will be dropped on floor each time a cgroup is destroyed. And with uptime the error on upper levels might become noticeable. The first flushing aims to make counters on ancestor levels more precise. Dying cgroups may resume in the dying state for a long time. After kmem_cache reparenting which is performed during the offlining slab counters of the dying cgroup don't have any chances to be updated, because any slab operations will be performed on the parent level. It means that the inaccuracy caused by percpu batching will not decrease up to the final destruction of the cgroup. By the original idea flushing slab counters during the offlining should minimize the visible inaccuracy of slab counters on the parent level. The problem is that percpu counters are not zeroed after the first flushing. So every cached percpu value is summed twice. It creates a small error (up to 32 pages per cpu, but usually less) which accumulates on parent cgroup level. After creating and destroying of thousands of child cgroups, slab counter on parent level can be way off the real value. For now, let's just stop flushing slab counters on memcg offlining. It can't be done correctly without scheduling a work on each cpu: reading and zeroing it during css offlining can race with an asynchronous update, which doesn't expect values to be changed underneath. With this change, slab counters on parent level will become eventually consistent. Once all dying children are gone, values are correct. And if not, the error is capped by 32 * NR_CPUS pages per dying cgroup. It's not perfect, as slab are reparented, so any updates after the reparenting will happen on the parent level. It means that if a slab page was allocated, a counter on child level was bumped, then the page was reparented and freed, the annihilation of positive and negative counter values will not happen until the child cgroup is released. It makes slab counters different from others, and it might want us to implement flushing in a correct form again. But it's also a question of performance: scheduling a work on each cpu isn't free, and it's an open question if the benefit of having more accurate counters is worth it. We might also consider flushing all counters on offlining, not only slab counters. So let's fix the main problem now: make the slab counters eventually consistent, so at least the error won't grow with uptime (or more precisely the number of created and destroyed cgroups). And think about the accuracy of counters separately. Link: http://lkml.kernel.org/r/20191220042728.1045881-1-guro@fb.com Fixes: bee07b33db78 ("mm: memcontrol: flush percpu slab vmstats on kmem offlining") Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignmentKirill A. Shutemov1-3/+4
Shmem/tmpfs tries to provide THP-friendly mappings if huge pages are enabled. But it doesn't work well with above-47bit hint address. Normally, the kernel doesn't create userspace mappings above 47-bit, even if the machine allows this (such as with 5-level paging on x86-64). Not all user space is ready to handle wide addresses. It's known that at least some JIT compilers use higher bits in pointers to encode their information. Userspace can ask for allocation from full address space by specifying hint address (with or without MAP_FIXED) above 47-bits. If the application doesn't need a particular address, but wants to allocate from whole address space it can specify -1 as a hint address. Unfortunately, this trick breaks THP alignment in shmem/tmp: shmem_get_unmapped_area() would not try to allocate PMD-aligned area if *any* hint address specified. This can be fixed by requesting the aligned area if the we failed to allocated at user-specified hint address. The request with inflated length will also take the user-specified hint address. This way we will not lose an allocation request from the full address space. [kirill@shutemov.name: fold in a fixup] Link: http://lkml.kernel.org/r/20191223231309.t6bh5hkbmokihpfu@box Link: http://lkml.kernel.org/r/20191220142548.7118-3-kirill.shutemov@linux.intel.com Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Willhalm, Thomas" <thomas.willhalm@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Bruggeman, Otto G" <otto.g.bruggeman@intel.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignmentKirill A. Shutemov1-14/+24
Patch series "Fix two above-47bit hint address vs. THP bugs". The two get_unmapped_area() implementations have to be fixed to provide THP-friendly mappings if above-47bit hint address is specified. This patch (of 2): Filesystems use thp_get_unmapped_area() to provide THP-friendly mappings. For DAX in particular. Normally, the kernel doesn't create userspace mappings above 47-bit, even if the machine allows this (such as with 5-level paging on x86-64). Not all user space is ready to handle wide addresses. It's known that at least some JIT compilers use higher bits in pointers to encode their information. Userspace can ask for allocation from full address space by specifying hint address (with or without MAP_FIXED) above 47-bits. If the application doesn't need a particular address, but wants to allocate from whole address space it can specify -1 as a hint address. Unfortunately, this trick breaks thp_get_unmapped_area(): the function would not try to allocate PMD-aligned area if *any* hint address specified. Modify the routine to handle it correctly: - Try to allocate the space at the specified hint address with length padding required for PMD alignment. - If failed, retry without length padding (but with the same hint address); - If the returned address matches the hint address return it. - Otherwise, align the address as required for THP and return. The user specified hint address is passed down to get_unmapped_area() so above-47bit hint address will be taken into account without breaking alignment requirements. Link: http://lkml.kernel.org/r/20191220142548.7118-2-kirill.shutemov@linux.intel.com Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Thomas Willhalm <thomas.willhalm@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Bruggeman, Otto G" <otto.g.bruggeman@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm/memory_hotplug: don't free usage map when removing a re-added early sectionDavid Hildenbrand1-1/+8
When we remove an early section, we don't free the usage map, as the usage maps of other sections are placed into the same page. Once the section is removed, it is no longer an early section (especially, the memmap is freed). When we re-add that section, the usage map is reused, however, it is no longer an early section. When removing that section again, we try to kfree() a usage map that was allocated during early boot - bad. Let's check against PageReserved() to see if we are dealing with an usage map that was allocated during boot. We could also check against !(PageSlab(usage_page) || PageCompound(usage_page)), but PageReserved() is cleaner. Can be triggered using memtrace under ppc64/powernv: $ mount -t debugfs none /sys/kernel/debug/ $ echo 0x20000000 > /sys/kernel/debug/powerpc/memtrace/enable $ echo 0x20000000 > /sys/kernel/debug/powerpc/memtrace/enable ------------[ cut here ]------------ kernel BUG at mm/slub.c:3969! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=3D64K MMU=3DHash SMP NR_CPUS=3D2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 154 Comm: sh Not tainted 5.5.0-rc2-next-20191216-00005-g0be1dba7b7c0 #61 NIP kfree+0x338/0x3b0 LR section_deactivate+0x138/0x200 Call Trace: section_deactivate+0x138/0x200 __remove_pages+0x114/0x150 arch_remove_memory+0x3c/0x160 try_remove_memory+0x114/0x1a0 __remove_memory+0x20/0x40 memtrace_enable_set+0x254/0x850 simple_attr_write+0x138/0x160 full_proxy_write+0x8c/0x110 __vfs_write+0x38/0x70 vfs_write+0x11c/0x2a0 ksys_write+0x84/0x140 system_call+0x5c/0x68 ---[ end trace 4b053cbd84e0db62 ]--- The first invocation will offline+remove memory blocks. The second invocation will first add+online them again, in order to offline+remove them again (usually we are lucky and the exact same memory blocks will get "reallocated"). Tested on powernv with boot memory: The usage map will not get freed. Tested on x86-64 with DIMMs: The usage map will get freed. Using Dynamic Memory under a Power DLAPR can trigger it easily. Triggering removal (I assume after previously removed+re-added) of memory from the HMC GUI can crash the kernel with the same call trace and is fixed by this patch. Link: http://lkml.kernel.org/r/20191217104637.5509-1-david@redhat.com Fixes: 326e1b8f83a4 ("mm/sparsemem: introduce a SECTION_IS_EARLY flag") Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Pingfan Liu <piliu@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-13mm, thp: tweak reclaim/compaction effort of local-only and all-node allocationsVlastimil Babka2-22/+12
THP page faults now attempt a __GFP_THISNODE allocation first, which should only compact existing free memory, followed by another attempt that can allocate from any node using reclaim/compaction effort specified by global defrag setting and madvise. This patch makes the following changes to the scheme: - Before the patch, the first allocation relies on a check for pageblock order and __GFP_IO to prevent excessive reclaim. This however affects also the second attempt, which is not limited to single node. Instead of that, reuse the existing check for costly order __GFP_NORETRY allocations, and make sure the first THP attempt uses __GFP_NORETRY. As a side-effect, all costly order __GFP_NORETRY allocations will bail out if compaction needs reclaim, while previously they only bailed out when compaction was deferred due to previous failures. This should be still acceptable within the __GFP_NORETRY semantics. - Before the patch, the second allocation attempt (on all nodes) was passing __GFP_NORETRY. This is redundant as the check for pageblock order (discussed above) was stronger. It's also contrary to madvise(MADV_HUGEPAGE) which means some effort to allocate THP is requested. After this patch, the second attempt doesn't pass __GFP_THISNODE nor __GFP_NORETRY. To sum up, THP page faults now try the following attempts: 1. local node only THP allocation with no reclaim, just compaction. 2. for madvised VMA's or when synchronous compaction is enabled always - THP allocation from any node with effort determined by global defrag setting and VMA madvise 3. fallback to base pages on any node Link: http://lkml.kernel.org/r/08a3f4dd-c3ce-0009-86c5-9ee51aba8557@suse.cz Fixes: b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-12Linux 5.5-rc6Linus Torvalds1-1/+1
2020-01-12Merge tag 'riscv/for-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linuxLinus Torvalds4-6/+6
Pull RISC-V fixes from Paul Walmsley: "Two fixes for RISC-V: - Clear FP registers during boot when FP support is present, rather than when they aren't present - Move the header files associated with the SiFive L2 cache controller to drivers/soc (where the code was recently moved)" * tag 'riscv/for-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fixup obvious bug for fp-regs reset riscv: move sifive_l2_cache.h to include/soc
2020-01-12riscv: Fixup obvious bug for fp-regs resetGuo Ren1-1/+1
CSR_MISA is defined in Privileged Architectures' spec: 3.1.1 Machine ISA Register misa. Every bit:1 indicate a feature, so we should beqz reset_done when there is no F/D bit in csr_misa register. Signed-off-by: Guo Ren <ren_guo@c-sky.com> [paul.walmsley@sifive.com: fix typo in commit message] Fixes: 9e80635619b51 ("riscv: clear the instruction cache and all registers when booting") Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-12riscv: move sifive_l2_cache.h to include/socYash Shah3-5/+5
The commit 9209fb51896f ("riscv: move sifive_l2_cache.c to drivers/soc") moves the sifive L2 cache driver to driver/soc. It did not move the header file along with the driver. Therefore this patch moves the header file to driver/soc Signed-off-by: Yash Shah <yash.shah@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> [paul.walmsley@sifive.com: updated to fix the include guard] Fixes: 9209fb51896f ("riscv: move sifive_l2_cache.c to drivers/soc") Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-12Merge tag 'iommu-fixes-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds3-7/+19
Pull iommu fixes from Joerg Roedel: - Two fixes for VT-d and generic IOMMU code to fix teardown on error handling code paths. - Patch for the Intel VT-d driver to fix handling of non-PCI devices - Fix W=1 compile warning in dma-iommu code * tag 'iommu-fixes-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/dma: fix variable 'cookie' set but not used iommu/vt-d: Unlink device if failed to add to group iommu: Remove device link to group on failure iommu/vt-d: Fix adding non-PCI devices to Intel IOMMU
2020-01-11Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds4-15/+23
Pull i2c fixes from Wolfram Sang: "Two driver bugfixes, a documentation fix, and a removal of a spec violation for the bus recovery algorithm in the core" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: fix bus recovery stop mode timing i2c: bcm2835: Store pointer to bus clock dt-bindings: i2c: at91: fix i2c-sda-hold-time-ns documentation for sam9x60 i2c: at91: fix clk_offset for sam9x60
2020-01-11Merge tag 'clone3-tls-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds18-32/+45
Pull thread fixes from Christian Brauner: "This contains a series of patches to fix CLONE_SETTLS when used with clone3(). The clone3() syscall passes the tls argument through struct clone_args instead of a register. This means, all architectures that do not implement copy_thread_tls() but still support CLONE_SETTLS via copy_thread() expecting the tls to be located in a register argument based on clone() are currently unfortunately broken. Their tls value will be garbage. The patch series fixes this on all architectures that currently define __ARCH_WANT_SYS_CLONE3. It also adds a compile-time check to ensure that any architecture that enables clone3() in the future is forced to also implement copy_thread_tls(). My ultimate goal is to get rid of the copy_thread()/copy_thread_tls() split and just have copy_thread_tls() at some point in the not too distant future (Maybe even renaming copy_thread_tls() back to simply copy_thread() once the old function is ripped from all arches). This is dependent now on all arches supporting clone3(). While all relevant arches do that now there are still four missing: ia64, m68k, sh and sparc. They have the system call reserved, but not implemented. Once they all implement clone3() we can get rid of ARCH_WANT_SYS_CLONE3 and HAVE_COPY_THREAD_TLS. This series also includes a minor fix for the arm64 uapi headers which caused __NR_clone3 to be missing from the exported user headers. Unfortunately the series came in a little late especially given that it touches a range of architectures. Due to the holidays not all arch maintainers responded in time probably due to their backlog. Will and Arnd have thankfully acked the arm specific changes. Given that the changes are straightforward and rather minimal combined with the fact the that clone3() with CLONE_SETTLS is broken I decided to send them post rc3 nonetheless" * tag 'clone3-tls-v5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: um: Implement copy_thread_tls clone3: ensure copy_thread_tls is implemented xtensa: Implement copy_thread_tls riscv: Implement copy_thread_tls parisc: Implement copy_thread_tls arm: Implement copy_thread_tls arm64: Implement copy_thread_tls arm64: Move __ARCH_WANT_SYS_CLONE3 definition to uapi headers
2020-01-10Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hidLinus Torvalds2-5/+7
Pull HID fix from Jiri Kosina: "A regression fix for EPOLLOUT handling in hidraw and uhid" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: hidraw, uhid: Always report EPOLLOUT
2020-01-10Merge tag 'usb-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds19-93/+243
Pull USB/PHY fixes from Greg KH: "Here are a number of USB and PHY driver fixes for 5.5-rc6 Nothing all that unusual, just the a bunch of small fixes for a lot of different reported issues. The PHY driver fixes are in here as they interacted with the usb drivers. Full details of the patches are in the shortlog, and all of these have been in linux-next with no reported issues" * tag 'usb-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (24 commits) usb: missing parentheses in USE_NEW_SCHEME usb: ohci-da8xx: ensure error return on variable error is set usb: musb: Disable pullup at init usb: musb: fix idling for suspend after disconnect interrupt usb: typec: ucsi: Fix the notification bit offsets USB: Fix: Don't skip endpoint descriptors with maxpacket=0 USB-PD tcpm: bad warning+size, PPS adapters phy/rockchip: inno-hdmi: round clock rate down to closest 1000 Hz usb: chipidea: host: Disable port power only if previously enabled usb: cdns3: should not use the same dev_id for shared interrupt handler usb: dwc3: gadget: Fix request complete check usb: musb: dma: Correct parameter passed to IRQ handler usb: musb: jz4740: Silence error if code is -EPROBE_DEFER usb: udc: tegra: select USB_ROLE_SWITCH USB: core: fix check for duplicate endpoints phy: cpcap-usb: Drop extra write to usb2 register phy: cpcap-usb: Improve host vs docked mode detection phy: cpcap-usb: Prevent USB line glitches from waking up modem phy: mapphone-mdm6600: Fix uninitialized status value regression phy: cpcap-usb: Fix flakey host idling and enumerating of devices ...
2020-01-10Merge tag 'char-misc-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds1-1/+1
Pull char/misc fix from Greg KH: "Here is a single fix, for the chrdev core, for 5.5-rc6 There's been a long-standing race condition triggered by syzbot, and occasionally real people, in the chrdev open() path. Will finally took the time to track it down and fix it for real before the holidays. Here's that one patch, it's been in linux-next for a while with no reported issues and it does fix the reported problem" * tag 'char-misc-5.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: chardev: Avoid potential use-after-free in 'chrdev_open()'