aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-11-25cifs: Always update signing key of first channelPaulo Alcantara (SUSE)1-0/+4
Update signing key of first channel whenever generating the master sigining/encryption/decryption keys rather than only in cifs_mount(). This also fixes reconnect when re-establishing smb sessions to other servers. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: Fix retrieval of DFS referrals in cifs_mount()Paulo Alcantara (SUSE)1-10/+22
Make sure that DFS referrals are sent to newly resolved root targets as in a multi tier DFS setup. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Link: https://lkml.kernel.org/r/05aa2995-e85e-0ff4-d003-5bb08bd17a22@canonical.com Cc: stable@vger.kernel.org Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: Fix potential softlockups while refreshing DFS cachePaulo Alcantara (SUSE)1-12/+29
We used to skip reconnects on all SMB2_IOCTL commands due to SMB3+ FSCTL_VALIDATE_NEGOTIATE_INFO - which made sense since we're still establishing a SMB session. However, when refresh_cache_worker() calls smb2_get_dfs_refer() and we're under reconnect, SMB2_ioctl() will not be able to get a proper status error (e.g. -EHOSTDOWN in case we failed to reconnect) but an -EAGAIN from cifs_send_recv() thus looping forever in refresh_cache_worker(). Fixes: e99c63e4d86d ("SMB3: Fix deadlock in validate negotiate hits reconnect") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Suggested-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: Fix lookup of root ses in DFS referral cachePaulo Alcantara (SUSE)1-2/+1
We don't care about module aliasing validation in cifs_compose_mount_options(..., is_smb3) when finding the root SMB session of an DFS namespace in order to refresh DFS referral cache. The following issue has been observed when mounting with '-t smb3' and then specifying 'vers=2.0': ... Nov 08 15:27:08 tw kernel: address conversion returned 0 for FS0.WIN.LOCAL Nov 08 15:27:08 tw kernel: [kworke] ==> dns_query((null),FS0.WIN.LOCAL,13,(null)) Nov 08 15:27:08 tw kernel: [kworke] call request_key(,FS0.WIN.LOCAL,) Nov 08 15:27:08 tw kernel: [kworke] ==> dns_resolver_cmp(FS0.WIN.LOCAL,FS0.WIN.LOCAL) Nov 08 15:27:08 tw kernel: [kworke] <== dns_resolver_cmp() = 1 Nov 08 15:27:08 tw kernel: [kworke] <== dns_query() = 13 Nov 08 15:27:08 tw kernel: fs/cifs/dns_resolve.c: dns_resolve_server_name_to_ip: resolved: FS0.WIN.LOCAL to 192.168.30.26 ===> Nov 08 15:27:08 tw kernel: CIFS VFS: vers=2.0 not permitted when mounting with smb3 Nov 08 15:27:08 tw kernel: fs/cifs/dfs_cache.c: CIFS VFS: leaving refresh_tcon (xid = 26) rc = -22 ... Fixes: 5072010ccf05 ("cifs: Fix DFS cache refresher for DFS links") Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: Fix use-after-free bug in cifs_reconnect()Paulo Alcantara (SUSE)1-11/+35
Ensure we grab an active reference in cifs superblock while doing failover to prevent automounts (DFS links) of expiring and then destroying the superblock pointer. This patch fixes the following KASAN report: [ 464.301462] BUG: KASAN: use-after-free in cifs_reconnect+0x6ab/0x1350 [ 464.303052] Read of size 8 at addr ffff888155e580d0 by task cifsd/1107 [ 464.304682] CPU: 3 PID: 1107 Comm: cifsd Not tainted 5.4.0-rc4+ #13 [ 464.305552] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58-rebuilt.opensuse.org 04/01/2014 [ 464.307146] Call Trace: [ 464.307875] dump_stack+0x5b/0x90 [ 464.308631] print_address_description.constprop.0+0x16/0x200 [ 464.309478] ? cifs_reconnect+0x6ab/0x1350 [ 464.310253] ? cifs_reconnect+0x6ab/0x1350 [ 464.311040] __kasan_report.cold+0x1a/0x41 [ 464.311811] ? cifs_reconnect+0x6ab/0x1350 [ 464.312563] kasan_report+0xe/0x20 [ 464.313300] cifs_reconnect+0x6ab/0x1350 [ 464.314062] ? extract_hostname.part.0+0x90/0x90 [ 464.314829] ? printk+0xad/0xde [ 464.315525] ? _raw_spin_lock+0x7c/0xd0 [ 464.316252] ? _raw_read_lock_irq+0x40/0x40 [ 464.316961] ? ___ratelimit+0xed/0x182 [ 464.317655] cifs_readv_from_socket+0x289/0x3b0 [ 464.318386] cifs_read_from_socket+0x98/0xd0 [ 464.319078] ? cifs_readv_from_socket+0x3b0/0x3b0 [ 464.319782] ? try_to_wake_up+0x43c/0xa90 [ 464.320463] ? cifs_small_buf_get+0x4b/0x60 [ 464.321173] ? allocate_buffers+0x98/0x1a0 [ 464.321856] cifs_demultiplex_thread+0x218/0x14a0 [ 464.322558] ? cifs_handle_standard+0x270/0x270 [ 464.323237] ? __switch_to_asm+0x40/0x70 [ 464.323893] ? __switch_to_asm+0x34/0x70 [ 464.324554] ? __switch_to_asm+0x40/0x70 [ 464.325226] ? __switch_to_asm+0x40/0x70 [ 464.325863] ? __switch_to_asm+0x34/0x70 [ 464.326505] ? __switch_to_asm+0x40/0x70 [ 464.327161] ? __switch_to_asm+0x34/0x70 [ 464.327784] ? finish_task_switch+0xa1/0x330 [ 464.328414] ? __switch_to+0x363/0x640 [ 464.329044] ? __schedule+0x575/0xaf0 [ 464.329655] ? _raw_spin_lock_irqsave+0x82/0xe0 [ 464.330301] kthread+0x1a3/0x1f0 [ 464.330884] ? cifs_handle_standard+0x270/0x270 [ 464.331624] ? kthread_create_on_node+0xd0/0xd0 [ 464.332347] ret_from_fork+0x35/0x40 [ 464.333577] Allocated by task 1110: [ 464.334381] save_stack+0x1b/0x80 [ 464.335123] __kasan_kmalloc.constprop.0+0xc2/0xd0 [ 464.335848] cifs_smb3_do_mount+0xd4/0xb00 [ 464.336619] legacy_get_tree+0x6b/0xa0 [ 464.337235] vfs_get_tree+0x41/0x110 [ 464.337975] fc_mount+0xa/0x40 [ 464.338557] vfs_kern_mount.part.0+0x6c/0x80 [ 464.339227] cifs_dfs_d_automount+0x336/0xd29 [ 464.339846] follow_managed+0x1b1/0x450 [ 464.340449] lookup_fast+0x231/0x4a0 [ 464.341039] path_openat+0x240/0x1fd0 [ 464.341634] do_filp_open+0x126/0x1c0 [ 464.342277] do_sys_open+0x1eb/0x2c0 [ 464.342957] do_syscall_64+0x5e/0x190 [ 464.343555] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 464.344772] Freed by task 0: [ 464.345347] save_stack+0x1b/0x80 [ 464.345966] __kasan_slab_free+0x12c/0x170 [ 464.346576] kfree+0xa6/0x270 [ 464.347211] rcu_core+0x39c/0xc80 [ 464.347800] __do_softirq+0x10d/0x3da [ 464.348919] The buggy address belongs to the object at ffff888155e58000 which belongs to the cache kmalloc-256 of size 256 [ 464.350222] The buggy address is located 208 bytes inside of 256-byte region [ffff888155e58000, ffff888155e58100) [ 464.351575] The buggy address belongs to the page: [ 464.352333] page:ffffea0005579600 refcount:1 mapcount:0 mapping:ffff88815a803400 index:0x0 compound_mapcount: 0 [ 464.353583] flags: 0x200000000010200(slab|head) [ 464.354209] raw: 0200000000010200 ffffea0005576200 0000000400000004 ffff88815a803400 [ 464.355353] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 [ 464.356458] page dumped because: kasan: bad access detected [ 464.367005] Memory state around the buggy address: [ 464.367787] ffff888155e57f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 464.368877] ffff888155e58000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 464.369967] >ffff888155e58080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 464.371111] ^ [ 464.371775] ffff888155e58100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 464.372893] ffff888155e58180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 464.373983] ================================================================== Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: dump channel info in DebugDataAurelien Aptel1-1/+34
* show server&TCP states for extra channels * mention if an interface has a channel connected to it In this version three of the patch, fixed minor printk format issue pointed out by the kbuild robot. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25smb3: dump in_send and num_waiters stats counters by defaultSteve French2-21/+4
Number of requests in_send and the number of waiters on sendRecv are useful counters in various cases, move them from CONFIG_CIFS_STATS2 to be on by default especially with multichannel Signed-off-by: Steve French <stfrench@microsoft.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-11-25cifs: try harder to open new channelsAurelien Aptel1-10/+22
Previously we would only loop over the iface list once. This patch tries to loop over multiple times until all channels are opened. It will also try to reuse RSS ifaces. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Properly process SMB3 lease breaksPavel Shilovsky7-65/+57
Currenly we doesn't assume that a server may break a lease from RWH to RW which causes us setting a wrong lease state on a file and thus mistakenly flushing data and byte-range locks and purging cached data on the client. This leads to performance degradation because subsequent IOs go directly to the server. Fix this by propagating new lease state and epoch values to the oplock break handler through cifsFileInfo structure and removing the use of cifsInodeInfo flags for that. It allows to avoid some races of several lease/oplock breaks using those flags in parallel. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: move cifsFileInfo_put logic into a work-queueRonnie Sahlberg3-27/+65
This patch moves the final part of the cifsFileInfo_put() logic where we need a write lock on lock_sem to be processed in a separate thread that holds no other locks. This is to prevent deadlocks like the one below: > there are 6 processes looping to while trying to down_write > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from > cifs_new_fileinfo > > and there are 5 other processes which are blocked, several of them > waiting on either PG_writeback or PG_locked (which are both set), all > for the same page of the file > > 2 inode_lock() (inode->i_rwsem) for the file > 1 wait_on_page_writeback() for the page > 1 down_read(inode->i_rwsem) for the inode of the directory > 1 inode_lock()(inode->i_rwsem) for the inode of the directory > 1 __lock_page > > > so processes are blocked waiting on: > page flags PG_locked and PG_writeback for one specific page > inode->i_rwsem for the directory > inode->i_rwsem for the file > cifsInodeInflock_sem > > > > here are the more gory details (let me know if I need to provide > anything more/better): > > [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df > #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d > #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d > #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33 > #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6 > #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315 > * (I think legitimize_path is bogus) > > in path_openat > } else { > const char *s = path_init(nd, flags); > while (!(error = link_path_walk(s, nd)) && > (error = do_last(nd, file, op)) > 0) { <<<< > > do_last: > if (open_flag & O_CREAT) > inode_lock(dir->d_inode); <<<< > else > so it's trying to take inode->i_rwsem for the directory > > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/ > inode.i_rwsem is ffff8c691158efc0 > > <struct rw_semaphore 0xffff8c691158efc0>: > owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 - > reopen_file), counter: 0x0000000000000003 > waitlist: 2 > 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926 > RWSEM_WAITING_FOR_WRITE > 0xffff996500393e00 9802 ls UN 0 1:17:26.700 > RWSEM_WAITING_FOR_READ > > > the owner of the inode.i_rwsem of the directory is: > > [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff99650065b828] __schedule at ffffffff9b6e6095 > #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df > #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff99650065b940] msleep at ffffffff9af573a9 > #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs] > #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs] > #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd > #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs] > #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs] > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs] > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7 > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33 > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6 > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315 > > cifs_launder_page is for page 0xffffd1e2c07d2480 > > crash> page.index,mapping,flags 0xffffd1e2c07d2480 > index = 0x8 > mapping = 0xffff8c68f3cd0db0 > flags = 0xfffffc0008095 > > PAGE-FLAG BIT VALUE > PG_locked 0 0000001 > PG_uptodate 2 0000004 > PG_lru 4 0000010 > PG_waiters 7 0000080 > PG_writeback 15 0008000 > > > inode is ffff8c68f3cd0c40 > inode.i_rwsem is ffff8c68f3cd0ce0 > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG > /mnt/vm1_smb/testfile.8853 > > > this process holds the inode->i_rwsem for the parent directory, is > laundering a page attached to the inode of the file it's opening, and in > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem > for the file itself. > > > <struct rw_semaphore 0xffff8c68f3cd0ce0>: > owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 - > reopen_file), counter: 0x0000000000000003 > waitlist: 1 > 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912 > RWSEM_WAITING_FOR_WRITE > > this is the inode.i_rwsem for the file > > the owner: > > [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095 > #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df > #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2 > #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f > #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf > #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c > #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs] > #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4 > #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a > #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs] > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35 > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9 > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315 > > the process holds the inode->i_rwsem for the file to which it's writing, > and is trying to __lock_page for the same page as in the other processes > > > the other tasks: > [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df > #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007b3af0] msleep at ffffffff9af573a9 > #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs] > #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs] > #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a > #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f > #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33 > #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6 > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315 > > this is opening the file, and is trying to down_write cinode->lock_sem > > > [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3 > COMMAND: "reopen_file" > [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6 > COMMAND: "reopen_file" > #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095 > #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df > #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007c3d90] msleep at ffffffff9af573a9 > #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs] > #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e > #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614 > #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f > #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c > > closing the file, and trying to down_write cifsi->lock_sem > > > [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095 > #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df > #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2 > #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6 > #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028 > #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165 > #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs] > #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1 > #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e > #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315 > > in __filemap_fdatawait_range > wait_on_page_writeback(page); > for the same page of the file > > > > [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095 > #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df > #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs] > #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd > #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35 > #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9 > #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315 > > inode_lock(inode); > > > and one 'ls' later on, to see whether the rest of the mount is available > (the test file is in the root, so we get blocked up on the directory > ->i_rwsem), so the entire mount is unavailable > > [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4 > COMMAND: "ls" > #0 [ffff996500393d28] __schedule at ffffffff9b6e6095 > #1 [ffff996500393db8] schedule at ffffffff9b6e64df > #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421 > #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2 > #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56 > #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c > #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6 > #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315 > > in iterate_dir: > if (shared) > res = down_read_killable(&inode->i_rwsem); <<<< > else > res = down_write_killable(&inode->i_rwsem); > Reported-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: try opening channels after mountingAurelien Aptel9-88/+478
After doing mount() successfully we call cifs_try_adding_channels() which will open as many channels as it can. Channels are closed when the master session is closed. The master connection becomes the first channel. ,-------------> global cifs_tcp_ses_list <-------------------------. | | '- TCP_Server_Info <--> TCP_Server_Info <--> TCP_Server_Info <-' (master con) (chan#1 con) (chan#2 con) | ^ ^ ^ v '--------------------|--------------------' cifs_ses | - chan_count = 3 | - chans[] ---------------------' - smb3signingkey[] (master signing key) Note how channel connections don't have sessions. That's because cifs_ses can only be part of one linked list (list_head are internal to the elements). For signing keys, each channel has its own signing key which must be used only after the channel has been bound. While it's binding it must use the master session signing key. For encryption keys, since channel connections do not have sessions attached we must now find matching session by looping over all sessions in smb2_get_enc_key(). Each channel is opened like a regular server connection but at the session setup request step it must set the SMB2_SESSION_REQ_FLAG_BINDING flag and use the session id to bind to. Finally, while sending in compound_send_recv() for requests that aren't negprot, ses-setup or binding related, use a channel by cycling through the available ones (round-robin). Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: refactor cifs_get_inode_info()Aurelien Aptel2-137/+205
Make logic of cifs_get_inode() much clearer by moving code to sub functions and adding comments. Document the steps this function does. cifs_get_inode_info() gets and updates a file inode metadata from its file path. * If caller already has raw info data from server they can pass it. * If inode already exists (just need to update) caller can pass it. Step 1: get raw data from server if none was passed Step 2: parse raw data into intermediate internal cifs_fattr struct Step 3: set fattr uniqueid which is later used for inode number. This can sometime be done from raw data Step 4: tweak fattr according to mount options (file_mode, acl to mode bits, uid, gid, etc) Step 5: update or create inode from final fattr struct * add is_smb1_server() helper * add is_inode_cache_good() helper * move SMB1-backupcreds-getinfo-retry to separate func cifs_backup_query_path_info(). * move set-uniqueid code to separate func cifs_set_fattr_ino() * don't clobber uniqueid from backup cred retry * fix some probable corner cases memleaks Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: switch servers depending on binding stateAurelien Aptel6-21/+33
Currently a lot of the code to initialize a connection & session uses the cifs_ses as input. But depending on if we are opening a new session or a new channel we need to use different server pointers. Add a "binding" flag in cifs_ses and a helper function that returns the server ptr a session should use (only in the sess establishment code path). Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: add server paramAurelien Aptel5-17/+22
As we get down to the transport layer, plenty of functions are passed the session pointer and assume the transport to use is ses->server. Instead we modify those functions to pass (ses, server) so that we can decouple the session from the server. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: add multichannel mount options and data structsAurelien Aptel3-2/+56
adds: - [no]multichannel to enable/disable multichannel - max_channels=N to control how many channels to create these options are then stored in the volume struct. - store channels and max_channels in cifs_ses Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: sort interface list by speedAurelien Aptel1-0/+11
New channels are going to be opened by walking the list sequentially, so by sorting it we will connect to the fastest interfaces first. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Fix SMB2 oplock break processingPavel Shilovsky1-4/+3
Even when mounting modern protocol version the server may be configured without supporting SMB2.1 leases and the client uses SMB2 oplock to optimize IO performance through local caching. However there is a problem in oplock break handling that leads to missing a break notification on the client who has a file opened. It latter causes big latencies to other clients that are trying to open the same file. The problem reproduces when there are multiple shares from the same server mounted on the client. The processing code tries to match persistent and volatile file ids from the break notification with an open file but it skips all share besides the first one. Fix this by looking up in all shares belonging to the server that issued the oplock break. Cc: Stable <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: don't use 'pre:' for MODULE_SOFTDEPRonnie Sahlberg1-12/+12
It can cause to fail with modprobe: FATAL: Module <module> is builtin. RHBZ: 1767094 Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: smbd: Return -EAGAIN when transport is reconnectingLong Li1-2/+5
During reconnecting, the transport may have already been destroyed and is in the process being reconnected. In this case, return -EAGAIN to not fail and to retry this I/O. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: smbd: Only queue work for error recovery on memory registrationLong Li1-11/+15
It's not necessary to queue invalidated memory registration to work queue, as all we need to do is to unmap the SG and make it usable again. This can save CPU cycles in normal data paths as memory registration errors are rare and normally only happens during reconnection. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25smb3: add debug messages for closing unmatched openRonnie Sahlberg2-10/+34
Helps distinguish between an interrupted close and a truly unmatched open. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Do not miss cancelled OPEN responsesPavel Shilovsky2-8/+8
When an OPEN command is cancelled we mark a mid as cancelled and let the demultiplex thread process it by closing an open handle. The problem is there is a race between a system call thread and the demultiplex thread and there may be a situation when the mid has been already processed before it is set as cancelled. Fix this by processing cancelled requests when mids are being destroyed which means that there is only one thread referencing a particular mid. Also set mids as cancelled unconditionally on their state. Cc: Stable <stable@vger.kernel.org> Tested-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Fix NULL pointer dereference in mid callbackPavel Shilovsky3-7/+17
There is a race between a system call processing thread and the demultiplex thread when mid->resp_buf becomes NULL and later is being accessed to get credits. It happens when the 1st thread wakes up before a mid callback is called in the 2nd one but the mid state has already been set to MID_RESPONSE_RECEIVED. This causes NULL pointer dereference in mid callback. Fix this by saving credits from the response before we update the mid state and then use this value in the mid callback rather then accessing a response buffer. Cc: Stable <stable@vger.kernel.org> Fixes: ee258d79159afed5 ("CIFS: Move credit processing to mid callbacks for SMB3") Tested-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Close open handle after interrupted closePavel Shilovsky3-15/+63
If Close command is interrupted before sending a request to the server the client ends up leaking an open file handle. This wastes server resources and can potentially block applications that try to remove the file or any directory containing this file. Fix this by putting the close command into a worker queue, so another thread retries it later. Cc: Stable <stable@vger.kernel.org> Tested-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Respect O_SYNC and O_DIRECT flags during reconnectPavel Shilovsky1-0/+7
Currently the client translates O_SYNC and O_DIRECT flags into corresponding SMB create options when openning a file. The problem is that on reconnect when the file is being re-opened the client doesn't set those flags and it causes a server to reject re-open requests because create options don't match. The latter means that any subsequent system call against that open file fail until a share is re-mounted. Fix this by properly setting SMB create options when re-openning files after reconnects. Fixes: 1013e760d10e6: ("SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags") Cc: Stable <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25smb3: remove confusing dmesg when mounting with encryption ("seal")Steve French1-8/+2
The smb2/smb3 message checking code was logging to dmesg when mounting with encryption ("seal") for compounded SMB3 requests. When encrypted the whole frame (including potentially multiple compounds) is read so the length field is longer than in the case of non-encrypted case (where length field will match the the calculated length for the particular SMB3 request in the compound being validated). Avoids the warning on mount (with "seal"): "srv rsp padded more than expected. Length 384 not ..." Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: close the shared root handle on tree disconnectRonnie Sahlberg1-0/+2
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Return directly after a failed build_path_from_dentry() in cifs_do_create()Markus Elfring1-4/+2
Return directly after a call of the function "build_path_from_dentry" failed at the beginning. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Use common error handling code in smb2_ioctl_query_info()Markus Elfring1-22/+23
Move the same error code assignments so that such exception handling can be better reused at the end of this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Use memdup_user() rather than duplicating its implementationMarkus Elfring1-9/+4
Reuse existing functionality from memdup_user() instead of keeping duplicate source code. Generated by: scripts/coccinelle/api/memdup_user.cocci Fixes: f5b05d622a3e99e6a97a189fe500414be802a05c ("cifs: add IOCTL for QUERY_INFO passthrough to userspace") Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: smbd: Return -ECONNABORTED when trasnport is not in connected stateLong Li1-1/+1
The transport should return this error so the upper layer will reconnect. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: smbd: Add messages on RDMA session destroy and reconnectionLong Li1-2/+4
Log these activities to help production support. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: smbd: Return -EINVAL when the number of iovs exceeds SMBDIRECT_MAX_SGELong Li1-1/+1
While it's not friendly to fail user processes that issue more iovs than we support, at least we should return the correct error code so the user process gets a chance to retry with smaller number of iovs. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: smbd: Invalidate and deregister memory registration on re-send for direct I/OLong Li1-2/+18
On re-send, there might be a reconnect and all prevoius memory registrations need to be invalidated and deregistered. Signed-off-by: Long Li <longli@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: Don't display RDMA transport on reconnectLong Li1-0/+5
On reconnect, the transport data structure is NULL and its information is not available. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: remove set but not used variables 'cinode' and 'netfid'YueHaibing1-4/+0
Fixes gcc '-Wunused-but-set-variable' warning: fs/cifs/file.c: In function 'cifs_flock': fs/cifs/file.c:1704:8: warning: variable 'netfid' set but not used [-Wunused-but-set-variable] fs/cifs/file.c:1702:24: warning: variable 'cinode' set but not used [-Wunused-but-set-variable] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: add support for flockSteve French3-1/+55
The flock system call locks the whole file rather than a byte range and so is currently emulated by various other file systems by simply sending a byte range lock for the whole file. Add flock handling for cifs.ko in similar way. xfstest generic/504 passes with this as well Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-11-25cifs: remove unused variable 'sid_user'YueHaibing1-2/+0
fs/cifs/cifsacl.c:43:30: warning: sid_user defined but not used [-Wunused-const-variable=] It is never used, so remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: rename a variable in SendReceive()Dan Carpenter1-1/+1
Smatch gets confused because we sometimes refer to "server->srv_mutex" and sometimes to "sess->server->srv_mutex". They refer to the same lock so let's just make this consistent. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-24Linux 5.4Linus Torvalds1-1/+1
2019-11-23cramfs: fix usage on non-MTD deviceMaxime Bizon1-2/+2
When both CONFIG_CRAMFS_MTD and CONFIG_CRAMFS_BLOCKDEV are enabled, if we fail to mount on MTD, we don't try on block device. Note: this relies upon cramfs_mtd_fill_super() leaving no side effects on fc state in case of failure; in general, failing get_tree_...() does *not* mean "fine to try again"; e.g. parsed options might've been consumed by fill_super callback and freed on failure. Fixes: 74f78fc5ef43 ("vfs: Convert cramfs to use the new mount API") Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-22Revert "Input: synaptics - enable RMI mode for X1 Extreme 2nd Generation"Lyude Paul1-1/+0
This reverts commit 68b9c5066e39af41d3448abfc887c77ce22dd64d. Ugh, I really dropped the ball on this one :\. So as it turns out RMI4 works perfectly fine on the X1 Extreme Gen 2 except for one thing I didn't notice because I usually use the trackpoint: clicking with the touchpad. Somehow this is broken, in fact we don't even seem to indicate BTN_LEFT as a valid event type for the RMI4 touchpad. And, I don't even see any RMI4 events coming from the touchpad when I press down on it. This only seems to work for PS/2 mode. Since that means we have a regression, and PS/2 mode seems to work fine for the time being - revert this for now. We'll have to do a more thorough investigation on this. Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://lore.kernel.org/r/20191119234534.10725-1-lyude@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-11-22afs: Fix large file supportMarc Dionne1-0/+1
By default s_maxbytes is set to MAX_NON_LFS, which limits the usable file size to 2GB, enforced by the vfs. Commit b9b1f8d5930a ("AFS: write support fixes") added support for the 64-bit fetch and store server operations, but did not change this value. As a result, attempts to write past the 2G mark result in EFBIG errors: $ dd if=/dev/zero of=foo bs=1M count=1 seek=2048 dd: error writing 'foo': File too large Set s_maxbytes to MAX_LFS_FILESIZE. Fixes: b9b1f8d5930a ("AFS: write support fixes") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22afs: Fix possible assert with callbacks from yfs serversMarc Dionne1-1/+0
Servers sending callback breaks to the YFS_CM_SERVICE service may send up to YFSCBMAX (1024) fids in a single RPC. Anything over AFSCBMAX (50) will cause the assert in afs_break_callbacks to trigger. Remove the assert, as the count has already been checked against the appropriate max values in afs_deliver_cb_callback and afs_deliver_yfs_cb_callback. Fixes: 35dbfba3111a ("afs: Implement the YFS cache manager service") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22r8152: avoid to call napi_disable twiceHayes Wang1-8/+20
Call napi_disable() twice would cause dead lock. There are three situations may result in the issue. 1. rtl8152_pre_reset() and set_carrier() are run at the same time. 2. Call rtl8152_set_tunable() after rtl8152_close(). 3. Call rtl8152_set_ringparam() after rtl8152_close(). For #1, use the same solution as commit 84811412464d ("r8152: Re-order napi_disable in rtl8152_close"). For #2 and #3, add checking the flag of IFF_UP and using napi_disable/napi_enable during mutex. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22MAINTAINERS: Add myself as maintainer of virtio-vsockStefano Garzarella1-0/+1
Since I'm actively working on vsock and virtio/vhost transports, Stefan suggested to help him to maintain it. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22udp: drop skb extensions before marking skb statelessFlorian Westphal2-5/+28
Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free assumes all skb head state has been dropped already. This will leak the extension memory in case the skb has extensions other than the ipsec secpath, e.g. bridge nf data. To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have extensions or if the extension space can be free'd. Fixes: 895b5c9f206eb7d25dc1360a ("netfilter: drop bridge nf reset from nf_reset") Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: Byron Stanoszek <gandalf@winds.org> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22net: rtnetlink: prevent underflows in do_setvfinfo()Dan Carpenter1-1/+22
The "ivm->vf" variable is a u32, but the problem is that a number of drivers cast it to an int and then forget to check for negatives. An example of this is in the cxgb4 driver. drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 2890 static int cxgb4_mgmt_get_vf_config(struct net_device *dev, 2891 int vf, struct ifla_vf_info *ivi) ^^^^^^ 2892 { 2893 struct port_info *pi = netdev_priv(dev); 2894 struct adapter *adap = pi->adapter; 2895 struct vf_info *vfinfo; 2896 2897 if (vf >= adap->num_vfs) ^^^^^^^^^^^^^^^^^^^ 2898 return -EINVAL; 2899 vfinfo = &adap->vfinfo[vf]; ^^^^^^^^^^^^^^^^^^^^^^^^^^ There are 48 functions affected. drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646' drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646' Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-22mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()Andrey Ryabinin1-7/+7
It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in remove_stable_node() when it races with __mmput() and squeezes in between ksm_exit() and exit_mmap(). WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150 Call Trace: remove_all_stable_nodes+0x12b/0x330 run_store+0x4ef/0x7b0 kernfs_fop_write+0x200/0x420 vfs_write+0x154/0x450 ksys_write+0xf9/0x1d0 do_syscall_64+0x99/0x510 entry_SYSCALL_64_after_hwframe+0x49/0xbe Remove the warning as there is nothing scary going on. Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com Fixes: cbf86cfe04a6 ("ksm: remove old stable nodes more thoroughly") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-22mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()David Hildenbrand1-3/+13
Let's limit shrinking to !ZONE_DEVICE so we can fix the current code. We should never try to touch the memmap of offline sections where we could have uninitialized memmaps and could trigger BUGs when calling page_to_nid() on poisoned pages. There is no reliable way to distinguish an uninitialized memmap from an initialized memmap that belongs to ZONE_DEVICE, as we don't have anything like SECTION_IS_ONLINE we can use similar to pfn_to_online_section() for !ZONE_DEVICE memory. E.g., set_zone_contiguous() similarly relies on pfn_to_online_section() and will therefore never set a ZONE_DEVICE zone consecutive. Stopping to shrink the ZONE_DEVICE therefore results in no observable changes, besides /proc/zoneinfo indicating different boundaries - something we can totally live with. Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug"), the memmap was initialized with 0 and the node with the right value. So the zone might be wrong but not garbage. After that commit, both the zone and the node will be garbage when touching uninitialized memmaps. Toshiki reported a BUG (race between delayed initialization of ZONE_DEVICE memmaps without holding the memory hotplug lock and concurrent zone shrinking). https://lkml.org/lkml/2019/11/14/1040 "Iteration of create and destroy namespace causes the panic as below: kernel BUG at mm/page_alloc.c:535! CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0 Call Trace: memmap_init_zone_device+0x165/0x17c memremap_pages+0x4c1/0x540 devm_memremap_pages+0x1d/0x60 pmem_attach_disk+0x16b/0x600 [nd_pmem] nvdimm_bus_probe+0x69/0x1c0 really_probe+0x1c2/0x3e0 driver_probe_device+0xb4/0x100 device_driver_attach+0x4f/0x60 bind_store+0xc9/0x110 kernfs_fop_write+0x116/0x190 vfs_write+0xa5/0x1a0 ksys_write+0x59/0xd0 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 While creating a namespace and initializing memmap, if you destroy the namespace and shrink the zone, it will initialize the memmap outside the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page) in set_pfnblock_flags_mask()." This BUG is also mitigated by this commit, where we for now stop to shrink the ZONE_DEVICE zone until we can do it in a safe and clean way. Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Damian Tometzki <damian.tometzki@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Halil Pasic <pasic@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jun Yao <yaojun8558363@gmail.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Pankaj Gupta <pagupta@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qian Cai <cai@lca.pw> Cc: Rich Felker <dalias@libc.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Steve Capper <steve.capper@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> [4.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>