aboutsummaryrefslogtreecommitdiffstats
path: root/fs/cifs/cifsglob.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-12-23cifs: Optimize readdir on reparse pointsPaulo Alcantara (SUSE)1-0/+1
When listing a directory with thounsands of files and most of them are reparse points, we simply marked all those dentries for revalidation and then sending additional (compounded) create/getinfo/close requests for each of them. Instead, upon receiving a response from an SMB2_QUERY_DIRECTORY (FileIdFullDirectoryInformation) command, the directory entries that have a file attribute of FILE_ATTRIBUTE_REPARSE_POINT will contain an EaSize field with a reparse tag in it, so we parse it and mark the dentry for revalidation only if it is a DFS or a symlink. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-12-13CIFS: Close cached root handle only if it has a leasePavel Shilovsky1-1/+1
SMB2_tdis() checks if a root handle is valid in order to decide whether it needs to close the handle or not. However if another thread has reference for the handle, it may end up with putting the reference twice. The extra reference that we want to put during the tree disconnect is the reference that has a directory lease. So, track the fact that we have a directory lease and close the handle only in that case. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-12-04cifs: Fix lookup of SMB connections on multichannelPaulo Alcantara (SUSE)1-0/+1
With the addition of SMB session channels, we introduced new TCP server pointers that have no sessions or tcons associated with them. In this case, when we started looking for TCP connections, we might end up picking session channel rather than the master connection, hence failing to get either a session or a tcon. In order to fix that, this patch introduces a new "is_channel" field to TCP_Server_Info structure so we can skip session channels during lookup of connections. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-12-03smb3: query attributes on file closeSteve French1-0/+3
Since timestamps on files on most servers can be updated at close, and since timestamps on our dentries default to one second we can have stale timestamps in some common cases (e.g. open, write, close, stat, wait one second, stat - will show different mtime for the first and second stat). The SMB2/SMB3 protocol allows querying timestamps at close so add the code to request timestamp and attr information (which is cheap for the server to provide) to be returned when a file is closed (it is not needed for the many paths that call SMB2_close that are from compounded query infos and close nor is it needed for some of the cases where a directory close immediately follows a directory open. Signed-off-by: Steve French <stfrench@microsoft.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-11-25smb3: dump in_send and num_waiters stats counters by defaultSteve French1-19/+3
Number of requests in_send and the number of waiters on sendRecv are useful counters in various cases, move them from CONFIG_CIFS_STATS2 to be on by default especially with multichannel Signed-off-by: Steve French <stfrench@microsoft.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-11-25CIFS: Properly process SMB3 lease breaksPavel Shilovsky1-3/+6
Currenly we doesn't assume that a server may break a lease from RWH to RW which causes us setting a wrong lease state on a file and thus mistakenly flushing data and byte-range locks and purging cached data on the client. This leads to performance degradation because subsequent IOs go directly to the server. Fix this by propagating new lease state and epoch values to the oplock break handler through cifsFileInfo structure and removing the use of cifsInodeInfo flags for that. It allows to avoid some races of several lease/oplock breaks using those flags in parallel. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: move cifsFileInfo_put logic into a work-queueRonnie Sahlberg1-1/+4
This patch moves the final part of the cifsFileInfo_put() logic where we need a write lock on lock_sem to be processed in a separate thread that holds no other locks. This is to prevent deadlocks like the one below: > there are 6 processes looping to while trying to down_write > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from > cifs_new_fileinfo > > and there are 5 other processes which are blocked, several of them > waiting on either PG_writeback or PG_locked (which are both set), all > for the same page of the file > > 2 inode_lock() (inode->i_rwsem) for the file > 1 wait_on_page_writeback() for the page > 1 down_read(inode->i_rwsem) for the inode of the directory > 1 inode_lock()(inode->i_rwsem) for the inode of the directory > 1 __lock_page > > > so processes are blocked waiting on: > page flags PG_locked and PG_writeback for one specific page > inode->i_rwsem for the directory > inode->i_rwsem for the file > cifsInodeInflock_sem > > > > here are the more gory details (let me know if I need to provide > anything more/better): > > [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df > #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d > #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d > #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33 > #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6 > #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315 > * (I think legitimize_path is bogus) > > in path_openat > } else { > const char *s = path_init(nd, flags); > while (!(error = link_path_walk(s, nd)) && > (error = do_last(nd, file, op)) > 0) { <<<< > > do_last: > if (open_flag & O_CREAT) > inode_lock(dir->d_inode); <<<< > else > so it's trying to take inode->i_rwsem for the directory > > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/ > inode.i_rwsem is ffff8c691158efc0 > > <struct rw_semaphore 0xffff8c691158efc0>: > owner: <struct task_struct 0xffff8c6914275d00> (UN - 8856 - > reopen_file), counter: 0x0000000000000003 > waitlist: 2 > 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926 > RWSEM_WAITING_FOR_WRITE > 0xffff996500393e00 9802 ls UN 0 1:17:26.700 > RWSEM_WAITING_FOR_READ > > > the owner of the inode.i_rwsem of the directory is: > > [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3 > COMMAND: "reopen_file" > #0 [ffff99650065b828] __schedule at ffffffff9b6e6095 > #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df > #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff99650065b940] msleep at ffffffff9af573a9 > #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs] > #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs] > #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd > #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs] > #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs] > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs] > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7 > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33 > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6 > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315 > > cifs_launder_page is for page 0xffffd1e2c07d2480 > > crash> page.index,mapping,flags 0xffffd1e2c07d2480 > index = 0x8 > mapping = 0xffff8c68f3cd0db0 > flags = 0xfffffc0008095 > > PAGE-FLAG BIT VALUE > PG_locked 0 0000001 > PG_uptodate 2 0000004 > PG_lru 4 0000010 > PG_waiters 7 0000080 > PG_writeback 15 0008000 > > > inode is ffff8c68f3cd0c40 > inode.i_rwsem is ffff8c68f3cd0ce0 > DENTRY INODE SUPERBLK TYPE PATH > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG > /mnt/vm1_smb/testfile.8853 > > > this process holds the inode->i_rwsem for the parent directory, is > laundering a page attached to the inode of the file it's opening, and in > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem > for the file itself. > > > <struct rw_semaphore 0xffff8c68f3cd0ce0>: > owner: <struct task_struct 0xffff8c6914272e80> (UN - 8854 - > reopen_file), counter: 0x0000000000000003 > waitlist: 1 > 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912 > RWSEM_WAITING_FOR_WRITE > > this is the inode.i_rwsem for the file > > the owner: > > [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095 > #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df > #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2 > #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f > #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf > #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c > #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs] > #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4 > #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a > #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs] > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35 > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9 > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315 > > the process holds the inode->i_rwsem for the file to which it's writing, > and is trying to __lock_page for the same page as in the other processes > > > the other tasks: > [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2 > COMMAND: "reopen_file" > #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095 > #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df > #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007b3af0] msleep at ffffffff9af573a9 > #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs] > #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs] > #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a > #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f > #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33 > #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6 > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315 > > this is opening the file, and is trying to down_write cinode->lock_sem > > > [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3 > COMMAND: "reopen_file" > [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2 > COMMAND: "reopen_file" > [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6 > COMMAND: "reopen_file" > #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095 > #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df > #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89 > #3 [ffff9965007c3d90] msleep at ffffffff9af573a9 > #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs] > #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs] > #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e > #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614 > #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f > #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c > > closing the file, and trying to down_write cifsi->lock_sem > > > [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095 > #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df > #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2 > #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6 > #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028 > #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165 > #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs] > #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1 > #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e > #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315 > > in __filemap_fdatawait_range > wait_on_page_writeback(page); > for the same page of the file > > > > [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7 > COMMAND: "reopen_file" > #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095 > #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df > #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7 > #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs] > #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd > #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35 > #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9 > #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315 > > inode_lock(inode); > > > and one 'ls' later on, to see whether the rest of the mount is available > (the test file is in the root, so we get blocked up on the directory > ->i_rwsem), so the entire mount is unavailable > > [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4 > COMMAND: "ls" > #0 [ffff996500393d28] __schedule at ffffffff9b6e6095 > #1 [ffff996500393db8] schedule at ffffffff9b6e64df > #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421 > #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2 > #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56 > #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c > #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6 > #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315 > > in iterate_dir: > if (shared) > res = down_read_killable(&inode->i_rwsem); <<<< > else > res = down_write_killable(&inode->i_rwsem); > Reported-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: try opening channels after mountingAurelien Aptel1-0/+16
After doing mount() successfully we call cifs_try_adding_channels() which will open as many channels as it can. Channels are closed when the master session is closed. The master connection becomes the first channel. ,-------------> global cifs_tcp_ses_list <-------------------------. | | '- TCP_Server_Info <--> TCP_Server_Info <--> TCP_Server_Info <-' (master con) (chan#1 con) (chan#2 con) | ^ ^ ^ v '--------------------|--------------------' cifs_ses | - chan_count = 3 | - chans[] ---------------------' - smb3signingkey[] (master signing key) Note how channel connections don't have sessions. That's because cifs_ses can only be part of one linked list (list_head are internal to the elements). For signing keys, each channel has its own signing key which must be used only after the channel has been bound. While it's binding it must use the master session signing key. For encryption keys, since channel connections do not have sessions attached we must now find matching session by looping over all sessions in smb2_get_enc_key(). Each channel is opened like a regular server connection but at the session setup request step it must set the SMB2_SESSION_REQ_FLAG_BINDING flag and use the session id to bind to. Finally, while sending in compound_send_recv() for requests that aren't negprot, ses-setup or binding related, use a channel by cycling through the available ones (round-robin). Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: refactor cifs_get_inode_info()Aurelien Aptel1-0/+6
Make logic of cifs_get_inode() much clearer by moving code to sub functions and adding comments. Document the steps this function does. cifs_get_inode_info() gets and updates a file inode metadata from its file path. * If caller already has raw info data from server they can pass it. * If inode already exists (just need to update) caller can pass it. Step 1: get raw data from server if none was passed Step 2: parse raw data into intermediate internal cifs_fattr struct Step 3: set fattr uniqueid which is later used for inode number. This can sometime be done from raw data Step 4: tweak fattr according to mount options (file_mode, acl to mode bits, uid, gid, etc) Step 5: update or create inode from final fattr struct * add is_smb1_server() helper * add is_inode_cache_good() helper * move SMB1-backupcreds-getinfo-retry to separate func cifs_backup_query_path_info(). * move set-uniqueid code to separate func cifs_set_fattr_ino() * don't clobber uniqueid from backup cred retry * fix some probable corner cases memleaks Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: switch servers depending on binding stateAurelien Aptel1-0/+10
Currently a lot of the code to initialize a connection & session uses the cifs_ses as input. But depending on if we are opening a new session or a new channel we need to use different server pointers. Add a "binding" flag in cifs_ses and a helper function that returns the server ptr a session should use (only in the sess establishment code path). Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: add server paramAurelien Aptel1-1/+2
As we get down to the transport layer, plenty of functions are passed the session pointer and assume the transport to use is ses->server. Instead we modify those functions to pass (ses, server) so that we can decouple the session from the server. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25cifs: add multichannel mount options and data structsAurelien Aptel1-0/+16
adds: - [no]multichannel to enable/disable multichannel - max_channels=N to control how many channels to create these options are then stored in the volume struct. - store channels and max_channels in cifs_ses Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25smb3: add debug messages for closing unmatched openRonnie Sahlberg1-0/+2
Helps distinguish between an interrupted close and a truly unmatched open. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25CIFS: Fix NULL pointer dereference in mid callbackPavel Shilovsky1-0/+1
There is a race between a system call processing thread and the demultiplex thread when mid->resp_buf becomes NULL and later is being accessed to get credits. It happens when the 1st thread wakes up before a mid callback is called in the 2nd one but the mid state has already been set to MID_RESPONSE_RECEIVED. This causes NULL pointer dereference in mid callback. Fix this by saving credits from the response before we update the mid state and then use this value in the mid callback rather then accessing a response buffer. Cc: Stable <stable@vger.kernel.org> Fixes: ee258d79159afed5 ("CIFS: Move credit processing to mid callbacks for SMB3") Tested-by: Frank Sorenson <sorenson@redhat.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-10-24cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occursDave Wysochanski1-0/+5
There's a deadlock that is possible and can easily be seen with a test where multiple readers open/read/close of the same file and a disruption occurs causing reconnect. The deadlock is due a reader thread inside cifs_strict_readv calling down_read and obtaining lock_sem, and then after reconnect inside cifs_reopen_file calling down_read a second time. If in between the two down_read calls, a down_write comes from another process, deadlock occurs. CPU0 CPU1 ---- ---- cifs_strict_readv() down_read(&cifsi->lock_sem); _cifsFileInfo_put OR cifs_new_fileinfo down_write(&cifsi->lock_sem); cifs_reopen_file() down_read(&cifsi->lock_sem); Fix the above by changing all down_write(lock_sem) calls to down_write_trylock(lock_sem)/msleep() loop, which in turn makes the second down_read call benign since it will never block behind the writer while holding lock_sem. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed--by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-10-08smb3: remove noisy debug message and minor cleanupSteve French1-1/+1
Message was intended only for developer temporary build In addition cleanup two minor warnings noticed by Coverity and a trivial change to workaround a sparse warning Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-09-26smb3: pass mode bits into create callsSteve French1-2/+4
We need to populate an ACL (security descriptor open context) on file and directory correct. This patch passes in the mode. Followon patch will build the open context and the security descriptor (from the mode) that goes in the open context. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2019-09-16cifs: Add support for root file systemsPaulo Alcantara (SUSE)1-0/+2
Introduce a new CONFIG_CIFS_ROOT option to handle root file systems over a SMB share. In order to mount the root file system during the init process, make cifs.ko perform non-blocking socket operations while mounting and accessing it. Cc: Steve French <smfrench@gmail.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16smb3: allow disabling requesting leasesSteve French1-0/+2
In some cases to work around server bugs or performance problems it can be helpful to be able to disable requesting SMB2.1/SMB3 leases on a particular mount (not to all servers and all shares we are mounted to). Add new mount parm "nolease" which turns off requesting leases on directory or file opens. Currently the only way to disable leases is globally through a module load parameter. This is more granular. Suggested-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-09-16smb3: display max smb3 requests in flight at any one timeSteve French1-0/+1
Displayed in /proc/fs/cifs/Stats once for each socket we are connected to. This allows us to find out what the maximum number of requests that had been in flight (at any one time). Note that /proc/fs/cifs/Stats can be reset if you want to look for maximum over a small period of time. Sample output (immediately after mount): Resources in use CIFS Session: 1 Share (unique mount targets): 2 SMB Request/Response Buffer: 1 Pool size: 5 SMB Small Req/Resp Buffer: 1 Pool size: 30 Operations (MIDs): 0 0 session 0 share reconnects Total vfs operations: 5 maximum at one time: 2 Max requests in flight: 2 1) \\localhost\scratch SMBs: 18 Bytes read: 0 Bytes written: 0 ... Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-09-16smb3: enable offload of decryption of large reads via mount optionSteve French1-0/+2
Disable offload of the decryption of encrypted read responses by default (equivalent to setting this new mount option "esize=0"). Allow setting the minimum encrypted read response size that we will choose to offload to a worker thread - it is now configurable via on a new mount option "esize=" Depending on which encryption mechanism (GCM vs. CCM) and the number of reads that will be issued in parallel and the performance of the network and CPU on the client, it may make sense to enable this since it can provide substantial benefit when multiple large reads are in flight at the same time. Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16smb3: allow parallelizing decryption of readsSteve French1-0/+1
decrypting large reads on encrypted shares can be slow (e.g. adding multiple milliseconds per-read on non-GCM capable servers or when mounting with dialects prior to SMB3.1.1) - allow parallelizing of read decryption by launching worker threads. Testing to Samba on localhost showed 25% improvement. Testing to remote server showed very large improvement when doing more than one 'cp' command was called at one time. Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16smb3: allow skipping signature verification for perf sensitive configurationsSteve French1-0/+2
Add new mount option "signloosely" which enables signing but skips the sometimes expensive signing checks in the responses (signatures are calculated and sent correctly in the SMB2/SMB3 requests even with this mount option but skipped in the responses). Although weaker for security (and also data integrity in case a packet were corrupted), this can provide enough of a performance benefit (calculating the signature to verify a packet can be expensive especially for large packets) to be useful in some cases. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16smb3: add mount option to allow RW caching of share accessed by only 1 clientSteve French1-2/+3
If a share is known to be only to be accessed by one client, we can aggressively cache writes not just reads to it. Add "cache=" option (cache=singleclient) for mounting read write shares (that will not be read or written to from other clients while we have it mounted) in order to improve performance. Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16smb3: add mount option to allow forced caching of read only shareSteve French1-2/+4
If a share is immutable (at least for the period that it will be mounted) it would be helpful to not have to revalidate dentries repeatedly that we know can not be changed remotely. Add "cache=" option (cache=ro) for mounting read only shares in order to improve performance in cases in which we know that the share will not be changing while it is in use. Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07smb3: do not send compression info by defaultSteve French1-0/+1
Since in theory a server could respond with compressed read responses even if not requested on read request (assuming that a compression negcontext is sent in negotiate protocol) - do not send compression information during negotiate protocol unless the user asks for compression explicitly (compression is experimental), and add a mount warning that compression is experimental. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-07-07smb3: add new mount option to retrieve mode from special ACESteve French1-1/+2
There is a special ACE used by some servers to allow the mode bits to be stored. This can be especially helpful in scenarios in which the client is trusted, and access checking on the client vs the POSIX mode bits is sufficient. Add mount option to allow enabling this behavior. Follow on patch will add support for chmod and queryinfo (stat) by retrieving the POSIX mode bits from the special ACE, SID: S-1-5-88-3 See e.g. https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/hh509017(v=ws.10) Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-07-07cifs: simplify code by removing CONFIG_CIFS_ACL ifdefSteve French1-2/+0
SMB3 ACL support is needed for many use cases now and should not be ifdeffed out, even for SMB1 (CIFS). Remove the CONFIG_CIFS_ACL ifdef so ACL support is always built into cifs.ko Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07cifs: Fix check for matching with existing mountSteve French1-0/+1
If we mount the same share twice, we check the flags to see if the second mount matches the earlier mount, but we left some flags out. Signed-off-by: Steve French <stfrench@microsoft.com>
2019-06-13cifs: add spinlock for the openFileList to cifsInodeInfoRonnie Sahlberg1-0/+5
We can not depend on the tcon->open_file_lock here since in multiuser mode we may have the same file/inode open via multiple different tcons. The current code is race prone and will crash if one user deletes a file at the same time a different user opens/create the file. To avoid this we need to have a spinlock attached to the inode and not the tcon. RHBZ: 1580165 CC: Stable <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-05-15cifs: add support for SEEK_DATA and SEEK_HOLERonnie Sahlberg1-0/+2
Add llseek op for SEEK_DATA and SEEK_HOLE. Improves xfstests/285,286,436,445,448 and 490 Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07SMB3: Clean up query symlink when reparse pointRonnie Sahlberg1-1/+2
Two of the common symlink formats use reparse points (unlike mfsymlinks and also unlike the SMB1 posix extensions). This is the first part of the fixes to allow these reparse points (NFS style and Windows symlinks) to be resolved properly as symlinks by the client. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07Negotiate and save preferred compression algorithmsSteve French1-0/+1
New negotiate context (3) allows the server and client to negotiate which compression algorithms to use. Add support for this and save it off in the server structure. Also now displayed in /proc/fs/cifs/DebugData (see below example to Windows 10) where compression algoirthm "LZ77" was negotiated: Servers: Number of credits: 326 Dialect 0x311 COMPRESS_LZ77 signed 1) Name: 192.168.92.17 Uses: 1 Capability: 0x300067 Session Status: 1 TCP status: 1 Instance: 1 See MS-XCA and MS-SMB2 2.2.3.1 for more details. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-05-07cifs: rename and clarify CIFS_ASYNC_OP and CIFS_NO_RESPRonnie Sahlberg1-2/+2
The flags were named confusingly. CIFS_ASYNC_OP now just means that we will not block waiting for credits to become available so we thus rename this to be CIFS_NON_BLOCKING. Change CIFS_NO_RESP to CIFS_NO_RSP_BUF to clarify that we will actually get a response from the server but we will not get/do not want a response buffer. Delete CIFSSMBNotify. This is an SMB1 function that is not used. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-05-07cifs: fix credits leak for SMB1 oplock breaksRonnie Sahlberg1-0/+1
For SMB1 oplock breaks we would grab one credit while sending the PDU but we would never relese the credit back since we will never receive a response to this from the server. Eventuallt this would lead to a hang once all credits are leaked. Fix this by defining a new flag CIFS_NO_SRV_RSP which indicates that there is no server response to this command and thus we need to add any credits back immediately after sending the PDU. CC: Stable <stable@vger.kernel.org> #v5.0+ Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07cifs: add fiemap supportRonnie Sahlberg1-0/+3
Useful for improved copy performance as well as for applications which query allocated ranges of sparse files. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07CIFS: check CIFS_MOUNT_NO_DFS when trying to reuse existing sbAurelien Aptel1-1/+10
if we mount A then mount A again with nodfs, we shouldn't reuse the superblock. document the purpose of the defines as well. there are most likely more flags that needs to be added to this mask, in fact the logic to find them should be which flag should be *ignored* when trying to reuse an existing sb. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07SMB3: Track total time spent on roundtrips for each SMB3 commandSteve French1-0/+4
Also track minimum and maximum time by command in /proc/fs/cifs/Stats Signed-off-by: Steve French <stfrench@microsoft.com>
2019-04-16CIFS: keep FileInfo handle live during oplock breakAurelien Aptel1-0/+2
In the oplock break handler, writing pending changes from pages puts the FileInfo handle. If the refcount reaches zero it closes the handle and waits for any oplock break handler to return, thus causing a deadlock. To prevent this situation: * We add a wait flag to cifsFileInfo_put() to decide whether we should wait for running/pending oplock break handlers * We keep an additionnal reference of the SMB FileInfo handle so that for the rest of the handler putting the handle won't close it. - The ref is bumped everytime we queue the handler via the cifs_queue_oplock_break() helper. - The ref is decremented at the end of the handler This bug was triggered by xfstest 464. Also important fix to address the various reports of oops in smb2_push_mandatory_locks Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-04-01SMB3: Allow persistent handle timeout to be configurable on mountSteve French1-0/+8
Reconnecting after server or network failure can be improved (to maintain availability and protect data integrity) by allowing the client to choose the default persistent (or resilient) handle timeout in some use cases. Today we default to 0 which lets the server pick the default timeout (usually 120 seconds) but this can be problematic for some workloads. Add the new mount parameter to cifs.ko for SMB3 mounts "handletimeout" which enables the user to override the default handle timeout for persistent (mount option "persistenthandles") or resilient handles (mount option "resilienthandles"). Maximum allowed is 16 minutes (960000 ms). Units for the timeout are expressed in milliseconds. See section 2.2.14.2.12 and 2.2.31.3 of the MS-SMB2 protocol specification for more information. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>
2019-03-14CIFS: make mknod() an smb_version_opAurelien Aptel1-0/+8
This cleanup removes cifs specific code from SMB2/SMB3 code paths which is cleaner and easier to maintain as the code to handle special files is improved. Below is an example creating special files using 'sfu' mount option over SMB3 to Windows (with this patch) (Note that to Samba server, support for saving dos attributes has to be enabled for the SFU mount option to work). In the future this will also make implementation of creating special files as reparse points easier (as Windows NFS server does for example). root@smf-Thinkpad-P51:~# stat -c "%F" /mnt2/char character special file root@smf-Thinkpad-P51:~# stat -c "%F" /mnt2/block block special file Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-14cifs: minor documentation updatesSteve French1-0/+1
Also updated a comment describing use of the GlobalMid_Lock Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-14cifs: cache FILE_ALL_INFO for the shared root handleRonnie Sahlberg1-0/+3
When we open the shared root handle also ask for FILE_ALL_INFORMATION since we can do this at zero cost as part of a compound. Cache this information as long as the lease is held and return and serve any future requests from cache. This allows us to serve "stat /<mountpoint>" directly from cache and avoid a network roundtrip. Since clients often want to do this quite a lot this improve performance slightly. As an example: xfstest generic/533 performs 43 stat operations on the root of the share while it is run. Which are eliminated with this patch. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-14cifs: wait_for_free_credits() make it possible to wait for >=1 creditsRonnie Sahlberg1-2/+2
Change wait_for_free_credits() to allow waiting for >=1 credits instead of just a single credit. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-03-05CIFS: Adjust MTU credits before reopening a filePavel Shilovsky1-0/+12
Currently we adjust MTU credits before sending an IO request and after reopening a file. This approach doesn't allow the reopen routine to use existing credits that are not needed for IO. Reorder credit adjustment and reopening a file to use credits available to the client more efficiently. Also unwrap complex if statement into few pieces to improve readability. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Respect reconnect in non-MTU credits calculationsPavel Shilovsky1-4/+2
Every time after a session reconnect we don't need to account for credits obtained in previous sessions. Make use of the recently added cifs_credits structure to properly calculate credits for non-MTU requests the same way we did for MTU ones. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-05CIFS: Respect reconnect in MTU credits calculationsPavel Shilovsky1-10/+19
Every time after a session reconnect we don't need to account for credits obtained in previous sessions. Introduce new struct cifs_credits which contains both credits value and reconnect instance of the time those credits were taken. Modify a routine that add credits back to handle the reconnect instance by assuming zero credits if the reconnect happened after the credits were obtained and before we decided to add them back due to some errors during sending. This patch fixes the MTU credits cases. The subsequent patch will handle non-MTU ones. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-04CIFS: Count SMB3 credits for malformed pending responsesPavel Shilovsky1-2/+2
Even if a response is malformed, we should count credits granted by the server to avoid miscalculations and unnecessary reconnects due to client or server bugs. If the response has been received partially, the session will be reconnected anyway on the next iteration of the demultiplex thread, so counting credits for such cases shouldn't break things. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-04CIFS: Do not skip SMB2 message IDs on send failuresPavel Shilovsky1-0/+19
When we hit failures during constructing MIDs or sending PDUs through the network, we end up not using message IDs assigned to the packet. The next SMB packet will skip those message IDs and continue with the next one. This behavior may lead to a server not granting us credits until we use the skipped IDs. Fix this by reverting the current ID to the original value if any errors occur before we push the packet through the network stack. This patch fixes the generic/310 test from the xfs-tests. Cc: <stable@vger.kernel.org> # 4.19.x Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-04smb3: make default i/o size for smb3 mounts largerSteve French1-0/+1
We negotiate rsize mounts (and it can be overridden by user) to typically 4MB, so using larger default I/O sizes from userspace (changing to 1MB default i/o size returned by stat) the performance is much better (and not just for long latency network connections) in most use cases for SMB3 than the default I/O size (which ends up being 128K for cp and can be even smaller for cp). This can be 4x slower or worse depending on network latency. By changing inode->blocksize from 32K (which was perhaps ok for very old SMB1/CIFS) to a larger value, 1MB (but still less than max size negotiated with the server which is 4MB, in order to minimize risk) it significantly increases performance for the noncached case, and slightly increases it for the cached case. This can be changed by the user on mount (specifying bsize= values from 16K to 16MB) to tune better for performance for applications that depend on blocksize. Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> CC: Stable <stable@vger.kernel.org>