aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2013-12-05Merge tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds7-9/+51
Pull NFS client bugfixes from Trond Myklebust: - Stable fix for a NFSv4.1 delegation and state recovery deadlock - Stable fix for a loop on irrecoverable errors when returning delegations - Fix a 3-way deadlock between layoutreturn, open, and state recovery - Update the MAINTAINERS file with contact information for Trond Myklebust - Close needs to handle NFS4ERR_ADMIN_REVOKED - Enabling v4.2 should not recompile nfsd and lockd - Fix a couple of compile warnings * tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: fix do_div() warning by instead using sector_div() MAINTAINERS: Update contact information for Trond Myklebust NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery SUNRPC: do not fail gss proc NULL calls with EACCES NFSv4: close needs to handle NFS4ERR_ADMIN_REVOKED NFSv4: Update list of irrecoverable errors on DELEGRETURN NFSv4 wait on recovery for async session errors NFS: Fix a warning in nfs_setsecurity NFS: Enabling v4.2 should not recompile nfsd and lockd
2013-12-04nfs: fix do_div() warning by instead using sector_div()Helge Deller1-1/+1
When compiling a 32bit kernel with CONFIG_LBDAF=n the compiler complains like shown below. Fix this warning by instead using sector_div() which is provided by the kernel.h header file. fs/nfs/blocklayout/extents.c: In function ‘normalize’: include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default] fs/nfs/blocklayout/extents.c:47:13: note: in expansion of macro ‘do_div’ nfs/blocklayout/extents.c:47:2: warning: right shift count >= width of type [enabled by default] fs/nfs/blocklayout/extents.c:47:2: warning: passing argument 1 of ‘__div64_32’ from incompatible pointer type [enabled by default] include/asm-generic/div64.h:35:17: note: expected ‘uint64_t *’ but argument is of type ‘sector_t *’ extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor); Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-12-04NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recoveryTrond Myklebust1-1/+8
Andy Adamson reports: The state manager is recovering expired state and recovery OPENs are being processed. If kswapd is pruning inodes at the same time, a deadlock can occur when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the resultant layoutreturn gets an error that the state mangager is to handle, causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq. At the same time an open is waiting for the inode deletion to complete in __wait_on_freeing_inode. If the open is either the open called by the state manager, or an open from the same open owner that is holding the NFSv4 sequence id which causes the OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue, then the state is deadlocked with kswapd. The fix is simply to have layoutreturn ignore all errors except NFS4ERR_DELAY. We already know that layouts are dropped on all server reboots, and that it has to be coded to deal with the "forgetful client model" that doesn't send layoutreturns. Reported-by: Andy Adamson <andros@netapp.com> Link: http://lkml.kernel.org/r/1385402270-14284-1-git-send-email-andros@netapp.com Signed-off-by: Trond Myklebust <Trond.Myklebust@primarydata.com>
2013-11-20NFSv4: close needs to handle NFS4ERR_ADMIN_REVOKEDTrond Myklebust1-3/+6
Also ensure that we zero out the stateid mode when exiting Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-20NFSv4: Update list of irrecoverable errors on DELEGRETURNTrond Myklebust1-2/+8
If the DELEGRETURN errors out with something like NFS4ERR_BAD_STATEID then there is no recovery possible. Just quit without returning an error. Also, note that the client must not assume that the NFSv4 lease has been renewed when it sees an error on DELEGRETURN. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
2013-11-20NFSv4 wait on recovery for async session errorsAndy Adamson1-1/+1
When the state manager is processing the NFS4CLNT_DELEGRETURN flag, session draining is off, but DELEGRETURN can still get a session error. The async handler calls nfs4_schedule_session_recovery returns -EAGAIN, and the DELEGRETURN done then restarts the RPC task in the prepare state. With the state manager still processing the NFS4CLNT_DELEGRETURN flag with session draining off, these DELEGRETURNs will cycle with errors filling up the session slots. This prevents OPEN reclaims (from nfs_delegation_claim_opens) required by the NFS4CLNT_DELEGRETURN state manager processing from completing, hanging the state manager in the __rpc_wait_for_completion_task in nfs4_run_open_task as seen in this kernel thread dump: kernel: 4.12.32.53-ma D 0000000000000000 0 3393 2 0x00000000 kernel: ffff88013995fb60 0000000000000046 ffff880138cc5400 ffff88013a9df140 kernel: ffff8800000265c0 ffffffff8116eef0 ffff88013fc10080 0000000300000001 kernel: ffff88013a4ad058 ffff88013995ffd8 000000000000fbc8 ffff88013a4ad058 kernel: Call Trace: kernel: [<ffffffff8116eef0>] ? cache_alloc_refill+0x1c0/0x240 kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc] kernel: [<ffffffffa0358152>] rpc_wait_bit_killable+0x42/0xa0 [sunrpc] kernel: [<ffffffff8152914f>] __wait_on_bit+0x5f/0x90 kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc] kernel: [<ffffffff815291f8>] out_of_line_wait_on_bit+0x78/0x90 kernel: [<ffffffff8109b520>] ? wake_bit_function+0x0/0x50 kernel: [<ffffffffa035810d>] __rpc_wait_for_completion_task+0x2d/0x30 [sunrpc] kernel: [<ffffffffa040d44c>] nfs4_run_open_task+0x11c/0x160 [nfs] kernel: [<ffffffffa04114e7>] nfs4_open_recover_helper+0x87/0x120 [nfs] kernel: [<ffffffffa0411646>] nfs4_open_recover+0xc6/0x150 [nfs] kernel: [<ffffffffa040cc6f>] ? nfs4_open_recoverdata_alloc+0x2f/0x60 [nfs] kernel: [<ffffffffa0414e1a>] nfs4_open_delegation_recall+0x6a/0xa0 [nfs] kernel: [<ffffffffa0424020>] nfs_end_delegation_return+0x120/0x2e0 [nfs] kernel: [<ffffffff8109580f>] ? queue_work+0x1f/0x30 kernel: [<ffffffffa0424347>] nfs_client_return_marked_delegations+0xd7/0x110 [nfs] kernel: [<ffffffffa04225d8>] nfs4_run_state_manager+0x548/0x620 [nfs] kernel: [<ffffffffa0422090>] ? nfs4_run_state_manager+0x0/0x620 [nfs] kernel: [<ffffffff8109b0f6>] kthread+0x96/0xa0 kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20 kernel: [<ffffffff8109b060>] ? kthread+0x0/0xa0 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20 The state manager can not therefore process the DELEGRETURN session errors. Change the async handler to wait for recovery on session errors. Signed-off-by: Andy Adamson <andros@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-19NFS: Fix a warning in nfs_setsecurityTrond Myklebust1-1/+1
Fix the following warning: linux-nfs/fs/nfs/inode.c:315:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-19NFS: Enabling v4.2 should not recompile nfsd and lockdAnna Schumaker4-0/+26
When CONFIG_NFS_V4_2 is toggled nfsd and lockd will be recompiled, instead of only the nfs client. This patch moves a small amount of code into the client directory to avoid unnecessary recompiles. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-16Merge tag 'nfs-for-3.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds3-5/+10
Pull NFS client bugfixes: - Stable fix for data corruption when retransmitting O_DIRECT writes - Stable fix for a deep recursion/stack overflow bug in rpc_release_client - Stable fix for infinite looping when mounting a NFSv4.x volume - Fix a typo in the nfs mount option parser - Allow pNFS layouts to be compiled into the kernel when NFSv4.1 is * tag 'nfs-for-3.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: fix pnfs Kconfig defaults NFS: correctly report misuse of "migration" mount option. nfs: don't retry detect_trunking with RPC_AUTH_UNIX more than once SUNRPC: Avoid deep recursion in rpc_release_client SUNRPC: Fix a data corruption issue when retransmitting RPC calls
2013-11-15nfs: fix pnfs Kconfig defaultsChristoph Hellwig1-3/+3
Defaulting to m seem to prevent building the pnfs layout modules into the kernel. Default to the value of CONFIG_NFS_V4 make sure they are built in for built-in NFSv4 support and modular for a modular NFSv4. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-15NFS: correctly report misuse of "migration" mount option.NeilBrown1-1/+1
The current test on valid use of the "migration" mount option can never report an error as it will only do so if mnt->version !=4 && mnt->minor_version != 0 (and some other condition), but if that test would succeed, then the previous test has already gone-to out_minorversion_mismatch. So change the && to an || to get correct semantics. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-15tree-wide: use reinit_completion instead of INIT_COMPLETIONWolfram Sang1-1/+1
Use this new function to make code more comprehensible, since we are reinitialzing the completion, not initializing. [akpm@linux-foundation.org: linux-next resyncs] Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> (personally at LCE13) Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13nfs: don't retry detect_trunking with RPC_AUTH_UNIX more than onceJeff Layton1-1/+6
Currently, when we try to mount and get back NFS4ERR_CLID_IN_USE or NFS4ERR_WRONGSEC, we create a new rpc_clnt and then try the call again. There is no guarantee that doing so will work however, so we can end up retrying the call in an infinite loop. Worse yet, we create the new client using rpc_clone_client_set_auth, which creates the new client as a child of the old one. Thus, we can end up with a *very* long lineage of rpc_clnts. When we go to put all of the references to them, we can end up with a long call chain that can smash the stack as each rpc_free_client() call can recurse back into itself. This patch fixes this by simply ensuring that the SETCLIENTID call will only be retried in this situation if the last attempt did not use RPC_AUTH_UNIX. Note too that with this change, we don't need the (i > 2) check in the -EACCES case since we now have a more reliable test as to whether we should reattempt. Cc: stable@vger.kernel.org # v3.10+ Cc: Chuck Lever <chuck.lever@oracle.com> Tested-by/Acked-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-13Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds11-186/+119
Pull vfs updates from Al Viro: "All kinds of stuff this time around; some more notable parts: - RCU'd vfsmounts handling - new primitives for coredump handling - files_lock is gone - Bruce's delegations handling series - exportfs fixes plus misc stuff all over the place" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits) ecryptfs: ->f_op is never NULL locks: break delegations on any attribute modification locks: break delegations on link locks: break delegations on rename locks: helper functions for delegation breaking locks: break delegations on unlink namei: minor vfs_unlink cleanup locks: implement delegations locks: introduce new FL_DELEG lock flag vfs: take i_mutex on renamed file vfs: rename I_MUTEX_QUOTA now that it's not used for quotas vfs: don't use PARENT/CHILD lock classes for non-directories vfs: pull ext4's double-i_mutex-locking into common code exportfs: fix quadratic behavior in filehandle lookup exportfs: better variable name exportfs: move most of reconnect_path to helper function exportfs: eliminate unused "noprogress" counter exportfs: stop retrying once we race with rename/remove exportfs: clear DISCONNECTED on all parents sooner exportfs: more detailed comment for path_reconnect ...
2013-11-04NFSv4.2: Remove redundant checks in nfs_setsecurity+nfs4_label_init_securityTrond Myklebust2-9/+0
We already check for nfs_server_capable(inode, NFS_CAP_SECURITY_LABEL) in nfs4_label_alloc() We check the minor version in _nfs4_server_capabilities before setting NFS_CAP_SECURITY_LABEL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04NFSv4: Sanity check the server reply in _nfs4_server_capabilitiesTrond Myklebust1-5/+20
We don't want to be setting capabilities and/or requesting attributes that are not appropriate for the NFSv4 minor version. - Ensure that we clear the NFS_CAP_SECURITY_LABEL capability when appropriate - Ensure that we limit the attribute bitmasks to the mounted_on_fileid attribute and less for NFSv4.0 - Ensure that we limit the attribute bitmasks to suppattr_exclcreat and less for NFSv4.1 - Ensure that we limit it to change_sec_label or less for NFSv4.2 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04NFSv4.2: encode_readdir - only ask for labels when doing readdirplusTrond Myklebust1-13/+12
Currently, if the server is doing NFSv4.2 and supports labeled NFS, then our on-the-wire READDIR request ends up asking for the label information, which is then ignored unless we're doing readdirplus. This patch ensures that READDIR doesn't ask the server for label information at all unless the readdir->bitmask contains the FATTR4_WORD2_SECURITY_LABEL attribute, and the readdir->plus flag is set. While we're at it, optimise away the 3rd bitmap field if it is zero. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-04nfs: set security label when revalidating inodeJeff Layton1-0/+2
Currently, we fetch the security label when revalidating an inode's attributes, but don't apply it. This is in contrast to the readdir() codepath where we do apply label changes. Cc: Dave Quigley <dpquigl@davequigley.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-01NFS: Fix a missing initialisation when reading the SELinux labelTrond Myklebust1-3/+3
Ensure that _nfs4_do_get_security_label() also initialises the SEQUENCE call correctly, by having it call into nfs4_call_sync(). Reported-by: Jeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org # 3.11+ Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-01nfs: fix oops when trying to set SELinux labelJeff Layton1-4/+4
Chao reported the following oops when testing labeled NFS: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffffa0568703>] nfs4_xdr_enc_setattr+0x43/0x110 [nfsv4] PGD 277bbd067 PUD 2777ea067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache sg coretemp kvm_intel kvm crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul iTCO_wdt glue_helper ablk_helper cryptd iTCO_vendor_support bnx2 pcspkr serio_raw i7core_edac cdc_ether microcode usbnet edac_core mii lpc_ich i2c_i801 mfd_core shpchp ioatdma dca acpi_cpufreq mperf nfsd auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ata_generic ttm pata_acpi drm ata_piix libata megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod CPU: 4 PID: 25657 Comm: chcon Not tainted 3.10.0-33.el7.x86_64 #1 Hardware name: IBM System x3550 M3 -[7944OEJ]-/90Y4784 , BIOS -[D6E150CUS-1.11]- 02/08/2011 task: ffff880178397220 ti: ffff8801595d2000 task.ti: ffff8801595d2000 RIP: 0010:[<ffffffffa0568703>] [<ffffffffa0568703>] nfs4_xdr_enc_setattr+0x43/0x110 [nfsv4] RSP: 0018:ffff8801595d3888 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff8801595d3b30 RCX: 0000000000000b4c RDX: ffff8801595d3b30 RSI: ffff8801595d38e0 RDI: ffff880278b6ec00 RBP: ffff8801595d38c8 R08: ffff8801595d3b30 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801595d38e0 R13: ffff880277a4a780 R14: ffffffffa05686c0 R15: ffff8802765f206c FS: 00007f2c68486800(0000) GS:ffff88027fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000027651a000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ffff880277865800 ffff880278b6ec00 ffff880277a4a780 ffff8801595d3948 ffffffffa02ad926 ffff8801595d3b30 ffff8802765f206c Call Trace: [<ffffffffa02ad926>] rpcauth_wrap_req+0x86/0xd0 [sunrpc] [<ffffffffa02a1d40>] ? call_connect+0xb0/0xb0 [sunrpc] [<ffffffffa02a1d40>] ? call_connect+0xb0/0xb0 [sunrpc] [<ffffffffa02a1ecb>] call_transmit+0x18b/0x290 [sunrpc] [<ffffffffa02a1d40>] ? call_connect+0xb0/0xb0 [sunrpc] [<ffffffffa02aae14>] __rpc_execute+0x84/0x400 [sunrpc] [<ffffffffa02ac40e>] rpc_execute+0x5e/0xa0 [sunrpc] [<ffffffffa02a2ea0>] rpc_run_task+0x70/0x90 [sunrpc] [<ffffffffa02a2f03>] rpc_call_sync+0x43/0xa0 [sunrpc] [<ffffffffa055284d>] _nfs4_do_set_security_label+0x11d/0x170 [nfsv4] [<ffffffffa0558861>] nfs4_set_security_label.isra.69+0xf1/0x1d0 [nfsv4] [<ffffffff815fca8b>] ? avc_alloc_node+0x24/0x125 [<ffffffff815fcd2f>] ? avc_compute_av+0x1a3/0x1b5 [<ffffffffa055897b>] nfs4_xattr_set_nfs4_label+0x3b/0x50 [nfsv4] [<ffffffff811bc772>] generic_setxattr+0x62/0x80 [<ffffffff811bcfc3>] __vfs_setxattr_noperm+0x63/0x1b0 [<ffffffff811bd1c5>] vfs_setxattr+0xb5/0xc0 [<ffffffff811bd2fe>] setxattr+0x12e/0x1c0 [<ffffffff811a4d22>] ? final_putname+0x22/0x50 [<ffffffff811a4f2b>] ? putname+0x2b/0x40 [<ffffffff811aa1cf>] ? user_path_at_empty+0x5f/0x90 [<ffffffff8119bc29>] ? __sb_start_write+0x49/0x100 [<ffffffff811bd66f>] SyS_lsetxattr+0x8f/0xd0 [<ffffffff8160cf99>] system_call_fastpath+0x16/0x1b Code: 48 8b 02 48 c7 45 c0 00 00 00 00 48 c7 45 c8 00 00 00 00 48 c7 45 d0 00 00 00 00 48 c7 45 d8 00 00 00 00 48 c7 45 e0 00 00 00 00 <48> 8b 00 48 8b 00 48 85 c0 0f 84 ae 00 00 00 48 8b 80 b8 03 00 RIP [<ffffffffa0568703>] nfs4_xdr_enc_setattr+0x43/0x110 [nfsv4] RSP <ffff8801595d3888> CR2: 0000000000000000 The problem is that _nfs4_do_set_security_label calls rpc_call_sync() directly which fails to do any setup of the SEQUENCE call. Have it use nfs4_call_sync() instead which does the right thing. While we're at it change the name of "args" to "arg" to better match the pattern in _nfs4_do_setattr. Reported-by: Chao Ye <cye@redhat.com> Cc: David Quigley <dpquigl@davequigley.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@vger.kernel.org # 3.11+ Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-31nfs: fix inverted test for delegation in nfs4_reclaim_open_stateJeff Layton1-1/+1
commit 6686390bab6a0e0 (NFS: remove incorrect "Lock reclaim failed!" warning.) added a test for a delegation before checking to see if any reclaimed locks failed. The test however is backward and is only doing that check when a delegation is held instead of when one isn't. Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Fixes: 6686390bab6a: NFS: remove incorrect "Lock reclaim failed!" warning. Cc: stable@vger.kernel.org # 3.12 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28Merge branch 'fscache' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into linux-nextTrond Myklebust5-160/+70
Pull fs-cache fixes from David Howells: Can you pull these commits to fix an issue with NFS whereby caching can be enabled on a file that is open for writing by subsequently opening it for reading. This can be made to crash by opening it for writing again if you're quick enough. The gist of the patchset is that the cookie should be acquired at inode creation only and subsequently enabled and disabled as appropriate (which dispenses with the backing objects when they're not needed). The extra synchronisation that NFS does can then be dispensed with as it is thenceforth managed by FS-Cache. Could you send these on to Linus? This likely will need fixing also in CIFS and 9P also once the FS-Cache changes are upstream. AFS and Ceph are probably safe. * 'fscache' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: NFS: Use i_writecount to control whether to get an fscache cookie in nfs_open() FS-Cache: Provide the ability to enable/disable cookies FS-Cache: Add use/unuse/wake cookie wrappers
2013-10-28nfs: use IS_ROOT not DCACHE_DISCONNECTEDJ. Bruce Fields1-1/+7
This check was added by Al Viro with d9e80b7de91db05c1c4d2e5ebbfd70b3b3ba0e0f "nfs d_revalidate() is too trigger-happy with d_drop()", with the explanation that we don't want to remove the root of a disconnected tree, which will still be included on the s_anon list. But DCACHE_DISCONNECTED does *not* actually identify dentries that are disconnected from the dentry tree or hashed on s_anon. IS_ROOT() is the way to do that. Also add a comment from Al's commit to remind us why this check is there. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28nfs: Use PTR_ERR_OR_ZERO in 'nfs/nfs4super.c'Geyslan G. Bem1-6/+6
Use 'PTR_ERR_OR_ZERO()' rather than 'IS_ERR(...) ? PTR_ERR(...) : 0'. Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28nfs: Use PTR_ERR_OR_ZERO in 'nfs41_callback_up' functionGeyslan G. Bem1-2/+1
Use 'PTR_ERR_OR_ZERO()' rather than 'IS_ERR(...) ? PTR_ERR(...) : 0'. Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28nfs: Remove useless 'error' assignmentGeyslan G. Bem1-2/+1
the 'error' variable was been assigned twice in vain. Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: add support for multiple sec= mount optionsWeston Andros Adamson5-70/+143
This patch adds support for multiple security options which can be specified using a colon-delimited list of security flavors (the same syntax as nfsd's exports file). This is useful, for instance, when NFSv4.x mounts cross SECINFO boundaries. With this patch a user can use "sec=krb5i,krb5p" to mount a remote filesystem using krb5i, but can still cross into krb5p-only exports. New mounts will try all security options before failing. NFSv4.x SECINFO results will be compared against the sec= flavors to find the first flavor in both lists or if no match is found will return -EPERM. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: stop using NFS_MOUNT_SECFLAVOUR server flagWeston Andros Adamson4-7/+8
Since the parsed sec= flavor is now stored in nfs_server->auth_info, we no longer need an nfs_server flag to determine if a sec= option was used. This flag has not been completely removed because it is still needed for the (old but still supported) non-text parsed mount options ABI compatability. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: cache parsed auth_info in nfs_serverWeston Andros Adamson3-2/+4
Cache the auth_info structure in nfs_server and pass these values to submounts. This lays the groundwork for supporting multiple sec= options. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: separate passed security flavs from selectedWeston Andros Adamson4-35/+35
When filling parsed_mount_data, store the parsed sec= mount option in the new struct nfs_auth_info and the chosen flavor in selected_flavor. This patch lays the groundwork for supporting multiple sec= options. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFSv4: make nfs_find_best_sec staticWeston Andros Adamson2-2/+1
It's not used outside of nfs4namespace.c anymore. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Fix possible endless state recovery waitChuck Lever1-4/+6
In nfs4_wait_clnt_recover(), hold a reference to the clp being waited on. The state manager can reduce clp->cl_count to 1, in which case the nfs_put_client() in nfs4_run_state_manager() can free *clp before wait_on_bit() returns and allows nfs4_wait_clnt_recover() to run again. The behavior at that point is non-deterministic. If the waited-on bit still happens to be zero, wait_on_bit() will wake the waiter as expected. If the bit is set again (say, if the memory was poisoned when freed) wait_on_bit() can leave the waiter asleep. This is a narrow fix which ensures the safety of accessing *clp in nfs4_wait_clnt_recover(), but does not address the continued use of a possibly freed *clp after nfs4_wait_clnt_recover() returns (see nfs_end_delegation_return(), for example). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Set EXCHGID4_FLAG_SUPP_MOVED_MIGRChuck Lever2-1/+18
Broadly speaking, v4.1 migration is untested. There are no servers in the wild that support NFSv4.1 migration. However, as server implementations become available, we do want to enable testing by developers, while leaving it disabled for environments for which broken migration support would be an unpleasant surprise. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Handle SEQ4_STATUS_LEASE_MOVEDChuck Lever1-2/+3
With the advent of NFSv4 sessions in NFSv4.1 and following, a "lease moved" condition is reported differently than it is in NFSv4.0. NFSv4 minor version 0 servers return an error status code, NFS4ERR_LEASE_MOVED, to signal that a lease has moved. This error causes the whole compound operation to fail. Normal compounds against this server continue to fail until the client performs migration recovery on the migrated share. Minor version 1 and later servers assert a bit flag in the reply to a compound's SEQUENCE operation to signal LEASE_MOVED. This is not a fatal condition: operations against this server continue normally. The server asserts this flag until the client performs migration recovery on the migrated share. Note that servers MUST NOT return NFS4ERR_LEASE_MOVED to NFSv4 clients not using NFSv4.0. After the server asserts any of the sr_status_flags in the SEQUENCE operation in a typical compound, our client initiates standard lease recovery. For NFSv4.1+, a stand-alone SEQUENCE operation is performed to discover what recovery is needed. If SEQ4_STATUS_LEASE_MOVED is asserted in this stand-alone SEQUENCE operation, our client attempts to discover which FSIDs have been migrated, and then performs migration recovery on each. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Handle NFS4ERR_LEASE_MOVED during async RENEWChuck Lever1-1/+7
With NFSv4 minor version 0, the asynchronous lease RENEW heartbeat can return NFS4ERR_LEASE_MOVED. Error recovery logic for async RENEW is a separate code path from the generic NFS proc paths, so it must be updated to handle NFS4ERR_LEASE_MOVED as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Migration support for RELEASE_LOCKOWNERChuck Lever1-0/+16
Currently the Linux NFS client ignores the operation status code for the RELEASE_LOCKOWNER operation. Like NFSv3's UMNT operation, RELEASE_LOCKOWNER is a courtesy to help servers manage their resources, and the outcome is not consequential for the client. During a migration, a server may report NFS4ERR_LEASE_MOVED, in which case the client really should retry, since typically LEASE_MOVED has nothing to do with the current operation, but does prevent it from going forward. Also, it's important for a client to respond as soon as possible to a moved lease condition, since the client's lease could expire on the destination without further action by the client. NFS4ERR_DELAY is not included in the list of valid status codes for RELEASE_LOCKOWNER in RFC 3530bis. However, rfc3530-migration-update does permit migration-capable servers to return DELAY to clients, but only in the context of an ongoing migration. In this case the server has frozen lock state in preparation for migration, and a client retry would help the destination server purge unneeded state once migration recovery is complete. Interestly, NFS4ERR_MOVED is not valid for RELEASE_LOCKOWNER, even though lock owners can be migrated with Transparent State Migration. Note that RFC 3530bis section 9.5 includes RELEASE_LOCKOWNER in the list of operations that renew a client's lease on the server if they succeed. Now that our client pays attention to the operation's status code, we can note that renewal appropriately. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Implement support for NFS4ERR_LEASE_MOVEDChuck Lever1-0/+9
Trigger lease-moved recovery when a request returns NFS4ERR_LEASE_MOVED. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Support NFS4ERR_LEASE_MOVED recovery in state managerChuck Lever2-1/+74
A migration on the FSID in play for the current NFS operation is reported via the error status code NFS4ERR_MOVED. "Lease moved" means that a migration has occurred on some other FSID than the one for the current operation. It's a signal that the client should take action immediately to handle a migration that it may not have noticed otherwise. This is so that the client's lease does not expire unnoticed on the destination server. In NFSv4.0, a moved lease is reported with the NFS4ERR_LEASE_MOVED error status code. To recover from NFS4ERR_LEASE_MOVED, check each FSID for that server to see if it is still present. Invoke nfs4_try_migration() if the FSID is no longer present on the server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add method to detect whether an FSID is still on the serverChuck Lever3-2/+195
Introduce a mechanism for probing a server to determine if an FSID is present or absent. The on-the-wire compound is different between minor version 0 and 1. Minor version 0 appends a RENEW operation to identify which client ID is probing. Minor version 1 has a SEQUENCE operation in the compound which effectively carries the same information. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Handle NFS4ERR_MOVED during delegation recallChuck Lever1-0/+3
When a server returns NFS4ERR_MOVED during a delegation recall, trigger the new migration recovery logic in the state manager. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add migration recovery callouts in nfs4proc.cChuck Lever1-2/+20
When a server returns NFS4ERR_MOVED, trigger the new migration recovery logic in the state manager. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Rename "stateid_invalid" labelChuck Lever1-3/+3
I'm going to use this exit label also for migration recovery failures. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Re-use exit code in nfs4_async_handle_error()Chuck Lever1-6/+3
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add basic migration support to state manager threadChuck Lever3-3/+161
Migration recovery and state recovery must be serialized, so handle both in the state manager thread. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add a super_block backpointer to the nfs_server structChuck Lever1-0/+1
NFS_SB() returns the pointer to an nfs_server struct, given a pointer to a super_block. But we have no way to go back the other way. Add a super_block backpointer field so that, given an nfs_server struct, it is easy to get to the filesystem's root dentry. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add method to retrieve fs_locations during migration recoveryChuck Lever3-11/+192
The nfs4_proc_fs_locations() function is invoked during referral processing to perform a GETATTR(fs_locations) on an object's parent directory in order to discover the target of the referral. It performs a LOOKUP in the compound, so the client needs to know the parent's file handle a priori. Unfortunately this function is not adequate for handling migration recovery. We need to probe fs_locations information on an FSID, but there's no parent directory available for many operations that can return NFS4ERR_MOVED. Another subtlety: recovering from NFS4ERR_LEASE_MOVED is a process of walking over a list of known FSIDs that reside on the server, and probing whether they have migrated. Once the server has detected that the client has probed all migrated file systems, it stops returning NFS4ERR_LEASE_MOVED. A minor version zero server needs to know what client ID is requesting fs_locations information so it can clear the flag that forces it to continue returning NFS4ERR_LEASE_MOVED. This flag is set per client ID and per FSID. However, the client ID is not an argument of either the PUTFH or GETATTR operations. Later minor versions have client ID information embedded in the compound's SEQUENCE operation. Therefore, by convention, minor version zero clients send a RENEW operation in the same compound as the GETATTR(fs_locations), since RENEW's one argument is a clientid4. This allows a minor version zero server to identify correctly the client that is probing for a migration. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Export _nfs_display_fhandle()Chuck Lever2-1/+3
Allow code in nfsv4.ko to use _nfs_display_fhandle(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Introduce a vector of migration recovery opsChuck Lever2-0/+14
The differences between minor version 0 and minor version 1 migration will be abstracted by the addition of a set of migration recovery ops. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add functions to swap transports during migration recoveryChuck Lever2-0/+103
Introduce functions that can walk through an array of returned fs_locations information and connect a transport to one of the destination servers listed therein. Note that NFS minor version 1 introduces "fs_locations_info" which extends the locations array sorting criteria available to clients. This is not supported yet. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-10-28NFS: Add nfs4_update_serverChuck Lever3-1/+113
New function nfs4_update_server() moves an nfs_server to a different nfs_client. This is done as part of migration recovery. Though it may be appealing to think of them as the same thing, migration recovery is not the same as following a referral. For a referral, the client has not descended into the file system yet: it has no nfs_server, no super block, no inodes or open state. It is enough to simply instantiate the nfs_server and super block, and perform a referral mount. For a migration, however, we have all of those things already, and they have to be moved to a different nfs_client. No local namespace changes are needed here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>