linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2012-11-28	nfsd4: remove state lock from nfsd4_load_reboot_recovery_data	J. Bruce Fields	1	-2/+0
	That function is only called under nfsd_mutex: we know that because the only caller is nfsd_svc, via nfsd_svc nfsd_startup nfs4_state_start nfsd4_client_tracking_init client_tracking_ops->init == nfsd4_load_reboot_recovery_data The shared state accessed here includes: - user_recovery_dirname: used here, modified only by nfs4_reset_recoverydir, which can be verified to only be called under nfsd_mutex. - filesystem state, protected by i_mutex (handwaving slightly here) - rec_file, reclaim_str_hashtbl, reclaim_str_hashtbl_size: other than here, used only from code called from nfsd or laundromat threads, both of which should be started only after this runs (see nfsd_svc) and stopped before this could run again (see nfsd_shutdown, called from nfsd_last_thread). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-27	nfsd4: return badname, not inval, on "." or "..", or "/"	J. Bruce Fields	1	-13/+12
	The spec requires badname, not inval, in these cases. Some callers want us to return enoent, but I can see no justification for that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd4: downgrade some fs/nfsd/nfs4state.c BUG's	J. Bruce Fields	1	-11/+15
	Linus has pointed out that indiscriminate use of BUG's can make it harder to diagnose bugs because they can bring a machine down, often before we manage to get any useful debugging information to the logs. (Consider, for example, a BUG() that fires in a workqueue, or while holding a spinlock). Most of these BUG's won't do much more than kill an nfsd thread, but it would still probably be safer to get out the warning without dying. There's still more of this to do in nfsd/. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd4: delay filling in write iovec array till after xdr decoding	J. Bruce Fields	3	-22/+23
	Our server rejects compounds containing more than one write operation. It's unclear whether this is really permitted by the spec; with 4.0, it's possibly OK, with 4.1 (which has clearer limits on compound parameters), it's probably not OK. No client that we're aware of has ever done this, but in theory it could be useful. The source of the limitation: we need an array of iovecs to pass to the write operation. In the worst case that array of iovecs could have hundreds of elements (the maximum rwsize divided by the page size), so it's too big to put on the stack, or in each compound op. So we instead keep a single such array in the compound argument. We fill in that array at the time we decode the xdr operation. But we decode every op in the compound before executing any of them. So once we've used that array we can't decode another write. If we instead delay filling in that array till the time we actually perform the write, we can reuse it. Another option might be to switch to decoding compound ops one at a time. I considered doing that, but it has a number of other side effects, and I'd rather fix just this one problem for now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd4: move more write parameters into xdr argument	J. Bruce Fields	2	-11/+11
	In preparation for moving some of this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd4: reorganize write decoding	J. Bruce Fields	1	-21/+41
	In preparation for moving some of it elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd4: simplify reading of opnum	J. Bruce Fields	1	-32/+2
	The comment here is totally bogus: - OP_WRITE + 1 is RELEASE_LOCKOWNER. Maybe there was some older version of the spec in which that served as a sort of OP_ILLEGAL? No idea, but it's clearly wrong now. - In any case, I can't see that the spec says anything about what to do if the client sends us less ops than promised. It's clearly nutty client behavior, and we should do whatever's easiest: returning an xdr error (even though it won't be consistent with the error on the last op returned) seems fine to me. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd4: no, we're not going to check tags for utf8	J. Bruce Fields	1	-6/+0
	Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26	nfsd: fix v4 reply caching	J. Bruce Fields	1	-1/+1
	Very embarassing: 1091006c5eb15cba56785bd5b498a8d0b9546903 "nfsd: turn on reply cache for NFSv4" missed a line, effectively leaving the reply cache off in the v4 case. I thought I'd tested that, but I guess not. This time, wrote a pynfs test to confirm it works. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make laundromat network namespace aware	Stanislav Kinsbursky	2	-8/+15
	This patch moves laundromat_work to nfsd per-net context, thus allowing to run multiple laundries. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: pass nfsd_net instead of net to grace enders	Stanislav Kinsbursky	3	-14/+10
	Passing net context looks as overkill. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: use service net instead of hard-coded init_net	Stanislav Kinsbursky	4	-31/+49
	This patch replaces init_net by SVC_NET(), where possible and also passes proper context to nested functions where required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make close_lru list per net	Stanislav Kinsbursky	2	-13/+13
	This list holds nfs4 clients (open) stateowner queue for last close replay, which are network namespace aware. So let's make this list per network namespace too. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make client_lru list per net	Stanislav Kinsbursky	2	-8/+13
	This list holds nfs4 clients queue for lease renewal, which are network namespace aware. So let's make this list per network namespace too. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make sessionid_hashtbl allocated per net	Stanislav Kinsbursky	2	-11/+20
	This hash holds established sessions state and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make lockowner_ino_hashtbl allocated per net	Stanislav Kinsbursky	2	-11/+20
	This hash holds file lock owners and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make ownerstr_hashtbl allocated per net	Stanislav Kinsbursky	2	-15/+27
	This hash holds open owner state and closely associated with nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace too. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make unconf_name_tree per net	Stanislav Kinsbursky	2	-23/+23
	This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make unconf_id_hashtbl allocated per net	Stanislav Kinsbursky	2	-10/+16
	This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make conf_name_tree per net	Stanislav Kinsbursky	2	-15/+20
	This tree holds nfs4_clients info, which are network namespace aware. So let's make it per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make conf_id_hashtbl allocated per net	Stanislav Kinsbursky	2	-21/+55
	This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash can be allocated in per-net operations. But it looks better to allocate it on nfsd state start and thus don't waste resources if server is not running. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make reclaim_str_hashtbl allocated per net	Stanislav Kinsbursky	4	-55/+111
	This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash is used only by legacy tracker. So let's allocate hash in tracker init. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: make nfs4_client network namespace dependent	Stanislav Kinsbursky	4	-13/+14
	And use it's net where possible. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15	nfsd: use service net instead of hard-coded net where possible	Stanislav Kinsbursky	1	-5/+5
	Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-14	nfsd4: get_backchannel_cred should be static	Fengguang Wu	1	-1/+1
	Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-14	nfsd4: init_session should be declared static	Fengguang Wu	1	-1/+1
	Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: release the legacy reclaimable clients list in grace_done	Jeff Layton	1	-0/+1
	The current code holds on to this list until nfsd is shut down, but it's never touched once the grace period ends. Release that memory back into the wild when the grace period ends. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: get rid of cl_recdir field	Jeff Layton	3	-36/+77
	Remove the cl_recdir field from the nfs4_client struct. Instead, just compute it on the fly when and if it's needed, which is now only when the legacy client tracking code is in effect. The error handling in the legacy client tracker is also changed to handle the case where md5 is unavailable. In that case, we'll warn the admin with a KERN_ERR message and disable the client tracking. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: move the confirmed and unconfirmed hlists to a rbtree	Jeff Layton	2	-52/+95
	The current code requires that we md5 hash the name in order to store the client in the confirmed and unconfirmed trees. Change it instead to store the clients in a pair of rbtrees, and simply compare the cl_names directly instead of hashing them. This also necessitates that we add a new flag to the clp->cl_flags field to indicate which tree the client is currently in. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: don't search for client by hash on legacy reboot recovery gracedone	Jeff Layton	3	-13/+34
	When nfsd starts, the legacy reboot recovery code creates a tracking struct for each directory in the v4recoverydir. When the grace period ends, it basically does a "readdir" on the directory again, and matches each dentry in there to an existing client id to see if it should be removed or not. If the matching client doesn't exist, or hasn't reclaimed its state then it will remove that dentry. This is pretty inefficient since it involves doing a lot of hash-bucket searching. It also means that we have to keep relying on being able to search for a nfs4_client by md5 hashed cl_recdir name. Instead, add a pointer to the nfs4_client that indicates the association between the nfs4_client_reclaim and nfs4_client. When a reclaim operation comes in, we set the pointer to make that association. On gracedone, the legacy client tracker will keep the recdir around iff: 1/ there is a reclaim record for the directory ...and... 2/ there's an association between the reclaim record and a client record -- that is, a create or check operation was performed on the client that matches that directory. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: make nfs4_client_to_reclaim return a pointer to the reclaim record	Jeff Layton	2	-11/+11
	Later callers will need to make changes to the record. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: break out reclaim record removal into separate function	Jeff Layton	2	-3/+10
	We'll need to be able to call this from nfs4recover.c eventually. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: have nfsd4_find_reclaim_client take a char * argument	Jeff Layton	3	-9/+6
	Currently, it takes a client pointer, but later we're going to need to search for these records without knowing whether a matching client even exists. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: warn about impending removal of nfsdcld upcall	Jeff Layton	1	-0/+3
	Let's shoot for removing the nfsdcld upcall in 3.10. Most likely, no one is actually using it so I don't expect this warning to fire often (except maybe on misconfigured systems). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: pass info about the legacy recoverydir in environment variables	Jeff Layton	1	-8/+82
	The usermodehelper upcall program can then decide to use this info as a (one-way) transition mechanism to the new scheme. When a "check" upcall occurs and the client doesn't exist in the database, we can look to see whether the directory exists. If it does, then we'd add the client to the database, remove the legacy recdir, and return success to the kernel to allow the recovery to proceed. For gracedone, we simply pass the v4recovery "topdir" so that the upcall can clean it out prior to returning to the kernel. A module parm is also added to disable the legacy conversion if the admin chooses. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: change heuristic for selecting the client_tracking_ops	Jeff Layton	1	-9/+27
	First, try to use the new usermodehelper upcall. It should succeed or fail quickly, so there's little cost to doing so. If it fails, and the legacy tracking dir exists, use that. If it doesn't exist then fall back to using nfsdcld. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-12	nfsd: add a usermodehelper upcall for NFSv4 client ID tracking	Jeff Layton	1	-1/+133
	Add a new client tracker upcall type that uses call_usermodehelper to call out to a program. This seems to be the preferred method of calling out to usermode these days for seldom-called upcalls. It's simple and doesn't require a running daemon, so it should "just work" as long as the binary is installed. The client tracking exit operation is also changed to check for a NULL pointer before running. The UMH upcall doesn't need to do anything at module teardown time. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-10	nfsd: remove unused argument to nfs4_has_reclaimed_state	Jeff Layton	3	-3/+3
	Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-10	nfsd: fix error handling in nfsd4_remove_clid_dir	Jeff Layton	1	-1/+2
	If the credential save fails, then we'll leak our mnt_want_write_file reference. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: backchannel should use client-provided security flavor	J. Bruce Fields	3	-5/+13
	For now this only adds support for AUTH_NULL. (Previously we assumed AUTH_UNIX.) We'll also need AUTH_GSS, which is trickier. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: common helper to initialize callback work	J. Bruce Fields	3	-4/+9
	I've found it confusing having the only references to nfsd4_do_callback_rpc() in a different file. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: implement backchannel_ctl operation	J. Bruce Fields	5	-1/+39
	This operation is mandatory for servers to implement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: use callback security parameters in create_session	J. Bruce Fields	3	-14/+37
	We're currently ignoring the callback security parameters specified in create_session, and just assuming the client wants auth_sys, because that's all the current linux client happens to care about. But this could cause us callbacks to fail to a client that wanted something different. For now, all we're doing is no longer ignoring the uid and gid passed in the auth_sys case. Further patches will add support for auth_null and gss (and possibly use more of the auth_sys information; the spec wants us to use exactly the credential we're passed, though it's hard to imagine why a client would care). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: clean up callback security parsing	J. Bruce Fields	2	-57/+70
	Move the callback parsing into a separate function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd: use vfs_fsync_range(), not O_SYNC, for stable writes	J. Bruce Fields	1	-7/+6
	NFSv4 shares the same struct file across multiple writes. (And we'd like NFSv2 and NFSv3 to do that as well some day.) So setting O_SYNC on the struct file as a way to request a synchronous write doesn't work. Instead, do a vfs_fsync_range() in that case. Reported-by: Peter Staubach <pstaubach@exagrid.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd: assume writeable exportabled filesystems have f_sync	J. Bruce Fields	1	-13/+0
	I don't really see how you could claim to support nfsd and not support fsync somehow. And in practice a quick look through the exportable filesystems suggests the only ones without an ->fsync are read-only (efs, isofs, squashfs) or in-memory (shmem). Also, performing a write and then returning an error if the sync fails (as we would do here in the wgather case) seems unhelpful to clients. Also remove an incorrect comment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: don't BUG in delegation break callback	J. Bruce Fields	1	-3/+8
	These conditions would indeed indicate bugs in the code, but if we want to hear about them we're likely better off warning and returning than immediately dying while holding file_lock_lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: remove unused init_session return	J. Bruce Fields	1	-2/+1
	Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfsd4: helper function for getting mounted_on ino	J. Bruce Fields	1	-12/+18
	Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07	nfs: fix wrong object type in lockowner_slab	Yanchuan Nian	1	-1/+1
	The object type in the cache of lockowner_slab is wrong, and it is better to fix it. Cc: stable@vger.kernel.org Signed-off-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>