aboutsummaryrefslogtreecommitdiffstats
path: root/net/sunrpc (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-11-04Merge tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds3-34/+14
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfix: - Fix build issues on architectures that don't provide 64-bit cmpxchg Cleanups: - Fix a spelling mistake" * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fix spelling mistake, EACCESS -> EACCES SUNRPC: Use atomic(64)_t for seq_send(64)
2018-11-01Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2-3/+3
Pull AFS updates from Al Viro: "AFS series, with some iov_iter bits included" * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits) missing bits of "iov_iter: Separate type from direction and use accessor functions" afs: Probe multiple fileservers simultaneously afs: Fix callback handling afs: Eliminate the address pointer from the address list cursor afs: Allow dumping of server cursor on operation failure afs: Implement YFS support in the fs client afs: Expand data structure fields to support YFS afs: Get the target vnode in afs_rmdir() and get a callback on it afs: Calc callback expiry in op reply delivery afs: Fix FS.FetchStatus delivery from updating wrong vnode afs: Implement the YFS cache manager service afs: Remove callback details from afs_callback_break struct afs: Commit the status on a new file/dir/symlink afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS afs: Don't invoke the server to read data beyond EOF afs: Add a couple of tracepoints to log I/O errors afs: Handle EIO from delivery function afs: Fix TTL on VL server and address lists afs: Implement VL server rotation afs: Improve FS server rotation error handling ...
2018-11-01missing bits of "iov_iter: Separate type from direction and use accessor functions"Al Viro1-2/+2
sunrpc patches from nfs tree conflict with calling conventions change done in iov_iter work. Trivial fixup... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-11-01Merge tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsAl Viro26-1590/+1921
backmerge to do fixup of iov_iter_kvec() conflict
2018-11-01SUNRPC: Use atomic(64)_t for seq_send(64)Paul Burton3-34/+14
The seq_send & seq_send64 fields in struct krb5_ctx are used as atomically incrementing counters. This is implemented using cmpxchg() & cmpxchg64() to implement what amount to custom versions of atomic_fetch_inc() & atomic64_fetch_inc(). Besides the duplication, using cmpxchg64() has another major drawback in that some 32 bit architectures don't provide it. As such commit 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") resulted in build failures for some architectures. Change seq_send to be an atomic_t and seq_send64 to be an atomic64_t, then use atomic(64)_* functions to manipulate the values. The atomic64_t type & associated functions are provided even on architectures which lack real 64 bit atomic memory access via CONFIG_GENERIC_ATOMIC64 which uses spinlocks to serialize access. This fixes the build failures for architectures lacking cmpxchg64(). A potential alternative that was raised would be to provide cmpxchg64() on the 32 bit architectures that currently lack it, using spinlocks. However this would provide a version of cmpxchg64() with semantics a little different to the implementations on architectures with real 64 bit atomics - the spinlock-based implementation would only work if all access to the memory used with cmpxchg64() is *always* performed using cmpxchg64(). That is not currently a requirement for users of cmpxchg64(), and making it one seems questionable. As such avoiding cmpxchg64() outside of architecture-specific code seems best, particularly in cases where atomic64_t seems like a better fit anyway. The CONFIG_GENERIC_ATOMIC64 implementation of atomic64_* functions will use spinlocks & so faces the same issue, but with the key difference that the memory backing an atomic64_t ought to always be accessed via the atomic64_* functions anyway making the issue moot. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: linux-nfs@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-30Merge tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linuxLinus Torvalds8-161/+219
Pull nfsd updates from Bruce Fields: "Olga added support for the NFSv4.2 asynchronous copy protocol. We already supported COPY, by copying a limited amount of data and then returning a short result, letting the client resend. The asynchronous protocol should offer better performance at the expense of some complexity. The other highlight is Trond's work to convert the duplicate reply cache to a red-black tree, and to move it and some other server caches to RCU. (Previously these have meant taking global spinlocks on every RPC) Otherwise, some RDMA work and miscellaneous bugfixes" * tag 'nfsd-4.20' of git://linux-nfs.org/~bfields/linux: (30 commits) lockd: fix access beyond unterminated strings in prints nfsd: Fix an Oops in free_session() nfsd: correctly decrement odstate refcount in error path svcrdma: Increase the default connection credit limit svcrdma: Remove try_module_get from backchannel svcrdma: Remove ->release_rqst call in bc reply handler svcrdma: Reduce max_send_sges nfsd: fix fall-through annotations knfsd: Improve lookup performance in the duplicate reply cache using an rbtree knfsd: Further simplify the cache lookup knfsd: Simplify NFS duplicate replay cache knfsd: Remove dead code from nfsd_cache_lookup SUNRPC: Simplify TCP receive code SUNRPC: Replace the cache_detail->hash_lock with a regular spinlock SUNRPC: Remove non-RCU protected lookup NFS: Fix up a typo in nfs_dns_ent_put NFS: Lockless DNS lookups knfsd: Lockless lookup of NFSv4 identities. SUNRPC: Lockless server RPCSEC_GSS context lookup knfsd: Allow lockless lookups of the exports ...
2018-10-29nfsd: Fix an Oops in free_session()Trond Myklebust1-1/+1
In call_xpt_users(), we delete the entry from the list, but we do not reinitialise it. This triggers the list poisoning when we later call unregister_xpt_user() in nfsd4_del_conns(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29svcrdma: Remove try_module_get from backchannelChuck Lever1-14/+0
Since commit ffe1f0df5862 ("rpcrdma: Merge svcrdma and xprtrdma modules into one"), the forward and backchannel components are part of the same kernel module. A separate try_module_get() call in the backchannel code is no longer necessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29svcrdma: Remove ->release_rqst call in bc reply handlerChuck Lever1-5/+4
Similar to a change made in the client's forward channel reply handler: The xprt_release_rqst_cong() call is not necessary. Also, release xprt->recv_lock when taking xprt->transport_lock to avoid disabling and enabling BH's while holding another spin lock. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29svcrdma: Reduce max_send_sgesChuck Lever1-4/+6
There's no need to request a large number of send SGEs because the inline threshold already constrains the number of SGEs per Send. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Simplify TCP receive codeTrond Myklebust1-39/+14
Use the fact that the iov iterators already have functionality for skipping a base offset. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Replace the cache_detail->hash_lock with a regular spinlockTrond Myklebust1-23/+23
Now that the reader functions are all RCU protected, use a regular spinlock rather than a reader/writer lock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Remove non-RCU protected lookupTrond Myklebust1-57/+4
Clean up the cache code by removing the non-RCU protected lookup. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Lockless server RPCSEC_GSS context lookupTrond Myklebust1-6/+26
Use RCU protection for looking up the RPCSEC_GSS context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Make server side AUTH_UNIX use lockless lookupsTrond Myklebust1-6/+8
Convert structs ip_map and unix_gid to use RCU protected lookups. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-29SUNRPC: Allow cache lookups to use RCU protection rather than the r/w spinlockTrond Myklebust1-14/+79
Instead of the reader/writer spinlock, allow cache lookups to use RCU for looking up entries. This is more efficient since modifications can occur while other entries are being looked up. Note that for now, we keep the reader/writer spinlock until all users have been converted to use RCU-safe freeing of their cache entries. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-26Merge tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds26-1590/+1921
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fix the NFSv4.1 r/wsize sanity checking - Reset the RPC/RDMA credit grant properly after a disconnect - Fix a missed page unlock after pg_doio() Features and optimisations: - Overhaul of the RPC client socket code to eliminate a locking bottleneck and reduce the latency when transmitting lots of requests in parallel. - Allow parallelisation of the RPCSEC_GSS encoding of an RPC request. - Convert the RPC client socket receive code to use iovec_iter() for improved efficiency. - Convert several NFS and RPC lookup operations to use RCU instead of taking global locks. - Avoid the need for BH-safe locks in the RPC/RDMA back channel. Bugfixes and cleanups: - Fix lock recovery during NFSv4 delegation recalls - Fix the NFSv4 + NFSv4.1 "lookup revalidate + open file" case. - Fixes for the RPC connection metrics - Various RPC client layer cleanups to consolidate stream based sockets - RPC/RDMA connection cleanups - Simplify the RPC/RDMA cleanup after memory operation failures - Clean ups for NFS v4.2 copy completion and NFSv4 open state reclaim" * tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (97 commits) SUNRPC: Convert the auth cred cache to use refcount_t SUNRPC: Convert auth creds to use refcount_t SUNRPC: Simplify lookup code SUNRPC: Clean up the AUTH cache code NFS: change sign of nfs_fh length sunrpc: safely reallow resvport min/max inversion nfs: remove redundant call to nfs_context_set_write_error() nfs: Fix a missed page unlock after pg_doio() SUNRPC: Fix a compile warning for cmpxchg64() NFSv4.x: fix lock recovery during delegation recall SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc() xprtrdma: Squelch a sparse warning xprtrdma: Clean up xprt_rdma_disconnect_inject xprtrdma: Add documenting comments xprtrdma: Report when there were zero posted Receives xprtrdma: Move rb_flags initialization xprtrdma: Don't disable BH's in backchannel server xprtrdma: Remove memory address of "ep" from an error message xprtrdma: Rename rpcrdma_qp_async_error_upcall xprtrdma: Simplify RPC wake-ups on connect ...
2018-10-24iov_iter: Separate type from direction and use accessor functionsDavid Howells1-1/+1
In the iov_iter struct, separate the iterator type from the iterator direction and use accessor functions to access them in most places. Convert a bunch of places to use switch-statements to access them rather then chains of bitwise-AND statements. This makes it easier to add further iterator types. Also, this can be more efficient as to implement a switch of small contiguous integers, the compiler can use ~50% fewer compare instructions than it has to use bitwise-and instructions. Further, cease passing the iterator type into the iterator setup function. The iterator function can set that itself. Only the direction is required. Signed-off-by: David Howells <dhowells@redhat.com>
2018-10-23SUNRPC: Convert the auth cred cache to use refcount_tTrond Myklebust5-8/+8
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-23SUNRPC: Convert auth creds to use refcount_tTrond Myklebust2-8/+8
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-23SUNRPC: Simplify lookup codeTrond Myklebust1-11/+8
We no longer need to worry about whether or not the entry is hashed in order to figure out if the contents are valid. We only care whether or not the refcount is non-zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-23SUNRPC: Clean up the AUTH cache codeTrond Myklebust2-62/+87
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-18Merge tag 'nfs-rdma-for-4.20-1' of git://git.linux-nfs.org/projects/anna/linux-nfsTrond Myklebust10-342/+293
NFS RDMA client updates for Linux 4.20 Stable bugfixes: - Reset credit grant properly after a disconnect Other bugfixes and cleanups: - xprt_release_rqst_cong is called outside of transport_lock - Create more MRs at a time and toss out old ones during recovery - Various improvements to the RDMA connection and disconnection code: - Improve naming of trace events, functions, and variables - Add documenting comments - Fix metrics and stats reporting - Fix a tracepoint sparse warning Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-18sunrpc: safely reallow resvport min/max inversionJ. Bruce Fields1-16/+18
Commits ffb6ca33b04b and e08ea3a96fc7 prevent setting xprt_min_resvport greater than xprt_max_resvport, but may also break simple code that sets one parameter then the other, if the new range does not overlap the old. Also it looks racy to me, unless there's some serialization I'm not seeing. Granted it would probably require malicious privileged processes (unless there's a chance these might eventually be settable in unprivileged containers), but still it seems better not to let userspace panic the kernel. Simpler seems to be to allow setting the parameters to whatever you want but interpret xprt_min_resvport > xprt_max_resvport as the empty range. Fixes: ffb6ca33b04b "sunrpc: Prevent resvport min/max inversion..." Fixes: e08ea3a96fc7 "sunrpc: Prevent rexvport min/max inversion..." Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-18SUNRPC: Fix a compile warning for cmpxchg64()Trond Myklebust1-0/+1
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-05SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc()Arnd Bergmann1-1/+1
The newly introduced gss_seq_send64_fetch_and_inc() fails to build on 32-bit architectures: net/sunrpc/auth_gss/gss_krb5_seal.c:144:14: note: in expansion of macro 'cmpxchg' seq_send = cmpxchg(&ctx->seq_send64, old, old + 1); ^~~~~~~ arch/x86/include/asm/cmpxchg.h:128:3: error: call to '__cmpxchg_wrong_size' declared with attribute error: Bad argument size for cmpxchg __cmpxchg_wrong_size(); \ As the message tells us, cmpxchg() cannot be used on 64-bit arguments, that's what cmpxchg64() does. Fixes: 571ed1fd2390 ("SUNRPC: Replace krb5_seq_lock with a lockless scheme") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-10-03xprtrdma: Clean up xprt_rdma_disconnect_injectChuck Lever1-2/+1
Clean up: Use the appropriate C macro instead of open-coding container_of() . Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Add documenting commentsChuck Lever1-20/+23
Clean up: fill in or update documenting comments for transport switch entry points. For xprt_rdma_allocate: The first paragraph is no longer true since commit 5a6d1db45569 ("SUNRPC: Add a transport-specific private field in rpc_rqst"). The second paragraph is no longer true since commit 54cbd6b0c6b9 ("xprtrdma: Delay DMA mapping Send and Receive buffers"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Report when there were zero posted ReceivesChuck Lever1-2/+5
To show that a caller did attempt to allocate and post more Receive buffers, the trace point in rpcrdma_post_recvs() should report when rpcrdma_post_recvs() was invoked but no new Receive buffers were posted. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Move rb_flags initializationChuck Lever1-1/+1
Clean up: rb_flags might be used for other things besides RPCRDMA_BUF_F_EMPTY_SCQ, so initialize it in a generic spot instead of in a send-completion-queue-related helper. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03SUNRPC: Refactor sunrpc_cache_lookupTrond Myklebust1-8/+25
This is a trivial split into lookup and insert functions, no change in behavior. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-03SUNRPC: Add lockless lookup of the server's auth domainTrond Myklebust3-6/+35
Avoid taking the global auth_domain_lock in most lookups of the auth domain by adding an RCU protected lookup. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-03SUNRPC: Remove the server 'authtab_lock' and just use RCUTrond Myklebust1-18/+34
Module removal is RCU safe by design, so we really have no need to lock the 'authtab[]' array. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-10-03xprtrdma: Don't disable BH's in backchannel serverChuck Lever1-8/+8
Clean up: This code was copied from xprtsock.c and backchannel_rqst.c. For rpcrdma, the backchannel server runs exclusively in process context, thus disabling bottom-halves is unnecessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Remove memory address of "ep" from an error messageChuck Lever1-2/+3
Clean up: Replace the hashed memory address of the target rpcrdma_ep with the server's IP address and port. The server address is more useful in an administrative error message. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Rename rpcrdma_qp_async_error_upcallChuck Lever1-3/+11
Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outside the rpcrdma.ko module, add an appropriate documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Simplify RPC wake-ups on connectChuck Lever3-48/+52
Currently, when a connection is established, rpcrdma_conn_upcall invokes rpcrdma_conn_func and then wake_up_all(&ep->rep_connect_wait). The former wakes waiting RPCs, but the connect worker is not done yet, and that leads to races, double wakes, and difficulty understanding how this logic is supposed to work. Instead, collect all the "connection established" logic in the connect worker (xprt_rdma_connect_worker). A disconnect worker is retained to handle provider upcalls safely. Fixes: 254f91e2fa1f ("xprtrdma: RPC/RDMA must invoke ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-03xprtrdma: Re-organize the switch() in rpcrdma_conn_upcallChuck Lever1-9/+8
Clean up: Eliminate the FALLTHROUGH into the default arm to make the switch easier to understand. Also, as long as I'm here, do not display the memory address of the target rpcrdma_ep. A hashed memory address is of marginal use here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Eliminate "connstate" variable from rpcrdma_conn_upcall()Chuck Lever1-8/+6
Clean up. Since commit 173b8f49b3af ("xprtrdma: Demote "connect" log messages") there has been no need to initialize connstat to zero. In fact, in this code path there's now no reason not to set rep_connected directly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Conventional variable names in rpcrdma_conn_upcallChuck Lever1-11/+12
Clean up: The convention throughout other parts of xprtrdma is to name variables of type struct rpcrdma_xprt "r_xprt", not "xprt". This convention enables the use of the name "xprt" for a "struct rpc_xprt" type variable, as in other parts of the RPC client. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Rename rpcrdma_conn_upcallChuck Lever1-3/+13
Clean up: Use a function name that is consistent with the RDMA core API and with other consumers. Because this is a function that is invoked from outside the rpcrdma.ko module, add an appropriate documenting comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02sunrpc: Report connect_time in secondsChuck Lever2-3/+3
The way connection-oriented transports report connect_time is wrong: it's supposed to be in seconds, not in jiffies. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02sunrpc: Fix connect metricsChuck Lever3-15/+15
For TCP, the logic in xprt_connect_status is currently never invoked to record a successful connection. Commit 2a4919919a97 ("SUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending") changed the way TCP xprt's are awoken after a connect succeeds. Instead, change connection-oriented transports to bump connect_count and compute connect_time the moment that XPRT_CONNECTED is set. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Name MR trace events consistentlyChuck Lever3-8/+8
Clean up the names of trace events related to MRs so that it's easy to enable these with a glob. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Explicitly resetting MRs is no longer necessaryChuck Lever6-197/+113
When a memory operation fails, the MR's driver state might not match its hardware state. The only reliable recourse is to dereg the MR. This is done in ->ro_recover_mr, which then attempts to allocate a fresh MR to replace the released MR. Since commit e2ac236c0b651 ("xprtrdma: Allocate MRs on demand"), xprtrdma dynamically allocates MRs. It can add more MRs whenever they are needed. That makes it possible to simply release an MR when a memory operation fails, instead of "recovering" it. It will automatically be replaced by the on-demand MR allocator. This commit is a little larger than I wanted, but it replaces ->ro_recover_mr, rb_recovery_lock, rb_recovery_worker, and the rb_stale_mrs list with a generic work queue. Since MRs are no longer orphaned, the mrs_orphaned metric is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Create more MRs at a timeChuck Lever4-3/+3
Some devices require more than 3 MRs to build a single 1MB I/O. Ensure that rpcrdma_mrs_create() will add enough MRs to build that I/O. In a subsequent patch I'm changing the MR recovery logic to just toss out the MRs. In that case it's possible for ->send_request to loop acquiring some MRs, not getting enough, getting called again, recycling the previous MRs, then not getting enough, lather rinse repeat. Thus first we need to ensure enough MRs are created to prevent that loop. I'm "reusing" ia->ri_max_segs. All of its accessors seem to want the maximum number of data segments plus two, so I'm going to bake that into the initial calculation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: Reset credit grant properly after a disconnectChuck Lever2-0/+7
On a fresh connection, an RPC/RDMA client is supposed to send only one RPC Call until it gets a credit grant in the first RPC Reply from the server [RFC 8166, Section 3.3.3]. There is a bug in the Linux client's credit accounting mechanism introduced by commit e7ce710a8802 ("xprtrdma: Avoid deadlock when credit window is reset"). On connect, it simply dumps all pending RPC Calls onto the new connection. Servers have been tolerant of this bad behavior. Currently no server implementation ever changes its credit grant over reconnects, and servers always repost enough Receives before connections are fully established. To correct this issue, ensure that the client resets both the credit grant _and_ the congestion window when handling a reconnect. Fixes: e7ce710a8802 ("xprtrdma: Avoid deadlock when credit ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@kernel.org Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-10-02xprtrdma: xprt_release_rqst_cong is called outside of transport_lockChuck Lever1-9/+7
Since commit ce7c252a8c74 ("SUNRPC: Add a separate spinlock to protect the RPC request receive list") the RPC/RDMA reply handler has been calling xprt_release_rqst_cong without holding xprt->transport_lock. I think the only way this call is ever made is if the credit grant increases and there are RPCs pending. Current server implementations do not change their credit grant during operation (except at connect time). Commit e7ce710a8802 ("xprtrdma: Avoid deadlock when credit window is reset") added the ->release_rqst call because UDP invokes xprt_adjust_cwnd(), which calls __xprt_put_cong() after adjusting xprt->cwnd. Both xprt_release() and ->xprt_release_xprt already wake another task in this case, so it is safe to remove this call from the reply handler. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-09-30SUNRPC: Replace krb5_seq_lock with a lockless schemeTrond Myklebust2-17/+28
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-09-30SUNRPC: Lockless lookup of RPCSEC_GSS mechanismsTrond Myklebust1-14/+14
Use RCU protected lookups for discovering the supported mechanisms. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>