aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs/pagelist.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-09-11NFS: various changes relating to reporting IO errors.NeilBrown1-2/+2
1/ remove 'start' and 'end' args from nfs_file_fsync_commit(). They aren't used. 2/ Make nfs_context_set_write_error() a "static inline" in internal.h so we can... 3/ Use nfs_context_set_write_error() instead of mapping_set_error() if nfs_pageio_add_request() fails before sending any request. NFS generally keeps errors in the open_context, not the mapping, so this is more consistent. 4/ If filemap_write_and_write_range() reports any error, still check ctx->error. The value in ctx->error is likely to be more useful. As part of this, NFS_CONTEXT_ERROR_WRITE is cleared slightly earlier, before nfs_file_fsync_commit() is called, rather than at the start of that function. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-09-08NFS: Fix 2 use after free issues in the I/O codeTrond Myklebust1-14/+12
The writeback code wants to send a commit after processing the pages, which is why we want to delay releasing the struct path until after that's done. Also, the layout code expects that we do not free the inode before we've put the layout segments in pnfs_writehdr_free() and pnfs_readhdr_free() Fixes: 919e3bd9a875 ("NFS: Ensure we commit after writeback is complete") Fixes: 4714fb51fd03 ("nfs: remove pgio_header refcount, related cleanup") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-20Merge branch 'bugfixes'Trond Myklebust1-37/+40
2017-08-20NFS: Remove unused parameter gfp_flags from nfs_pageio_init()Trond Myklebust1-3/+1
Now that the mirror allocation has been moved, the parameter can go. Also remove the redundant symbol export. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-20NFSv4: Fix up mirror allocationTrond Myklebust1-34/+39
There are a number of callers of nfs_pageio_complete() that want to continue using the nfs_pageio_descriptor without needing to call nfs_pageio_init() again. Examples include nfs_pageio_resend() and nfs_pageio_cond_complete(). The problem is that nfs_pageio_complete() also calls nfs_pageio_cleanup_mirroring(), which frees up the array of mirrors. This can lead to writeback errors, in the next call to nfs_pageio_setup_mirroring(). Fix by simply moving the allocation of the mirrors to nfs_pageio_setup_mirroring(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=196709 Reported-by: JianhongYin <yin-jianhong@163.com> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-15NFS: Wait for requests that are locked on the commit listTrond Myklebust1-0/+2
If a request is on the commit list, but is locked, we will currently skip it, which can lead to livelocking when the commit count doesn't reduce to zero. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-15NFS: Use an atomic_long_t to count the number of requestsTrond Myklebust1-3/+1
Rather than forcing us to take the inode->i_lock just in order to bump the number. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-15NFS: Remove unused parameter from nfs_page_group_lock()Trond Myklebust1-20/+11
nfs_page_group_lock() is now always called with the 'nonblock' parameter set to 'false'. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-15NFS: Remove unuse function nfs_page_group_lock_wait()Trond Myklebust1-21/+0
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-08-15NFS: Ensure we always dereference the page head lastTrond Myklebust1-5/+6
This fixes a race with nfs_page_group_sync_on_bit() whereby the call to wake_up_bit() in nfs_page_group_unlock() could occur after the page header had been freed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-07-13NFS: Don't run wake_up_bit() when nobody is waiting...Trond Myklebust1-1/+16
"perf lock" shows fairly heavy contention for the bit waitqueue locks when doing an I/O heavy workload. Use a bit to tell whether or not there has been contention for a lock so that we can optimise away the bit waitqueue options in those cases. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13NFS: Fix initialization of nfs_page_array->npagesBenjamin Coddington1-0/+1
Commit 8ef9b0b9e1c0 open-coded nfs_pgarray_set(), and left out the initialization of the nfs_page_array's npages. This mistake didn't show up until testing with block layouts, and there shows that all pNFS reads return -EIO. Fixes: 8ef9b0b9e1c0 ("NFS: move nfs_pgarray_set() to open code") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # 4.12 Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13NFS: Ensure we commit after writeback is completeTrond Myklebust1-0/+3
If the page cache is being flushed, then we want to ensure that we do start a commit once the pages are done being flushed. If we just wait until all I/O is done to that file, we can end up livelocking until the balance_dirty_pages() mechanism puts its foot down and forces I/O to stop. So instead we do more or less the same thing that O_DIRECT does, and set up a counter to tell us when the flush is done, Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-07-13NFS: Remove unused fields in the page I/O structuresTrond Myklebust1-2/+0
Remove the 'layout_private' fields that were only used by the pNFS OSD layout driver. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-04-21NFS: Add an iocounter wait function for async RPC tasksBenjamin Coddington1-1/+33
By sleeping on a new NFS Unlock-On-Close waitqueue, rpc tasks may wait for a lock context's iocounter to reach zero. The rpc waitqueue is only woken when the open_context has the NFS_CONTEXT_UNLOCK flag set in order to mitigate spurious wake-ups for any iocounter reaching zero. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: move rw_mode to nfs_pageio_headerBenjamin Coddington1-5/+3
Let's try to have it in a cacheline in nfs4_proc_pgio_rpc_prepare(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: move nfs_pgarray_set() to open codeBenjamin Coddington1-20/+14
Since commit 00bfa30abe86 ("NFS: Create a common pgio_alloc and pgio_release function"), nfs_pgarray_set() has only a single caller. Let's open code it. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Use GFP_NOIO for two allocations in writebackBenjamin Coddington1-4/+11
Prevent a deadlock that can occur if we wait on allocations that try to write back our pages. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 00bfa30abe869 ("NFS: Create a common pgio_alloc and pgio_release...") Cc: stable@vger.kernel.org # 3.16+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-04-20NFS: Fix missing pg_cleanup after nfs_pageio_cond_complete()Benjamin Coddington1-2/+4
Commit a7d42ddb3099727f58366fa006f850a219cce6c8 ("nfs: add mirroring support to pgio layer") moved pg_cleanup out of the path when there was non-sequental I/O that needed to be flushed. The result is that for layouts that have more than one layout segment per file, the pg_lseg is not cleared, so we can end up hitting the WARN_ON_ONCE(req_start >= seg_end) in pnfs_generic_pg_test since the pg_lseg will be pointing to that previously-flushed layout segment. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: a7d42ddb3099 ("nfs: add mirroring support to pgio layer") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01NFS: discard nfs_lockowner structure.NeilBrown1-1/+1
It now has only one field and is only used in one structure. So replaced it in that structure by the field it contains. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01NFS: remove l_pid field from nfs_lockownerNeilBrown1-2/+1
this field is not used in any important way and probably should have been removed by Commit: 8003d3c4aaa5 ("nfs4: treat lock owners as opaque values") which removed the pid argument from nfs4_get_lock_state. Except in unusual and uninteresting cases, two threads with the same ->tgid will have the same ->files pointer, so keeping them both for comparison brings no benefit. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-10-07mm: remove page_file_indexHuang Ying1-1/+1
After using the offset of the swap entry as the key of the swap cache, the page_index() becomes exactly same as page_file_index(). So the page_file_index() is removed and the callers are changed to use page_index() instead. Link: http://lkml.kernel.org/r/1473270649-27229-2-git-send-email-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-17NFS: Add nfs_commit_file()Anna Schumaker1-2/+4
Copy will use this to set up a commit request for a generic range. I don't want to allocate a new pagecache entry for the file, so I needed to change parts of the commit path to handle requests with a null wb_page. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov1-3/+3
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-08NFS: Fix a compile warning about unused variable in nfs_generic_pg_pgios()Trond Myklebust1-3/+0
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-07NFS: Use wait_on_atomic_t() for unlock after readaheadBenjamin Coddington1-41/+7
The use of wait_on_atomic_t() for waiting on I/O to complete before unlocking allows us to git rid of the NFS_IO_INPROGRESS flag, and thus the nfs_iocounter's flags member, and finally the nfs_iocounter altogether. The count of I/O is moved to the lock context, and the counter increment/decrement functions become simple enough to open-code. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> [Trond: Fix up conflict with existing function nfs_wait_atomic_killable()] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-01-04Merge branch 'pnfs_generic'Trond Myklebust1-6/+0
* pnfs_generic: NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid() NFSv4.1/pNFS: Fix a race in initiate_file_draining() NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn() NFS: Relax requirements in nfs_flush_incompatible NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid NFS: Allow multiple commit requests in flight per file NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs NFSv4: List stateid information in the callback tracepoints NFSv4.1/pNFS: Don't return NFS4ERR_DELAY unnecessarily in CB_LAYOUTRECALL NFSv4.1/pNFS: Ensure we enforce RFC5661 Section 12.5.5.2.1 pNFS: If we have to delay the layout callback, mark the layout for return NFSv4.1/pNFS: Add a helper to mark the layout as returned pNFS: Ensure nfs4_layoutget_prepare returns the correct error
2015-12-31NFS: Relax requirements in nfs_flush_incompatibleTrond Myklebust1-6/+0
If two processes share the same credentials and NFSv4 open stateid, then allow them both to dirty the same page, even if their nfs_open_context differs. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28nfs: centralize pgio error cleanupPeng Tao1-24/+29
In case we fail during setting things up for read/write IO, set pg_error in IO descriptor and do the cleanup in nfs_pageio_add_request, where we clean up all pages that are still hanging around on the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28nfs: clean up rest of reqs when failing to add onePeng Tao1-3/+14
If we fail to set up things before sending anything over wire, we need to clean up the reqs that are still attached to the IO descriptor. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-28NFS41: pop some layoutget errors to applicationPeng Tao1-1/+8
For ERESTARTSYS/EIO/EROFS/ENOSPC/E2BIG in layoutget, we should just bail out instead of hiding the error and retrying inband IO. Change all the call sites to pop the error all the way up. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-12-13sched/wait: Fix the signal handling fixPeter Zijlstra1-1/+1
Jan Stancek reported that I wrecked things for him by fixing things for Vladimir :/ His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which should not be possible, however my previous patch made this possible by unconditionally checking signal_pending(). We cannot use current->state as was done previously, because the instruction after the store to that variable it can be changed. We must instead pass the initial state along and use that. Fixes: 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers") Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Chris Mason <clm@fb.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Paul Turner <pjt@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: tglx@linutronix.de Cc: Oleg Nesterov <oleg@redhat.com> Cc: hpa@zytor.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-17nfs: fix pg_test page count calculationPeng Tao1-1/+1
We really want sizeof(struct page *) instead. Otherwise we limit maximum IO size to 64 pages rather than 512 pages on a 64bit system. Fixes 2e11f829(nfs: cap request size to fit a kmalloced page array). Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Fixes: 2e11f8296d22 ("nfs: cap request size to fit a kmalloced page array") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-08-17NFS: nfs_set_pgio_error sometimes misses errorsTrond Myklebust1-2/+2
We should ensure that we always set the pgio_header's error field if a READ or WRITE RPC call returns an error. The current code depends on 'hdr->good_bytes' always being initialised to a large value, which is not always done correctly by callers. When this happens, applications may end up missing important errors. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-27NFS: Don't clear desc->pg_moreio in nfs_do_recoalesce()Trond Myklebust1-2/+0
Recoalescing does not affect whether or not we've already sent off I/O, and doing so means that we end up sending a bunch of synchronous for cases where we actually need to be using unstable writes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-27NFS: Fix a memory leak in nfs_do_recoalesceTrond Myklebust1-1/+4
If the function exits early, then we must put those requests that were not processed back onto the &mirror->pg_list so they can be cleaned up by nfs_pgio_error(). Fixes: a7d42ddb30997 ("nfs: add mirroring support to pgio layer") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-07-01nfs: Remove invalid tk_pid from debug messageKinglong Mee1-2/+1
Before rpc_run_task(), tk_pid is uninitiated as 0 always. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-18nfs: Fix comment for nfs_pageio_init() and nfs_pageio_complete_mirror()Yijing Wang1-1/+4
Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-06-10NFS: Remove unused nfs_rw_ops->rw_release() functionAnna Schumaker1-2/+0
This was only ever set to nfs_writeback_release_common(), a function which is completely empty. Let's just drop this function pointer and simplify the code a bit. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-04-15VFS: normal filesystems (and lustre): d_inode() annotationsDavid Howells1-1/+1
that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-11Merge tag 'nfs-for-3.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-49/+245
Pull NFS client updates from Trond Myklebust: "Highlights incluse: Features: - Removing the forced serialisation of open()/close() calls in NFSv4.x (x>0) makes for a significant performance improvement in metadata intensive workloads. - Full support for the pNFS "flexible files" layout type - Further RPC/RDMA client improvements from Chuck Bugfixes: - Stable fix: NFSv4.1 backchannel calls blocking operations with !TASK_RUNNING - Stable fix: pnfs_generic_pg_init_read/write can be called with lseg == NULL - Stable fix: Fix an Oopsable condition when nsm_mon_unmon is called as part of the namespace cleanup, - Stable fix: Ensure we reference the inode for return-on-close in delegreturn - Use SO_REUSEPORT to ensure that NFSv3 TCP connections can rebind to the same source address/port combination during a disconnect/ reconnect event. This is a requirement imposed by most NFSv3 server duplicate reply cache implementations. Optimisations: - Ask for no NFSv4.1 delegations on OPEN if using O_DIRECT Other: - Add Anna Schumaker as co-maintainer for the NFS client" * tag 'nfs-for-3.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (119 commits) SUNRPC: Cleanup to remove xs_tcp_close() pnfs: delete an unintended goto pnfs/flexfiles: Do not dprintk after the free SUNRPC: Fix stupid typo in xs_sock_set_reuseport SUNRPC: Define xs_tcp_fin_timeout only if CONFIG_SUNRPC_DEBUG SUNRPC: Handle connection reset more efficiently. SUNRPC: Remove the redundant XPRT_CONNECTION_CLOSE flag SUNRPC: Make xs_tcp_close() do a socket shutdown rather than a sock_release SUNRPC: Ensure xs_tcp_shutdown() requests a full close of the connection SUNRPC: Cleanup to remove remaining uses of XPRT_CONNECTION_ABORT SUNRPC: Remove TCP socket linger code SUNRPC: Remove TCP client connection reset hack SUNRPC: TCP/UDP always close the old socket before reconnecting SUNRPC: Add helpers to prevent socket create from racing SUNRPC: Ensure xs_reset_transport() resets the close connection flags SUNRPC: Do not clear the source port in xs_reset_transport SUNRPC: Handle EADDRINUSE on connect SUNRPC: Set SO_REUSEPORT socket option for TCP connections NFSv4.1: Fix pnfs_put_lseg races NFSv4.1: pnfs_send_layoutreturn should use GFP_NOFS ...
2015-02-03nfs: add nfs_pgio_current_mirror helperPeng Tao1-8/+17
Let it return current nfs_pgio_mirror in use depending on pg_mirror_count. For read, we always use pg_mirrors[0], so this effectively gives us freedom to use pg_mirror_idx to track the actual mirror to read from through out the IO stack. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
2015-02-03nfs: only reset desc->pg_mirror_idx when mirroring is supportedPeng Tao1-4/+4
so that we don't reset desc->pg_mirror_idx for read unnecessarily. Remove WARN_ON_ONCE from __nfs_pageio_add_request to allow LD to set pg_mirror_idx for read where pg_mirror_count is always 1. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
2015-02-03nfs: add mirroring support to pgio layerWeston Andros Adamson1-45/+225
This patch adds mirrored write support to the pgio layer. The default is to use one mirror, but pgio callers may define callbacks to change this to any value up to the (arbitrarily selected) limit of 16. The basic idea is to break out members of nfs_pageio_descriptor that cannot be shared between mirrored DSes and put them in a new structure. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
2015-02-03nfs: introduce pg_cleanup op for pgio descriptorsWeston Andros Adamson1-1/+4
Add a new operation to nfs_pageio_ops that is called on nfs_pageio_complete. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
2015-02-03nfs: allow to specify cred in nfs_initiate_pgioPeng Tao1-3/+5
so that flexfile layout client can pass in DS credential instead of using user cred, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
2015-02-03pnfs: Add nfs_rpc_ops in calls to nfs_initiate_pgioTom Haynes1-2/+4
Signed-off-by: Tom Haynes <loghyr@primarydata.com>
2015-01-16locks: convert posix locks to file_lock_contextJeff Layton1-5/+3
Signed-off-by: Jeff Layton <jlayton@primarydata.com> Acked-by: Christoph Hellwig <hch@lst.de>
2015-01-16locks: move flock locks to file_lock_contextJeff Layton1-0/+6
Signed-off-by: Jeff Layton <jlayton@primarydata.com> Acked-by: Christoph Hellwig <hch@lst.de>
2014-11-24NFS: fix subtle change in COMMIT behaviorWeston Andros Adamson1-3/+8
Recent work in the pgio layer made it possible for there to be more than one request per page. This caused a subtle change in commit behavior, because write.c:nfs_commit_unstable_pages compares the number of *pages* waiting for writeback against the number of requests on a commit list to choose when to send a COMMIT in a non-blocking flush. This is probably hard to hit in normal operation - you have to be using rsize/wsize < PAGE_SIZE, or pnfs with lots of boundaries that are not page aligned to have a noticeable change in behavior. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>