aboutsummaryrefslogtreecommitdiffstats
path: root/fs/nfs/flexfilelayout (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-03-31Merge tag 'nfs-for-4.11-3' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-0/+4
Pull NFS client fixes from Anna Schumaker: "Here are a few more bugfixes that came in over the last couple of weeks. Most of these fix various hangs and loops that people found, but we also had a few error handling fixes. Stable Bugfixes: - fix infinite loop on BAD_STATEID error Other Bugfixes: - fix old dentry rehash after move - fix pnfs GETDEVINFO hangs - fix pnfs fallback to MDS on commit errors - fix flexfiles kernel oops" * tag 'nfs-for-4.11-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: nfs: flexfiles: fix kernel OOPS if MDS returns unsupported DS type NFSv4.1 fix infinite loop on IO BAD_STATEID error PNFS fix fallback to MDS if got error on commit to DS NFS filelayout:call GETDEVICEINFO after pnfs_layout_process completes NFS store nfs4_deviceid in struct nfs4_filelayout_segment NFS cleanup struct nfs4_filelayout_segment NFS: Fix old dentry rehash after move
2017-03-31nfs: flexfiles: fix kernel OOPS if MDS returns unsupported DS typeTigran Mkrtchyan1-0/+4
this fix aims to fix dereferencing of a mirror in an error state when MDS returns unsupported DS type (IOW, not v3), which causes the following oops: [ 220.370709] BUG: unable to handle kernel NULL pointer dereference at 0000000000000065 [ 220.370842] IP: ff_layout_mirror_valid+0x2d/0x110 [nfs_layout_flexfiles] [ 220.370920] PGD 0 [ 220.370972] Oops: 0000 [#1] SMP [ 220.371013] Modules linked in: nfnetlink_queue nfnetlink_log bluetooth nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security ebtable_filter ebtables ip6table_filter ip6_tables binfmt_misc intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel btrfs kvm arc4 snd_hda_codec_hdmi iwldvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate mac80211 xor uvcvideo [ 220.371814] videobuf2_vmalloc videobuf2_memops snd_hda_codec_idt mei_wdt videobuf2_v4l2 snd_hda_codec_generic iTCO_wdt ppdev videobuf2_core iTCO_vendor_support dell_rbtn dell_wmi iwlwifi sparse_keymap dell_laptop dell_smbios snd_hda_intel dcdbas videodev snd_hda_codec dell_smm_hwmon snd_hda_core media cfg80211 intel_uncore snd_hwdep raid6_pq snd_seq intel_rapl_perf snd_seq_device joydev i2c_i801 rfkill lpc_ich snd_pcm parport_pc mei_me parport snd_timer dell_smo8800 mei snd shpchp soundcore tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc i915 nouveau mxm_wmi ttm i2c_algo_bit drm_kms_helper crc32c_intel e1000e drm sdhci_pci firewire_ohci sdhci serio_raw mmc_core firewire_core ptp crc_itu_t pps_core wmi fjes video [ 220.372568] CPU: 7 PID: 4988 Comm: cat Not tainted 4.10.5-200.fc25.x86_64 #1 [ 220.372647] Hardware name: Dell Inc. Latitude E6520/0J4TFW, BIOS A06 07/11/2011 [ 220.372729] task: ffff94791f6ea580 task.stack: ffffb72b88c0c000 [ 220.372802] RIP: 0010:ff_layout_mirror_valid+0x2d/0x110 [nfs_layout_flexfiles] [ 220.372883] RSP: 0018:ffffb72b88c0f970 EFLAGS: 00010246 [ 220.372945] RAX: 0000000000000000 RBX: ffff9479015ca600 RCX: ffffffffffffffed [ 220.373025] RDX: ffffffffffffffed RSI: ffff9479753dc980 RDI: 0000000000000000 [ 220.373104] RBP: ffffb72b88c0f988 R08: 000000000001c980 R09: ffffffffc0ea6112 [ 220.373184] R10: ffffef17477d9640 R11: ffff9479753dd6c0 R12: ffff9479211c7440 [ 220.373264] R13: ffff9478f45b7790 R14: 0000000000000001 R15: ffff9479015ca600 [ 220.373345] FS: 00007f555fa3e700(0000) GS:ffff9479753c0000(0000) knlGS:0000000000000000 [ 220.373435] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 220.373506] CR2: 0000000000000065 CR3: 0000000196044000 CR4: 00000000000406e0 [ 220.373586] Call Trace: [ 220.373627] nfs4_ff_layout_prepare_ds+0x5e/0x200 [nfs_layout_flexfiles] [ 220.373708] ff_layout_pg_init_read+0x81/0x160 [nfs_layout_flexfiles] [ 220.373806] __nfs_pageio_add_request+0x11f/0x4a0 [nfs] [ 220.373886] ? nfs_create_request.part.14+0x37/0x330 [nfs] [ 220.373967] nfs_pageio_add_request+0xb2/0x260 [nfs] [ 220.374042] readpage_async_filler+0xaf/0x280 [nfs] [ 220.374103] read_cache_pages+0xef/0x1b0 [ 220.374166] ? nfs_read_completion+0x210/0x210 [nfs] [ 220.374239] nfs_readpages+0x129/0x200 [nfs] [ 220.374293] __do_page_cache_readahead+0x1d0/0x2f0 [ 220.374352] ondemand_readahead+0x17d/0x2a0 [ 220.374403] page_cache_sync_readahead+0x2e/0x50 [ 220.374460] generic_file_read_iter+0x6c8/0x950 [ 220.374532] ? nfs_mapping_need_revalidate_inode+0x17/0x40 [nfs] [ 220.374617] nfs_file_read+0x6e/0xc0 [nfs] [ 220.374670] __vfs_read+0xe2/0x150 [ 220.374715] vfs_read+0x96/0x130 [ 220.374758] SyS_read+0x55/0xc0 [ 220.374801] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 220.374856] RIP: 0033:0x7f555f570bd0 [ 220.374900] RSP: 002b:00007ffeb73e1b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 220.374986] RAX: ffffffffffffffda RBX: 00007f555f839ae0 RCX: 00007f555f570bd0 [ 220.375066] RDX: 0000000000020000 RSI: 00007f555fa41000 RDI: 0000000000000003 [ 220.375145] RBP: 0000000000021010 R08: ffffffffffffffff R09: 0000000000000000 [ 220.375226] R10: 00007f555fa40010 R11: 0000000000000246 R12: 0000000000022000 [ 220.375305] R13: 0000000000021010 R14: 0000000000001000 R15: 0000000000002710 [ 220.375386] Code: 66 66 90 55 48 89 e5 41 54 53 49 89 fc 48 83 ec 08 48 85 f6 74 2e 48 8b 4e 30 48 89 f3 48 81 f9 00 f0 ff ff 77 1e 48 85 c9 74 15 <48> 83 79 78 00 b8 01 00 00 00 74 2c 48 83 c4 08 5b 41 5c 5d c3 [ 220.375653] RIP: ff_layout_mirror_valid+0x2d/0x110 [nfs_layout_flexfiles] RSP: ffffb72b88c0f970 [ 220.375748] CR2: 0000000000000065 [ 220.403538] ---[ end trace bcdca752211b7da9 ]--- Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-03-17Merge tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds2-3/+16
Pull NFS client fixes from Anna Schumaker: "We have a handful of stable fixes to fix kernel warnings and other bugs that have been around for a while. We've also found a few other reference counting bugs and memory leaks since the initial 4.11 pull. Stable Bugfixes: - Fix decrementing nrequests in NFS v4.2 COPY to fix kernel warnings - Prevent a double free in async nfs4_exchange_id() - Squelch a kbuild sparse complaint for xprtrdma Other Bugfixes: - Fix a typo (NFS_ATTR_FATTR_GROUP_NAME) that causes a memory leak - Fix a reference leak that causes kernel warnings - Make nfs4_cb_sv_ops static to fix a sparse warning - Respect a server's max size in CREATE_SESSION - Handle errors from nfs4_pnfs_ds_connect - Flexfiles layout shouldn't mark devices as unavailable" * tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: pNFS/flexfiles: never nfs4_mark_deviceid_unavailable pNFS: return status from nfs4_pnfs_ds_connect NFSv4.1 respect server's max size in CREATE_SESSION NFS prevent double free in async nfs4_exchange_id nfs: make nfs4_cb_sv_ops static xprtrdma: Squelch kbuild sparse complaint NFS: fix the fault nrequests decreasing for nfs_inode COPY NFSv4: fix a reference leak caused WARNING messages nfs4: fix a typo of NFS_ATTR_FATTR_GROUP_NAME
2017-03-17pNFS/flexfiles: never nfs4_mark_deviceid_unavailableWeston Andros Adamson2-2/+14
The flexfiles layout should never mark a device unavailable. Move nfs4_mark_deviceid_unavailable out of nfs4_pnfs_ds_connect and call directly from files layout where it's still needed. The flexfiles driver still handles marked devices in error paths, but will now print a rate limited warning. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-03-17pNFS: return status from nfs4_pnfs_ds_connectWeston Andros Adamson1-1/+2
The nfs4_pnfs_ds_connect path can call rpc_create which can fail or it can wait on another context to reach the same failure. This checks that the rpc_create succeeded and returns the error to the caller. When an error is returned, both the files and flexfiles layouts will return NULL from _prepare_ds(). The flexfiles layout will also return the layout with the error NFS4ERR_NXIO. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-03-01Merge tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-39/+21
Pull NFS client updates from Anna Schumaker: "Highlights include: Stable bugfixes: - NFSv4: Fix memory and state leak in _nfs4_open_and_get_state - xprtrdma: Fix Read chunk padding - xprtrdma: Per-connection pad optimization - xprtrdma: Disable pad optimization by default - xprtrdma: Reduce required number of send SGEs - nlm: Ensure callback code also checks that the files match - pNFS/flexfiles: If the layout is invalid, it must be updated before retrying - NFSv4: Fix reboot recovery in copy offload - Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE" - NFSv4: fix getacl head length estimation - NFSv4: fix getacl ERANGE for sum ACL buffer sizes Features: - Add and use dprintk_cont macros - Various cleanups to NFS v4.x to reduce code duplication and complexity - Remove unused cr_magic related code - Improvements to sunrpc "read from buffer" code - Clean up sunrpc timeout code and allow changing TCP timeout parameters - Remove duplicate mw_list management code in xprtrdma - Add generic functions for encoding and decoding xdr streams Bugfixes: - Clean up nfs_show_mountd_netid - Make layoutreturn_ops static and use NULL instead of 0 to fix sparse warnings - Properly handle -ERESTARTSYS in nfs_rename() - Check if register_shrinker() failed during rpcauth_init() - Properly clean up procfs/pipefs entries - Various NFS over RDMA related fixes - Silence unititialized variable warning in sunrpc" * tag 'nfs-for-4.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (64 commits) NFSv4: fix getacl ERANGE for some ACL buffer sizes NFSv4: fix getacl head length estimation Revert "NFSv4.1: Handle NFS4ERR_BADSESSION/NFS4ERR_DEADSESSION replies to OP_SEQUENCE" NFSv4: Fix reboot recovery in copy offload pNFS/flexfiles: If the layout is invalid, it must be updated before retrying NFSv4: Clean up owner/group attribute decode SUNRPC: Add a helper function xdr_stream_decode_string_dup() NFSv4: Remove bogus "struct nfs_client" argument from decode_ace() NFSv4: Fix the underestimation of delegation XDR space reservation NFSv4: Replace callback string decode function with a generic NFSv4: Replace the open coded decode_opaque_inline() with the new generic NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* generics SUNRPC: Add generic helpers for xdr_stream encode/decode sunrpc: silence uninitialized variable warning nlm: Ensure callback code also checks that the files match sunrpc: Allow xprt->ops->timer method to sleep xprtrdma: Refactor management of mw_list field xprtrdma: Handle stale connection rejection xprtrdma: Properly recover FRWRs with in-flight FASTREG WRs xprtrdma: Shrink send SGEs array ...
2017-02-27lib/vsprintf.c: remove %Z supportAlexey Dobriyan1-2/+2
Now that %z is standartised in C99 there is no reason to support %Z. Unlike %L it doesn't even make format strings smaller. Use BUILD_BUG_ON in a couple ATM drivers. In case anyone didn't notice lib/vsprintf.o is about half of SLUB which is in my opinion is quite an achievement. Hopefully this patch inspires someone else to trim vsprintf.c more. Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-22pNFS/flexfiles: If the layout is invalid, it must be updated before retryingTrond Myklebust1-6/+7
If we see that our pNFS READ/WRITE/COMMIT operation failed, but we also see that our layout segment is no longer valid, then we need to get a new layout segment before retrying. Fixes: 90816d1ddacf ("NFSv4.1/flexfiles: Don't mark the entire deviceid...") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-02-21NFSv4: Replace ad-hoc xdr encode/decode helpers with xdr_stream_* genericsTrond Myklebust1-4/+1
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-01-30pNFS/flexfiles: Make local symbol layoutreturn_ops staticWei Yongjun1-1/+1
Fixes the following sparse warning: fs/nfs/flexfilelayout/flexfilelayout.c:2114:34: warning: symbol 'layoutreturn_ops' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2017-01-30NFS: Use nfs4_setup_sequence() everywhereAnna Schumaker1-28/+12
This does the right thing depending on if we have a session, rather than needing to handle this manually in multiple places. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-12-25ktime: Get rid of ktime_equal()Thomas Gleixner1-1/+1
No point in going through loops and hoops instead of just comparing the values. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-25ktime: Get rid of the unionThomas Gleixner1-2/+1
ktime is a union because the initial implementation stored the time in scalar nanoseconds on 64 bit machine and in a endianess optimized timespec variant for 32bit machines. The Y2038 cleanup removed the timespec variant and switched everything to scalar nanoseconds. The union remained, but become completely pointless. Get rid of the union and just keep ktime_t as simple typedef of type s64. The conversion was done with coccinelle and some manual mopping up. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
2016-12-19pNFS/flexfiles: delete deviceid, don't mark inactiveWeston Andros Adamson2-3/+5
Instead of marking a device inactive, remove it from the cache entirely. Flexfiles has a way to report errors back to the server, so we don't want to stop devices from being tried again for 120 seconds. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09pNFS/flexfiles: Ensure we have enough buffer for layoutreturnTrond Myklebust2-6/+27
The flexfiles client can piggyback both layout errors and layoutstats as part of the layoutreturn. Both these payloads can get large, with 20 layout error entries taking up about 1.2K, and 4 layoutstats entries taking up another 1K. This patch allows a maximum payload of 4k by allocating a full page. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09pNFS/flexfiles: Remove a redundant parameter in ff_layout_encode_ioerr()Trond Myklebust1-6/+4
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-08pNFS/flexfiles: Fix a deadlock on LAYOUTGETFred Isaman3-46/+43
We encountered a deadlock where the SEQUENCE that accompanied the LAYOUTGET triggered a session drain, while ff_layout_alloc_lseg triggered a GETDEVICEINFO. The GETDEVICEINFO hung waiting for the session drain, while the LAYOUTGET held the slot waiting for alloc_lseg to finish. Avoid this by moving the call to nfs4_find_get_deviceid out of ff_layout_alloc_lseg and into nfs4_ff_layout_prepare_ds. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> [dros@primarydata.com: pNFS/flexfiles: fix races in ff_layout_mirror_valid] Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07pNFS/flexfiles: Fix ff_layout_add_ds_error_locked()Trond Myklebust1-1/+2
When we're merging an old entry into our new entry, we want to ensure that we add the list entry in the correct place. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03pNFS/flexfiles: Support sending layoutstats in layoutreturnTrond Myklebust2-6/+79
Add the ability to send an array of layoutstats entries as part of layoutreturn. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03pNFS/flexfiles: Minor refactoring before adding iostats to layoutreturnTrond Myklebust1-25/+34
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03NFS: Fix up read of mirror statsTrond Myklebust1-0/+2
Need to lock while reading in order to ensure 64-bit reads are correct. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03pNFS/flexfiles: Clean up layoutstatsTrond Myklebust1-20/+7
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-03pNFS/flexfiles: Refactor encoding of the layoutreturn payloadTrond Myklebust3-31/+124
Add the layout error payload to the flexfiles layoutreturn private data, and set up the encoding mechanisms. This is a refactoring in preparation for adding the layout iostats payload. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS/flexfiles: Only send layoutstats updates for mirrors that were updatedTrond Myklebust2-0/+9
If there have been no reads or writes to a given mirror since the last layoutstats update, then don't resend the same data. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-02pNFS/flexfiles: Don't attempt to send layoutstats if there are no entriesTrond Myklebust1-0/+5
If the list of mirrors is empty, then don't send an RPC call. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01NFS: Remove unused authflavour parameter from nfs_get_client()Anna Schumaker1-2/+1
This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so let's remove it from this function and callers. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01pNFS: Get rid of unnecessary layout parameter in encode_layoutreturn callbackTrond Myklebust1-2/+2
The parameter is already present in the "args" structure. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01pNFS: consolidate the different range intersection testsTrond Myklebust1-25/+8
Both pnfs.c and the flexfiles code have their own versions of the range intersection testing, and the "end_offset" helper. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-01pNFS: Fix a deadlock between read resends and layoutreturnTrond Myklebust1-0/+4
We must not call nfs_pageio_init_read() on a new nfs_pageio_descriptor while holding a reference to a layout segment, as that can deadlock pnfs_update_layout(). Fixes: d67ae825a59d6 ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org # v4.0+
2016-09-27NFSv4.x: Allow callers of nfs_remove_bad_delegation() to specify a stateidTrond Myklebust1-1/+1
Allow the callers of nfs_remove_bad_delegation() to specify the stateid that needs to be marked as bad. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-08-29pNFS/flexfiles: Fix an Oopsable condition when connection to the DS failsTrond Myklebust2-28/+28
If the attempt to connect to a DS fails inside ff_layout_pg_init_read or ff_layout_pg_init_write, then we currently end up clearing the layout segment carried by the struct nfs_pageio_descriptor, causing an Oops when we later call into ff_layout_read_pagelist/ff_layout_write_pagelist. The fix is to ensure we return the layout and then retry. Fixes: 446ca2195303 ("pNFS/flexfiles: When initing reads or writes, we...") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-16pNFS/flexfiles: Set reasonable default retrans values for the data channelTrond Myklebust1-2/+2
Prior to this patch, the retrans value was set at 5, meaning that we could see a maximum retransmission timeout value of more than 6 minutes. That's a tad high for NFSv3 where the protocol does allow the server to drop requests at any time. Since this is a data channel, let's just set retrans to 0, and the default timeout to 60s. The user can continue to adjust these defaults using the dataserver_retrans and dataserver_timeo module parameters. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-08-14pNFS/flexfiles: Fix layoutstat periodic reportingTrond Myklebust2-5/+5
Putting the periodicity timer in the mirror instances is causing non-scalable reporting behaviour and missed reporting intervals. When you recall layouts and/or implement client side mirroring, it leads to consecutive reports with only a few ms between RPC calls. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: d0379a5d066a9 ("pNFS/flexfiles: Support server-supplied...")
2016-07-05pNFS: Files and flexfiles always need to commit before layoutcommitTrond Myklebust1-2/+5
So ensure that we mark the layout for commit once the write is done, and then ensure that the commit to ds is finished before sending layoutcommit. Note that by doing this, we're able to optimise away the commit for the case of servers that don't need layoutcommit in order to return updated attributes. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05pNFS/flexfiles: Clean up calls to pnfs_set_layoutcommit()Trond Myklebust1-9/+10
Let's just have one place where we check ff_layout_need_layoutcommit(). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05pNFS/flexfiles: Fix layoutcommit after a commit to DSTrond Myklebust1-2/+1
We should always do a layoutcommit after commit to DS, except if the layout segment we're using has set FF_FLAGS_NO_LAYOUTCOMMIT. Fixes: d67ae825a59d ("pnfs/flexfiles: Add the FlexFile Layout Driver") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-05-26pnfs: pnfs_update_layout needs to consider if strict iomode checking is onTom Haynes1-13/+36
As flexfiles has FF_FLAGS_NO_READ_IO, there is a need to generically support enforcing that a IOMODE_RW segment will not allow READ I/O. Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-26nfs/flexfiles: Use the layout segment for reading unless it a IOMODE_RW and reading is disabledTom Haynes1-2/+3
Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-25nfs/flexfiles: Helper function to detect FF_FLAGS_NO_READ_IOTom Haynes2-1/+16
The mds can inform the client not to use the IOMODE_RW layout segment for doing READs. I.e., it is basically a IOMODE_WRITE layout segment. It would do this to not interfere with the WRITEs. Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_dsJeff Layton1-1/+17
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTEDJeff Layton2-9/+1
Setting just the NFS_LAYOUT_RETURN_REQUESTED flag doesn't do anything, unless there are lsegs that are also being marked for return. At the point where that happens this flag is also set, so these set_bit calls don't do anything useful. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit setJeff Layton1-2/+2
Otherwise, we'll end up returning layouts that we've just received if the client issues a new LAYOUTGET prior to the LAYOUTRETURN. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17pNFS/flexfiles: When initing reads or writes, we might have to retry connecting to DSesTom Haynes1-4/+25
If we are initializing reads or writes and can not connect to a DS, then check whether or not IO is allowed through the MDS. If it is allowed, reset to the MDS. Else, fail the layout segment and force a retry of a new layout segment. Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17pNFS/flexfiles: When checking for available DSes, conditionally check for MDS ioTom Haynes3-3/+9
Whenever we check to see if we have the needed number of DSes for the action, we may also have to check to see whether IO is allowed to go to the MDS or not. [jlayton: fix merge conflict due to lack of localio patches here] Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17pNFS/flexfile: Fix erroneous fall back to read/write through the MDSTrond Myklebust1-17/+6
This patch fixes a problem whereby the pNFS client falls back to doing reads and writes through the metadata server even when the layout flag FF_FLAGS_NO_IO_THRU_MDS is set. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-17NFSv4: Label stateids with the typeTrond Myklebust2-4/+6
In order to more easily distinguish what kind of stateid we are dealing with, introduce a type that can be used to label the stateid structure. The label will be useful both for debugging, but also when dealing with operations like SETATTR, READ and WRITE that can take several different types of stateid as arguments. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: have flexfiles mirror keep creds for both ro and rw layoutsJeff Layton3-10/+30
A mirror can be shared between multiple layouts, even with different iomodes. That makes stats gathering simpler, but it causes a problem when we get different creds in READ vs. RW layouts. The current code drops the newer credentials onto the floor when this occurs. That's problematic when you fetch a READ layout first, and then a RW. If the READ layout doesn't have the correct creds to do a write, then writes will fail. We could just overwrite the READ credentials with the RW ones, but that would break the ability for the server to fence the layout for reads if things go awry. We need to be able to revert to the earlier READ creds if the RW layout is returned afterward. The simplest fix is to just keep two sets of creds per mirror. One for READ layouts and one for RW, and then use the appropriate set depending on the iomode of the layout segment. Also fix up some RCU nits that sparse found. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: get a reference to the credential in ff_layout_alloc_lsegJeff Layton3-54/+36
We're just as likely to have allocation problems here as we would if we delay looking up the credential like we currently do. Fix the code to get a rpc_cred reference early, as soon as the mirror is set up. This allows us to eliminate the mirror early if there is a problem getting an rpc credential. This also allows us to drop the uid/gid from the layout_mirror struct as well. In the event that we find an existing mirror where this one would go, we swap in the new creds unconditionally, and drop the reference to the old one. Note that the old ff_layout_update_mirror_cred function wouldn't set this pointer unless the DS version was 3, but we don't know what the DS version is at this point. I'm a little unclear on why it did that as you still need creds to talk to v4 servers as well. I have the code set it regardless of the DS version here. Also note the change to using generic creds instead of calling lookup_cred directly. With that change, we also need to populate the group_info pointer in the acred as some functions expect that to never be NULL. Instead of allocating one every time however, we can allocate one when the module is loaded and share it since the group_info is refcounted. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: have ff_layout_get_ds_cred take a reference to the credJeff Layton2-10/+35
In later patches, we're going to want to allow the creds to be updated when we get a new layout with updated creds. Have this function take a reference to the cred that is later put once the call has been dispatched. Also, prepare for this change by ensuring we follow RCU rules when getting a reference to the cred as well. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-05-09nfs: don't call nfs4_ff_layout_prepare_ds from ff_layout_get_ds_credJeff Layton1-5/+1
All the callers already call that function before calling into here, so it ends up being a no-op anyway. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>