aboutsummaryrefslogtreecommitdiffstats
path: root/fs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-08-01Btrfs: remove a BUG_ON() in btrfs_commit_transaction()Li Zefan1-4/+2
wait_for_commit() always returns 0. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: use wait_event()Li Zefan1-52/+7
Use wait_event() when possible to avoid code duplication. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: check the nodatasum flag when writing compressed filesLi Zefan1-4/+10
If mounting with nodatasum option, we won't csum file data for general write or direct-io write, and this rule should also be applied when writing compressed files. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: copy string correctly in INO_LOOKUP ioctlLi Zefan1-2/+1
Memory areas [ptr, ptr+total_len] and [name, name+total_len] may overlap, so it's wrong to use memcpy(). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: don't print the leaf if we had an errorJosef Bacik1-1/+3
In __btrfs_free_extent we will print the leaf if we fail to find the extent we wanted, but the problem is if we get an error we won't have a leaf so often this leads to a NULL pointer dereference and we lose the error that actually occurred. So only print the leaf if ret > 0, which means we didn't find the item we were looking for but we didn't error either. This way the error is preserved. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01btrfs: make btrfs_set_root_node voidMark Fasheh2-5/+4
This is fairly trivial - btrfs_set_root_node() - always returns zero so we can just make it void. All callers ignore the return code now anyway. I also made sure to check that none of the functions that btrfs_set_root_node() calls returns an error that we might have needed to catch and pass back. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: fix oops while writing data to SSD partitionsliubo1-1/+3
Here I have a two SSD-partitions btrfs, and they are defaultly set to "data=raid0, metadata=raid1", then I try to fill my btrfs partition till "No space left on device", via "dd if=/dev/zero of=/mnt/btrfs/tmp". I get an oops panic from kernel BUG at fs/btrfs/extent-tree.c:5199!, which refers to find_free_extent's BUG_ON(index != get_block_group_index(block_group)); In SSD mode, in order to find enough space to alloc, we may check the block_group cache which has been checked sometime before, but the index is not updated, where it hits the BUG_ON. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Acked-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: Protect the readonly flag of block groupWuBo1-3/+7
The access for ro in btrfs_block_group_cache should be protected because of the racy lock in relocation. Signed-off-by: Wu Bo <wu.bo@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01btrfs: Make extent-io callbacks that never fail return voidJeff Mahoney3-62/+34
The set/clear bit and the extent split/merge hooks only ever return 0. Changing them to return void simplifies the error handling cases later. This patch changes the hook prototypes, the single implementation of each, and the functions that call them to return void instead. Since all four of these hooks execute under a spinlock, they're necessarily simple. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: fix readahead in file defragLi Zefan2-16/+8
We passed the wrong value to btrfs_force_ra(). Fix this by changing the argument of btrfs_force_ra() from last_index to nr_page. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs: return error to caller when btrfs_unlink() failesTsutomu Itoh2-4/+9
When btrfs_unlink_inode() and btrfs_orphan_add() in btrfs_unlink() are error, the error code is returned to the caller instead of BUG_ON(). Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Btrfs:don't check the return value of __btrfs_add_inode_defragWanlong Gao1-6/+5
Don't need to check the return value of __btrfs_add_inode_defrag(), since it will always return 0. Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-08-01Merge branch 'alloc_path' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/btrfs-error-handling into for-linusChris Mason6-29/+78
2011-08-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds9-376/+370
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: CIFS: Cleanup demupltiplex thread exiting code CIFS: Move mid search to a separate function CIFS: Move RFC1002 check to a separate function CIFS: Simplify socket reading in demultiplex thread CIFS: Move buffer allocation to a separate function cifs: remove unneeded variable initialization in cifs_reconnect_tcon cifs: simplify refcounting for oplock breaks cifs: fix compiler warning in CIFSSMBQAllEAs cifs: fix name parsing in CIFSSMBQAllEAs cifs: don't start signing too early cifs: trivial: goto out here is unnecessary cifs: advertise the right receive buffer size to the server
2011-08-01CIFS: Cleanup demupltiplex thread exiting codePavel Shilovsky1-77/+96
Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-08-01CIFS: Move mid search to a separate functionPavel Shilovsky1-65/+61
Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-08-01CIFS: Move RFC1002 check to a separate functionPavel Shilovsky1-49/+67
Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-08-01CIFS: Simplify socket reading in demultiplex threadPavel Shilovsky1-83/+71
Move reading to separate function and remove csocket variable. Also change semantic in a little: goto incomplete_rcv only when we get -EAGAIN (or a familiar error) while reading rfc1002 header. In this case we don't check for echo timeout when we don't get whole header at once, as it was before. Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-08-01ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_infoTheodore Ts'o3-13/+15
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-08-01ext4: introduce ext4_kvmalloc(), ext4_kzalloc(), and ext4_kvfree()Theodore Ts'o2-18/+39
Introduce new helper functions which try kmalloc, and then fall back to vmalloc if necessary, and use them for allocating and deallocating s_flex_groups. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-08-01CIFS: Move buffer allocation to a separate functionPavel Shilovsky1-37/+55
Reviewed-and-Tested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-08-01ext4: use the correct error exit path in ext4_init_inode_table()Yongqiang Yang1-1/+1
This patch lets ext4_init_inode_table() handle errors right. ext4_init_inode_table() should down_write() alloc_sem which has been up_write()ed and stop the started journal handle. Signed-off-by: Yongqiang Yang <xiaoqiangnk@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-08-01xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not setMarkus Trippelsdorf1-1/+4
commit 4e34e719e45, that takes the ACL checks to common code, accidentely broke the build when CONFIG_FS_POSIX_ACL is not set: CC fs/xfs/linux-2.6/xfs_iops.o fs/xfs/linux-2.6/xfs_iops.c:1025:14: error: ‘xfs_get_acl’ undeclared here (not in a function) Fix this by declaring xfs_get_acl a static inline function. Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01VFS: Reorganise shrink_dcache_for_umount_subtree() after demise of dcache_lockDavid Howells1-17/+5
Reorganise shrink_dcache_for_umount_subtree() in light of the demise of dcache_lock. Without that dcache_lock, there is no need for the batching of removal of dentries from the system under it (we wanted to make intensive use of the locked data whilst we held it, but didn't want to hold it for long at a time). This works, provided the preceding patch is correct in its removal of locking on dentry->d_lock on the basis that no one should be locking these dentries any more as the whole superblock is defunct. With this patch, the calls to dentry_lru_del() and __d_shrink() are placed at the point where each dentry is detached handled. It is possible that, as an alternative, the batching should still be done - but only for dentry_lru_del() of all a dentry's children in one go. In such a case, the batching would be done under dcache_lru_lock. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01VFS: Remove dentry->d_lock locking from shrink_dcache_for_umount_subtree()David Howells1-24/+24
Locks of the dcache_lock were replaced by locks of dentry->d_lock in commits such as: 2304450783dfde7b0b94ae234edd0dbffa865073 2fd6b7f50797f2e993eea59e0a0b8c6399c811dc as part of the RCU-based pathwalk changes, despite the fact that the caller (shrink_dcache_for_umount()) notes in the banner comment the reasons that d_lock is not necessary in these functions: /* * destroy the dentries attached to a superblock on unmounting * - we don't need to use dentry->d_lock because: * - the superblock is detached from all mountings and open files, so the * dentry trees will not be rearranged by the VFS * - s_umount is write-locked, so the memory pressure shrinker will ignore * any dentries belonging to this superblock that it comes across * - the filesystem itself is no longer permitted to rearrange the dentries * in this superblock */ So remove these locks. If the locks are actually necessary, then this banner comment should be altered instead. The hash table chains are protected by 1-bit locks in the hash table heads, so those shouldn't be a problem. Note that to make this work, __d_drop() has to be split so that the RCUwalk barrier can be avoided. This causes problems otherwise as it has an assertion that dentry->d_lock is locked - but there is no need for that as no one else can be trying to access this dentry, except to step over it (and that should be handled by d_free(), I think). Signed-off-by: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01VFS: Remove detached-dentry counter from shrink_dcache_for_umount_subtree()David Howells1-3/+0
Remove the detached-dentry counter from shrink_dcache_for_umount_subtree() as the value it computes is no longer used as of commit 312d3ca856d369bb04d0443846b85b4cdde6fa8a which made the nr_dentry counters summed per-CPU rather than global atomic. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01switch posix_acl_chmod() to umode_tAl Viro1-2/+2
again, that's what all callers pass to it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01switch posix_acl_from_mode() to umode_tAl Viro1-1/+1
... seeing that this is what all callers pass to it anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01switch posix_acl_equiv_mode() to umode_t *Al Viro13-31/+14
... so that &inode->i_mode could be passed to it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01switch posix_acl_create() to umode_t *Al Viro20-50/+32
so we can pass &inode->i_mode to it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01block: initialise bd_super in bdget()Lachlan McIlroy1-0/+1
bd_super is currently reset to NULL in kill_block_super() so we rely on previous users of the block_device object to initialise this value for the next user. This quirk was exposed on RHEL5 when a third party filesystem did not always use kill_block_super() and therefore bd_super wasn't being reset when a block_device object was recycled within the cache. This may not be a problem upstream but makes sense to be defensive. Signed-off-by: Lachlan McIlroy <lmcilroy@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01vfs: avoid call to inode_lru_list_del() if possibleEric Dumazet1-1/+2
inode_lru_list_del() is expensive because of per superblock lru locking, while some inodes are not in lru list. Adding a check in iput_final() can speedup pipe/sockets workloads on SMP. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01vfs: avoid taking inode_hash_lock on pipes and socketsEric Dumazet1-3/+3
Some inodes (pipes, sockets, ...) are not hashed, no need to take contended inode_hash_lock at dismantle time. nice speedup on SMP machines on socket intensive workloads. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01vfs: conditionally call inode_wb_list_del()Eric Dumazet1-1/+3
Some inodes (pipes, sockets, ...) are not in bdi writeback list. evict() can avoid calling inode_wb_list_del() and its expensive spinlock by checking inode i_wb_list being empty or not. At this point, no other cpu/user can concurrently manipulate this inode i_wb_list Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01VFS: Fix automount for negative autofs dentriesDavid Howells1-9/+15
Autofs may set the DCACHE_NEED_AUTOMOUNT flag on negative dentries. These need attention from the automounter daemon regardless of the LOOKUP_FOLLOW flag. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-08-01Btrfs: load the key from the dir item in readdir into a fake dentryJosef Bacik1-2/+45
In btrfs we have 2 indexes for inodes. One is for readdir, it's in this nice sequential order and works out brilliantly for readdir. However if you use ls, it usually stat's each file it gets from readdir. This is where the second index comes in, which is based on a hash of the name of the file. So then the lookup has to lookup this index, and then lookup the inode. The index lookup is going to be in random order (since its based on the name hash), which gives us less than stellar performance. Since we know the inode location from the readdir index, I create a dummy dentry and copy the location key into dentry->d_fsdata. Then on lookup if we have d_fsdata we use that location to lookup the inode, avoiding looking up the other directory index. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-31NFS: Re-enable compilation of nfs with !CONFIG_NFS_V4 || !CONFIG_NFS_V4_1Trond Myklebust1-1/+1
Fix two recently introduced compile problems: Fix a typo in fs/nfs/pnfs.h Move the pnfs_blksize declaration outside the CONFIG_NFS_V4 section in struct nfs_server. Reported-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-31cifs: remove unneeded variable initialization in cifs_reconnect_tconJeff Layton1-1/+1
Reported-and-acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31cifs: simplify refcounting for oplock breaksJeff Layton5-58/+4
Currently, we take a sb->s_active reference and a cifsFileInfo reference when an oplock break workqueue job is queued. This is unnecessary and more complicated than it needs to be. Also as Al points out, deactivate_super has non-trivial locking implications so it's best to avoid that if we can. Instead, just cancel any pending oplock breaks for this filehandle synchronously in cifsFileInfo_put after taking it off the lists. That should ensure that this job doesn't outlive the structures it depends on. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31cifs: fix compiler warning in CIFSSMBQAllEAsJeff Layton1-5/+1
The recent fix to the above function causes this compiler warning to pop on some gcc versions: CC [M] fs/cifs/cifssmb.o fs/cifs/cifssmb.c: In function ‘CIFSSMBQAllEAs’: fs/cifs/cifssmb.c:5708: warning: ‘ea_name_len’ may be used uninitialized in this function Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31cifs: fix name parsing in CIFSSMBQAllEAsJeff Layton1-1/+7
The code that matches EA names in CIFSSMBQAllEAs is incorrect. It uses strncmp to do the comparison with the length limited to the name_len sent in the response. Problem: Suppose we're looking for an attribute named "foobar" and have an attribute before it in the EA list named "foo". The comparison will succeed since we're only looking at the first 3 characters. Fix this by also comparing the length of the provided ea_name with the name_len in the response. If they're not equal then it shouldn't match. Reported-by: Jian Li <jiali@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31cifs: don't start signing too earlyJeff Layton1-2/+14
Sniffing traffic on the wire shows that windows clients send a zeroed out signature field in a NEGOTIATE request, and send "BSRSPYL" in the signature field during SESSION_SETUP. Make the cifs client behave the same way. It doesn't seem to make much difference in any server that I've tested against, but it's probably best to follow windows behavior as closely as possible here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31cifs: trivial: goto out here is unnecessaryJeff Layton1-6/+0
...and remove some obsolete comments. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31cifs: advertise the right receive buffer size to the serverJeff Layton1-1/+2
Currently, we mirror the same size back to the server that it sends us. That makes little sense. Instead we should be sending the server the maximum buffer size that we can handle -- CIFSMaxBufSize minus the 4 byte RFC1001 header. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
2011-07-31Merge branch 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds16-87/+3090
* 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits) pnfsblock: write_pagelist handle zero invalid extents pnfsblock: note written INVAL areas for layoutcommit pnfsblock: bl_write_pagelist pnfsblock: bl_read_pagelist pnfsblock: cleanup_layoutcommit pnfsblock: encode_layoutcommit pnfsblock: merge rw extents pnfsblock: add extent manipulation functions pnfsblock: bl_find_get_extent pnfsblock: xdr decode pnfs_block_layout4 pnfsblock: call and parse getdevicelist pnfsblock: merge extents pnfsblock: lseg alloc and free pnfsblock: remove device operations pnfsblock: add device operations pnfsblock: basic extent code pnfsblock: use pageio_ops api pnfsblock: add blocklayout Kconfig option, Makefile, and stubs pnfs: cleanup_layoutcommit pnfs: ask for layout_blksize and save it in nfs_server ...
2011-07-31pnfsblock: write_pagelist handle zero invalid extentsPeng Tao1-42/+233
For invalid extents, find other pages in the same fsblock and write them out. [pnfsblock: write_begin] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfsblock: note written INVAL areas for layoutcommitFred Isaman3-0/+129
Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfsblock: bl_write_pagelistFred Isaman1-3/+126
Note: When upper layer's read/write request cannot be fulfilled, the block layout driver shouldn't silently mark the page as error. It should do what can be done and leave the rest to the upper layer. To do so, we should set rdata/wdata->res.count properly. When upper layer re-send the read/write request to finish the rest part of the request, pgbase is the position where we should start at. [pnfsblock: bl_write_pagelist support functions] [pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com> [pnfs-block: use new write_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> [SQUASHME: pnfsblock: mds_offset is set in the generic layer] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> [pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED] Signed-off-by: Peng Tao <peng_tao@emc.com> [pnfsblock: SQUASHME: adjust to API change] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [pnfsblock: fixup blksize alignment in bl_setup_layoutcommit] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> [pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com> [pnfs-block: use new write_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfsblock: bl_read_pagelistFred Isaman1-0/+265
Note: When upper layer's read/write request cannot be fulfilled, the block layout driver shouldn't silently mark the page as error. It should do what can be done and leave the rest to the upper layer. To do so, we should set rdata/wdata->res.count properly. When upper layer re-send the read/write request to finish the rest part of the request, pgbase is the position where we should start at. [pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED] Signed-off-by: Peng Tao <peng_tao@emc.com> [pnfsblock: read path error handling] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com> [pnfs-block: use new read_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31pnfsblock: cleanup_layoutcommitFred Isaman3-0/+217
In blocklayout driver. There are two things happening while layoutcommit/cleanup. 1. the modified extents are encoded. 2. On cleanup the extents are put back on the layout rw extents list, for reads. In the new system where actual xdr encoding is done in encode_layoutcommit() directly into xdr buffer, these are the new commit stages: 1. On setup_layoutcommit, the range is adjusted as before and a structure is allocated for communication with bl_encode_layoutcommit && bl_cleanup_layoutcommit (Generic layer provides a void-star to hang it on) 2. bl_encode_layoutcommit is called to do the actual encoding directly into xdr. The commit-extent-list is not freed and is stored on above structure. FIXME: The code is not yet converted to the new XDR cleanup 3. On cleanup the commit-extent-list is put back by a call to set_to_rw() as before, but with no need for XDR decoding of the list as before. And the commit-extent-list is freed. Finally allocated structure is freed. [rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()] Signed-off-by: Jim Rees <rees@umich.edu> [pnfsblock: introduce bl_committing list] Signed-off-by: Peng Tao <peng_tao@emc.com> [pnfsblock: SQUASHME: adjust to API change] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> [blocklayout: encode_layoutcommit implementation] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [pnfsblock: fix bug setting up layoutcommit.] Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn> [pnfsblock: cleanup_layoutcommit wants a status parameter] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>