linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2009-04-04	x86, mtrr: remove debug message	Ingo Molnar	1	-3/+0
	The MTRR code grew a new debug message which triggers commonly: [ 40.142276] get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back [ 40.142280] get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back [ 40.142284] get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back [ 40.142311] get_mtrr: cpu0 reg00 base=0000000000 size=0000080000 write-back [ 40.142314] get_mtrr: cpu0 reg01 base=0000080000 size=0000040000 write-back [ 40.142317] get_mtrr: cpu0 reg02 base=0000100000 size=0000040000 write-back Remove this annoyance. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03	LANANA: Change of management and resync	Alan Cox	1	-63/+58
	Bring the devices.txt back into some relationship with reality. Update the documentation a bit. Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03	x86, PAT: Remove duplicate memtype reserve in pci mmap	Suresh Siddha	1	-46/+0
	pci mmap code was doing memtype reserve for a while now. Recently we added memtype tracking in remap_pfn_range, and pci code indirectly calls remap_pfn_range. So, we don't need seperate tracking in pci code anymore. Which means a patch that removes ~50 lines of code :-). Also, recently we found out that the pci tracking is not working as we expect it to work in some cases. Specifically, userlevel X mmap of pci, with some recent version of X, is having a problem with vm_page_prot getting reset. The pci tracking uses vm_page_prot to pass on the protection type from parent to child during fork. a) Parent does a pci mmap b) We look at PAT and get either UC_MINUS or WC mapping for parent c) Store that mapping type in vma vm_page_prot for future use d) This thread does a fork e) Fork results in mmap_ops ->open for the child process f) We get the vm_page_prot from vma and reserve that type for the child process But, between c) and e) above, the vma vm_page_prot is getting reset to zero. This results in PAT reserve failing at the time of fork as in here. http://marc.info/?l=linux-kernel&m=123858163103240&w=2 This cleanup makes the above problem go away as we do not depend on vm_page_prot in our PAT code anymore. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03	ocfs2: recover orphans in offline slots during recovery and mount	Srinivas Eeda	4	-18/+132
	During recovery, a node recovers orphans in it's slot and the dead node(s). But if the dead nodes were holding orphans in offline slots, they will be left unrecovered. If the dead node is the last one to die and is holding orphans in other slots and is the first one to mount, then it only recovers it's own slot, which leaves orphans in offline slots. This patch queues complete_recovery to clean orphans for all offline slots during mount and node recovery. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Pagecache usage optimization on ocfs2	Hisashi Hifumi	1	-11/+12
	A page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate on pagesize != blocksize environment. This aops checks that all buffers which correspond to a part of a file that we want to read are uptodate. If so, we do not have to issue actual read IO to HDD even if a page is not uptodate because the portion we want to read are uptodate. "block_is_partially_uptodate" function is already used by ext2/3/4. With the following patch random read/write mixed workloads or random read after random write workloads can be optimized and we can get performance improvement. Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: fix rare stale inode errors when exporting via nfs	wengang wang	9	-8/+319
	For nfs exporting, ocfs2_get_dentry() returns the dentry for fh. ocfs2_get_dentry() may read from disk when the inode is not in memory, without any cross cluster lock. this leads to the file system loading a stale inode. This patch fixes above problem. Solution is that in case of inode is not in memory, we get the cluster lock(PR) of alloc inode where the inode in question is allocated from (this causes node on which deletion is done sync the alloc inode) before reading out the inode itsself. then we check the bitmap in the group (the inode in question allcated from) to see if the bit is clear. if it's clear then it's stale. if the bit is set, we then check generation as the existing code does. We have to read out the inode in question from disk first to know its alloc slot and allot bit. And if its not stale we read it out using ocfs2_iget(). The second read should then be from cache. And also we have to add a per superblock nfs_sync_lock to cover the lock for alloc inode and that for inode in question. this is because ocfs2_get_dentry() and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so that mutliple ocfs2_delete_inode() can run concurrently in normal case. [mfasheh@suse.com: build warning fixes and comment cleanups] Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Tweak mle_state output	Sunil Mushran	1	-2/+5
	The debugfs file, mle_state, now prints the number of largest number of mles in one hash link. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Do not purge lockres that is being migrated dlm_purge_lockres()	Sunil Mushran	1	-2/+18
	This patch attempts to fix a fine race between purging and migration. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entry	Sunil Mushran	3	-71/+23
	This patch removes struct dlm_lock_name and adds the entries directly to struct dlm_master_list_entry. Under the new scheme, both mles that are backed by a lockres or not, will have the name populated in mle->mname. This allows us to get rid of code that was figuring out the location of the mle name. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Show the number of lockres/mles in dlm_state	Sunil Mushran	1	-0/+36
	This patch shows the number of lockres' and mles in the debugfs file, dlm_state. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: dlm_set_lockres_owner() and dlm_change_lockres_owner() inlined	Sunil Mushran	2	-22/+18
	This patch inlines dlm_set_lockres_owner() and dlm_change_lockres_owner(). Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Improve lockres counts	Sunil Mushran	4	-38/+11
	This patch replaces the lockres counts that tracked the number number of locally and remotely mastered lockres' with a current and total count. The total count is the number of lockres' that have been created since the dlm domain was created. The number of locally and remotely mastered counts can be computed using the locking_state output. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Track number of mles	Sunil Mushran	3	-1/+14
	The lifetime of a mle is limited to the duration of the lockres mastery process. While typically this lifetime is fairly short, we have noticed the number of mles explode under certain circumstances. This patch tracks the number of each different types of mles and should help us determine how best to speed up the mastery process. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Indent dlm_cleanup_master_list()	Sunil Mushran	1	-54/+52
	The previous patch explicitly did not indent dlm_cleanup_master_list() so as to make the patch readable. This patch properly indents the function. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Activate dlm->master_hash for master list entries	Sunil Mushran	4	-30/+60
	With this patch, the mles are stored in a hash and not a simple list. This should improve the mle lookup time when the number of outstanding masteries is large. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Create and destroy the dlm->master_hash	Sunil Mushran	2	-0/+26
	This patch adds code to create and destroy the dlm->master_hash. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Refactor dlm_clean_master_list()	Sunil Mushran	1	-63/+85
	This patch refactors dlm_clean_master_list() so as to make it easier to convert the mle list to a hash. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Clean up struct dlm_lock_name	Sunil Mushran	3	-44/+53
	For master mle, the name it stored in the attached lockres in struct qstr. For block and migration mle, the name is stored inline in struct dlm_lock_name. This patch attempts to make struct dlm_lock_name look like a struct qstr. While we could use struct qstr, we don't because we want to avoid having to malloc and free the lockname string as the mle's lifetime is fairly short. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/dlm: Encapsulate adding and removing of mle from dlm->master_list	Sunil Mushran	2	-11/+26
	This patch encapsulates adding and removing of the mle from the dlm->master_list. This patch is part of the series of patches that converts the mle list to a mle hash. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Optimize inode group allocation by recording last used group.	Tao Ma	2	-4/+31
	In ocfs2, the block group search looks for the "emptiest" group to allocate from. So if the allocator has many equally(or almost equally) empty groups, new block group will tend to get spread out amongst them. So we add osb_inode_alloc_group in ocfs2_super to record the last used inode allocation group. For more details, please see http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy. I have done some basic test and the results are a ten times improvement on some cold-cache stat workloads. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Allocate inode groups from global_bitmap.	Tao Ma	1	-10/+19
	Inode groups used to be allocated from local alloc file, but since we want all inodes to be contiguous enough, we will try to allocate them directly from global_bitmap. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Optimize inode allocation by remembering last group	Tao Ma	5	-2/+46
	In ocfs2, the inode block search looks for the "emptiest" inode group to allocate from. So if an inode alloc file has many equally (or almost equally) empty groups, new inodes will tend to get spread out amongst them, which in turn can put them all over the disk. This is undesirable because directory operations on conceptually "nearby" inodes force a large number of seeks. So we add ip_last_used_group in core directory inodes which records the last used allocation group. Another field named ip_last_used_slot is also added in case inode stealing happens. When claiming new inode, we passed in directory's inode so that the allocation can use this information. For more details, please see http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: fix leaf start calculation in ocfs2_dx_dir_rebalance()	Mark Fasheh	2	-2/+11
	ocfs2_dx_dir_rebalance() is passed the block offset of a dx leaf which needs rebalancing. Since we rebalance an entire cluster at a time however, this function needs to calculate the beginning of that cluster, in blocks. The calculation was wrong, which would result in a read of non-leaf blocks. Fix the calculation by adding ocfs2_block_to_cluster_start() which is a more straight-forward way of determining this. Reported-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: re-order ocfs2_empty_dir checks	Mark Fasheh	1	-6/+3
	ocfs2_empty_dir() is far more expensive than checking link count. Since both need to be checked at the same time, we can improve performance by checking link count first. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Enable indexed directories	Mark Fasheh	1	-1/+2
	Since the disk format is finalized, we can set this feature bit in the supported mask. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <Joel.Becker@oracle.com>
2009-04-03	ocfs2: Add total entry count to dx_root_block	Mark Fasheh	2	-44/+124
	This little bit of extra accounting speeds up ocfs2_empty_dir() dramatically by allowing us to short-circuit the full directory scan. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Increase max links count	Mark Fasheh	4	-26/+66
	Since we've now got a directory format capable of handling a large number of entries, we can increase the maximum link count supported. This only gets increased if the directory indexing feature is turned on. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
2009-04-03	ocfs2: Introduce dir free space list	Mark Fasheh	4	-93/+490
	The only operation which doesn't get faster with directory indexing is insert, which still has to walk the entire unindexed directory portion to find a free block. This patch provides an improvement in directory insert performance by maintaining a singly linked list of directory leaf blocks which have space for additional dirents. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
2009-04-03	ocfs2: Store dir index records inline	Mark Fasheh	5	-145/+471
	Allow us to store a small number of directory index records in the ocfs2_dx_root_block. This saves us a disk read on small to medium sized directories (less than about 250 entries). The inline root is automatically turned into a root block with extents if the directory size increases beyond it's capacity. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
2009-04-03	ocfs2: Add a name indexed b-tree to directory inodes	Mark Fasheh	13	-104/+2157
	This patch makes use of Ocfs2's flexible btree code to add an additional tree to directory inodes. The new tree stores an array of small, fixed-length records in each leaf block. Each record stores a hash value, and pointer to a block in the traditional (unindexed) directory tree where a dirent with the given name hash resides. Lookup exclusively uses this tree to find dirents, thus providing us with constant time name lookups. Some of the hashing code was copied from ext3. Unfortunately, it has lots of unfixed checkpatch errors. I left that as-is so that tracking changes would be easier. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
2009-04-03	ocfs2: Introduce dir lookup helper struct	Mark Fasheh	3	-131/+148
	Many directory manipulation calls pass around a tuple of dirent, and it's containing buffer_head. Dir indexing has a bit more state, but instead of adding yet more arguments to functions, we introduce 'struct ocfs2_dir_lookup_result'. In this patch, it simply holds the same tuple, but future patches will add more state. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
2009-04-03	ocfs2: Remove debugfs file local_alloc_stats	Sunil Mushran	2	-91/+0
	This patch removes the debugfs file local_alloc_stats as that information is now included in the fs_state debugfs file. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Expose the file system state via debugfs	Sunil Mushran	2	-0/+177
	This patch creates a per mount debugfs file, fs_state, which exposes information like, cluster stack in use, states of the downconvert, recovery and commit threads, number of journal txns, some allocation stats, list of all slots, etc. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2: Move struct recovery_map to a header file	Sunil Mushran	2	-12/+11
	Move the definition of struct recovery_map from journal.c to journal.h. This is preparation for the next patch. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	ocfs2/hb: Expose the list of heartbeating nodes via debugfs	Sunil Mushran	3	-4/+104
	This patch creates a debugfs file, o2hb/livesnodes, which exposes the aggregate list of heartbeating node across all heartbeat regions. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-04-03	x86: disable stack-protector for __restore_processor_state()	Joseph Cihula	1	-0/+5
	The __restore_processor_state() fn restores %gs on resume from S3. As such, it cannot be protected by the stack-protector guard since %gs will not be correct on function entry. There are only a few other fns in this file and it should not negatively impact kernel security that they will also have the stack-protector guard removed (and so it's not worth moving them to another file). Without this change, S3 resume on a kernel built with CONFIG_CC_STACKPROTECTOR_ALL=y will fail. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Tested-by: Chris Wright <chrisw@sous-sol.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Tejun Heo <tj@kernel.org> LKML-Reference: <49D13385.5060900@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03	mm: fix misuse of debug_kmap_atomic	Akinobu Mita	2	-2/+1
	Commit 7ca43e7564679604d86e9ed834e7bbcffd8a4a3f ("mm: use debug_kmap_atomic") introduced some debug_kmap_atomic() in wrong places. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03	Fix highmem PPC build failure	Kumar Gala	1	-15/+14
	Commit f4112de6b679d84bd9b9681c7504be7bdfb7c7d5 ("mm: introduce debug_kmap_atomic") broke PPC builds with CONFIG_HIGHMEM=y: CC init/main.o In file included from include/linux/highmem.h:25, from include/linux/pagemap.h:11, from include/linux/mempolicy.h:63, from init/main.c:53: arch/powerpc/include/asm/highmem.h: In function 'kmap_atomic_prot': arch/powerpc/include/asm/highmem.h:98: error: implicit declaration of function 'debug_kmap_atomic' In file included from include/linux/pagemap.h:11, from include/linux/mempolicy.h:63, from init/main.c:53: include/linux/highmem.h: At top level: include/linux/highmem.h:196: warning: conflicting types for 'debug_kmap_atomic' include/linux/highmem.h:196: error: static declaration of 'debug_kmap_atomic' follows non-static declaration include/asm/highmem.h:98: error: previous implicit declaration of 'debug_kmap_atomic' was here make[1]: * [init/main.o] Error 1 make: * [init] Error 2 Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03	x86: fix is_io_mapping_possible() build warning on i386 allnoconfig	Andrew Morton	1	-1/+1
	i386 allnoconfig: arch/x86/mm/iomap_32.c: In function 'is_io_mapping_possible': arch/x86/mm/iomap_32.c:27: warning: comparison is always false due to limited range of data type Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03	NFS: Add mount options to enable local caching on NFS	David Howells	3	-0/+28
	Add NFS mount options to allow the local caching support to be enabled. The attached patch makes it possible for the NFS filesystem to be told to make use of the network filesystem local caching service (FS-Cache). To be able to use this, a recent nfsutils package is required. There are three variant NFS mount options that can be added to a mount command to control caching for a mount. Only the last one specified takes effect: () Adding "fsc" will request caching. () Adding "fsc=<string>" will request caching and also specify a uniquifier. (*) Adding "nofsc" will disable caching. For example: mount warthog:/ /a -o fsc The cache of a particular superblock (NFS FSID) will be shared between all mounts of that volume, provided they have the same connection parameters and are not marked 'nosharecache'. Where it is otherwise impossible to distinguish superblocks because all the parameters are identical, but the 'nosharecache' option is supplied, a uniquifying string must be supplied, else only the first mount will be permitted to use the cache. If there's a key collision, then the second mount will disable caching and give a warning into the kernel log. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Display local caching state	David Howells	2	-3/+19
	Display the local caching state in /proc/fs/nfsfs/volumes. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Store pages from an NFS inode into a local cache	David Howells	3	-0/+49
	Store pages from an NFS inode into the cache data storage object associated with that inode. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Read pages from FS-Cache into an NFS inode	David Howells	3	-0/+177
	Read pages from an FS-Cache data storage object representing an inode into an NFS inode. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching	David Howells	2	-2/+4
	nfs_readpage_async() needs to be non-static so that it can be used as a fallback for the local on-disk caching should an EIO crop up when reading the cache. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Add read context retention for FS-Cache to call back with	David Howells	1	-0/+26
	Add read context retention so that FS-Cache can call back into NFS when a read operation on the cache fails EIO rather than reading data. This permits NFS to then fetch the data from the server instead using the appropriate security context. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: FS-Cache page management	David Howells	3	-4/+101
	FS-Cache page management for NFS. This includes hooking the releasing and invalidation of pages marked with PG_fscache (aka PG_private_2) and waiting for completion of the write-to-cache flag (PG_fscache_write aka PG_owner_priv_2). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Add some new I/O counters for FS-Cache doing things for NFS	David Howells	3	-0/+41
	Add some new NFS I/O counters for FS-Cache doing things for NFS. A new line is emitted into /proc/pid/mountstats if caching is enabled that looks like: fsc: <rok> <rfl> <wok> <wfl> <unc> Where <rok> is the number of pages read successfully from the cache, <rfl> is the number of failed page reads against the cache, <wok> is the number of successful page writes to the cache, <wfl> is the number of failed page writes to the cache, and <unc> is the number of NFS pages that have been disconnected from the cache. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Invalidate FsCache page flags when cache removed	David Howells	1	-0/+40
	Invalidate the FsCache page flags on the pages belonging to an inode when the cache backing that NFS inode is removed. This allows a live cache to be withdrawn. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Use local disk inode cache	David Howells	4	-0/+191
	Bind data storage objects in the local cache to NFS inodes. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03	NFS: Define and create inode-level cache objects	David Howells	2	-0/+124
	Define and create inode-level cache data storage objects (as managed by nfs_inode structs). Each inode-level object is created in a superblock-level index object and is itself a data storage object into which pages from the inode are stored. The inode object key is the NFS file handle for the inode. The inode object is given coherency data to carry in the auxiliary data permitted by the cache. This is a sequence made up of: (1) i_mtime from the NFS inode. (2) i_ctime from the NFS inode. (3) i_size from the NFS inode. (4) change_attr from the NFSv4 attribute data. As the cache is a persistent cache, the auxiliary data is checked when a new NFS in-memory inode is set up that matches an already existing data storage object in the cache. If the coherency data is the same, the on-disk object is retained and used; if not, it is scrapped and a new one created. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>