aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/00-INDEX2
-rw-r--r--Documentation/filesystems/caching/fscache.txt110
-rw-r--r--Documentation/filesystems/caching/netfs-api.txt21
-rw-r--r--Documentation/filesystems/exofs.txt23
-rw-r--r--Documentation/filesystems/ext3.txt4
-rw-r--r--Documentation/filesystems/ext4.txt10
-rw-r--r--Documentation/filesystems/nilfs2.txt7
-rw-r--r--Documentation/filesystems/ocfs2.txt6
-rw-r--r--Documentation/filesystems/proc.txt16
-rw-r--r--Documentation/filesystems/seq_file.txt4
-rw-r--r--Documentation/filesystems/vfs.txt2
11 files changed, 177 insertions, 28 deletions
diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX
index 658154f52557..875d49696b6e 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -34,6 +34,8 @@ dnotify.txt
- info about directory notification in Linux.
ecryptfs.txt
- docs on eCryptfs: stacked cryptographic filesystem for Linux.
+exofs.txt
+ - info, usage, mount options, design about EXOFS.
ext2.txt
- info, mount options and specifications for the Ext2 filesystem.
ext3.txt
diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
index 9e94b9491d89..a91e2e2095b0 100644
--- a/Documentation/filesystems/caching/fscache.txt
+++ b/Documentation/filesystems/caching/fscache.txt
@@ -235,6 +235,7 @@ proc files.
neg=N Number of negative lookups made
pos=N Number of positive lookups made
crt=N Number of objects created by lookup
+ tmo=N Number of lookups timed out and requeued
Updates n=N Number of update cookie requests seen
nul=N Number of upd reqs given a NULL parent
run=N Number of upd reqs granted CPU time
@@ -250,8 +251,10 @@ proc files.
ok=N Number of successful alloc reqs
wt=N Number of alloc reqs that waited on lookup completion
nbf=N Number of alloc reqs rejected -ENOBUFS
+ int=N Number of alloc reqs aborted -ERESTARTSYS
ops=N Number of alloc reqs submitted
owt=N Number of alloc reqs waited for CPU time
+ abt=N Number of alloc reqs aborted due to object death
Retrvls n=N Number of retrieval (read) requests seen
ok=N Number of successful retr reqs
wt=N Number of retr reqs that waited on lookup completion
@@ -261,6 +264,7 @@ proc files.
oom=N Number of retr reqs failed -ENOMEM
ops=N Number of retr reqs submitted
owt=N Number of retr reqs waited for CPU time
+ abt=N Number of retr reqs aborted due to object death
Stores n=N Number of storage (write) requests seen
ok=N Number of successful store reqs
agn=N Number of store reqs on a page already pending storage
@@ -268,12 +272,37 @@ proc files.
oom=N Number of store reqs failed -ENOMEM
ops=N Number of store reqs submitted
run=N Number of store reqs granted CPU time
+ pgs=N Number of pages given store req processing time
+ rxd=N Number of store reqs deleted from tracking tree
+ olm=N Number of store reqs over store limit
+ VmScan nos=N Number of release reqs against pages with no pending store
+ gon=N Number of release reqs against pages stored by time lock granted
+ bsy=N Number of release reqs ignored due to in-progress store
+ can=N Number of page stores cancelled due to release req
Ops pend=N Number of times async ops added to pending queues
run=N Number of times async ops given CPU time
enq=N Number of times async ops queued for processing
+ can=N Number of async ops cancelled
+ rej=N Number of async ops rejected due to object lookup/create failure
dfr=N Number of async ops queued for deferred release
rel=N Number of async ops released
gc=N Number of deferred-release async ops garbage collected
+ CacheOp alo=N Number of in-progress alloc_object() cache ops
+ luo=N Number of in-progress lookup_object() cache ops
+ luc=N Number of in-progress lookup_complete() cache ops
+ gro=N Number of in-progress grab_object() cache ops
+ upo=N Number of in-progress update_object() cache ops
+ dro=N Number of in-progress drop_object() cache ops
+ pto=N Number of in-progress put_object() cache ops
+ syn=N Number of in-progress sync_cache() cache ops
+ atc=N Number of in-progress attr_changed() cache ops
+ rap=N Number of in-progress read_or_alloc_page() cache ops
+ ras=N Number of in-progress read_or_alloc_pages() cache ops
+ alp=N Number of in-progress allocate_page() cache ops
+ als=N Number of in-progress allocate_pages() cache ops
+ wrp=N Number of in-progress write_page() cache ops
+ ucp=N Number of in-progress uncache_page() cache ops
+ dsp=N Number of in-progress dissociate_pages() cache ops
(*) /proc/fs/fscache/histogram
@@ -299,6 +328,87 @@ proc files.
jiffy range covered, and the SECS field the equivalent number of seconds.
+===========
+OBJECT LIST
+===========
+
+If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
+list of all the objects currently allocated and allow them to be viewed
+through:
+
+ /proc/fs/fscache/objects
+
+This will look something like:
+
+ [root@andromeda ~]# head /proc/fs/fscache/objects
+ OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
+ ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
+ 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 8 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
+ 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 8 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
+
+where the first set of columns before the '|' describe the object:
+
+ COLUMN DESCRIPTION
+ ======= ===============================================================
+ OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
+ PARENT Debugging ID of parent object
+ STAT Object state
+ CHLDN Number of child objects of this object
+ OPS Number of outstanding operations on this object
+ OOP Number of outstanding child object management operations
+ IPR
+ EX Number of outstanding exclusive operations
+ READS Number of outstanding read operations
+ EM Object's event mask
+ EV Events raised on this object
+ F Object flags
+ S Object slow-work work item flags
+
+and the second set of columns describe the object's cookie, if present:
+
+ COLUMN DESCRIPTION
+ =============== =======================================================
+ NETFS_COOKIE_DEF Name of netfs cookie definition
+ TY Cookie type (IX - index, DT - data, hex - special)
+ FL Cookie flags
+ NETFS_DATA Netfs private data stored in the cookie
+ OBJECT_KEY Object key } 1 column, with separating comma
+ AUX_DATA Object aux data } presence may be configured
+
+The data shown may be filtered by attaching the a key to an appropriate keyring
+before viewing the file. Something like:
+
+ keyctl add user fscache:objlist <restrictions> @s
+
+where <restrictions> are a selection of the following letters:
+
+ K Show hexdump of object key (don't show if not given)
+ A Show hexdump of object aux data (don't show if not given)
+
+and the following paired letters:
+
+ C Show objects that have a cookie
+ c Show objects that don't have a cookie
+ B Show objects that are busy
+ b Show objects that aren't busy
+ W Show objects that have pending writes
+ w Show objects that don't have pending writes
+ R Show objects that have outstanding reads
+ r Show objects that don't have outstanding reads
+ S Show objects that have slow work queued
+ s Show objects that don't have slow work queued
+
+If neither side of a letter pair is given, then both are implied. For example:
+
+ keyctl add user fscache:objlist KB @s
+
+shows objects that are busy, and lists their object keys, but does not dump
+their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
+not implied.
+
+By default all objects and all fields will be shown.
+
+
=========
DEBUGGING
=========
diff --git a/Documentation/filesystems/caching/netfs-api.txt b/Documentation/filesystems/caching/netfs-api.txt
index 2666b1ed5e9e..1902c57b72ef 100644
--- a/Documentation/filesystems/caching/netfs-api.txt
+++ b/Documentation/filesystems/caching/netfs-api.txt
@@ -641,7 +641,7 @@ data file must be retired (see the relinquish cookie function below).
Furthermore, note that this does not cancel the asynchronous read or write
operation started by the read/alloc and write functions, so the page
-invalidation and release functions must use:
+invalidation functions must use:
bool fscache_check_page_write(struct fscache_cookie *cookie,
struct page *page);
@@ -654,6 +654,25 @@ to see if a page is being written to the cache, and:
to wait for it to finish if it is.
+When releasepage() is being implemented, a special FS-Cache function exists to
+manage the heuristics of coping with vmscan trying to eject pages, which may
+conflict with the cache trying to write pages to the cache (which may itself
+need to allocate memory):
+
+ bool fscache_maybe_release_page(struct fscache_cookie *cookie,
+ struct page *page,
+ gfp_t gfp);
+
+This takes the netfs cookie, and the page and gfp arguments as supplied to
+releasepage(). It will return false if the page cannot be released yet for
+some reason and if it returns true, the page has been uncached and can now be
+released.
+
+To make a page available for release, this function may wait for an outstanding
+storage request to complete, or it may attempt to cancel the storage request -
+in which case the page will not be stored in the cache this time.
+
+
==========================
INDEX AND DATA FILE UPDATE
==========================
diff --git a/Documentation/filesystems/exofs.txt b/Documentation/filesystems/exofs.txt
index 0ced74c2f73c..abd2a9b5b787 100644
--- a/Documentation/filesystems/exofs.txt
+++ b/Documentation/filesystems/exofs.txt
@@ -60,13 +60,13 @@ USAGE
mkfs.exofs --pid=65536 --format /dev/osd0
- The --format is optional if not specified no OSD_FORMAT will be
- preformed and a clean file system will be created in the specified pid,
+ The --format is optional. If not specified, no OSD_FORMAT will be
+ performed and a clean file system will be created in the specified pid,
in the available space of the target. (Use --format=size_in_meg to limit
the total LUN space available)
- If pid already exist it will be deleted and a new one will be created in it's
- place. Be careful.
+ If pid already exists, it will be deleted and a new one will be created in
+ its place. Be careful.
An exofs lives inside a single OSD partition. You can create multiple exofs
filesystems on the same device using multiple pids.
@@ -81,7 +81,7 @@ USAGE
7. For reference (See do-exofs example script):
do-exofs start - an example of how to perform the above steps.
- do-exofs stop - an example of how to unmount the file system.
+ do-exofs stop - an example of how to unmount the file system.
do-exofs format - an example of how to format and mkfs a new exofs.
8. Extra compilation flags (uncomment in fs/exofs/Kbuild):
@@ -104,8 +104,8 @@ Where:
exofs specific options: Options are separated by commas (,)
pid=<integer> - The partition number to mount/create as
container of the filesystem.
- This option is mandatory
- to=<integer> - Timeout in ticks for a single command
+ This option is mandatory.
+ to=<integer> - Timeout in ticks for a single command.
default is (60 * HZ) [for debugging only]
===============================================================================
@@ -116,7 +116,7 @@ DESIGN
with a special ID (defined in common.h).
Information included in the file system control block is used to fill the
in-memory superblock structure at mount time. This object is created before
- the file system is used by mkexofs.c It contains information such as:
+ the file system is used by mkexofs.c. It contains information such as:
- The file system's magic number
- The next inode number to be allocated
@@ -134,8 +134,8 @@ DESIGN
attributes. This applies to both regular files and other types (directories,
device files, symlinks, etc.).
-* Credentials are generated per object (inode and superblock) when they is
- created in memory (read off disk or created). The credential works for all
+* Credentials are generated per object (inode and superblock) when they are
+ created in memory (read from disk or created). The credential works for all
operations and is used as long as the object remains in memory.
* Async OSD operations are used whenever possible, but the target may execute
@@ -145,7 +145,8 @@ DESIGN
from executing in reverse order:
- The following are handled with the OBJ_CREATED and OBJ_2BCREATED
flags. OBJ_CREATED is set when we know the object exists on the OSD -
- in create's callback function, and when we successfully do a read_inode.
+ in create's callback function, and when we successfully do a
+ read_inode.
OBJ_2BCREATED is set in the beginning of the create function, so we
know that we should wait.
- create/delete: delete should wait until the object is created
diff --git a/Documentation/filesystems/ext3.txt b/Documentation/filesystems/ext3.txt
index 05d5cf1d743f..867c5b50cb42 100644
--- a/Documentation/filesystems/ext3.txt
+++ b/Documentation/filesystems/ext3.txt
@@ -32,8 +32,8 @@ journal_dev=devnum When the external journal device's major/minor numbers
identified through its new major/minor numbers encoded
in devnum.
-noload Don't load the journal on mounting. Note that this forces
- mount of inconsistent filesystem, which can lead to
+norecovery Don't load the journal on mounting. Note that this forces
+noload mount of inconsistent filesystem, which can lead to
various problems.
data=journal All data are committed into the journal prior to being
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 6d94e0696f8c..af6885c3c821 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -153,8 +153,8 @@ journal_dev=devnum When the external journal device's major/minor numbers
identified through its new major/minor numbers encoded
in devnum.
-noload Don't load the journal on mounting. Note that
- if the filesystem was not unmounted cleanly,
+norecovery Don't load the journal on mounting. Note that
+noload if the filesystem was not unmounted cleanly,
skipping the journal replay will lead to the
filesystem containing inconsistencies that can
lead to any number of problems.
@@ -353,6 +353,12 @@ noauto_da_alloc replacing existing files via patterns such as
system crashes before the delayed allocation
blocks are forced to disk.
+discard Controls whether ext4 should issue discard/TRIM
+nodiscard(*) commands to the underlying block device when
+ blocks are freed. This is useful for SSD devices
+ and sparse/thinly-provisioned LUNs, but it is off
+ by default until sufficient testing has been done.
+
Data Mode
=========
There are 3 different data modes:
diff --git a/Documentation/filesystems/nilfs2.txt b/Documentation/filesystems/nilfs2.txt
index 01539f410676..4949fcaa6b6a 100644
--- a/Documentation/filesystems/nilfs2.txt
+++ b/Documentation/filesystems/nilfs2.txt
@@ -49,8 +49,7 @@ Mount options
NILFS2 supports the following mount options:
(*) == default
-barrier=on(*) This enables/disables barriers. barrier=off disables
- it, barrier=on enables it.
+nobarrier Disables barriers.
errors=continue(*) Keep going on a filesystem error.
errors=remount-ro Remount the filesystem read-only on an error.
errors=panic Panic and halt the machine if an error occurs.
@@ -71,6 +70,10 @@ order=strict Apply strict in-order semantics that preserves sequence
blocks. That means, it is guaranteed that no
overtaking of events occurs in the recovered file
system after a crash.
+norecovery Disable recovery of the filesystem on mount.
+ This disables every write access on the device for
+ read-only mounts or snapshots. This option will fail
+ for r/w mounts on an unclean volume.
NILFS2 usage
============
diff --git a/Documentation/filesystems/ocfs2.txt b/Documentation/filesystems/ocfs2.txt
index c2a0871280a0..c58b9f5ba002 100644
--- a/Documentation/filesystems/ocfs2.txt
+++ b/Documentation/filesystems/ocfs2.txt
@@ -20,15 +20,16 @@ Lots of code taken from ext3 and other projects.
Authors in alphabetical order:
Joel Becker <joel.becker@oracle.com>
Zach Brown <zach.brown@oracle.com>
-Mark Fasheh <mark.fasheh@oracle.com>
+Mark Fasheh <mfasheh@suse.com>
Kurt Hackel <kurt.hackel@oracle.com>
+Tao Ma <tao.ma@oracle.com>
Sunil Mushran <sunil.mushran@oracle.com>
Manish Singh <manish.singh@oracle.com>
+Tiger Yang <tiger.yang@oracle.com>
Caveats
=======
Features which OCFS2 does not support yet:
- - quotas
- Directory change notification (F_NOTIFY)
- Distributed Caching (F_SETLEASE/F_GETLEASE/break_lease)
@@ -70,7 +71,6 @@ commit=nrsec (*) Ocfs2 can be told to sync all its data and metadata
performance.
localalloc=8(*) Allows custom localalloc size in MB. If the value is too
large, the fs will silently revert it to the default.
- Localalloc is not enabled for local mounts.
localflocks This disables cluster aware flock.
inode64 Indicates that Ocfs2 is allowed to create inodes at
any location in the filesystem, including those which
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 2c48f945546b..220cc6376ef8 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -38,6 +38,7 @@ Table of Contents
3.3 /proc/<pid>/io - Display the IO accounting fields
3.4 /proc/<pid>/coredump_filter - Core dump filtering settings
3.5 /proc/<pid>/mountinfo - Information about mounts
+ 3.6 /proc/<pid>/comm & /proc/<pid>/task/<tid>/comm
------------------------------------------------------------------------------
@@ -1072,7 +1073,8 @@ second). The meanings of the columns are as follows, from left to right:
- irq: servicing interrupts
- softirq: servicing softirqs
- steal: involuntary wait
-- guest: running a guest
+- guest: running a normal guest
+- guest_nice: running a niced guest
The "intr" line gives counts of interrupts serviced since boot time, for each
of the possible system interrupts. The first column is the total of all
@@ -1088,8 +1090,8 @@ The "processes" line gives the number of processes and threads created, which
includes (but is not limited to) those created by calls to the fork() and
clone() system calls.
-The "procs_running" line gives the number of processes currently running on
-CPUs.
+The "procs_running" line gives the total number of threads that are
+running or ready to run (i.e., the total number of runnable threads).
The "procs_blocked" line gives the number of processes currently blocked,
waiting for I/O to complete.
@@ -1408,3 +1410,11 @@ For more information on mount propagation see:
Documentation/filesystems/sharedsubtree.txt
+
+3.6 /proc/<pid>/comm & /proc/<pid>/task/<tid>/comm
+--------------------------------------------------------
+These files provide a method to access a tasks comm value. It also allows for
+a task to set its own or one of its thread siblings comm value. The comm value
+is limited in size compared to the cmdline value, so writing anything longer
+then the kernel's TASK_COMM_LEN (currently 16 chars) will result in a truncated
+comm value.
diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt
index 0d15ebccf5b0..a1e2e0dda907 100644
--- a/Documentation/filesystems/seq_file.txt
+++ b/Documentation/filesystems/seq_file.txt
@@ -248,9 +248,7 @@ code, that is done in the initialization code in the usual way:
{
struct proc_dir_entry *entry;
- entry = create_proc_entry("sequence", 0, NULL);
- if (entry)
- entry->proc_fops = &ct_file_ops;
+ proc_create("sequence", 0, NULL, &ct_file_ops);
return 0;
}
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 623f094c9d8d..3de2f32edd90 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -472,7 +472,7 @@ __sync_single_inode) to check if ->writepages has been successful in
writing out the whole address_space.
The Writeback tag is used by filemap*wait* and sync_page* functions,
-via wait_on_page_writeback_range, to wait for all writeback to
+via filemap_fdatawait_range, to wait for all writeback to
complete. While waiting ->sync_page (if defined) will be called on
each page that is found to require writeback.