aboutsummaryrefslogtreecommitdiffstats
path: root/security/selinux/ss (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-03-28SELinux: Write class field in role_trans_write.Harry Ciao1-2/+9
If kernel policy version is >= 26, then write the class field of the role_trans structure into the binary reprensentation. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-03-28SELinux: Compute role in newcontext for all classesHarry Ciao1-11/+9
Apply role_transition rules for all kinds of classes. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-03-28SELinux: Add class support to the role_trans structureHarry Ciao2-1/+16
If kernel policy version is >= 26, then the binary representation of the role_trans structure supports specifying the class for the current subject or the newly created object. If kernel policy version is < 26, then the class field would be default to the process class. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Eric Paris <eparis@redhat.com>
2011-03-08Merge branch 'master' of git://git.infradead.org/users/eparis/selinux into nextJames Morris6-33/+214
2011-03-08Merge branch 'master'; commit 'v2.6.38-rc7' into nextJames Morris2-3/+3
2011-03-03SELinux: Socket retains creator role and MLS attributeHarry Ciao3-7/+29
The socket SID would be computed on creation and no longer inherit its creator's SID by default. Socket may have a different type but needs to retain the creator's role and MLS attribute in order not to break labeled networking and network access control. The kernel value for a class would be used to determine if the class if one of socket classes. If security_compute_sid is called from userspace the policy value for a class would be mapped to the relevant kernel value first. Signed-off-by: Harry Ciao <qingtao.cao@windriver.com> Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2011-02-01SELinux: Use dentry name in new object labelingEric Paris4-26/+185
Currently SELinux has rules which label new objects according to 3 criteria. The label of the process creating the object, the label of the parent directory, and the type of object (reg, dir, char, block, etc.) This patch adds a 4th criteria, the dentry name, thus we can distinguish between creating a file in an etc_t directory called shadow and one called motd. There is no file globbing, regex parsing, or anything mystical. Either the policy exactly (strcmp) matches the dentry name of the object or it doesn't. This patch has no changes from today if policy does not implement the new rules. Signed-off-by: Eric Paris <eparis@redhat.com>
2011-01-24selinux: return -ENOMEM when memory allocation failsDavidlohr Bueso2-3/+3
Return -ENOMEM when memory allocation fails in cond_init_bool_indexes, correctly propagating error code to caller. Signed-off-by: Davidlohr Bueso <dave@gnu.org> Signed-off-by: James Morris <jmorris@namei.org>
2011-01-24security:selinux: kill unused MAX_AVTAB_HASH_MASK and ebitmap_startbitShan Wei2-2/+0
Kill unused MAX_AVTAB_HASH_MASK and ebitmap_startbit. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-12-07selinux: cache sidtab_context_to_sid resultsEric Paris2-2/+39
sidtab_context_to_sid takes up a large share of time when creating large numbers of new inodes (~30-40% in oprofile runs). This patch implements a cache of 3 entries which is checked before we do a full context_to_sid lookup. On one system this showed over a x3 improvement in the number of inodes that could be created per second and around a 20% improvement on another system. Any time we look up the same context string sucessivly (imagine ls -lZ) we should hit this cache hot. A cache miss should have a relatively minor affect on performance next to doing the full table search. All operations on the cache are done COMPLETELY lockless. We know that all struct sidtab_node objects created will never be deleted until a new policy is loaded thus we never have to worry about a pointer being dereferenced. Since we also know that pointer assignment is atomic we know that the cache will always have valid pointers. Given this information we implement a FIFO cache in an array of 3 pointers. Every result (whether a cache hit or table lookup) will be places in the 0 spot of the cache and the rest of the entries moved down one spot. The 3rd entry will be lost. Races are possible and are even likely to happen. Lets assume that 4 tasks are hitting sidtab_context_to_sid. The first task checks against the first entry in the cache and it is a miss. Now lets assume a second task updates the cache with a new entry. This will push the first entry back to the second spot. Now the first task might check against the second entry (which it already checked) and will miss again. Now say some third task updates the cache and push the second entry to the third spot. The first task my check the third entry (for the third time!) and again have a miss. At which point it will just do a full table lookup. No big deal! Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30SELinux: merge policydb_index_classes and policydb_index_othersEric Paris1-59/+10
We duplicate functionality in policydb_index_classes() and policydb_index_others(). This patch merges those functions just to make it clear there is nothing special happening here. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30selinux: convert part of the sym_val_to_name array to use flex_arrayEric Paris5-68/+127
The sym_val_to_name type array can be quite large as it grows linearly with the number of types. With known policies having over 5k types these allocations are growing large enough that they are likely to fail. Convert those to flex_array so no allocation is larger than PAGE_SIZE Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30selinux: convert type_val_to_struct to flex_arrayEric Paris3-13/+34
In rawhide type_val_to_struct will allocate 26848 bytes, an order 3 allocations. While this hasn't been seen to fail it isn't outside the realm of possibiliy on systems with severe memory fragmentation. Convert to flex_array so no allocation will ever be bigger than PAGE_SIZE. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30selinux: rework security_netlbl_secattr_to_sidEric Paris1-21/+21
security_netlbl_secattr_to_sid is difficult to follow, especially the return codes. Try to make the function obvious. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30SELinux: standardize return code handling in selinuxfs.cEric Paris1-171/+157
selinuxfs.c has lots of different standards on how to handle return paths on error. For the most part transition to rc=errno if (failure) goto out; [...] out: cleanup() return rc; Instead of doing cleanup mid function, or having multiple returns or other options. This doesn't do that for every function, but most of the complex functions which have cleanup routines on error. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-11-30SELinux: standardize return code handling in policydb.cEric Paris1-287/+268
policydb.c has lots of different standards on how to handle return paths on error. For the most part transition to rc=errno if (failure) goto out; [...] out: cleanup() return rc; Instead of doing cleanup mid function, or having multiple returns or other options. This doesn't do that for every function, but most of the complex functions which have cleanup routines on error. Signed-off-by: Eric Paris <eparis@redhat.com>
2010-10-21selinux: include vmalloc.h for vmalloc_userStephen Rothwell1-0/+1
Include vmalloc.h for vmalloc_user (fixes ppc build warning). Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21selinux: implement mmap on /selinux/policyEric Paris1-1/+1
/selinux/policy allows a user to copy the policy back out of the kernel. This patch allows userspace to actually mmap that file and use it directly. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21SELinux: allow userspace to read policy back out of the kernelEric Paris9-2/+1158
There is interest in being able to see what the actual policy is that was loaded into the kernel. The patch creates a new selinuxfs file /selinux/policy which can be read by userspace. The actual policy that is loaded into the kernel will be written back out to userspace. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21SELinux: drop useless (and incorrect) AVTAB_MAX_SIZEEric Paris2-3/+2
AVTAB_MAX_SIZE was a define which was supposed to be used in userspace to define a maximally sized avtab when userspace wasn't sure how big of a table it needed. It doesn't make sense in the kernel since we always know our table sizes. The only place it is used we have a more appropiately named define called AVTAB_MAX_HASH_BUCKETS, use that instead. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21SELinux: deterministic ordering of range transition rulesEric Paris1-3/+13
Range transition rules are placed in the hash table in an (almost) arbitrary order. This patch inserts them in a fixed order to make policy retrival more predictable. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21security: secid_to_secctx returns len when data is NULLEric Paris1-2/+9
With the (long ago) interface change to have the secid_to_secctx functions do the string allocation instead of having the caller do the allocation we lost the ability to query the security server for the length of the upcoming string. The SECMARK code would like to allocate a netlink skb with enough length to hold the string but it is just too unclean to do the string allocation twice or to do the allocation the first time and hold onto the string and slen. This patch adds the ability to call security_secid_to_secctx() with a NULL data pointer and it will just set the slen pointer. Signed-off-by: Eric Paris <eparis@redhat.com> Reviewed-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21selinux: fix up style problem on /selinux/statusKaiGai Kohei1-9/+6
This patch fixes up coding-style problem at this commit: 4f27a7d49789b04404eca26ccde5f527231d01d5 selinux: fast status update interface (/selinux/status) Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21selinux: really fix dependency causing parallel compile failure.Paul Gortmaker1-9/+0
While the previous change to the selinux Makefile reduced the window significantly for this failure, it is still possible to see a compile failure where cpp starts processing selinux files before the auto generated flask.h file is completed. This is easily reproduced by adding the following temporary change to expose the issue everytime: - cmd_flask = scripts/selinux/genheaders/genheaders ... + cmd_flask = sleep 30 ; scripts/selinux/genheaders/genheaders ... This failure happens because the creation of the object files in the ss subdir also depends on flask.h. So simply incorporate them into the parent Makefile, as the ss/Makefile really doesn't do anything unique. With this change, compiling of all selinux files is dependent on completion of the header file generation, and this test case with the "sleep 30" now confirms it is functioning as expected. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21selinux: fast status update interface (/selinux/status)KaiGai Kohei3-1/+133
This patch provides a new /selinux/status entry which allows applications read-only mmap(2). This region reflects selinux_kernel_status structure in kernel space. struct selinux_kernel_status { u32 length; /* length of this structure */ u32 sequence; /* sequence number of seqlock logic */ u32 enforcing; /* current setting of enforcing mode */ u32 policyload; /* times of policy reloaded */ u32 deny_unknown; /* current setting of deny_unknown */ }; When userspace object manager caches access control decisions provided by SELinux, it needs to invalidate the cache on policy reload and setenforce to keep consistency. However, the applications need to check the kernel state for each accesses on userspace avc, or launch a background worker process. In heuristic, frequency of invalidation is much less than frequency of making access control decision, so it is annoying to invoke a system call to check we don't need to invalidate the userspace cache. If we can use a background worker thread, it allows to receive invalidation messages from the kernel. But it requires us an invasive coding toward the base application in some cases; E.g, when we provide a feature performing with SELinux as a plugin module, it is unwelcome manner to launch its own worker thread from the module. If we could map /selinux/status to process memory space, application can know updates of selinux status; policy reload or setenforce. A typical application checks selinux_kernel_status::sequence when it tries to reference userspace avc. If it was changed from the last time when it checked userspace avc, it means something was updated in the kernel space. Then, the application can reset userspace avc or update current enforcing mode, without any system call invocations. This sequence number is updated according to the seqlock logic, so we need to wait for a while if it is odd number. Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Acked-by: Eric Paris <eparis@redhat.com> -- security/selinux/include/security.h | 21 ++++++ security/selinux/selinuxfs.c | 56 +++++++++++++++ security/selinux/ss/Makefile | 2 +- security/selinux/ss/services.c | 3 + security/selinux/ss/status.c | 129 +++++++++++++++++++++++++++++++++++ 5 files changed, 210 insertions(+), 1 deletions(-) Signed-off-by: James Morris <jmorris@namei.org>
2010-10-21selinux: type_bounds_sanity_check has a meaningless variable declarationEric Paris1-2/+2
type is not used at all, stop declaring and assigning it. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: convert the policy type_attr_map to flex_arrayEric Paris3-13/+39
Current selinux policy can have over 3000 types. The type_attr_map in policy is an array sized by the number of types times sizeof(struct ebitmap) (12 on x86_64). Basic math tells us the array is going to be of length 3000 x 12 = 36,000 bytes. The largest 'safe' allocation on a long running system is 16k. Most of the time a 32k allocation will work. But on long running systems a 64k allocation (what we need) can fail quite regularly. In order to deal with this I am converting the type_attr_map to use flex_arrays. Let the library code deal with breaking this into PAGE_SIZE pieces. -v2 rework some of the if(!obj) BUG() to be BUG_ON(!obj) drop flex_array_put() calls and just use a _get() object directly -v3 make apply to James' tree (drop the policydb_write changes) Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02SELinux: break ocontext reading into a separate functionEric Paris1-111/+133
Move the reading of ocontext type data out of policydb_read() in a separate function ocontext_read() Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02SELinux: move genfs read to a separate functionEric Paris1-105/+133
move genfs read functionality out of policydb_read() and into a new function called genfs_read() Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: fix error codes in symtab_init()Dan Carpenter1-1/+1
hashtab_create() only returns NULL on allocation failures to -ENOMEM is appropriate here. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: fix error codes in cond_read_bool()Dan Carpenter1-5/+8
The original code always returned -1 (-EPERM) on error. The new code returns either -ENOMEM, or -EINVAL or it propagates the error codes from lower level functions next_entry() or hashtab_insert(). next_entry() returns -EINVAL. hashtab_insert() returns -EINVAL, -EEXIST, or -ENOMEM. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: fix error codes in cond_policydb_init()Dan Carpenter1-2/+6
It's better to propagate the error code from avtab_init() instead of returning -1 (-EPERM). It turns out that avtab_init() never fails so this patch doesn't change how the code runs but it's still a clean up. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: fix error codes in cond_read_node()Dan Carpenter1-8/+12
Originally cond_read_node() returned -1 (-EPERM) on errors which was incorrect. Now it either propagates the error codes from lower level functions next_entry() or cond_read_av_list() or it returns -ENOMEM or -EINVAL. next_entry() returns -EINVAL. cond_read_av_list() returns -EINVAL or -ENOMEM. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: fix error codes in cond_read_av_list()Dan Carpenter1-6/+8
After this patch cond_read_av_list() no longer returns -1 for any errors. It just propagates error code back from lower levels. Those can either be -EINVAL or -ENOMEM. I also modified cond_insertf() since cond_read_av_list() passes that as a function pointer to avtab_read_item(). It isn't used anywhere else. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: propagate error codes in cond_read_list()Dan Carpenter1-4/+6
These are passed back when the security module gets loaded. The original code always returned -1 (-EPERM) on error but after this patch it can return -EINVAL, or -ENOMEM or propagate the error code from cond_read_node(). cond_read_node() still returns -1 all the time, but I fix that in a later patch. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02selinux: cleanup return codes in avtab_read_item()Dan Carpenter1-20/+19
The avtab_read_item() function tends to return -1 as a default error code which is wrong (-1 means -EPERM). I modified it to return appropriate error codes which is -EINVAL or the error code from next_entry() or insertf(). next_entry() returns -EINVAL. insertf() is a function pointer to either avtab_insert() or cond_insertf(). avtab_insert() returns -EINVAL, -ENOMEM, and -EEXIST. cond_insertf() currently returns -1, but I will fix it in a later patch. There is code in avtab_read() which translates the -1 returns from avtab_read_item() to -EINVAL. The translation is no longer needed, so I removed it. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02SELinux: seperate range transition rules to a seperate functionEric Paris1-64/+75
Move the range transition rule to a separate function, range_read(), rather than doing it all in policydb_read() Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-05-17security/selinux/ss: Use kstrdupJulia Lawall1-2/+1
Use kstrdup when the goal of an allocation is copy a string into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to; expression flag,E1,E2; statement S; @@ - to = kmalloc(strlen(from) + 1,flag); + to = kstrdup(from, flag); ... when != \(from = E1 \| to = E1 \) if (to==NULL || ...) S ... when != \(from = E2 \| to = E2 \) - strcpy(to, from); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-21SELinux: return error codes on policy load failureEric Paris1-15/+22
policy load failure always return EINVAL even if the failure was for some other reason (usually ENOMEM). This patch passes error codes back up the stack where they will make their way to userspace. This might help in debugging future problems with policy load. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-04-09Security: Fix coding style in security/wzt.wzt@gmail.com3-9/+9
Fix coding style in security/ Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-03-16SELinux: Reduce max avtab size to avoid page allocation failuresStephen Smalley1-1/+1
Reduce MAX_AVTAB_HASH_BITS so that the avtab allocation is an order 2 allocation rather than an order 4 allocation on x86_64. This addresses reports of page allocation failures: http://marc.info/?l=selinux&m=126757230625867&w=2 https://bugzilla.redhat.com/show_bug.cgi?id=570433 Reported-by: Russell Coker <russell@coker.com.au> Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-03-09Merge branch 'next-queue' into nextJames Morris2-2/+1
2010-03-08selinux: const strings in tablesStephen Hemminger1-1/+1
Several places strings tables are used that should be declared const. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-03-03Selinux: Remove unused headers slab.h in selinux/ss/symtab.cwzt.wzt@gmail.com1-1/+0
slab.h is unused in symtab.c, so remove it. Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-03-01Merge branch 'next' into for-linusJames Morris7-213/+266
2010-02-25netlabel: fix export of SELinux categories > 127Joshua Roys1-1/+1
This fixes corrupted CIPSO packets when SELinux categories greater than 127 are used. The bug occured on the second (and later) loops through the while; the inner for loop through the ebitmap->maps array used the same index as the NetLabel catmap->bitmap array, even though the NetLabel bitmap is twice as long as the SELinux bitmap. Signed-off-by: Joshua Roys <joshua.roys@gtri.gatech.edu> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2010-02-22selinux: libsepol: remove dead code in check_avtab_hierarchy_callback()KaiGai Kohei1-4/+39
This patch revert the commit of 7d52a155e38d5a165759dbbee656455861bf7801 which removed a part of type_attribute_bounds_av as a dead code. However, at that time, we didn't find out the target side boundary allows to handle some of pseudo /proc/<pid>/* entries with its process's security context well. Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> -- security/selinux/ss/services.c | 43 ++++++++++++++++++++++++++++++++++++--- 1 files changed, 39 insertions(+), 4 deletions(-) Signed-off-by: James Morris <jmorris@namei.org>
2010-02-16security: fix a couple of sparse warningsJames Morris1-2/+3
Fix a couple of sparse warnings for callers of context_struct_to_string, which takes a *u32, not an *int. These cases are harmless as the values are not used. Signed-off-by: James Morris <jmorris@namei.org> Acked-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
2010-02-04selinux: allow MLS->non-MLS and vice versa upon policy reloadGuido Trentalancia7-57/+80
Allow runtime switching between different policy types (e.g. from a MLS/MCS policy to a non-MLS/non-MCS policy or viceversa). Signed-off-by: Guido Trentalancia <guido@trentalancia.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2010-02-04selinux: load the initial SIDs upon every policy loadGuido Trentalancia1-4/+12
Always load the initial SIDs, even in the case of a policy reload and not just at the initial policy load. This comes particularly handy after the introduction of a recent patch for enabling runtime switching between different policy types, although this patch is in theory independent from that feature. Signed-off-by: Guido Trentalancia <guido@trentalancia.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>