aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/security.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-10-28powerpc/xmon: Restrict when kernel is locked downChristopher M. Riedl1-0/+2
Xmon should be either fully or partially disabled depending on the kernel lockdown state. Put xmon into read-only mode for lockdown=integrity and prevent user entry into xmon when lockdown=confidentiality. Xmon checks the lockdown state on every attempted entry: (1) during early xmon'ing (2) when triggered via sysrq (3) when toggled via debugfs (4) when triggered via a previously enabled breakpoint The following lockdown state transitions are handled: (1) lockdown=none -> lockdown=integrity set xmon read-only mode (2) lockdown=none -> lockdown=confidentiality clear all breakpoints, set xmon read-only mode, prevent user re-entry into xmon (3) lockdown=integrity -> lockdown=confidentiality clear all breakpoints, set xmon read-only mode, prevent user re-entry into xmon Suggested-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190907061124.1947-3-cmr@informatik.wtf
2019-09-28Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds1-0/+59
Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...
2019-09-23Merge tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinuxLinus Torvalds1-2/+8
Pull selinux updates from Paul Moore: - Add LSM hooks, and SELinux access control hooks, for dnotify, fanotify, and inotify watches. This has been discussed with both the LSM and fs/notify folks and everybody is good with these new hooks. - The LSM stacking changes missed a few calls to current_security() in the SELinux code; we fix those and remove current_security() for good. - Improve our network object labeling cache so that we always return the object's label, even when under memory pressure. Previously we would return an error if we couldn't allocate a new cache entry, now we always return the label even if we can't create a new cache entry for it. - Convert the sidtab atomic_t counter to a normal u32 with READ/WRITE_ONCE() and memory barrier protection. - A few patches to policydb.c to clean things up (remove forward declarations, long lines, bad variable names, etc) * tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: lsm: remove current_security() selinux: fix residual uses of current_security() for the SELinux blob selinux: avoid atomic_t usage in sidtab fanotify, inotify, dnotify, security: add security hook for fs notifications selinux: always return a secid from the network caches if we find one selinux: policydb - rename type_val_to_struct_array selinux: policydb - fix some checkpatch.pl warnings selinux: shuffle around policydb.c to get rid of forward declarations
2019-08-19tracefs: Restrict tracefs when the kernel is locked downMatthew Garrett1-0/+1
Tracefs may release more information about the kernel than desirable, so restrict it when the kernel is locked down in confidentiality mode by preventing open(). (Fixed by Ben Hutchings to avoid a null dereference in default_file_open()) Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19debugfs: Restrict debugfs when the kernel is locked downDavid Howells1-0/+1
Disallow opening of debugfs files that might be used to muck around when the kernel is locked down as various drivers give raw access to hardware through debugfs. Given the effort of auditing all 2000 or so files and manually fixing each one as necessary, I've chosen to apply a heuristic instead. The following changes are made: (1) chmod and chown are disallowed on debugfs objects (though the root dir can be modified by mount and remount, but I'm not worried about that). (2) When the kernel is locked down, only files with the following criteria are permitted to be opened: - The file must have mode 00444 - The file must not have ioctl methods - The file must not have mmap (3) When the kernel is locked down, files may only be opened for reading. Normal device interaction should be done through configfs, sysfs or a miscdev, not debugfs. Note that this makes it unnecessary to specifically lock down show_dsts(), show_devs() and show_call() in the asus-wmi driver. I would actually prefer to lock down all files by default and have the the files unlocked by the creator. This is tricky to manage correctly, though, as there are 19 creation functions and ~1600 call sites (some of them in loops scanning tables). Signed-off-by: David Howells <dhowells@redhat.com> cc: Andy Shevchenko <andy.shevchenko@gmail.com> cc: acpi4asus-user@lists.sourceforge.net cc: platform-driver-x86@vger.kernel.org cc: Matthew Garrett <mjg59@srcf.ucam.org> cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <greg@kroah.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down perf when in confidentiality modeDavid Howells1-0/+1
Disallow the use of certain perf facilities that might allow userspace to access kernel data. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19bpf: Restrict bpf when kernel lockdown is in confidentiality modeDavid Howells1-0/+1
bpf_read() and bpf_read_str() could potentially be abused to (eg) allow private keys in kernel memory to be leaked. Disable them if the kernel has been locked down in confidentiality mode. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: netdev@vger.kernel.org cc: Chun-Yi Lee <jlee@suse.com> cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down tracing and perf kprobes when in confidentiality modeDavid Howells1-0/+1
Disallow the creation of perf and ftrace kprobes when the kernel is locked down in confidentiality mode by preventing their registration. This prevents kprobes from being used to access kernel memory to steal crypto data, but continues to allow the use of kprobes from signed modules. Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: davem@davemloft.net Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down /proc/kcoreDavid Howells1-0/+1
Disallow access to /proc/kcore when the kernel is locked down to prevent access to cryptographic data. This is limited to lockdown confidentiality mode and is still permitted in integrity mode. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19x86/mmiotrace: Lock down the testmmiotrace moduleDavid Howells1-0/+1
The testmmiotrace module shouldn't be permitted when the kernel is locked down as it can be used to arbitrarily read and write MMIO space. This is a runtime check rather than buildtime in order to allow configurations where the same kernel may be run in both locked down or permissive modes depending on local policy. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Howells <dhowells@redhat.com Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org> cc: Thomas Gleixner <tglx@linutronix.de> cc: Steven Rostedt <rostedt@goodmis.org> cc: Ingo Molnar <mingo@kernel.org> cc: "H. Peter Anvin" <hpa@zytor.com> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down module params that specify hardware parameters (eg. ioport)David Howells1-0/+1
Provided an annotation for module parameters that specify hardware parameters (such as io ports, iomem addresses, irqs, dma channels, fixed dma buffers and other types). Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Lock down TIOCSSERIALDavid Howells1-0/+1
Lock down TIOCSSERIAL as that can be used to change the ioport and irq settings on a serial port. This only appears to be an issue for the serial drivers that use the core serial code. All other drivers seem to either ignore attempts to change port/irq or give an error. Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: Jiri Slaby <jslaby@suse.com> Cc: linux-serial@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Prohibit PCMCIA CIS storage when the kernel is locked downDavid Howells1-0/+1
Prohibit replacement of the PCMCIA Card Information Structure when the kernel is locked down. Suggested-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19ACPI: Limit access to custom_method when the kernel is locked downMatthew Garrett1-0/+1
custom_method effectively allows arbitrary access to system memory, making it possible for an attacker to circumvent restrictions on module loading. Disable it if the kernel is locked down. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: linux-acpi@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19x86/msr: Restrict MSR access when the kernel is locked downMatthew Garrett1-0/+1
Writing to MSRs should not be allowed if the kernel is locked down, since it could lead to execution of arbitrary code in kernel mode. Based on a patch by Kees Cook. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19x86: Lock down IO port access when the kernel is locked downMatthew Garrett1-0/+1
IO port access would permit users to gain access to PCI configuration registers, which in turn (on a lot of hardware) give access to MMIO register space. This would potentially permit root to trigger arbitrary DMA, so lock it down by default. This also implicitly locks down the KDADDIO, KDDELIO, KDENABIO and KDDISABIO console ioctls. Signed-off-by: Matthew Garrett <mjg59@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19PCI: Lock down BAR access when the kernel is locked downMatthew Garrett1-0/+1
Any hardware that can potentially generate DMA has to be locked down in order to avoid it being possible for an attacker to modify kernel code, allowing them to circumvent disabled module loading or module signing. Default to paranoid - in future we can potentially relax this for sufficiently IOMMU-isolated devices. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: linux-pci@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19hibernate: Disable when the kernel is locked downJosh Boyer1-0/+1
There is currently no way to verify the resume image when returning from hibernate. This might compromise the signed modules trust model, so until we can work with signed hibernate images we disable it when the kernel is locked down. Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: rjw@rjwysocki.net Cc: pavel@ucw.cz cc: linux-pm@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19kexec_load: Disable at runtime if the kernel is locked downMatthew Garrett1-0/+1
The kexec_load() syscall permits the loading and execution of arbitrary code in ring 0, which is something that lock-down is meant to prevent. It makes sense to disable kexec_load() in this situation. This does not affect kexec_file_load() syscall which can check for a signature on the image to be booted. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Dave Young <dyoung@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> cc: kexec@lists.infradead.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Restrict /dev/{mem,kmem,port} when the kernel is locked downMatthew Garrett1-0/+1
Allowing users to read and write to core kernel memory makes it possible for the kernel to be subverted, avoiding module loading restrictions, and also to steal cryptographic information. Disallow /dev/mem and /dev/kmem from being opened this when the kernel has been locked down to prevent this. Also disallow /dev/port from being opened to prevent raw ioport access and thus DMA from being used to accomplish the same thing. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: x86@kernel.org Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19lockdown: Enforce module signatures if the kernel is locked downDavid Howells1-0/+1
If the kernel is locked down, require that all modules have valid signatures that we can verify. I have adjusted the errors generated: (1) If there's no signature (ENODATA) or we can't check it (ENOPKG, ENOKEY), then: (a) If signatures are enforced then EKEYREJECTED is returned. (b) If there's no signature or we can't check it, but the kernel is locked down then EPERM is returned (this is then consistent with other lockdown cases). (2) If the signature is unparseable (EBADMSG, EINVAL), the signature fails the check (EKEYREJECTED) or a system error occurs (eg. ENOMEM), we return the error we got. Note that the X.509 code doesn't check for key expiry as the RTC might not be valid or might not have been transferred to the kernel's clock yet. [Modified by Matthew Garrett to remove the IMA integration. This will be replaced with integration with the IMA architecture policy patchset.] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <matthewgarrett@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Add a static lockdown policy LSMMatthew Garrett1-0/+3
While existing LSMs can be extended to handle lockdown policy, distributions generally want to be able to apply a straightforward static policy. This patch adds a simple LSM that can be configured to reject either integrity or all lockdown queries, and can be configured at runtime (through securityfs), boot time (via a kernel parameter) or build time (via a kconfig option). Based on initial code by David Howells. Signed-off-by: Matthew Garrett <mjg59@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Add a "locked down" LSM hookMatthew Garrett1-0/+32
Add a mechanism to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-19security: Support early LSMsMatthew Garrett1-0/+6
The lockdown module is intended to allow for kernels to be locked down early in boot - sufficiently early that we don't have the ability to kmalloc() yet. Add support for early initialisation of some LSMs, and then add them to the list of names when we do full initialisation later. Early LSMs are initialised in link order and cannot be overridden via boot parameters, and cannot make use of kmalloc() (since the allocator isn't initialised yet). (Fixed by Stephen Rothwell to include a stub to fix builds when !CONFIG_SECURITY) Signed-off-by: Matthew Garrett <mjg59@google.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: James Morris <jmorris@namei.org>
2019-08-12fanotify, inotify, dnotify, security: add security hook for fs notificationsAaron Goidel1-2/+8
As of now, setting watches on filesystem objects has, at most, applied a check for read access to the inode, and in the case of fanotify, requires CAP_SYS_ADMIN. No specific security hook or permission check has been provided to control the setting of watches. Using any of inotify, dnotify, or fanotify, it is possible to observe, not only write-like operations, but even read access to a file. Modeling the watch as being merely a read from the file is insufficient for the needs of SELinux. This is due to the fact that read access should not necessarily imply access to information about when another process reads from a file. Furthermore, fanotify watches grant more power to an application in the form of permission events. While notification events are solely, unidirectional (i.e. they only pass information to the receiving application), permission events are blocking. Permission events make a request to the receiving application which will then reply with a decision as to whether or not that action may be completed. This causes the issue of the watching application having the ability to exercise control over the triggering process. Without drawing a distinction within the permission check, the ability to read would imply the greater ability to control an application. Additionally, mount and superblock watches apply to all files within the same mount or superblock. Read access to one file should not necessarily imply the ability to watch all files accessed within a given mount or superblock. In order to solve these issues, a new LSM hook is implemented and has been placed within the system calls for marking filesystem objects with inotify, fanotify, and dnotify watches. These calls to the hook are placed at the point at which the target path has been resolved and are provided with the path struct, the mask of requested notification events, and the type of object on which the mark is being set (inode, superblock, or mount). The mask and obj_type have already been translated into common FS_* values shared by the entirety of the fs notification infrastructure. The path struct is passed rather than just the inode so that the mount is available, particularly for mount watches. This also allows for use of the hook by pathname-based security modules. However, since the hook is intended for use even by inode based security modules, it is not placed under the CONFIG_SECURITY_PATH conditional. Otherwise, the inode-based security modules would need to enable all of the path hooks, even though they do not use any of them. This only provides a hook at the point of setting a watch, and presumes that permission to set a particular watch implies the ability to receive all notification about that object which match the mask. This is all that is required for SELinux. If other security modules require additional hooks or infrastructure to control delivery of notification, these can be added by them. It does not make sense for us to propose hooks for which we have no implementation. The understanding that all notifications received by the requesting application are all strictly of a type for which the application has been granted permission shows that this implementation is sufficient in its coverage. Security modules wishing to provide complete control over fanotify must also implement a security_file_open hook that validates that the access requested by the watching application is authorized. Fanotify has the issue that it returns a file descriptor with the file mode specified during fanotify_init() to the watching process on event. This is already covered by the LSM security_file_open hook if the security module implements checking of the requested file mode there. Otherwise, a watching process can obtain escalated access to a file for which it has not been authorized. The selinux_path_notify hook implementation works by adding five new file permissions: watch, watch_mount, watch_sb, watch_reads, and watch_with_perm (descriptions about which will follow), and one new filesystem permission: watch (which is applied to superblock checks). The hook then decides which subset of these permissions must be held by the requesting application based on the contents of the provided mask and the obj_type. The selinux_file_open hook already checks the requested file mode and therefore ensures that a watching process cannot escalate its access through fanotify. The watch, watch_mount, and watch_sb permissions are the baseline permissions for setting a watch on an object and each are a requirement for any watch to be set on a file, mount, or superblock respectively. It should be noted that having either of the other two permissions (watch_reads and watch_with_perm) does not imply the watch, watch_mount, or watch_sb permission. Superblock watches further require the filesystem watch permission to the superblock. As there is no labeled object in view for mounts, there is no specific check for mount watches beyond watch_mount to the inode. Such a check could be added in the future, if a suitable labeled object existed representing the mount. The watch_reads permission is required to receive notifications from read-exclusive events on filesystem objects. These events include accessing a file for the purpose of reading and closing a file which has been opened read-only. This distinction has been drawn in order to provide a direct indication in the policy for this otherwise not obvious capability. Read access to a file should not necessarily imply the ability to observe read events on a file. Finally, watch_with_perm only applies to fanotify masks since it is the only way to set a mask which allows for the blocking, permission event. This permission is needed for any watch which is of this type. Though fanotify requires CAP_SYS_ADMIN, this is insufficient as it gives implicit trust to root, which we do not do, and does not support least privilege. Signed-off-by: Aaron Goidel <acgoide@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-06-14LSM: switch to blocking policy update notifiersJanne Karhunen1-6/+6
Atomic policy updaters are not very useful as they cannot usually perform the policy updates on their own. Since it seems that there is no strict need for the atomicity, switch to the blocking variant. While doing so, rename the functions accordingly. Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2019-05-07Merge branch 'work.mount-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-0/+7
Pull mount ABI updates from Al Viro: "The syscalls themselves, finally. That's not all there is to that stuff, but switching individual filesystems to new methods is fortunately independent from everything else, so e.g. NFS series can go through NFS tree, etc. As those conversions get done, we'll be finally able to get rid of a bunch of duplication in fs/super.c introduced in the beginning of the entire thing. I expect that to be finished in the next window..." * 'work.mount-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Add a sample program for the new mount API vfs: syscall: Add fspick() to select a superblock for reconfiguration vfs: syscall: Add fsmount() to create a mount for a superblock vfs: syscall: Add fsconfig() for configuring and managing a context vfs: Implement logging through fs_context vfs: syscall: Add fsopen() to prepare for superblock creation Make anon_inodes unconditional teach move_mount(2) to work with OPEN_TREE_CLONE vfs: syscall: Add move_mount(2) to move mounts around vfs: syscall: Add open_tree(2) to reference or clone a mount
2019-03-20LSM: add new hook for kernfs node initializationOndrej Mosnacek1-0/+9
This patch introduces a new security hook that is intended for initializing the security data for newly created kernfs nodes, which provide a way of storing a non-default security context, but need to operate independently from mounts (and therefore may not have an associated inode at the moment of creation). The main motivation is to allow kernfs nodes to inherit the context of the parent under SELinux, similar to the behavior of security_inode_init_security(). Other LSMs may implement their own logic for handling the creation of new nodes. This patch also adds helper functions to <linux/kernfs.h> for getting/setting security xattrs of a kernfs node so that LSMs hooks are able to do their job. Other important attributes should be accessible direcly in the kernfs_node fields (in case there is need for more, then new helpers should be added to kernfs.h along with the patch that needs them). Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> [PM: more manual merge fixes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-03-20vfs: syscall: Add move_mount(2) to move mounts aroundDavid Howells1-0/+7
Add a move_mount() system call that will move a mount from one place to another and, in the next commit, allow to attach an unattached mount tree. The new system call looks like the following: int move_mount(int from_dfd, const char *from_path, int to_dfd, const char *to_path, unsigned int flags); Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-03-12Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-1/+17
Pull vfs mount infrastructure updates from Al Viro: "The rest of core infrastructure; no new syscalls in that pile, but the old parts are switched to new infrastructure. At that point conversions of individual filesystems can happen independently; some are done here (afs, cgroup, procfs, etc.), there's also a large series outside of that pile dealing with NFS (quite a bit of option-parsing stuff is getting used there - it's one of the most convoluted filesystems in terms of mount-related logics), but NFS bits are the next cycle fodder. It got seriously simplified since the last cycle; documentation is probably the weakest bit at the moment - I considered dropping the commit introducing Documentation/filesystems/mount_api.txt (cutting the size increase by quarter ;-), but decided that it would be better to fix it up after -rc1 instead. That pile allows to do followup work in independent branches, which should make life much easier for the next cycle. fs/super.c size increase is unpleasant; there's a followup series that allows to shrink it considerably, but I decided to leave that until the next cycle" * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits) afs: Use fs_context to pass parameters over automount afs: Add fs_context support vfs: Add some logging to the core users of the fs_context log vfs: Implement logging through fs_context vfs: Provide documentation for new mount API vfs: Remove kern_mount_data() hugetlbfs: Convert to fs_context cpuset: Use fs_context kernfs, sysfs, cgroup, intel_rdt: Support fs_context cgroup: store a reference to cgroup_ns into cgroup_fs_context cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper cgroup_do_mount(): massage calling conventions cgroup: stash cgroup_root reference into cgroup_fs_context cgroup2: switch to option-by-option parsing cgroup1: switch to option-by-option parsing cgroup: take options parsing into ->parse_monolithic() cgroup: fold cgroup1_mount() into cgroup1_get_tree() cgroup: start switching to fs_context ipc: Convert mqueue fs to fs_context proc: Add fs_context support to procfs ...
2019-03-07Merge tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/auditLinus Torvalds1-3/+2
Pull audit updates from Paul Moore: "A lucky 13 audit patches for v5.1. Despite the rather large diffstat, most of the changes are from two bug fix patches that move code from one Kconfig option to another. Beyond that bit of churn, the remaining changes are largely cleanups and bug-fixes as we slowly march towards container auditing. It isn't all boring though, we do have a couple of new things: file capabilities v3 support, and expanded support for filtering on filesystems to solve problems with remote filesystems. All changes pass the audit-testsuite. Please merge for v5.1" * tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: mark expected switch fall-through audit: hide auditsc_get_stamp and audit_serial prototypes audit: join tty records to their syscall audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL audit: remove unused actx param from audit_rule_match audit: ignore fcaps on umount audit: clean up AUDITSYSCALL prototypes and stubs audit: more filter PATH records keyed on filesystem magic audit: add support for fcaps v3 audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT audit: add syscall information to CONFIG_CHANGE records audit: hand taken context to audit_kill_trees for syscall logging audit: give a clue what CONFIG_CHANGE op was involved
2019-02-28introduce cloning of fs_contextAl Viro1-0/+6
new primitive: vfs_dup_fs_context(). Comes with fs_context method (->dup()) for copying the filesystem-specific parts of fs_context, along with LSM one (->fs_context_dup()) for doing the same to LSM parts. [needs better commit message, and change of Author:, anyway] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28vfs: Put security flags into the fs_context structDavid Howells1-1/+1
Put security flags, such as SECURITY_LSM_NATIVE_LABELS, into the filesystem context so that the filesystem can communicate them to the LSM more easily. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28vfs: Add LSM hooks for the new mount APIDavid Howells1-0/+10
Add LSM hooks for use by the new mount API and filesystem context code. This includes: (1) Hooks to handle allocation, duplication and freeing of the security record attached to a filesystem context. (2) A hook to snoop source specifications. There may be multiple of these if the filesystem supports it. They will to be local files/devices if fs_context::source_is_dev is true and will be something else, possibly remote server specifications, if false. (3) A hook to snoop superblock configuration options in key[=val] form. If the LSM decides it wants to handle it, it can suppress the option being passed to the filesystem. Note that 'val' may include commas and binary data with the fsopen patch. (4) A hook to perform validation and allocation after the configuration has been done but before the superblock is allocated and set up. (5) A hook to transfer the security from the context to a newly created superblock. (6) A hook to rule on whether a path point can be used as a mountpoint. These are intended to replace: security_sb_copy_data security_sb_kern_mount security_sb_mount security_sb_set_mnt_opts security_sb_clone_mnt_opts security_sb_parse_opts_str [AV -- some of the methods being replaced are already gone, some of the methods are not added for the lack of need] Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-31audit: remove unused actx param from audit_rule_matchRichard Guy Briggs1-3/+2
The audit_rule_match() struct audit_context *actx parameter is not used by any in-tree consumers (selinux, apparmour, integrity, smack). The audit context is an internal audit structure that should only be accessed by audit accessor functions. It was part of commit 03d37d25e0f9 ("LSM/Audit: Introduce generic Audit LSM hooks") but appears to have never been used. Remove it. Please see the github issue https://github.com/linux-audit/audit-kernel/issues/107 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: fixed the referenced commit title] Signed-off-by: Paul Moore <paul@paul-moore.com>
2019-01-10LSM: generalize flag passing to security_capableMicah Morton1-14/+14
This patch provides a general mechanism for passing flags to the security_capable LSM hook. It replaces the specific 'audit' flag that is used to tell security_capable whether it should log an audit message for the given capability check. The reason for generalizing this flag passing is so we can add an additional flag that signifies whether security_capable is being called by a setid syscall (which is needed by the proposed SafeSetID LSM). Signed-off-by: Micah Morton <mortonm@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.morris@microsoft.com>
2019-01-08procfs: add smack subdir to attrsCasey Schaufler1-5/+10
Back in 2007 I made what turned out to be a rather serious mistake in the implementation of the Smack security module. The SELinux module used an interface in /proc to manipulate the security context on processes. Rather than use a similar interface, I used the same interface. The AppArmor team did likewise. Now /proc/.../attr/current will tell you the security "context" of the process, but it will be different depending on the security module you're using. This patch provides a subdirectory in /proc/.../attr for Smack. Smack user space can use the "current" file in this subdirectory and never have to worry about getting SELinux attributes by mistake. Programs that use the old interface will continue to work (or fail, as the case may be) as before. The proposed S.A.R.A security module is dependent on the mechanism to create its own attr subdirectory. The original implementation is by Kees Cook. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-21LSM: new method: ->sb_add_mnt_opt()Al Viro1-2/+4
Adding options to growing mnt_opts. NFS kludge with passing context= down into non-text-options mount switched to it, and with that the last use of ->sb_parse_opts_str() is gone. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: bury struct security_mnt_optsAl Viro1-8/+0
no users left Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: hide struct security_mnt_opts from any generic codeAl Viro1-33/+10
Keep void * instead, allocate on demand (in parse_str_opts, at the moment). Eventually both selinux and smack will be better off with private structures with several strings in those, rather than this "counter and two pointers to dynamically allocated arrays" ugliness. This commit allows to do that at leisure, without disrupting anything outside of given module. Changes: * instead of struct security_mnt_opt use an opaque pointer initialized to NULL. * security_sb_eat_lsm_opts(), security_sb_parse_opts_str() and security_free_mnt_opts() take it as var argument (i.e. as void **); call sites are unchanged. * security_sb_set_mnt_opts() and security_sb_remount() take it by value (i.e. as void *). * new method: ->sb_free_mnt_opts(). Takes void *, does whatever freeing that needs to be done. * ->sb_set_mnt_opts() and ->sb_remount() might get NULL as mnt_opts argument, meaning "empty". Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: split ->sb_set_mnt_opts() out of ->sb_kern_mount()Al Viro1-4/+2
... leaving the "is it kernel-internal" logics in the caller. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21new helper: security_sb_eat_lsm_opts()Al Viro1-25/+3
combination of alloc_secdata(), security_sb_copy_data(), security_sb_parse_opt_str() and free_secdata(). Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: lift extracting and parsing LSM options into the caller of ->sb_remount()Al Viro1-2/+3
This paves the way for retaining the LSM options from a common filesystem mount context during a mount parameter parsing phase to be instituted prior to actual mount/reconfiguration actions. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-12-21LSM: lift parsing LSM options into the caller of ->sb_kern_mount()Al Viro1-2/+4
This paves the way for retaining the LSM options from a common filesystem mount context during a mount parameter parsing phase to be instituted prior to actual mount/reconfiguration actions. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-10-03signal: Distinguish between kernel_siginfo and siginfoEric W. Biederman1-3/+3
Linus recently observed that if we did not worry about the padding member in struct siginfo it is only about 48 bytes, and 48 bytes is much nicer than 128 bytes for allocating on the stack and copying around in the kernel. The obvious thing of only adding the padding when userspace is including siginfo.h won't work as there are sigframe definitions in the kernel that embed struct siginfo. So split siginfo in two; kernel_siginfo and siginfo. Keeping the traditional name for the userspace definition. While the version that is used internally to the kernel and ultimately will not be padded to 128 bytes is called kernel_siginfo. The definition of struct kernel_siginfo I have put in include/signal_types.h A set of buildtime checks has been added to verify the two structures have the same field offsets. To make it easy to verify the change kernel_siginfo retains the same size as siginfo. The reduction in size comes in a following change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2018-08-15Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds1-0/+27
Pull security subsystem updates from James Morris: - kstrdup() return value fix from Eric Biggers - Add new security_load_data hook to differentiate security checking of kernel-loaded binaries in the case of there being no associated file descriptor, from Mimi Zohar. - Add ability to IMA to specify a policy at build-time, rather than just via command line params or by loading a custom policy, from Mimi. - Allow IMA and LSMs to prevent sysfs firmware load fallback (e.g. if using signed firmware), from Mimi. - Allow IMA to deny loading of kexec kernel images, as they cannot be measured by IMA, from Mimi. * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: security: check for kstrdup() failure in lsm_append() security: export security_kernel_load_data function ima: based on policy warn about loading firmware (pre-allocated buffer) module: replace the existing LSM hook in init_module ima: add build time policy ima: based on policy require signed firmware (sysfs fallback) firmware: add call to LSM hook before firmware sysfs fallback ima: based on policy require signed kexec kernel images kexec: add call to LSM hook in original kexec_load syscall security: define new LSM hook named security_kernel_load_data MAINTAINERS: remove the outdated "LINUX SECURITY MODULE (LSM) FRAMEWORK" entry
2018-07-16security: define new LSM hook named security_kernel_load_dataMimi Zohar1-0/+27
Differentiate between the kernel reading a file specified by userspace from the kernel loading a buffer containing data provided by userspace. This patch defines a new LSM hook named security_kernel_load_data(). Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-07-12security_file_open(): lose cred argumentAl Viro1-3/+2
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2018-05-04security: add hook for socketpair()David Herrmann1-0/+7
Right now the LSM labels for socketpairs are always uninitialized, since there is no security hook for the socketpair() syscall. This patch adds the required hooks so LSMs can properly label socketpairs. This allows SO_PEERSEC to return useful information on those sockets. Note that the behavior of socketpair() can be emulated by creating a listener socket, connecting to it, and then discarding the initial listener socket. With this workaround, SO_PEERSEC would return the caller's security context. However, with socketpair(), the uninitialized context is returned unconditionally. This is unexpected and makes socketpair() less useful in situations where the security context is crucial to the application. With the new socketpair-hook this disparity can be solved by making socketpair() return the expected security context. Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: James Morris <james.morris@microsoft.com>
2018-04-24Merge tag 'v4.17-rc2' into next-generalJames Morris1-35/+58
Sync to Linux 4.17-rc2 for developers.