Age | Commit message (Collapse) | Author | Files | Lines |
|
Hulk Robot reported a BUG_ON:
==================================================================
kernel BUG at fs/ext4/extents_status.c:199!
[...]
RIP: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline]
RIP: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217
[...]
Call Trace:
ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766
ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561
ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964
ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384
ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567
ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980
ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031
ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257
v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63
v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82
vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368
dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490
ext4_quota_enable fs/ext4/super.c:6137 [inline]
ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163
ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754
mount_bdev+0x2e9/0x3b0 fs/super.c:1158
mount_fs+0x4b/0x1e4 fs/super.c:1261
[...]
==================================================================
Above issue may happen as follows:
-------------------------------------
ext4_fill_super
ext4_enable_quotas
ext4_quota_enable
ext4_iget
__ext4_iget
ext4_ext_check_inode
ext4_ext_check
__ext4_ext_check
ext4_valid_extent_entries
Check for overlapping extents does't take effect
dquot_enable
vfs_load_quota_inode
v2_check_quota_file
v2_read_header
ext4_quota_read
ext4_bread
ext4_getblk
ext4_map_blocks
ext4_ext_map_blocks
ext4_find_extent
ext4_cache_extents
ext4_es_cache_extent
ext4_es_cache_extent
__es_tree_search
ext4_es_end
BUG_ON(es->es_lblk + es->es_len < es->es_lblk)
The error ext4 extents is as follows:
0af3 0300 0400 0000 00000000 extent_header
00000000 0100 0000 12000000 extent1
00000000 0100 0000 18000000 extent2
02000000 0400 0000 14000000 extent3
In the ext4_valid_extent_entries function,
if prev is 0, no error is returned even if lblock<=prev.
This was intended to skip the check on the first extent, but
in the error image above, prev=0+1-1=0 when checking the second extent,
so even though lblock<=prev, the function does not return an error.
As a result, bug_ON occurs in __es_tree_search and the system panics.
To solve this problem, we only need to check that:
1. The lblock of the first extent is not less than 0.
2. The lblock of the next extent is not less than
the next block of the previous extent.
The same applies to extent_idx.
Cc: stable@kernel.org
Fixes: 5946d089379a ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220518120816.1541863-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
A maliciously corrupted filesystem can contain cycles in the h-tree
stored inside a directory. That can easily lead to the kernel corrupting
tree nodes that were already verified under its hands while doing a node
split and consequently accessing unallocated memory. Fix the problem by
verifying traversed block numbers are unique.
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220518093332.13986-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Before splitting a directory block verify its directory entries are sane
so that the splitting code does not access memory it should not.
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220518093332.13986-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
The EXT4_FC_REPLAY bit in sbi->s_mount_state is used to indicate that
we are in the middle of replay the fast commit journal. This was
actually a mistake, since the sbi->s_mount_info is initialized from
es->s_state. Arguably s_mount_state is misleadingly named, but the
name is historical --- s_mount_state and s_state dates back to ext2.
What should have been used is the ext4_{set,clear,test}_mount_flag()
inline functions, which sets EXT4_MF_* bits in sbi->s_mount_flags.
The problem with using EXT4_FC_REPLAY is that a maliciously corrupted
superblock could result in EXT4_FC_REPLAY getting set in
s_mount_state. This bypasses some sanity checks, and this can trigger
a BUG() in ext4_es_cache_extent(). As a easy-to-backport-fix, filter
out the EXT4_FC_REPLAY bit for now. We should eventually transition
away from EXT4_FC_REPLAY to something like EXT4_MF_REPLAY.
Cc: stable@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20220420192312.1655305-1-phind.uet@gmail.com
Link: https://lore.kernel.org/r/20220517174028.942119-1-tytso@mit.edu
Reported-by: syzbot+c7358a3cd05ee786eb31@syzkaller.appspotmail.com
|
|
Before this patch, function bh_get used block_map to figure out the
block it needed to read in from the quota_change file. This patch
changes it to use iomap directly to make it more efficient.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Before this patch, functions gfs2_qa_get and _put used the i_rw_mutex to
prevent simultaneous access to its i_qadata. But i_rw_mutex is now used
for many other things, including iomap_begin and end, which causes a
conflict according to lockdep. We cannot just remove the lock since
simultaneous opens (gfs2_open -> gfs2_open_common -> gfs2_qa_get) can
then stomp on each others values for i_qadata.
This patch solves the conflict by using the i_lock spin_lock in the inode
to prevent simultaneous access.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The bug that 27ca8273f ("gfs2: Make sure FITRIM minlen is rounded up to
fs block size") fixes was a little confusing as the user saw
"Input/output error" which masked the -EINVAL that sb_issue_discard()
returned.
sb_issue_discard() can fail for various reasons, so we should return its
return value from gfs2_rgrp_send_discards() to avoid all errors being
reported as IO errors.
This improves error reporting for FITRIM and makes no difference to the
-o discard code path because the return value from
gfs2_rgrp_send_discards() gets thrown away in that case (and the option
switches off). Presumably that's why it was ok to just return -EIO in
the past, before FITRIM was implemented.
Tested with xfstests.
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Clang's structure layout randomization feature gets upset when it sees
struct address_space (which is randomized) cast to struct gfs2_glock.
This is due to seeing the mapping pointer as being treated as an array
of gfs2_glock, rather than "something else, before struct address_space":
In file included from fs/gfs2/acl.c:23:
fs/gfs2/meta_io.h:44:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct gfs2_glock *'
return (((struct gfs2_glock *)mapping) - 1)->gl_name.ln_sbd;
^
Replace the instances of open-coded pointer math with container_of()
usage, and update the allocator to match.
Some cleanups and conversion of gfs2_glock_get() and
gfs2_glock_dealloc() by Andreas.
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202205041550.naKxwCBj-lkp@intel.com
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Bill Wendling <morbo@google.com>
Cc: cluster-devel@redhat.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Add some comments explaining the oddities of partial direct I/O reads
and writes.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
KGDB and KDB allow read and write access to kernel memory, and thus
should be restricted during lockdown. An attacker with access to a
serial port (for example, via a hypervisor console, which some cloud
vendors provide over the network) could trigger the debugger so it is
important that the debugger respect the lockdown mode when/if it is
triggered.
Fix this by integrating lockdown into kdb's existing permissions
mechanism. Unfortunately kgdb does not have any permissions mechanism
(although it certainly could be added later) so, for now, kgdb is simply
and brutally disabled by immediately exiting the gdb stub without taking
any action.
For lockdowns established early in the boot (e.g. the normal case) then
this should be fine but on systems where kgdb has set breakpoints before
the lockdown is enacted than "bad things" will happen.
CVE: CVE-2022-21499
Co-developed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 87c9ce3ffec9 ("zonefs: Add active seq file accounting") wrongly
changed zonefs_init_file_inode() to always return 0 even if the call to
zonefs_zone_mgmt() fails. Fix this by propagating zonefs_zone_mgmt()
return value as the return value for zonefs_init_file_inode().
Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
|
|
Get rid of redundant assignments which end up in values not being
read either because they are overwritten or the function ends.
Reported by clang-tidy [deadcode.DeadStores]
Signed-off-by: Michal Orzel <michalorzel.eng@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
|
|
Create a maintainer entry for CAAM trusted keys in the Linux keyring.
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Update documentation for trusted key use with the Cryptographic
Acceleration and Assurance Module (CAAM), an IP on NXP SoCs.
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The Cryptographic Acceleration and Assurance Module (CAAM) is an IP core
built into many newer i.MX and QorIQ SoCs by NXP.
The CAAM does crypto acceleration, hardware number generation and
has a blob mechanism for encapsulation/decapsulation of sensitive material.
This blob mechanism depends on a device specific random 256-bit One Time
Programmable Master Key that is fused in each SoC at manufacturing
time. This key is unreadable and can only be used by the CAAM for AES
encryption/decryption of user data.
This makes it a suitable backend (source) for kernel trusted keys.
Previous commits generalized trusted keys to support multiple backends
and added an API to access the CAAM blob mechanism. Based on these,
provide the necessary glue to use the CAAM for trusted keys.
Reviewed-by: David Gstir <david@sigma-star.at>
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Tim Harvey <tharvey@gateworks.com>
Tested-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Tested-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on ls1028a (non-E and E)
Tested-by: John Ernberg <john.ernberg@actia.se> # iMX8QXP
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The NXP Cryptographic Acceleration and Assurance Module (CAAM)
can be used to protect user-defined data across system reboot:
- When the system is fused and boots into secure state, the master
key is a unique never-disclosed device-specific key
- random key is encrypted by key derived from master key
- data is encrypted using the random key
- encrypted data and its encrypted random key are stored alongside
- This blob can now be safely stored in non-volatile memory
On next power-on:
- blob is loaded into CAAM
- CAAM writes decrypted data either into memory or key register
Add functions to realize encrypting and decrypting into memory alongside
the CAAM driver.
They will be used in a later commit as a source for the trusted key
seal/unseal mechanism.
Reviewed-by: David Gstir <david@sigma-star.at>
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Tested-by: Tim Harvey <tharvey@gateworks.com>
Tested-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Tested-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on ls1028a (non-E and E)
Tested-by: John Ernberg <john.ernberg@actia.se> # iMX8QXP
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Depending on SoC variant, a CAAM may be available, but with some futures
fused out. The LS1028A (non-E) SoC is one such SoC and while it
indicates BLOB support, BLOB operations will ultimately fail, because
there is no AES support. Add a new blob_present member to reflect
whether both BLOB support and the AES support it depends on is
available.
These will be used in a follow-up commit to allow blob driver
initialization to error out on SoCs without the necessary hardware
support instead of failing at runtime with a cryptic
caam_jr 8020000.jr: 20000b0f: CCB: desc idx 11: : Invalid CHA selected.
Co-developed-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc> # on ls1028a (non-E and E)
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The two existing trusted key sources don't make use of the kernel RNG,
but instead let the hardware doing the sealing/unsealing also
generate the random key material. However, both users and future
backends may want to place less trust into the quality of the trust
source's random number generator and instead reuse the kernel entropy
pool, which can be seeded from multiple entropy sources.
Make this possible by adding a new trusted.rng parameter,
that will force use of the kernel RNG. In its absence, it's up
to the trust source to decide, which random numbers to use,
maintaining the existing behavior.
Suggested-by: Jarkko Sakkinen <jarkko@kernel.org>
Acked-by: Sumit Garg <sumit.garg@linaro.org>
Acked-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Reviewed-by: David Gstir <david@sigma-star.at>
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on ls1028a (non-E and E)
Tested-by: John Ernberg <john.ernberg@actia.se> # iMX8QXP
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
With recent rework, trusted keys are no longer limited to TPM as trust
source. The Kconfig symbol is unchanged however leading to a few issues:
- TCG_TPM is required, even if only TEE is to be used
- Enabling TCG_TPM, but excluding it from available trusted sources
is not possible
- TEE=m && TRUSTED_KEYS=y will lead to TEE support being silently
dropped, which is not the best user experience
Remedy these issues by introducing two new boolean Kconfig symbols:
TRUSTED_KEYS_TPM and TRUSTED_KEYS_TEE with the appropriate
dependencies.
Any new code depending on the TPM trusted key backend in particular
or symbols exported by it will now need to explicitly state that it
depends on TRUSTED_KEYS && TRUSTED_KEYS_TPM
The latter to ensure the dependency is built and the former to ensure
it's reachable for module builds. There are no such users yet.
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Tested-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Tested-by: Andreas Rammhold <andreas@rammhold.de>
Tested-by: Tim Harvey <tharvey@gateworks.com>
Tested-by: Michael Walle <michael@walle.cc> # on ls1028a (non-E and E)
Tested-by: John Ernberg <john.ernberg@actia.se> # iMX8QXP
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
TPM2_GetCapability with a capability that has the property type value
of TPM_PT_TOTAL_COMMANDS returns a zero length list, when an Infineon
TPM2 is in field upgrade mode.
Since an Infineon TPM2.0 in field upgrade mode returns RC_SUCCESS on
TPM2_Startup, the field upgrade mode has to be detected by
TPM2_GetCapability.
Signed-off-by: Stefan Mahnke-Hartmann <stefan.mahnke-hartmann@infineon.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Under certain conditions uninitialized memory will be accessed.
As described by TCG Trusted Platform Module Library Specification,
rev. 1.59 (Part 3: Commands), if a TPM2_GetCapability is received,
requesting a capability, the TPM in field upgrade mode may return a
zero length list.
Check the property count in tpm2_get_tpm_pt().
Fixes: 2ab3241161b3 ("tpm: migrate tpm2_get_tpm_pt() to use struct tpm_buf")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Mahnke-Hartmann <stefan.mahnke-hartmann@infineon.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Returning an error value in an i2c remove callback results in an error
message being emitted by the i2c core, but otherwise it doesn't make a
difference. The device goes away anyhow and the devm cleanups are
called.
As tpm_cr50_i2c_remove() emits an error message already and the
additional error message by the i2c core doesn't add any useful
information, change the return value to zero to suppress this error
message.
Note that if i2c_clientdata is NULL, there is something really fishy.
Assuming no memory corruption happened (then all bets are lost anyhow),
tpm_cr50_i2c_remove() is only called after tpm_cr50_i2c_probe() returned
successfully. So there was a tpm chip registered before and after
tpm_cr50_i2c_remove() its privdata is freed but the associated character
device isn't removed. If after that happened userspace accesses the
character device it's likely that the freed memory is accessed. For that
reason the warning message is made a bit more frightening.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Accept one additional numerical value of DID:VID for next generation
Google TPM with new firmware, to be used in future Chromebooks.
The TPM with the new firmware has the code name TI50, and is going to
use the same interfaces.
Signed-off-by: Jes B. Klinke <jbk@chromium.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Only tpm_tis and tpm_tis_synquacer have a dedicated way to access
multiple bytes at once, every other driver will just fall back to
read_bytes/write_bytes. Therefore, remove the read16/read32/write32
calls and move their logic to read_bytes/write_bytes.
Suggested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Johannes Holland <johannes.holland@infineon.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Currently it returns zero when CRQ response timed out, it should return
an error code instead.
Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before proceeding")
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The blacklist_init() function calls panic() for memory allocation
errors. This change documents the reason why we don't return -ENODEV.
Link: https://lore.kernel.org/r/20220322111323.542184-2-mic@digikod.net
Link: https://lore.kernel.org/r/YjeW2r6Wv55Du0bJ@iki.fi
Suggested-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Add a kernel option SYSTEM_BLACKLIST_AUTH_UPDATE to enable the root user
to dynamically add new keys to the blacklist keyring. This enables to
invalidate new certificates, either from being loaded in a keyring, or
from being trusted in a PKCS#7 certificate chain. This also enables to
add new file hashes to be denied by the integrity infrastructure.
Being able to untrust a certificate which could have normaly been
trusted is a sensitive operation. This is why adding new hashes to the
blacklist keyring is only allowed when these hashes are signed and
vouched by the builtin trusted keyring. A blacklist hash is stored as a
key description. The PKCS#7 signature of this description must be
provided as the key payload.
Marking a certificate as untrusted should be enforced while the system
is running. It is then forbiden to remove such blacklist keys.
Update blacklist keyring, blacklist key and revoked certificate access
rights:
* allows the root user to search for a specific blacklisted hash, which
make sense because the descriptions are already viewable;
* forbids key update (blacklist and asymmetric ones);
* restricts kernel rights on the blacklist keyring to align with the
root user rights.
See help in tools/certs/print-cert-tbs-hash.sh .
Cc: David Howells <dhowells@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Snowberg <eric.snowberg@oracle.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210712170313.884724-6-mic@digikod.net
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Add and use a check-blacklist-hashes.awk script to make sure that the
builtin blacklist hashes set with CONFIG_SYSTEM_BLACKLIST_HASH_LIST will
effectively be taken into account as blacklisted hashes. This is useful
to debug invalid hash formats, and it make sure that previous hashes
which could have been loaded in the kernel, but silently ignored, are
now noticed and deal with by the user at kernel build time.
This also prevent stricter blacklist key description checking (provided
by following commits) to failed for builtin hashes.
Update CONFIG_SYSTEM_BLACKLIST_HASH_LIST help to explain the content of
a hash string and how to generate certificate ones.
Cc: David Howells <dhowells@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Snowberg <eric.snowberg@oracle.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210712170313.884724-3-mic@digikod.net
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Before exposing this new key type to user space, make sure that only
meaningful blacklisted hashes are accepted. This is also checked for
builtin blacklisted hashes, but a following commit make sure that the
user will notice (at built time) and will fix the configuration if it
already included errors.
Check that a blacklist key description starts with a valid prefix and
then a valid hexadecimal string.
Cc: David Howells <dhowells@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/r/20210712170313.884724-4-mic@digikod.net
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Factor out the blacklist hash creation with the get_raw_hash() helper.
This also centralize the "tbs" and "bin" prefixes and make them private,
which help to manage them consistently.
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Snowberg <eric.snowberg@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210712170313.884724-5-mic@digikod.net
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Add a new helper print-cert-tbs-hash.sh to generate a TBSCertificate
hash from a given certificate. This is useful to generate a blacklist
key description used to forbid loading a specific certificate in a
keyring, or to invalidate a certificate provided by a PKCS#7 file.
This kind of hash formatting is required to populate the file pointed
out by CONFIG_SYSTEM_BLACKLIST_HASH_LIST, but only the kernel code was
available to understand how to effectively create such hash.
Cc: David Howells <dhowells@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/r/20210712170313.884724-2-mic@digikod.net
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Let's help users by documenting how to enable and check for Landlock in
the kernel and the running system. The userspace-api section may not be
the best place for this but it still makes sense to put all the user
documentation at the same place.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220513112743.156414-1-mic@digikod.net
Reviewed-by: Paul Moore <paul@paul-moore.com>
|
|
Summarize the rationale of filesystem access rights according to the
file type.
Update the document date.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-13-mic@digikod.net
|
|
Explain how to set access rights per hierarchy in an efficient and safe
way, especially with the LANDLOCK_ACCESS_FS_REFER side effect (i.e.
partial ordering and constraints for access rights per hierarchy).
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-12-mic@digikod.net
|
|
Add LANDLOCK_ACCESS_FS_REFER in the example and properly check to only
use it if the current kernel support it thanks to the Landlock ABI
version.
Move the file renaming and linking limitation to a new "Previous
limitations" section.
Improve documentation about the backward and forward compatibility,
including the rational for ruleset's handled_access_fs.
Update the document date.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-11-mic@digikod.net
|
|
Add LANDLOCK_ACCESS_FS_REFER to the "roughly write" access rights and
leverage the Landlock ABI version to only try to enforce it if it is
supported by the running kernel.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-10-mic@digikod.net
|
|
These test suites try to check all edge cases for directory and file
renaming or linking involving a new parent directory, with and without
LANDLOCK_ACCESS_FS_REFER and other access rights.
layout1:
* reparent_refer: Tests simple FS_REFER usage.
* reparent_link: Tests a mix of FS_MAKE_REG and FS_REFER with links.
* reparent_rename: Tests a mix of FS_MAKE_REG and FS_REFER with renames
and RENAME_EXCHANGE.
* reparent_exdev_layers_rename1/2: Tests renames with two layers.
* reparent_exdev_layers_exchange1/2/3: Tests exchanges with two layers.
* reparent_remove: Tests file and directory removal with rename.
* reparent_dom_superset: Tests access partial ordering.
layout1_bind:
* reparent_cross_mount: Tests FS_REFER propagation across mount points.
Test coverage for security/landlock is 95.4% of 604 lines according to
gcc/gcov-11.
Cc: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-9-mic@digikod.net
|
|
Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers
to allow sandboxed processes to link and rename files from and to a
specific set of file hierarchies. This access right should be composed
with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename,
and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This
lift a Landlock limitation that always denied changing the parent of an
inode.
Renaming or linking to the same directory is still always allowed,
whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not
considered a threat to user data.
However, creating multiple links or renaming to a different parent
directory may lead to privilege escalations if not handled properly.
Indeed, we must be sure that the source doesn't gain more privileges by
being accessible from the destination. This is handled by making sure
that the source hierarchy (including the referenced file or directory
itself) restricts at least as much the destination hierarchy. If it is
not the case, an EXDEV error is returned, making it potentially possible
for user space to copy the file hierarchy instead of moving or linking
it.
Instead of creating different access rights for the source and the
destination, we choose to make it simple and consistent for users.
Indeed, considering the previous constraint, it would be weird to
require such destination access right to be also granted to the source
(to make it a superset). Moreover, RENAME_EXCHANGE would also add to
the confusion because of paths being both a source and a destination.
See the provided documentation for additional details.
New tests are provided with a following commit.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-8-mic@digikod.net
|
|
In order to be able to identify a file exchange with renameat2(2) and
RENAME_EXCHANGE, which will be useful for Landlock [1], propagate the
rename flags to LSMs. This may also improve performance because of the
switch from two set of LSM hook calls to only one, and because LSMs
using this hook may optimize the double check (e.g. only one lock,
reduce the number of path walks).
AppArmor, Landlock and Tomoyo are updated to leverage this change. This
should not change the current behavior (same check order), except
(different level of) speed boosts.
[1] https://lore.kernel.org/r/20220221212522.320243-1-mic@digikod.net
Cc: James Morris <jmorris@namei.org>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: Serge E. Hallyn <serge@hallyn.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-7-mic@digikod.net
|
|
Move the SB_NOUSER and IS_PRIVATE dentry check to a standalone
is_nouser_or_private() helper. This will be useful for a following
commit.
Move get_mode_access() and maybe_remove() to make them usable by new
code provided by a following commit.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220506161102.525323-6-mic@digikod.net
|
|
The original behavior was to check if the full set of requested accesses
was allowed by at least a rule of every relevant layer. This didn't
take into account requests for multiple accesses and same-layer rules
allowing the union of these accesses in a complementary way. As a
result, multiple accesses requested on a file hierarchy matching rules
that, together, allowed these accesses, but without a unique rule
allowing all of them, was illegitimately denied. This case should be
rare in practice and it can only be triggered by the path_rename or
file_open hook implementations.
For instance, if, for the same layer, a rule allows execution
beneath /a/b and another rule allows read beneath /a, requesting access
to read and execute at the same time for /a/b should be allowed for this
layer.
This was an inconsistency because the union of same-layer rule accesses
was already allowed if requested once at a time anyway.
This fix changes the way allowed accesses are gathered over a path walk.
To take into account all these rule accesses, we store in a matrix all
layer granting the set of requested accesses, according to the handled
accesses. To avoid heap allocation, we use an array on the stack which
is 2*13 bytes. A following commit bringing the LANDLOCK_ACCESS_FS_REFER
access right will increase this size to reach 112 bytes (2*14*4) in case
of link or rename actions.
Add a new layout1.layer_rule_unions test to check that accesses from
different rules pertaining to the same layer are ORed in a file
hierarchy. Also test that it is not the case for rules from different
layers.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20220506161102.525323-5-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This refactoring will be useful in a following commit.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20220506161102.525323-4-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
The maximum number of nested Landlock domains is currently 64. Because
of the following fix and to help reduce the stack size, let's reduce it
to 16. This seems large enough for a lot of use cases (e.g. sandboxed
init service, spawning a sandboxed SSH service, in nested sandboxed
containers). Reducing the number of nested domains may also help to
discover misuse of Landlock (e.g. creating a domain per rule).
Add and use a dedicated layer_mask_t typedef to fit with the number of
layers. This might be useful when changing it and to keep it consistent
with the maximum number of layers.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20220506161102.525323-3-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Create and use the access_mask_t typedef to enforce a consistent access
mask size and uniformly use a 16-bits type. This will helps transition
to a 32-bits value one day.
Add a build check to make sure all (filesystem) access rights fit in.
This will be extended with a following commit.
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20220506161102.525323-2-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add inval_create_ruleset_arguments, extension of
inval_create_ruleset_flags, to also check error ordering for
landlock_create_ruleset(2).
This is similar to the previous commit checking landlock_add_rule(2).
Test coverage for security/landlock is 94.4% of 504 lines accorging to
gcc/gcov-11.
Link: https://lore.kernel.org/r/20220506160820.524344-11-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
According to the Landlock goal to be a security feature available to
unprivileges processes, it makes more sense to first check for
no_new_privs before checking anything else (i.e. syscall arguments).
Merge inval_fd_enforce and unpriv_enforce_without_no_new_privs tests
into the new restrict_self_checks_ordering. This is similar to the
previous commit checking other syscalls.
Link: https://lore.kernel.org/r/20220506160820.524344-10-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This makes more sense to first check the ruleset FD and then the rule
attribute. It will be useful to factor out code for other rule types.
Add inval_add_rule_arguments tests, extension of empty_path_beneath_attr
tests, to also check error ordering for landlock_add_rule(2).
Link: https://lore.kernel.org/r/20220506160820.524344-9-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
The O_PATH flag is currently not handled by Landlock. Let's make sure
this behavior will remain consistent with the same ruleset over time.
Cc: Shuah Khan <shuah@kernel.org>
Link: https://lore.kernel.org/r/20220506160820.524344-8-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
These tests were missing to check the check_access_path() call with all
combinations of maybe_remove(old_dentry) and maybe_remove(new_dentry).
Extend layout1.link with a new complementary test and check that
REMOVE_FILE is not required to link a file.
Cc: Shuah Khan <shuah@kernel.org>
Link: https://lore.kernel.org/r/20220506160820.524344-7-mic@digikod.net
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|