linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2019-02-28	proc: Add fs_context support to procfs	David Howells	3	-68/+129
	Add fs_context support to procfs. Signed-off-by: David Howells <dhowells@redhat.com> cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	procfs: Move proc_fill_super() to fs/proc/root.c	David Howells	3	-54/+52
	Move proc_fill_super() to fs/proc/root.c as that's where the other superblock stuff is. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	introduce cloning of fs_context	Al Viro	7	-0/+175
	new primitive: vfs_dup_fs_context(). Comes with fs_context method (->dup()) for copying the filesystem-specific parts of fs_context, along with LSM one (->fs_context_dup()) for doing the same to LSM parts. [needs better commit message, and change of Author:, anyway] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	convenience helpers: vfs_get_super() and sget_fc()	Al Viro	3	-0/+190
	the former is an analogue of mount_{single,nodev} for use in ->get_tree() instances, the latter - analogue of sget() for the same. These are fairly similar to the originals, but the callback signature for sget_fc() is different from sget() ones, so getting bits and pieces shared would be too convoluted; we might get around to that later, but for now let's just remember to keep them in sync. They do live next to each other, and changes in either won't be hard to spot. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	vfs: Implement a filesystem superblock creation/configuration context	David Howells	5	-17/+319
	[AV - unfuck kern_mount_data(); we want non-NULL ->mnt_ns on long-living mounts] [AV - reordering fs/namespace.c is badly overdue, but let's keep it separate from that series] [AV - drop simple_pin_fs() change] [AV - clean vfs_kern_mount() failure exits up] Implement a filesystem context concept to be used during superblock creation for mount and superblock reconfiguration for remount. The mounting procedure then becomes: (1) Allocate new fs_context context. (2) Configure the context. (3) Create superblock. (4) Query the superblock. (5) Create a mount for the superblock. (6) Destroy the context. Rather than calling fs_type->mount(), an fs_context struct is created and fs_type->init_fs_context() is called to set it up. Pointers exist for the filesystem and LSM to hang their private data off. A set of operations has to be set by ->init_fs_context() to provide freeing, duplication, option parsing, binary data parsing, validation, mounting and superblock filling. Legacy filesystems are supported by the provision of a set of legacy fs_context operations that build up a list of mount options and then invoke fs_type->mount() from within the fs_context ->get_tree() operation. This allows all filesystems to be accessed using fs_context. It should be noted that, whilst this patch adds a lot of lines of code, there is quite a bit of duplication with existing code that can be eliminated should all filesystems be converted over. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	vfs: Put security flags into the fs_context struct	David Howells	2	-1/+2
	Put security flags, such as SECURITY_LSM_NATIVE_LABELS, into the filesystem context so that the filesystem can communicate them to the LSM more easily. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	smack: Implement filesystem context security hooks	David Howells	2	-15/+47
	Implement filesystem context security hooks for the smack LSM. Signed-off-by: David Howells <dhowells@redhat.com> cc: Casey Schaufler <casey@schaufler-ca.com> cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	selinux: Implement the new mount API LSM hooks	David Howells	2	-10/+49
	Implement the new mount API LSM hooks for SELinux. At some point the old hooks will need to be removed. Signed-off-by: David Howells <dhowells@redhat.com> cc: Paul Moore <paul@paul-moore.com> cc: Stephen Smalley <sds@tycho.nsa.gov> cc: selinux@tycho.nsa.gov cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	vfs: Add LSM hooks for the new mount API	David Howells	3	-0/+29
	Add LSM hooks for use by the new mount API and filesystem context code. This includes: (1) Hooks to handle allocation, duplication and freeing of the security record attached to a filesystem context. (2) A hook to snoop source specifications. There may be multiple of these if the filesystem supports it. They will to be local files/devices if fs_context::source_is_dev is true and will be something else, possibly remote server specifications, if false. (3) A hook to snoop superblock configuration options in key[=val] form. If the LSM decides it wants to handle it, it can suppress the option being passed to the filesystem. Note that 'val' may include commas and binary data with the fsopen patch. (4) A hook to perform validation and allocation after the configuration has been done but before the superblock is allocated and set up. (5) A hook to transfer the security from the context to a newly created superblock. (6) A hook to rule on whether a path point can be used as a mountpoint. These are intended to replace: security_sb_copy_data security_sb_kern_mount security_sb_mount security_sb_set_mnt_opts security_sb_clone_mnt_opts security_sb_parse_opts_str [AV -- some of the methods being replaced are already gone, some of the methods are not added for the lack of need] Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-security-module@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-02-28	vfs: Add configuration parser helpers	David Howells	8	-3/+640
	Because the new API passes in key,value parameters, match_token() cannot be used with it. Instead, provide three new helpers to aid with parsing: (1) fs_parse(). This takes a parameter and a simple static description of all the parameters and maps the key name to an ID. It returns 1 on a match, 0 on no match if unknowns should be ignored and some other negative error code on a parse error. The parameter description includes a list of key names to IDs, desired parameter types and a list of enumeration name -> ID mappings. [!] Note that for the moment I've required that the key->ID mapping array is expected to be sorted and unterminated. The size of the array is noted in the fsconfig_parser struct. This allows me to use bsearch(), but I'm not sure any performance gain is worth the hassle of requiring people to keep the array sorted. The parameter type array is sized according to the number of parameter IDs and is indexed directly. The optional enum mapping array is an unterminated, unsorted list and the size goes into the fsconfig_parser struct. The function can do some additional things: (a) If it's not ambiguous and no value is given, the prefix "no" on a key name is permitted to indicate that the parameter should be considered negatory. (b) If the desired type is a single simple integer, it will perform an appropriate conversion and store the result in a union in the parse result. (c) If the desired type is an enumeration, {key ID, name} will be looked up in the enumeration list and the matching value will be stored in the parse result union. (d) Optionally generate an error if the key is unrecognised. This is called something like: enum rdt_param { Opt_cdp, Opt_cdpl2, Opt_mba_mpbs, nr__rdt_params }; const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = { [Opt_cdp] = { fs_param_is_bool }, [Opt_cdpl2] = { fs_param_is_bool }, [Opt_mba_mpbs] = { fs_param_is_bool }, }; const const char const rdt_param_keys[nr__rdt_params] = { [Opt_cdp] = "cdp", [Opt_cdpl2] = "cdpl2", [Opt_mba_mpbs] = "mba_mbps", }; const struct fs_parameter_description rdt_parser = { .name = "rdt", .nr_params = nr__rdt_params, .keys = rdt_param_keys, .specs = rdt_param_specs, .no_source = true, }; int rdt_parse_param(struct fs_context fc, struct fs_parameter param) { struct fs_parse_result parse; struct rdt_fs_context ctx = rdt_fc2context(fc); int ret; ret = fs_parse(fc, &rdt_parser, param, &parse); if (ret < 0) return ret; switch (parse.key) { case Opt_cdp: ctx->enable_cdpl3 = true; return 0; case Opt_cdpl2: ctx->enable_cdpl2 = true; return 0; case Opt_mba_mpbs: ctx->enable_mba_mbps = true; return 0; } return -EINVAL; } (2) fs_lookup_param(). This takes a { dirfd, path, LOOKUP_EMPTY? } or string value and performs an appropriate path lookup to convert it into a path object, which it will then return. If the desired type was a blockdev, the type of the looked up inode will be checked to make sure it is one. This can be used like: enum foo_param { Opt_source, nr__foo_params }; const struct fs_parameter_spec foo_param_specs[nr__foo_params] = { [Opt_source] = { fs_param_is_blockdev }, }; const char char foo_param_keys[nr__foo_params] = { [Opt_source] = "source", }; const struct constant_table foo_param_alt_keys[] = { { "device", Opt_source }, }; const struct fs_parameter_description foo_parser = { .name = "foo", .nr_params = nr__foo_params, .nr_alt_keys = ARRAY_SIZE(foo_param_alt_keys), .keys = foo_param_keys, .alt_keys = foo_param_alt_keys, .specs = foo_param_specs, }; int foo_parse_param(struct fs_context fc, struct fs_parameter param) { struct fs_parse_result parse; struct foo_fs_context ctx = foo_fc2context(fc); int ret; ret = fs_parse(fc, &foo_parser, param, &parse); if (ret < 0) return ret; switch (parse.key) { case Opt_source: return fs_lookup_param(fc, &foo_parser, param, &parse, &ctx->source); default: return -EINVAL; } } (3) lookup_constant(). This takes a table of named constants and looks up the given name within it. The table is expected to be sorted such that bsearch() be used upon it. Possibly I should require the table be terminated and just use a for-loop to scan it instead of using bsearch() to reduce hassle. Tables look something like: static const struct constant_table bool_names[] = { { "0", false }, { "1", true }, { "false", false }, { "no", false }, { "true", true }, { "yes", true }, }; and a lookup is done with something like: b = lookup_constant(bool_names, param->string, -1); Additionally, optional validation routines for the parameter description are provided that can be enabled at compile time. A later patch will invoke these when a filesystem is registered. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	vfs: Introduce logging functions	David Howells	1	-0/+42
	Introduce a set of logging functions through which informational messages, warnings and error messages incurred by the mount procedure can be logged and, in a future patch, passed to userspace instead by way of the filesystem configuration context file descriptor. There are four functions: (1) infof(const char fmt, ...); Logs an informational message. (2) warnf(const char fmt, ...); Logs a warning message. (3) errorf(const char fmt, ...); Logs an error message. (4) invalf(const char fmt, ...); As errof(), but returns -EINVAL so can be used on a return statement. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	introduce fs_context methods	Al Viro	5	-16/+65
	Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	fs_context flavour for submounts	Al Viro	2	-0/+13
	This is an eventual replacement for vfs_submount() uses. Unlike the "mount" and "remount" cases, the users of that thing are not in VFS - they are buried in various ->d_automount() instances and rather than converting them all at once we introduce the (thankfully small and simple) infrastructure here and deal with the prospective users in afs, nfs, etc. parts of the series. Here we just introduce a new constructor (fs_context_for_submount()) along with the corresponding enum constant to be put into fc->purpose for those. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	convert do_remount_sb() to fs_context	David Howells	6	-59/+152
	Replace do_remount_sb() with a function, reconfigure_super(), that's fs_context aware. The fs_context is expected to be parameterised already and have ->root pointing to the superblock to be reconfigured. A legacy wrapper is provided that is intended to be called from the fs_context ops when those appear, but for now is called directly from reconfigure_super(). This wrapper invokes the ->remount_fs() superblock op for the moment. It is intended that the remount_fs() op will be phased out. The fs_context->purpose is set to FS_CONTEXT_FOR_RECONFIGURE to indicate that the context is being used for reconfiguration. do_umount_root() is provided to consolidate remount-to-R/O for umount and emergency remount by creating a context and invoking reconfiguration. do_remount(), do_umount() and do_emergency_remount_callback() are switched to use the new process. [AV -- fold UMOUNT and EMERGENCY_REMOUNT in; fixes the umount / bug, gets rid of pointless complexity] [AV -- set ->net_ns in all cases; nfs remount will need that] [AV -- shift security_sb_remount() call into reconfigure_super(); the callers that didn't do security_sb_remount() have NULL fc->security anyway, so it's a no-op for them] Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	vfs_get_tree(): evict the call of security_sb_kern_mount()	Al Viro	4	-17/+19
	Right now vfs_get_tree() calls security_sb_kern_mount() (i.e. mount MAC) unless it gets MS_KERNMOUNT or MS_SUBMOUNT in flags. Doing it that way is both clumsy and imprecise. Consider the callers' tree of vfs_get_tree(): vfs_get_tree() <- do_new_mount() <- vfs_kern_mount() <- simple_pin_fs() <- vfs_submount() <- kern_mount_data() <- init_mount_tree() <- btrfs_mount() <- vfs_get_tree() <- nfs_do_root_mount() <- nfs4_try_mount() <- nfs_fs_mount() <- vfs_get_tree() <- nfs4_referral_mount() do_new_mount() always does need MAC (we are guaranteed that neither MS_KERNMOUNT nor MS_SUBMOUNT will be passed there). simple_pin_fs(), vfs_submount() and kern_mount_data() pass explicit flags inhibiting that check. So does nfs4_referral_mount() (the flags there are ulimately coming from vfs_submount()). init_mount_tree() is called too early for anything LSM-related; it doesn't matter whether we attempt those checks, they'll do nothing. Finally, in case of btrfs_mount() and nfs_fs_mount(), doing MAC is pointless - either the caller will do it, or the flags are such that we wouldn't have done it either. In other words, the one and only case when we want that check done is when we are called from do_new_mount(), and there we want it unconditionally. So let's simply move it there. The superblock is still locked, so nobody is going to get access to it (via ustat(2), etc.) until we get a chance to apply the checks - we are free to move them to any point up to where we drop ->s_umount (in do_new_mount_fc()). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	new helper: do_new_mount_fc()	David Howells	1	-26/+39
	Create an fs_context-aware version of do_new_mount(). This takes an fs_context with a superblock already attached to it. Make do_new_mount() use do_new_mount_fc() rather than do_new_mount(); this allows the consolidation of the mount creation, check and add steps. To make this work, mount_too_revealing() is changed to take a superblock rather than a mount (which the fs_context doesn't have available), allowing this check to be done before the mount object is created. Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	teach vfs_get_tree() to handle subtype, switch do_new_mount() to it	Al Viro	3	-32/+52
	Roll the handling of subtypes into do_new_mount() and vfs_get_tree(). The former determines any subtype string and hangs it off the fs_context; the latter applies it. Make do_new_mount() create, parameterise and commit an fs_context and create a mount for itself rather than calling vfs_kern_mount(). [AV -- missing kstrdup()] [AV -- ... and no kstrdup() if we get to setting ->s_submount - we simply transfer it from fc, leaving NULL behind] [AV -- constify ->s_submount, while we are at it] Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	new helpers: vfs_create_mount(), fc_mount()	Al Viro	2	-24/+55
	Create a new helper, vfs_create_mount(), that creates a detached vfsmount object from an fs_context that has a superblock attached to it. Almost all uses will be paired with immediately preceding vfs_get_tree(); add a helper for such combination. Switch vfs_kern_mount() to use this. NOTE: mild behaviour change; passing NULL as 'device name' to something like procfs will change /proc/*/mountstats - "device none" instead on "no device". That is consistent with /proc/mounts et.al. [do'h - EXPORT_SYMBOL_GPL slipped in by mistake; removed] [AV -- remove confused comment from vfs_create_mount()] [AV -- removed the second argument] Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	vfs: Introduce fs_context, switch vfs_kern_mount() to it.	David Howells	6	-44/+310
	Introduce a filesystem context concept to be used during superblock creation for mount and superblock reconfiguration for remount. This is allocated at the beginning of the mount procedure and into it is placed: (1) Filesystem type. (2) Namespaces. (3) Source/Device names (there may be multiple). (4) Superblock flags (SB_*). (5) Security details. (6) Filesystem-specific data, as set by the mount options. Accessor functions are then provided to set up a context, parameterise it from monolithic mount data (the data page passed to mount(2)) and tear it down again. A legacy wrapper is provided that implements what will be the basic operations, wrapping access to filesystems that aren't yet aware of the fs_context. Finally, vfs_kern_mount() is changed to make use of the fs_context and mount_fs() is replaced by vfs_get_tree(), called from vfs_kern_mount(). [AV -- add missing kstrdup()] [AV -- put_cred() can be unconditional - fc->cred can't be NULL] [AV -- take legacy_validate() contents into legacy_parse_monolithic()] [AV -- merge KERNEL_MOUNT and USER_MOUNT] [AV -- don't unlock superblock on success return from vfs_get_tree()] [AV -- kill 'reference' argument of init_fs_context()] Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	saner handling of temporary namespaces	Al Viro	2	-39/+40
	mount_subtree() creates (and soon destroys) a temporary namespace, so that automounts could function normally. These beasts should never become anyone's current namespaces; they don't, but it would be better to make prevention of that more straightforward. And since they don't become anyone's current namespace, we don't need to bother with reserving procfs inums for those. Teach alloc_mnt_ns() to skip inum allocation if told so, adjust put_mnt_ns() accordingly, make mount_subtree() use temporary (anon) namespace. is_anon_ns() checks if a namespace is such. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-30	separate copying and locking mount tree on cross-userns copies	Al Viro	3	-29/+38
	Rather than having propagate_mnt() check doing unprivileged copies, lock them before commit_tree(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-17	kill kernfs_pin_sb()	Al Viro	2	-31/+0
	unused now and impossible to use safely anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-17	cgroup: saner refcounting for cgroup_root	Al Viro	3	-55/+21
	* make the reference from superblock to cgroup_root counting - do cgroup_put() in cgroup_kill_sb() whether we'd done percpu_ref_kill() or not; matching grab is done when we allocate a new root. That gives the same refcounting rules for all callers of cgroup_do_mount() - a reference to cgroup_root has been grabbed by caller and it either is transferred to new superblock or dropped. * have cgroup_kill_sb() treat an already killed refcount as "just don't bother killing it, then". * after successful cgroup_do_mount() have cgroup1_mount() recheck if we'd raced with mount/umount from somebody else and cgroup_root got killed. In that case we drop the superblock and bugger off with -ERESTARTSYS, same as if we'd found it in the list already dying. * don't bother with delayed initialization of refcount - it's unreliable and not needed. No need to prevent attempts to bump the refcount if we find cgroup_root of another mount in progress - sget will reuse an existing superblock just fine and if the other sb manages to die before we get there, we'll catch that immediately after cgroup_do_mount(). * don't bother with kernfs_pin_sb() - no need for doing that either. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-17	fix cgroup_do_mount() handling of failure exits	Al Viro	2	-5/+12
	same story as with last May fixes in sysfs (7b745a4e4051 "unfuck sysfs_mount()"); new_sb is left uninitialized in case of early errors in kernfs_mount_ns() and papering over it by treating any error from kernfs_mount_ns() as equivalent to !new_ns ends up conflating the cases when objects had never been transferred to a superblock with ones when that has happened and resulting new superblock had been dropped. Easily fixed (same way as in sysfs case). Additionally, there's a superblock leak on kernfs_node_dentry() failure and a dentry leak inside kernfs_node_dentry() itself - the latter on probably impossible errors, but the former not impossible to trigger (as the matter of fact, injecting allocation failures at that point does trigger it). Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-01-14	Linux 5.0-rc2	Linus Torvalds	1	-1/+1

2019-01-14	kernel/sys.c: Clarify that UNAME26 does not generate unique versions anymore	Jonathan Neuschäfer	1	-1/+2
	UNAME26 is a mechanism to report Linux's version as 2.6.x, for compatibility with old/broken software. Due to the way it is implemented, it would have to be updated after 5.0, to keep the resulting versions unique. Linus Torvalds argued: "Do we actually need this? I'd rather let it bitrot, and just let it return random versions. It will just start again at 2.4.60, won't it? Anybody who uses UNAME26 for a 5.x kernel might as well think it's still 4.x. The user space is so old that it can't possibly care about differences between 4.x and 5.x, can it? The only thing that matters is that it shows "2.4.<largeenough>", which it will do regardless" Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-12	phy: fix build breakage: add PHY_MODE_SATA	John Hubbard	2	-2/+4
	Commit 49e54187ae0b ("ata: libahci_platform: comply to PHY framework") uses the PHY_MODE_SATA, but that enum had not yet been added. This caused a build failure for me, with today's linux.git. Also, there is a potentially conflicting (mis-named) PHY_MODE_SATA, hiding in the Marvell Berlin SATA PHY driver. Fix the build by: 1) Renaming Marvell's defined value to a more scoped name, in order to avoid any potential conflicts: PHY_BERLIN_MODE_SATA. 2) Adding the missing enum, which was going to be added anyway as part of [1]. [1] https://lkml.kernel.org/r/20190108163124.6409-3-miquel.raynal@bootlin.com Fixes: 49e54187ae0b ("ata: libahci_platform: comply to PHY framework") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Jens Axboe <axboe@kernel.dk> Acked-by: Olof Johansson <olof@lixom.net> Cc: Grzegorz Jaszczyk <jaz@semihalf.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-11	ata: ahci: mvebu: request PHY suspend/resume for Armada 3700	Miquel Raynal	1	-0/+3
	A feature has been added in the libahci driver: the possibility to set a new flag in hpriv->flags to let the core handle PHY suspend/resume automatically. Make use of this feature to make suspend to RAM work with SATA drives on A3700. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-11	ata: ahci: mvebu: add Armada 3700 initialization needed for S2RAM	Miquel Raynal	1	-9/+18
	A3700 comphy initialization is done in the firmware (TF-A). Looking at the SATA PHY initialization routine, there is a comment about "vendor specific" registers. Two registers are mentioned. They are not initialized there in the firmware because they are AHCI related, while the firmware at this location does only PHY configuration. The solution to avoid doing such initialization is relying on U-Boot. While this work at boot time, U-Boot is definitely not going to run during a resume after suspending to RAM. Two possible solutions were considered: * Fixing the firmware. * Fixing the kernel driver. The first solution would take ages to propagate, while the second solution is easy to implement as the driver as been a little bit reworked to prepare for such platform configuration. Hence, this patch adds an Armada 3700 configuration function to set these two registers both at boot time (in the probe) and after a suspend (in the resume path). Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-11	ata: ahci: mvebu: do Armada 38x configuration only on relevant SoCs	Miquel Raynal	1	-17/+51
	At the beginning, only Armada 38x SoCs where supported by the ahci_mvebu.c driver. Commit 15d3ce7b63bd ("ata: ahci_mvebu: add support for Armada 3700 variant") introduced Armada 3700 support. As opposed to Armada 38x SoCs, the 3700 variants do not have to configure mbus and the regret option. This patch took care of avoiding such configuration when not needed in the probe function, but failed to do the same in the resume path. While doing so looks harmless by experience, let's clean the driver logic and avoid doing this useless configuration with Armada 3700 SoCs. Because the logic is very similar between these two places, it has been decided to factorize this code and put it in a "Armada 38x configuration function". This function is part of a new (per-compatible) platform data structure, so that the addition of such configuration function for Armada 3700 will be eased. Fixes: 15d3ce7b63bd ("ata: ahci_mvebu: add support for Armada 3700 variant") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-11	ata: ahci: mvebu: remove stale comment	Miquel Raynal	1	-5/+0
	For Armada-38x (32-bit) SoCs, PM platform support has been added since: commit 32f9494c9dfd ("ARM: mvebu: prepare pm-board.c for the introduction of Armada 38x support") commit 3cbd6a6ca81c ("ARM: mvebu: Add standby support") For Armada 64-bit SoCs, like the A3700 also using this AHCI driver, PM platform support has always existed. There are even suspend/resume hooks in this driver since: commit d6ecf15814888 ("ata: ahci_mvebu: add suspend/resume support") Remove the stale comment at the end of this driver stating that all the above does not exist yet. Fixes: d6ecf15814888 ("ata: ahci_mvebu: add suspend/resume support") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-11	ata: libahci_platform: comply to PHY framework	Miquel Raynal	2	-0/+15
	Current implementation of the libahci does not take into account the new PHY framework. Correct the situation by adding a call to phy_set_mode() before phy_power_on(). PHYs should also be handled at suspend/resume time. For this, call ahci_platform_enable/disable_phys() at suspend/resume_host() time. These calls are guarded by a HFLAG (AHCI_HFLAG_SUSPEND_PHYS) that the user of the libahci driver must set manually in hpriv->flags at probe time. This is to avoid breaking users that have not been tested with this change. Reviewed-by: Hans de Goede <hdegoede@redhat.com> Suggested-by: Grzegorz Jaszczyk <jaz@semihalf.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-11	x86/kvm/nVMX: don't skip emulated instruction twice when vmptr address is not backed	Vitaly Kuznetsov	1	-2/+1
	Since commit 09abb5e3e5e50 ("KVM: nVMX: call kvm_skip_emulated_instruction in nested_vmx_{fail,succeed}") nested_vmx_failValid() results in kvm_skip_emulated_instruction() so doing it again in handle_vmptrld() when vmptr address is not backed is wrong, we end up advancing RIP twice. Fixes: fca91f6d60b6e ("kvm: nVMX: Set VM instruction error for VMPTRLD of unbacked page") Reported-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-01-11	Documentation/virtual/kvm: Update URL for AMD SEV API specification	Christophe de Dinechin	1	-1/+1
	The URL of [api-spec] in Documentation/virtual/kvm/amd-memory-encryption.rst is no longer valid, replaced space with underscore. Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-01-11	KVM/VMX: Avoid return error when flush tlb successfully in the hv_remote_flush_tlb_with_range()	Lan Tianyu	1	-1/+1
	The "ret" is initialized to be ENOTSUPP. The return value of __hv_remote_flush_tlb_with_range() will be Or with "ret" when ept table potiners are mismatched. This will cause return ENOTSUPP even if flush tlb successfully. This patch is to fix the issue and set "ret" to 0. Fixes: a5c214dad198 ("KVM/VMX: Change hv flush logic when ept tables are mismatched.") Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-01-11	kvm: sev: Fail KVM_SEV_INIT if already initialized	David Rientjes	1	-0/+3
	By code inspection, it was found that multiple calls to KVM_SEV_INIT could deplete asid bits and overwrite kvm_sev_info's regions_list. Multiple calls to KVM_SVM_INIT is not likely to occur with QEMU, but this should likely be fixed anyway. This code is serialized by kvm->lock. Fixes: 1654efcbc431 ("KVM: SVM: Add KVM_SEV_INIT command") Reported-by: Cfir Cohen <cfir@google.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-01-11	KVM: validate userspace input in kvm_clear_dirty_log_protect()	Tomas Bortoli	1	-2/+7
	The function at issue does not fully validate the content of the structure pointed by the log parameter, though its content has just been copied from userspace and lacks validation. Fix that. Moreover, change the type of n to unsigned long as that is the type returned by kvm_dirty_bitmap_bytes(). Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> Reported-by: syzbot+028366e52c9ace67deb3@syzkaller.appspotmail.com [Squashed the fix from Paolo. - Radim.] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-01-11	KVM: x86: Fix bit shifting in update_intel_pt_cfg	Gustavo A. R. Silva	1	-1/+1
	ctl_bitmask in pt_desc is of type u64. When an integer like 0xf is being left shifted more than 32 bits, the behavior is undefined. Fix this by adding suffix ULL to integer 0xf. Addresses-Coverity-ID: 1476095 ("Bad bit shift operation") Fixes: 6c0f0bba85a0 ("KVM: x86: Introduce a function to initialize the PT configuration") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2019-01-11	tty: Don't hold ldisc lock in tty_reopen() if ldisc present	Dmitry Safonov	1	-7/+13
	Try to get reference for ldisc during tty_reopen(). If ldisc present, we don't need to do tty_ldisc_reinit() and lock the write side for line discipline semaphore. Effectively, it optimizes fast-path for tty_reopen(), but more importantly it won't interrupt ongoing IO on the tty as no ldisc change is needed. Fixes user-visible issue when tty_reopen() interrupted login process for user with a long password, observed and reported by Lukas. Fixes: c96cf923a98d ("tty: Don't block on IO when ldisc change is pending") Fixes: 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()") Cc: Jiri Slaby <jslaby@suse.com> Reported-by: Lukas F. Hartmann <lukas@mntmn.com> Tested-by: Lukas F. Hartmann <lukas@mntmn.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-11	cifs: update internal module version number	Steve French	1	-1/+1
	To 2.16 Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	CIFS: Fix error paths in writeback code	Pavel Shilovsky	4	-9/+56
	This patch aims to address writeback code problems related to error paths. In particular it respects EINTR and related error codes and stores and returns the first error occurred during writeback. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	CIFS: Move credit processing to mid callbacks for SMB3	Pavel Shilovsky	1	-17/+34
	Currently we account for credits in the thread initiating a request and waiting for a response. The demultiplex thread receives the response, wakes up the thread and the latter collects credits from the response buffer and add them to the server structure on the client. This approach is not accurate, because it may race with reconnect events in the demultiplex thread which resets the number of credits. Fix this by moving credit processing to new mid callbacks that collect credits granted by the server from the response in the demultiplex thread. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	CIFS: Fix credits calculation for cancelled requests	Pavel Shilovsky	2	-2/+27
	If a request is cancelled, we can't assume that the server returns 1 credit back. Instead we need to wait for a response and process the number of credits granted by the server. Create a separate mid callback for cancelled request, parse the number of credits in a response buffer and add them to the client's credits. If the didn't get a response (no response buffer available) assume 0 credits granted. The latter most probably happens together with session reconnect, so the client's credits are adjusted anyway. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	cifs: Fix potential OOB access of lock element array	Ross Lagerwall	2	-6/+6
	If maxBuf is small but non-zero, it could result in a zero sized lock element array which we would then try and access OOB. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org>
2019-01-11	cifs: Limit memory used by lock request calls to a page	Ross Lagerwall	2	-0/+12
	The code tries to allocate a contiguous buffer with a size supplied by the server (maxBuf). This could fail if memory is fragmented since it results in high order allocations for commonly used server implementations. It is also wasteful since there are probably few locks in the usual case. Limit the buffer to be no larger than a page to avoid memory allocation failures due to fragmentation. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	cifs: move large array from stack to heap	Aurelien Aptel	2	-14/+32
	This addresses some compile warnings that you can see depending on configuration settings. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	CIFS: Do not hide EINTR after sending network packets	Pavel Shilovsky	1	-1/+1
	Currently we hide EINTR code returned from sock_sendmsg() and return 0 instead. This makes a caller think that we successfully completed the network operation which is not true. Fix this by properly returning EINTR to callers. Cc: <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2019-01-11	ARM: integrator: impd1: use struct_size() in devm_kzalloc()	Gustavo A. R. Silva	1	-1/+1
	One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void entry[]; }; instance = devm_kzalloc(dev, sizeof(struct foo) + sizeof(void ) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = devm_kzalloc(dev, struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
2019-01-11	arm64: kexec_file: return successfully even if kaslr-seed doesn't exist	AKASHI Takahiro	1	-1/+3
	In kexec_file_load, kaslr-seed property of the current dtb will be deleted any way before setting a new value if possible. It doesn't matter whether it exists in the current dtb. So "ret" should be reset to 0 here. Fixes: commit 884143f60c89 ("arm64: kexec_file: add kaslr support") Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-01-11	ACPI/IORT: Fix rc_dma_get_range()	Jean-Philippe Brucker	1	-1/+2
	When executed for a PCI_ROOT_COMPLEX type, iort_match_node_callback() expects the opaque pointer argument to be a PCI bus device. At the moment rc_dma_get_range() passes the PCI endpoint instead of the bus, and we've been lucky to have pci_domain_nr(ptr) return 0 instead of crashing. Pass the bus device to iort_scan_node(). Fixes: 5ac65e8c8941 ("ACPI/IORT: Support address size limit for root complexes") Reported-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Cc: stable@vger.kernel.org Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Will Deacon <will.deacon@arm.com>