aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-02-12selftests/cgroup: add tests for cloning into cgroupsChristian Brauner5-5/+214
Expand the cgroup test-suite to include tests for CLONE_INTO_CGROUP. This adds the following tests: - CLONE_INTO_CGROUP manages to clone a process directly into a correctly delegated cgroup - CLONE_INTO_CGROUP fails to clone a process into a cgroup that has been removed after we've opened an fd to it - CLONE_INTO_CGROUP fails to clone a process into an invalid domain cgroup - CLONE_INTO_CGROUP adheres to the no internal process constraint - CLONE_INTO_CGROUP works with the freezer feature Cc: Tejun Heo <tj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: cgroups@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Acked-by: Roman Gushchin <guro@fb.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12clone3: allow spawning processes into cgroupsChristian Brauner7-39/+214
This adds support for creating a process in a different cgroup than its parent. Callers can limit and account processes and threads right from the moment they are spawned: - A service manager can directly spawn new services into dedicated cgroups. - A process can be directly created in a frozen cgroup and will be frozen as well. - The initial accounting jitter experienced by process supervisors and daemons is eliminated with this. - Threaded applications or even thread implementations can choose to create a specific cgroup layout where each thread is spawned directly into a dedicated cgroup. This feature is limited to the unified hierarchy. Callers need to pass a directory file descriptor for the target cgroup. The caller can choose to pass an O_PATH file descriptor. All usual migration restrictions apply, i.e. there can be no processes in inner nodes. In general, creating a process directly in a target cgroup adheres to all migration restrictions. One of the biggest advantages of this feature is that CLONE_INTO_GROUP does not need to grab the write side of the cgroup cgroup_threadgroup_rwsem. This global lock makes moving tasks/threads around super expensive. With clone3() this lock is avoided. Cc: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: cgroups@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: add cgroup_may_write() helperChristian Brauner1-7/+17
Add a cgroup_may_write() helper which we can use in the CLONE_INTO_CGROUP patch series to verify that we can write to the destination cgroup. Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: refactor fork helpersChristian Brauner2-23/+27
This refactors the fork helpers so they can be easily modified in the next patches. The patch just moves the cgroup threadgroup rwsem grab and release into the helpers. They don't need to be directly exposed in fork.c. Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: add cgroup_get_from_file() helperChristian Brauner1-11/+19
Add a helper cgroup_get_from_file(). The helper will be used in subsequent patches to retrieve a cgroup while holding a reference to the struct file it was taken from. Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Li Zefan <lizefan@huawei.com> Cc: cgroups@vger.kernel.org Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: unify attach permission checkingChristian Brauner1-14/+25
The core codepaths to check whether a process can be attached to a cgroup are the same for threads and thread-group leaders. Only a small piece of code verifying that source and destination cgroup are in the same domain differentiates the thread permission checking from thread-group leader permission checking. Since cgroup_migrate_vet_dst() only matters cgroup2 - it is a noop on cgroup1 - we can move it out of cgroup_attach_task(). All checks can now be consolidated into a new helper cgroup_attach_permissions() callable from both cgroup_procs_write() and cgroup_threads_write(). Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: cgroups@vger.kernel.org Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cpuset: Make cpuset hotplug synchronousPrateek Sood3-17/+19
Convert cpuset_hotplug_workfn() into synchronous call for cpu hotplug path. For memory hotplug path it still gets queued as a work item. Since cpuset_hotplug_workfn() can be made synchronous for cpu hotplug path, it is not required to wait for cpuset hotplug while thawing processes. Signed-off-by: Prateek Sood <prsood@codeaurora.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup.c: Use built-in RCU list checkingMadhuparna Bhowmik1-1/+2
list_for_each_entry_rcu has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Even though the function css_next_child() already checks if cgroup_mutex or rcu_read_lock() is held using cgroup_assert_mutex_or_rcu_locked(), there is a need to pass cond to list_for_each_entry_rcu() to avoid false positive lockdep warning. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12kselftest/cgroup: add cgroup destruction testSuren Baghdasaryan1-0/+113
Add new test to verify that a cgroup with dead processes can be destroyed. The test spawns a child process which allocates and touches 100MB of RAM to ensure prolonged exit. Subsequently it kills the child, waits until the cgroup containing the child is empty and destroys the cgroup. Signed-off-by: Suren Baghdasaryan <surenb@google.com> [mkoutny@suse.com: Fix typo in test_cgcore_destroy comment] Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: Clean up css_set task traversalMichal Koutný2-36/+28
css_task_iter stores pointer to head of each iterable list, this dates back to commit 0f0a2b4fa621 ("cgroup: reorganize css_task_iter") when we did not store cur_cset. Let us utilize list heads directly in cur_cset and streamline css_task_iter_advance_css_set a bit. This is no intentional function change. Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: Iterate tasks that did not finish do_exit()Michal Koutný2-7/+17
PF_EXITING is set earlier than actual removal from css_set when a task is exitting. This can confuse cgroup.procs readers who see no PF_EXITING tasks, however, rmdir is checking against css_set membership so it can transitionally fail with EBUSY. Fix this by listing tasks that weren't unlinked from css_set active lists. It may happen that other users of the task iterator (without CSS_TASK_ITER_PROCS) spot a PF_EXITING task before cgroup_exit(). This is equal to the state before commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") but it may be reviewed later. Reported-by: Suren Baghdasaryan <surenb@google.com> Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup: cgroup_procs_next should increase position indexVasily Averin1-3/+7
If seq_file .next fuction does not change position index, read after some lseek can generate unexpected output: 1) dd bs=1 skip output of each 2nd elements $ dd if=/sys/fs/cgroup/cgroup.procs bs=8 count=1 2 3 4 5 1+0 records in 1+0 records out 8 bytes copied, 0,000267297 s, 29,9 kB/s [test@localhost ~]$ dd if=/sys/fs/cgroup/cgroup.procs bs=1 count=8 2 4 <<< NB! 3 was skipped 6 <<< ... and 5 too 8 <<< ... and 7 8+0 records in 8+0 records out 8 bytes copied, 5,2123e-05 s, 153 kB/s This happen because __cgroup_procs_start() makes an extra extra cgroup_procs_next() call 2) read after lseek beyond end of file generates whole last line. 3) read after lseek into middle of last line generates expected rest of last line and unexpected whole line once again. Additionally patch removes an extra position index changes in __cgroup_procs_start() Cc: stable@vger.kernel.org https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12cgroup-v1: cgroup_pidlist_next should update position indexVasily Averin1-0/+1
if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. # mount | grep cgroup # dd if=/mnt/cgroup.procs bs=1 # normal output ... 1294 1295 1296 1304 1382 584+0 records in 584+0 records out 584 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 83 <<< generates end of last line 1383 <<< ... and whole last line once again 0+1 records in 0+1 records out 8 bytes copied dd: /mnt/cgroup.procs: cannot skip to specified offset 1386 <<< generates last line anyway 0+1 records in 0+1 records out 5 bytes copied https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2020-02-12linux/pipe_fs_i.h: fix kernel-doc warnings after @wait was splitRandy Dunlap1-1/+2
Fix kernel-doc warnings in struct pipe_inode_info after @wait was split into @rd_wait and @wr_wait. include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'rd_wait' not described in 'pipe_inode_info' include/linux/pipe_fs_i.h:66: warning: Function parameter or member 'wr_wait' not described in 'pipe_inode_info' Fixes: 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-12Merge tag 'kbuild-fixes-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds2-3/+3
Pull Kbuild fixes from Masahiro Yamada: - fix memory corruption in scripts/kallsyms - fix the vmlinux link stage to correctly update compile.h * tag 'kbuild-fixes-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: fix mismatch between .version and include/generated/compile.h scripts/kallsyms: fix memory corruption caused by write over-run
2020-02-11Merge tag 'dax-fixes-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds6-24/+12
Pull dax fixes from Dan Williams: "A fix for an xfstest failure and some and an update that removes an fsdax dependency on block devices. Summary: - Fix RWF_NOWAIT writes to properly return -EAGAIN - Clean up an unused helper - Update dax_writeback_mapping_range to not need a block_device argument" * tag 'dax-fixes-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: pass NOWAIT flag to iomap_apply dax: Get rid of fs_dax_get_by_host() helper dax: Pass dax_dev instead of bdev to dax_writeback_mapping_range()
2020-02-11Merge tag 'trace-v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds12-173/+167
Pull tracing fixes from Steven Rostedt: "Various fixes: - Fix an uninitialized variable - Fix compile bug to bootconfig userspace tool (in tools directory) - Suppress some error messages of bootconfig userspace tool - Remove unneded CONFIG_LIBXBC from bootconfig - Allocate bootconfig xbc_nodes dynamically. To ease complaints about taking up static memory at boot up - Use of parse_args() to parse bootconfig instead of strstr() usage Prevents issues of double quotes containing the interested string - Fix missing ring_buffer_nest_end() on synthetic event error path - Return zero not -EINVAL on soft disabled synthetic event (soft disabling must be the same as hard disabling, which returns zero) - Consolidate synthetic event code (remove duplicate code)" * tag 'trace-v5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Consolidate trace() functions tracing: Don't return -EINVAL when tracing soft disabled synth events tracing: Add missing nest end to synth_event_trace_start() error case tools/bootconfig: Suppress non-error messages bootconfig: Allocate xbc_nodes array dynamically bootconfig: Use parse_args() to find bootconfig and '--' tracing/kprobe: Fix uninitialized variable bug bootconfig: Remove unneeded CONFIG_LIBXBC tools/bootconfig: Fix wrong __VA_ARGS__ usage
2020-02-10tracing: Consolidate trace() functionsTom Zanussi2-135/+87
Move the checking, buffer reserve and buffer commit code in synth_event_trace_start/end() into inline functions __synth_event_trace_start/end() so they can also be used by synth_event_trace() and synth_event_trace_array(), and then have all those functions use them. Also, change synth_event_trace_state.enabled to disabled so it only needs to be set if the event is disabled, which is not normally the case. Link: http://lkml.kernel.org/r/b1f3108d0f450e58192955a300e31d0405ab4149.1581374549.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10tracing: Don't return -EINVAL when tracing soft disabled synth eventsTom Zanussi1-14/+6
There's no reason to return -EINVAL when tracing a synthetic event if it's soft disabled - treat it the same as if it were hard disabled and return normally. Have synth_event_trace() and synth_event_trace_array() just return normally, and have synth_event_trace_start set the trace state to disabled and return. Link: http://lkml.kernel.org/r/df5d02a1625aff97c9866506c5bada6a069982ba.1581374549.git.zanussi@kernel.org Fixes: 8dcc53ad956d2 ("tracing: Add synth_event_trace() and related functions") Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10tracing: Add missing nest end to synth_event_trace_start() error caseTom Zanussi1-0/+1
If the ring_buffer reserve in synth_event_trace_start() fails, the matching ring_buffer_nest_end() should be called in the error code, since nothing else will ever call it in this case. Link: http://lkml.kernel.org/r/20abc444b3eeff76425f895815380abe7aa53ff8.1581374549.git.zanussi@kernel.org Fixes: 8dcc53ad956d2 ("tracing: Add synth_event_trace() and related functions") Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10Merge branch 'for-5.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds1-5/+8
Pull cgroup fix from Tejun Heo: "I made a mistake while removing cgroup task list lazy init optimization making the root cgroup.procs show entries for the init_tasks. The zero entries doesn't cause critical failures but does make systemd print out warning messages during boot. Fix it by omitting init_tasks as they should be" * 'for-5.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: init_tasks shouldn't be linked to the root cgroup
2020-02-10Merge tag 'selinux-pr-20200210' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinuxLinus Torvalds2-10/+4
Pull SELinux fixes from Paul Moore: "Two small fixes: one fixes a locking problem in the recently merged label translation code, the other fixes an embarrassing 'binderfs' / 'binder' filesystem name check" * tag 'selinux-pr-20200210' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: fix sidtab string cache locking selinux: fix typo in filesystem name
2020-02-10tools/bootconfig: Suppress non-error messagesMasami Hiramatsu2-14/+23
Suppress non-error messages when applying new bootconfig to initrd image. To enable it, replace printf for error message with pr_err() macro. This also adds a testcase for this fix. Link: http://lkml.kernel.org/r/158125351377.16911.13283712972275131160.stgit@devnote2 Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10bootconfig: Allocate xbc_nodes array dynamicallyMasami Hiramatsu2-3/+24
To reduce the large static array from kernel data, allocate xbc_nodes array dynamically only if the kernel loads a bootconfig. Note that this also add dummy memblock.h for user-spacae bootconfig tool. Link: http://lkml.kernel.org/r/158108569699.3187.6512834527603883707.stgit@devnote2 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-11kbuild: fix mismatch between .version and include/generated/compile.hMasahiro Yamada1-1/+1
Since commit 56d589361572 ("kbuild: do not create orphan built-in.a or obj-y objects"), scripts/link-vmlinux.sh does nothing when descending into init/. Once the version number becomes out of sync between .version and include/generated/compile.h, it is not self-healing. [How to reproduce] $ echo 100 > .version $ make You will see the number in the .version is always bigger than that in compile.h by one. After this, every time you run 'make', the vmlinux is re-linked even when none of source files is updated. Fixes: 56d589361572 ("kbuild: do not create orphan built-in.a or obj-y objects") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-02-11scripts/kallsyms: fix memory corruption caused by write over-runMasahiro Yamada1-2/+2
memcpy() writes one more byte than allocated. Fixes: 8d60526999aa ("scripts/kallsyms: change table to store (strcut sym_entry *)") Reported-by: youling257 <youling257@gmail.com> Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Pavel Machek <pavel@ucw.cz>
2020-02-10bootconfig: Use parse_args() to find bootconfig and '--'Steven Rostedt (VMware)1-7/+30
The current implementation does a naive search of "bootconfig" on the kernel command line. But this could find "bootconfig" that is part of another option in quotes (although highly unlikely). But it also needs to find '--' on the kernel command line to know if it should append a '--' or not when a bootconfig in the initrd file has an "init" section. The check uses the naive strstr() to find to see if it exists. But this can return a false positive if it exists in an option and then the "init" section in the initrd will not be appended properly. Using parse_args() to find both of these will solve both of these problems. Link: https://lore.kernel.org/r/202002070954.C18E7F58B@keescook Fixes: 7495e0926fdf3 ("bootconfig: Only load bootconfig if "bootconfig" is on the kernel cmdline") Fixes: 1319916209ce8 ("bootconfig: init: Allow admin to use bootconfig for init command line") Reported-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10tracing/kprobe: Fix uninitialized variable bugGustavo A. R. Silva1-1/+1
There is a potential execution path in which variable *ret* is returned without being properly initialized, previously. Fix this by initializing variable *ret* to 0. Link: http://lkml.kernel.org/r/20200205223404.GA3379@embeddedor Addresses-Coverity-ID: 1491142 ("Uninitialized scalar variable") Fixes: 2a588dd1d5d6 ("tracing: Add kprobe event command generation functions") Reviewed-by: Tom Zanussi <zanussi@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10bootconfig: Remove unneeded CONFIG_LIBXBCMasami Hiramatsu3-5/+1
Since there is no user except CONFIG_BOOT_CONFIG and no plan to use it from other functions, CONFIG_LIBXBC can be removed and we can use CONFIG_BOOT_CONFIG directly. Link: http://lkml.kernel.org/r/158098769281.939.16293492056419481105.stgit@devnote2 Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-10tools/bootconfig: Fix wrong __VA_ARGS__ usageMasami Hiramatsu1-1/+1
Since printk() wrapper macro uses __VA_ARGS__ without "##" prefix, it causes a build error if there is no variable arguments (e.g. only fmt is specified.) To fix this error, use ##__VA_ARGS__ instead of __VAR_ARGS__. Link: http://lkml.kernel.org/r/158108370130.2758.10893830923800978011.stgit@devnote2 Fixes: 950313ebf79c ("tools: bootconfig: Add bootconfig command") Reported-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-02-09Linux 5.6-rc1Linus Torvalds1-2/+2
2020-02-09Merge tag 'kbuild-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuildLinus Torvalds53-261/+252
Pull more Kbuild updates from Masahiro Yamada: - fix randconfig to generate a sane .config - rename hostprogs-y / always to hostprogs / always-y, which are more natual syntax. - optimize scripts/kallsyms - fix yes2modconfig and mod2yesconfig - make multiple directory targets ('make foo/ bar/') work * tag 'kbuild-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: make multiple directory targets work kconfig: Invalidate all symbols after changing to y or m. kallsyms: fix type of kallsyms_token_table[] scripts/kallsyms: change table to store (strcut sym_entry *) scripts/kallsyms: rename local variables in read_symbol() kbuild: rename hostprogs-y/always to hostprogs/always-y kbuild: fix the document to use extra-y for vmlinux.lds kconfig: fix broken dependency in randconfig-generated .config
2020-02-09Merge tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefsLinus Torvalds9-0/+2058
Pull new zonefs file system from Damien Le Moal: "Zonefs is a very simple file system exposing each zone of a zoned block device as a file. Unlike a regular file system with native zoned block device support (e.g. f2fs or the on-going btrfs effort), zonefs does not hide the sequential write constraint of zoned block devices to the user. As a result, zonefs is not a POSIX compliant file system. Its goal is to simplify the implementation of zoned block devices support in applications by replacing raw block device file accesses with a richer file based API, avoiding relying on direct block device file ioctls which may be more obscure to developers. One example of this approach is the implementation of LSM (log-structured merge) tree structures (such as used in RocksDB and LevelDB) on zoned block devices by allowing SSTables to be stored in a zone file similarly to a regular file system rather than as a range of sectors of a zoned device. The introduction of the higher level construct "one file is one zone" can help reducing the amount of changes needed in the application while at the same time allowing the use of zoned block devices with various programming languages other than C. Zonefs IO management implementation uses the new iomap generic code. Zonefs has been successfully tested using a functional test suite (available with zonefs userland format tool on github) and a prototype implementation of LevelDB on top of zonefs" * tag 'zonefs-5.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Add documentation fs: New zonefs file system
2020-02-09irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARMMarc Zyngier1-2/+2
In order to allow the GICv4 code to link properly on 32bit ARM, make sure we don't use 64bit divisions when it isn't strictly necessary. Fixes: 4e6437f12d6e ("irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-09Merge tag '5.6-rc-smb3-plugfest-patches' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds22-129/+247
Pull cifs fixes from Steve French: "13 cifs/smb3 patches, most from testing at the SMB3 plugfest this week: - Important fix for multichannel and for modefromsid mounts. - Two reconnect fixes - Addition of SMB3 change notify support - Backup tools fix - A few additional minor debug improvements (tracepoints and additional logging found useful during testing this week)" * tag '5.6-rc-smb3-plugfest-patches' of git://git.samba.org/sfrench/cifs-2.6: smb3: Add defines for new information level, FileIdInformation smb3: print warning once if posix context returned on open smb3: add one more dynamic tracepoint missing from strict fsync path cifs: fix mode bits from dir listing when mounted with modefromsid cifs: fix channel signing cifs: add SMB3 change notification support cifs: make multichannel warning more visible cifs: fix soft mounts hanging in the reconnect code cifs: Add tracepoints for errors on flush or fsync cifs: log warning message (once) if out of disk space cifs: fail i/o on soft mounts if sessionsetup errors out smb3: fix problem with null cifs super block with previous patch SMB3: Backup intent flag missing from some more ops
2020-02-09Merge branch 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds12-0/+3280
Pull vboxfs from Al Viro: "This is the VirtualBox guest shared folder support by Hans de Goede, with fixups for fs_parse folded in to avoid bisection hazards from those API changes..." * 'work.vboxsf' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Add VirtualBox guest shared folder (vboxsf) support
2020-02-09Merge tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds13-11/+260
Pull x86 fixes from Thomas Gleixner: "A set of fixes for X86: - Ensure that the PIT is set up when the local APIC is disable or configured in legacy mode. This is caused by an ordering issue introduced in the recent changes which skip PIT initialization when the TSC and APIC frequencies are already known. - Handle malformed SRAT tables during early ACPI parsing which caused an infinite loop anda boot hang. - Fix a long standing race in the affinity setting code which affects PCI devices with non-maskable MSI interrupts. The problem is caused by the non-atomic writes of the MSI address (destination APIC id) and data (vector) fields which the device uses to construct the MSI message. The non-atomic writes are mandated by PCI. If both fields change and the device raises an interrupt after writing address and before writing data, then the MSI block constructs a inconsistent message which causes interrupts to be lost and subsequent malfunction of the device. The fix is to redirect the interrupt to the new vector on the current CPU first and then switch it over to the new target CPU. This allows to observe an eventually raised interrupt in the transitional stage (old CPU, new vector) to be observed in the APIC IRR and retriggered on the new target CPU and the new vector. The potential spurious interrupts caused by this are harmless and can in the worst case expose a buggy driver (all handlers have to be able to deal with spurious interrupts as they can and do happen for various reasons). - Add the missing suspend/resume mechanism for the HYPERV hypercall page which prevents resume hibernation on HYPERV guests. This change got lost before the merge window. - Mask the IOAPIC before disabling the local APIC to prevent potentially stale IOAPIC remote IRR bits which cause stale interrupt lines after resume" * tag 'x86-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Mask IOAPIC entries when disabling the local APIC x86/hyperv: Suspend/resume the hypercall page for hibernation x86/apic/msi: Plug non-maskable MSI affinity race x86/boot: Handle malformed SRAT tables during early ACPI parsing x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode
2020-02-09Merge tag 'smp-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-2/+3
Pull SMP fixes from Thomas Gleixner: "Two fixes for the SMP related functionality: - Make the UP version of smp_call_function_single() match SMP semantics when called for a not available CPU. Instead of emitting a warning and assuming that the function call target is CPU0, return a proper error code like the SMP version does. - Remove a superfluous check in smp_call_function_many_cond()" * tag 'smp-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp/up: Make smp_call_function_single() match SMP semantics smp: Remove superfluous cond_func check in smp_call_function_many_cond()
2020-02-09Merge tag 'perf-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds10-36/+88
Pull perf fixes from Thomas Gleixner: "A set of fixes and improvements for the perf subsystem: Kernel fixes: - Install cgroup events to the correct CPU context to prevent a potential list double add - Prevent an integer underflow in the perf mlock accounting - Add a missing prototype for arch_perf_update_userpage() Tooling: - Add a missing unlock in the error path of maps__insert() in perf maps. - Fix the build with the latest libbfd - Fix the perf parser so it does not delete parse event terms, which caused a regression for using perf with the ARM CoreSight as the sink configuration was missing due to the deletion. - Fix the double free in the perf CPU map merging test case - Add the missing ustring support for the perf probe command" * tag 'perf-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf maps: Add missing unlock to maps__insert() error case perf probe: Add ustring support for perf probe command perf: Make perf able to build with latest libbfd perf test: Fix test case Merge cpu map perf parse: Copy string to perf_evsel_config_term perf parse: Refactor 'struct perf_evsel_config_term' kernel/events: Add a missing prototype for arch_perf_update_userpage() perf/cgroups: Install cgroup events to correct cpuctx perf/core: Fix mlock accounting in perf_mmap()
2020-02-09Merge tag 'timers-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds2-3/+10
Pull timer fixes from Thomas Gleixner: "Two small fixes for the time(r) subsystem: - Handle a subtle race between the clocksource watchdog and a concurrent clocksource watchdog stop/start sequence correctly to prevent a timer double add bug. - Fix the file path for the core time namespace file" * tag 'timers-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Prevent double add_timer_on() for watchdog_timer MAINTAINERS: Correct path to time namespace source file
2020-02-09Merge tag 'irq-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds7-35/+127
Pull interrupt fixes from Thomas Gleixner: "A set of fixes for the interrupt subsystem: - Provision only ACPI enabled redistributors on GICv3 - Use the proper command colums when building the INVALL command for the GICv3-ITS - Ensure the allocation of the L2 vPE table for GICv4.1 - Correct the GICv4.1 VPROBASER programming so it uses the proper size - A set of small GICv4.1 tidy up patches - Configuration cleanup for C-SKY interrupt chip - Clarify the function documentation for irq_set_wake() to document that the wakeup functionality is orthogonal to the irq disable/enable mechanism" * tag 'irq-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Rename VPENDBASER/VPROPBASER accessors irqchip/gic-v3-its: Remove superfluous WARN_ON irqchip/gic-v4.1: Drop 'tmp' in inherit_vpe_l1_table_from_rd() irqchip/gic-v4.1: Ensure L2 vPE table is allocated at RD level irqchip/gic-v4.1: Set vpe_l1_base for all redistributors irqchip/gic-v4.1: Fix programming of GICR_VPROPBASER_4_1_SIZE genirq: Clarify that irq wake state is orthogonal to enable/disable irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL irqchip: Some Kconfig cleanup for C-SKY irqchip/gic-v3: Only provision redistributors that are enabled in ACPI
2020-02-09Merge tag 'efi-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull EFI fix from Thomas Gleixner: "A single fix for a EFI boot regression on X86 which was caused by the recent rework of the EFI memory map parsing. On systems with invalid memmap entries the cleanup function uses an value which cannot be relied on in this stage. Use the actual EFI memmap entry instead" * tag 'efi-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/x86: Fix boot regression on systems with invalid memmap entries
2020-02-08Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds7-20/+29
Pull misc SCSI fixes from James Bottomley: "Five small patches, all in drivers or doc, which missed the initial pull request. The qla2xxx and megaraid_sas are actual fixes and the rest are spelling and doc changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: fix spelling mistake "initilized" -> "initialized" scsi: pm80xx: fix spelling mistake "to" -> "too" scsi: MAINTAINERS: ufs: remove pedrom.sousa@synopsys.com scsi: megaraid_sas: fixup MSIx interrupt setup during resume scsi: qla2xxx: Fix unbound NVME response length
2020-02-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds80-327/+782
Pull networking fixes from David Miller: 1) Unbalanced locking in mwifiex_process_country_ie, from Brian Norris. 2) Fix thermal zone registration in iwlwifi, from Andrei Otcheretianski. 3) Fix double free_irq in sgi ioc3 eth, from Thomas Bogendoerfer. 4) Use after free in mptcp, from Florian Westphal. 5) Use after free in wireguard's root_remove_peer_lists, from Eric Dumazet. 6) Properly access packets heads in bonding alb code, from Eric Dumazet. 7) Fix data race in skb_queue_len(), from Qian Cai. 8) Fix regression in r8169 on some chips, from Heiner Kallweit. 9) Fix XDP program ref counting in hv_netvsc, from Haiyang Zhang. 10) Certain kinds of set link netlink operations can cause a NULL deref in the ipv6 addrconf code. Fix from Eric Dumazet. 11) Don't cancel uninitialized work queue in drop monitor, from Ido Schimmel. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits) net: thunderx: use proper interface type for RGMII mt76: mt7615: fix max_nss in mt7615_eeprom_parse_hw_cap bpf: Improve bucket_log calculation logic selftests/bpf: Test freeing sockmap/sockhash with a socket in it bpf, sockhash: Synchronize_rcu before free'ing map bpf, sockmap: Don't sleep while holding RCU lock on tear-down bpftool: Don't crash on missing xlated program instructions bpf, sockmap: Check update requirements after locking drop_monitor: Do not cancel uninitialized work item mlxsw: spectrum_dpipe: Add missing error path mlxsw: core: Add validation of hardware device types for MGPIR register mlxsw: spectrum_router: Clear offload indication from IPv6 nexthops on abort selftests: mlxsw: Add test cases for local table route replacement mlxsw: spectrum_router: Prevent incorrect replacement of local table routes net: dsa: microchip: enable module autoprobe ipv6/addrconf: fix potential NULL deref in inet6_set_link_af() dpaa_eth: support all modes with rate adapting PHYs net: stmmac: update pci platform data to use phy_interface net: stmmac: xgmac: fix missing IFF_MULTICAST checki in dwxgmac2_set_filter net: stmmac: fix missing IFF_MULTICAST check in dwmac4_set_filter ...
2020-02-08fs: Add VirtualBox guest shared folder (vboxsf) supportHans de Goede12-0/+3280
VirtualBox hosts can share folders with guests, this commit adds a VFS driver implementing the Linux-guest side of this, allowing folders exported by the host to be mounted under Linux. This driver depends on the guest <-> host IPC functions exported by the vboxguest driver. Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-08Merge tag 'powerpc-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds2-5/+7
Pull powerpc fixes from Michael Ellerman: - Fix an existing bug in our user access handling, exposed by one of the bug fixes we merged this cycle. - A fix for a boot hang on 32-bit with CONFIG_TRACE_IRQFLAGS and the recently added CONFIG_VMAP_STACK. Thanks to: Christophe Leroy, Guenter Roeck. * tag 'powerpc-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Fix CONFIG_TRACE_IRQFLAGS with CONFIG_VMAP_STACK powerpc/futex: Fix incorrect user access blocking
2020-02-08Fix up remaining devm_ioremap_nocache() in SGI IOC3 8250 UART driverLinus Torvalds1-1/+1
This is a merge error on my part - the driver was merged into mainline by commit c5951e7c8ee5 ("Merge tag 'mips_5.6' of git://../mips/linux") over a week ago, but nobody apparently noticed that it didn't actually build due to still having a reference to the devm_ioremap_nocache() function, removed a few days earlier through commit 6a1000bd2703 ("Merge tag 'ioremap-5.6' of git://../ioremap"). Apparently this didn't get any build testing anywhere. Not perhaps all that surprising: it's restricted to 64-bit MIPS only, and only with the new SGI_MFD_IOC3 support enabled. I only noticed because the ioremap conflicts in the ARM SoC driver update made me check there weren't any others hiding, and I found this one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-08Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds35-675/+697
Pull ARM SoC late updates from Olof Johansson: "This is some material that we picked up into our tree late, or that had more complex dependencies on more than one topic branch that makes sense to keep separately. - TI support for secure accelerators and hwrng on OMAP4/5 - TI camera changes for dra7 and am437x and SGX improvement due to better reset control support on am335x, am437x and dra7 - Davinci moves to proper clocksource on DM365, and regulator/audio improvements for DM365 and DM644x eval boards" * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (32 commits) ARM: dts: omap4-droid4: Enable hdq for droid4 ds250x 1-wire battery nvmem ARM: dts: motorola-cpcap-mapphone: Configure calibration interrupt ARM: dts: Configure interconnect target module for am437x sgx ARM: dts: Configure sgx for dra7 ARM: dts: Configure rstctrl reset for am335x SGX ARM: dts: dra7: Add ti-sysc node for VPE ARM: dts: dra7: add vpe clkctrl node ARM: dts: am43x-epos-evm: Add VPFE and OV2659 entries ARM: dts: am437x-sk-evm: Add VPFE and OV2659 entries ARM: dts: am43xx: add support for clkout1 clock arm: dts: dra76-evm: Add CAL and OV5640 nodes arm: dtsi: dra76x: Add CAL dtsi node arm: dts: dra72-evm-common: Add entries for the CSI2 cameras ARM: dts: DRA72: Add CAL dtsi node ARM: dts: dra7-l4: Add ti-sysc node for CAM ARM: OMAP: DRA7xx: Make CAM clock domain SWSUP only ARM: dts: dra7: add cam clkctrl node ARM: OMAP2+: Drop legacy platform data for omap4 des ARM: OMAP2+: Drop legacy platform data for omap4 sham ARM: OMAP2+: Drop legacy platform data for omap4 aes ...
2020-02-08Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds9-36/+113
Pull ARM SoC defconfig updates from Olof Johansson: "We keep this in a separate branch to avoid cross-branch conflicts, but most of the material here is fairly boring -- some new drivers turned on for hardware since they were merged, and some refreshed files due to time having moved a lot of entries around" * tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (38 commits) ARM: configs: at91: enable MMC_SDHCI_OF_AT91 and MICROCHIP_PIT64B arm64: defconfig: Enable Broadcom's GENET Ethernet controller ARM: multi_v7_defconfig: Enable devfreq thermal integration ARM: exynos_defconfig: Enable devfreq thermal integration ARM: multi_v7_defconfig: Enable NFS v4.1 and v4.2 ARM: exynos_defconfig: Enable NFS v4.1 and v4.2 arm64: defconfig: Enable Actions Semi specific drivers arm64: defconfig: Enable Broadcom's STB PCIe controller arm64: defconfig: Enable CONFIG_CLK_IMX8MP by default ARM: configs: at91: enable config flags for sam9x60 SoC ARM: configs: at91: use savedefconfig arm64: defconfig: Enable tegra XUDC support ARM: defconfig: gemini: Update defconfig arm64: defconfig: enable CONFIG_ARM_QCOM_CPUFREQ_NVMEM arm64: defconfig: enable CONFIG_QCOM_CPR arm64: defconfig: Enable HFPLL arm64: defconfig: Enable CRYPTO_DEV_FSL_CAAM ARM: imx_v6_v7_defconfig: Select the TFP410 driver ARM: imx_v6_v7_defconfig: Enable NFS_V4_1 and NFS_V4_2 support arm64: defconfig: Enable ATH10K_SNOC ...
2020-02-08Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds142-3276/+6579
Pull ARM SoC-related driver updates from Olof Johansson: "Various driver updates for platforms: - Nvidia: Fuse support for Tegra194, continued memory controller pieces for Tegra30 - NXP/FSL: Refactorings of QuickEngine drivers to support ARM/ARM64/PPC - NXP/FSL: i.MX8MP SoC driver pieces - TI Keystone: ring accelerator driver - Qualcomm: SCM driver cleanup/refactoring + support for new SoCs. - Xilinx ZynqMP: feature checking interface for firmware. Mailbox communication for power management - Overall support patch set for cpuidle on more complex hierarchies (PSCI-based) and misc cleanups, refactorings of Marvell, TI, other platforms" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (166 commits) drivers: soc: xilinx: Use mailbox IPI callback dt-bindings: power: reset: xilinx: Add bindings for ipi mailbox drivers: soc: ti: knav_qmss_queue: Pass lockdep expression to RCU lists MAINTAINERS: Add brcmstb PCIe controller entry soc/tegra: fuse: Unmap registers once they are not needed anymore soc/tegra: fuse: Correct straps' address for older Tegra124 device trees soc/tegra: fuse: Warn if straps are not ready soc/tegra: fuse: Cache values of straps and Chip ID registers memory: tegra30-emc: Correct error message for timed out auto calibration memory: tegra30-emc: Firm up hardware programming sequence memory: tegra30-emc: Firm up suspend/resume sequence soc/tegra: regulators: Do nothing if voltage is unchanged memory: tegra: Correct reset value of xusb_hostr soc/tegra: fuse: Add APB DMA dependency for Tegra20 bus: tegra-aconnect: Remove PM_CLK dependency dt-bindings: mediatek: add MT6765 power dt-bindings soc: mediatek: cmdq: delete not used define memory: tegra: Add support for the Tegra194 memory controller memory: tegra: Only include support for enabled SoCs memory: tegra: Support DVFS on Tegra186 and later ...