linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2020-05-29	exec: Add a per bprm->file version of per_clear	Eric W. Biederman	4	-3/+12
	There is a small bug in the code that recomputes parts of bprm->cred for every bprm->file. The code never recomputes the part of clear_dangerous_personality_flags it is responsible for. Which means that in practice if someone creates a sgid script the interpreter will not be able to use any of: READ_IMPLIES_EXEC ADDR_NO_RANDOMIZE ADDR_COMPAT_LAYOUT MMAP_PAGE_ZERO. This accentially clearing of personality flags probably does not matter in practice because no one has complained but it does make the code more difficult to understand. Further remaining bug compatible prevents the recomputation from being removed and replaced by simply computing bprm->cred once from the final bprm->file. Making this change removes the last behavior difference between computing bprm->creds from the final file and recomputing bprm->cred several times. Which allows this behavior change to be justified for it's own reasons, and for any but hunts looking into why the behavior changed to wind up here instead of in the code that will follow that computes bprm->cred from the final bprm->file. This small logic bug appears to have existed since the code started clearing dangerous personality bits. History Tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 1bb0fa189c6a ("[PATCH] NX: clean up legacy binary support") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-27	binfmt_elf_fdpic: fix execfd build regression	Arnd Bergmann	1	-1/+1
	The change to bprm->have_execfd was incomplete, leading to a build failure: fs/binfmt_elf_fdpic.c: In function 'create_elf_fdpic_tables': fs/binfmt_elf_fdpic.c:591:27: error: 'BINPRM_FLAGS_EXECFD' undeclared Change the last user of BINPRM_FLAGS_EXECFD in a corresponding way. Reported-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Fixes: b8a61c9e7b4a ("exec: Generic execfd support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-05-26	exec: Always set cap_ambient in cap_bprm_set_creds	Eric W. Biederman	1	-0/+1
	An invariant of cap_bprm_set_creds is that every field in the new cred structure that cap_bprm_set_creds might set, needs to be set every time to ensure the fields does not get a stale value. The field cap_ambient is not set every time cap_bprm_set_creds is called, which means that if there is a suid or sgid script with an interpreter that has neither the suid nor the sgid bits set the interpreter should be able to accept ambient credentials. Unfortuantely because cap_ambient is not reset to it's original value the interpreter can not accept ambient credentials. Given that the ambient capability set is expected to be controlled by the caller, I don't think this is particularly serious. But it is definitely worth fixing so the code works correctly. I have tested to verify my reading of the code is correct and the interpreter of a sgid can receive ambient capabilities with this change and cannot receive ambient capabilities without this change. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Fixes: 58319057b784 ("capabilities: ambient capabilities") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-21	selftests/exec: Add binfmt_script regression test	Kees Cook	2	-0/+172
	While working on commit b5372fe5dc84 ("exec: load_script: Do not exec truncated interpreter path"), I wrote a series of test scripts to verify corner cases. However, soon after, commit 6eb3c3d0a52d ("exec: increase BINPRM_BUF_SIZE to 256") landed, resulting in the tests needing to be refactored for the larger BINPRM_BUF_SIZE, which got lost on my TODO list. During the recent exec refactoring work[1], the need for these tests resurfaced, so I've finished them up for addition to the kernel selftests. [1] https://lore.kernel.org/lkml/202005191144.E3112135@keescook/ Link: https://lkml.kernel.org/r/202005200204.D07DF079@keescook Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-05-21	exec: Remove recursion from search_binary_handler	Eric W. Biederman	6	-55/+43
	Recursion in kernel code is generally a bad idea as it can overflow the kernel stack. Recursion in exec also hides that the code is looping and that the loop changes bprm->file. Instead of recursing in search_binary_handler have the methods that would recurse set bprm->interpreter and return 0. Modify exec_binprm to loop when bprm->interpreter is set. Consolidate all of the reassignments of bprm->file in that loop to make it clear what is going on. The structure of the new loop in exec_binprm is that all errors return immediately, while successful completion (ret == 0 && !bprm->interpreter) just breaks out of the loop and runs what exec_bprm has always run upon successful completion. Fail if the an interpreter is being call after execfd has been set. The code has never properly handled an interpreter being called with execfd being set and with reassignments of bprm->file and the assignment of bprm->executable in generic code it has finally become possible to test and fail when if this problematic condition happens. With the reassignments of bprm->file and the assignment of bprm->executable moved into the generic code add a test to see if bprm->executable is being reassigned. In search_binary_handler remove the test for !bprm->file. With all reassignments of bprm->file moved to exec_binprm bprm->file can never be NULL in search_binary_handler. Link: https://lkml.kernel.org/r/87sgfwyd84.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-21	exec: Generic execfd support	Eric W. Biederman	5	-41/+32
	Most of the support for passing the file descriptor of an executable to an interpreter already lives in the generic code and in binfmt_elf. Rework the fields in binfmt_elf that deal with executable file descriptor passing to make executable file descriptor passing a first class concept. Move the fd_install from binfmt_misc into begin_new_exec after the new creds have been installed. This means that accessing the file through /proc/<pid>/fd/N is able to see the creds for the new executable before allowing access to the new executables files. Performing the install of the executables file descriptor after the point of no return also means that nothing special needs to be done on error. The exiting of the process will close all of it's open files. Move the would_dump from binfmt_misc into begin_new_exec right after would_dump is called on the bprm->file. This makes it obvious this case exists and that no nesting of bprm->file is currently supported. In binfmt_misc the movement of fd_install into generic code means that it's special error exit path is no longer needed. Link: https://lkml.kernel.org/r/87y2poyd91.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-21	exec/binfmt_script: Don't modify bprm->buf and then return -ENOEXEC	Eric W. Biederman	1	-42/+38
	The return code -ENOEXEC serves to tell search_binary_handler that it should continue searching for the binfmt to handle a given file. This makes return -ENOEXEC with a bprm->buf that is needed to continue the search problematic. The current binfmt_script manages to escape problems as it closes and clears bprm->file before return -ENOEXEC with bprm->buf modified. This prevents search_binary_handler from looping as it explicitly handles a NULL bprm->file. I plan on moving all of the bprm->file managment into fs/exec.c and out of the binary handlers so this will become a problem. Move closing bprm->file and the test for BINPRM_PATH_INACCESSIBLE down below the last return of -ENOEXEC. Introduce i_sep and i_end to track the end of the first argument and the end of the parameters respectively. Using those, constification of all char * pointers, and the helpers next_terminator and next_non_spacetab guarantee the parameter parsing will not modify bprm->buf. Only modify bprm->buf to terminate the strings i_arg and i_name with '\0' for passing to copy_strings_kernel. When replacing loops with next_non_spacetab and next_terminator care has been take that the logic of the parsing code (short of replacing characters by '\0') remains the same. Link: https://lkml.kernel.org/r/874ksczru6.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-21	exec: Move the call of prepare_binprm into search_binary_handler	Eric W. Biederman	6	-22/+5
	The code in prepare_binary_handler needs to be run every time search_binary_handler is called so move the call into search_binary_handler itself to make the code simpler and easier to understand. Link: https://lkml.kernel.org/r/87d070zrvx.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-21	exec: Allow load_misc_binary to call prepare_binprm unconditionally	Eric W. Biederman	3	-19/+17
	Add a flag preserve_creds that binfmt_misc can set to prevent credentials from being updated. This allows binfmt_misc to always call prepare_binprm. Allowing the credential computation logic to be consolidated. Not replacing the credentials with the interpreters credentials is safe because because an open file descriptor to the executable is passed to the interpreter. As the interpreter does not need to reopen the executable it is guaranteed to see the same file that exec sees. Ref: c407c033de84 ("[PATCH] binfmt_misc: improve calculation of interpreter's credentials") Link: https://lkml.kernel.org/r/87imgszrwo.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-21	exec: Convert security_bprm_set_creds into security_bprm_repopulate_creds	Eric W. Biederman	7	-20/+19
	Rename bprm->cap_elevated to bprm->active_secureexec and initialize it in prepare_binprm instead of in cap_bprm_set_creds. Initializing bprm->active_secureexec in prepare_binprm allows multiple implementations of security_bprm_repopulate_creds to play nicely with each other. Rename security_bprm_set_creds to security_bprm_reopulate_creds to emphasize that this path recomputes part of bprm->cred. This recomputation avoids the time of check vs time of use problems that are inherent in unix #! interpreters. In short two renames and a move in the location of initializing bprm->active_secureexec. Link: https://lkml.kernel.org/r/87o8qkzrxp.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-20	exec: Factor security_bprm_creds_for_exec out of security_bprm_set_creds	Eric W. Biederman	12	-63/+63
	Today security_bprm_set_creds has several implementations: apparmor_bprm_set_creds, cap_bprm_set_creds, selinux_bprm_set_creds, smack_bprm_set_creds, and tomoyo_bprm_set_creds. Except for cap_bprm_set_creds they all test bprm->called_set_creds and return immediately if it is true. The function cap_bprm_set_creds ignores bprm->calld_sed_creds entirely. Create a new LSM hook security_bprm_creds_for_exec that is called just before prepare_binprm in __do_execve_file, resulting in a LSM hook that is called exactly once for the entire of exec. Modify the bits of security_bprm_set_creds that only want to be called once per exec into security_bprm_creds_for_exec, leaving only cap_bprm_set_creds behind. Remove bprm->called_set_creds all of it's former users have been moved to security_bprm_creds_for_exec. Add or upate comments a appropriate to bring them up to date and to reflect this change. Link: https://lkml.kernel.org/r/87v9kszrzh.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> # For the LSM and Smack bits Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-20	exec: Teach prepare_exec_creds how exec treats uids & gids	Eric W. Biederman	1	-0/+3
	It is almost possible to use the result of prepare_exec_creds with no modifications during exec. Update prepare_exec_creds to initialize the suid and the fsuid to the euid, and the sgid and the fsgid to the egid. This is all that is needed to handle the common case of exec when nothing special like a setuid exec is happening. That this preserves the existing behavior of exec can be verified by examing bprm_fill_uid and cap_bprm_set_creds. This change makes it clear that the later parts of exec that update bprm->cred are just need to handle special cases such as setuid exec and change of domains. Link: https://lkml.kernel.org/r/871rng22dm.fsf_-_@x220.int.ebiederm.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-17	exec: Move would_dump into flush_old_exec	Eric W. Biederman	1	-2/+2
	I goofed when I added mm->user_ns support to would_dump. I missed the fact that in the case of binfmt_loader, binfmt_em86, binfmt_misc, and binfmt_script bprm->file is reassigned. Which made the move of would_dump from setup_new_exec to __do_execve_file before exec_binprm incorrect as it can result in would_dump running on the script instead of the interpreter of the script. The net result is that the code stopped making unreadable interpreters undumpable. Which allows them to be ptraced and written to disk without special permissions. Oops. The move was necessary because the call in set_new_exec was after bprm->mm was no longer valid. To correct this mistake move the misplaced would_dump from __do_execve_file into flos_old_exec, before exec_mmap is called. I tested and confirmed that without this fix I can attach with gdb to a script with an unreadable interpreter, and with this fix I can not. Cc: stable@vger.kernel.org Fixes: f84df2a6f268 ("exec: Ensure mm->user_ns contains the execed files") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-11	exec: Set the point of no return sooner	Eric W. Biederman	1	-7/+5
	Make the code more robust by marking the point of no return sooner. This ensures that future code changes don't need to worry about how they return errors if they are past this point. This results in no actual change in behavior as __do_execve_file does not force SIGSEGV when there is a pending fatal signal pending past the point of no return. Further the only error returns from de_thread and exec_mmap that can occur result in fatal signals being pending. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/87sgga5klu.fsf_-_@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-11	exec: Move handling of the point of no return to the top level	Eric W. Biederman	1	-9/+12
	Move the handing of the point of no return from search_binary_handler into __do_execve_file so that it is easier to find, and to keep things robust in the face of change. Make it clear that an existing fatal signal will take precedence over a forced SIGSEGV by not forcing SIGSEGV if a fatal signal is already pending. This does not change the behavior but it saves a reader of the code the tedium of reading and understanding force_sig and the signal delivery code. Update the comment in begin_new_exec about where SIGSEGV is forced. Keep point_of_no_return from being a mystery by documenting what the code is doing where it forces SIGSEGV if the code is past the point of no return. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/87y2q25knl.fsf_-_@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-11	exec: Run sync_mm_rss before taking exec_update_mutex	Eric W. Biederman	1	-1/+2
	Like exec_mm_release sync_mm_rss is about flushing out the state of the old_mm, which does not need to happen under exec_update_mutex. Make this explicit by moving sync_mm_rss outside of exec_update_mutex. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/875zd66za3.fsf_-_@x220.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-09	exec: Fix spelling of search_binary_handler in a comment	Eric W. Biederman	1	-1/+1
	Link: https://lkml.kernel.org/r/87h7wq6zc1.fsf_-_@x220.int.ebiederm.org Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-09	exec: Move the comment from above de_thread to above unshare_sighand	Eric W. Biederman	1	-6/+6
	The comment describes work that now happens in unshare_sighand so move the comment where it makes sense. Link: https://lkml.kernel.org/r/87mu6i6zcs.fsf_-_@x220.int.ebiederm.org Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	exec: Rename flush_old_exec begin_new_exec	Eric W. Biederman	8	-9/+9
	There is and has been for a very long time been a lot more going on in flush_old_exec than just flushing the old state. After the movement of code from setup_new_exec there is a whole lot more going on than just flushing the old executables state. Rename flush_old_exec to begin_new_exec to more accurately reflect what this function does. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	exec: Move most of setup_new_exec into flush_old_exec	Eric W. Biederman	1	-41/+44
	The current idiom for the callers is: flush_old_exec(bprm); set_personality(...); setup_new_exec(bprm); In 2010 Linus split flush_old_exec into flush_old_exec and setup_new_exec. With the intention that setup_new_exec be what is called after the processes new personality is set. Move the code that doesn't depend upon the personality from setup_new_exec into flush_old_exec. This is to facilitate future changes by having as much code together in one function as possible. To see why it is safe to move this code please note that effectively this change moves the personality setting in the binfmt and the following three lines of code after everything except unlocking the mutexes: arch_pick_mmap_layout arch_setup_new_exec mm->task_size = TASK_SIZE The function arch_pick_mmap_layout at most sets: mm->get_unmapped_area mm->mmap_base mm->mmap_legacy_base mm->mmap_compat_base mm->mmap_compat_legacy_base which nothing in flush_old_exec or setup_new_exec depends on. The function arch_setup_new_exec only sets architecture specific state and the rest of the functions only deal in state that applies to all architectures. The last line just sets mm->task_size and again nothing in flush_old_exec or setup_new_exec depend on task_size. Ref: 221af7f87b97 ("Split 'flush_old_exec' into two functions") Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	exec: In setup_new_exec cache current in the local variable me	Eric W. Biederman	1	-11/+12
	At least gcc 8.3 when generating code for x86_64 has a hard time consolidating multiple calls to current aka get_current(), and winds up unnecessarily rereading %gs:current_task several times in setup_new_exec. Caching the value of current in the local variable of me generates slightly better and shorter assembly. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	exec: Merge install_exec_creds into setup_new_exec	Eric W. Biederman	8	-37/+27
	The two functions are now always called one right after the other so merge them together to make future maintenance easier. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	exec: Rename the flag called_exec_mmap point_of_no_return	Eric W. Biederman	2	-9/+9
	Update the comments and make the code easier to understand by renaming this flag. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	exec: Make unlocking exec_update_mutex explict	Eric W. Biederman	2	-5/+4
	With install_exec_creds updated to follow immediately after setup_new_exec, the failure of unshare_sighand is the only code path where exec_update_mutex is held but not explicitly unlocked. Update that code path to explicitly unlock exec_update_mutex. Remove the unlocking of exec_update_mutex from free_bprm. Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-07	binfmt: Move install_exec_creds after setup_new_exec to match binfmt_elf	Eric W. Biederman	4	-6/+4
	In 2016 Linus moved install_exec_creds immediately after setup_new_exec, in binfmt_elf as a cleanup and as part of closing a potential information leak. Perform the same cleanup for the other binary formats. Different binary formats doing the same things the same way makes exec easier to reason about and easier to maintain. Greg Ungerer reports: > I tested the the whole series on non-MMU m68k and non-MMU arm > (exercising binfmt_flat) and it all tested out with no problems, > so for the binfmt_flat changes: Tested-by: Greg Ungerer <gerg@linux-m68k.org> Ref: 9f834ec18def ("binfmt_elf: switch to new creds when switching to new mm") Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-04-26	Linux 5.7-rc3	Linus Torvalds	1	-1/+1

2020-04-26	firmware_loader: revert removal of the fw_fallback_config export	Luis Chamberlain	1	-0/+1
	Christoph's patch removed two unsused exported symbols, however, one symbol is used by the firmware_loader itself. If CONFIG_FW_LOADER=m so the firmware_loader is modular but CONFIG_FW_LOADER_USER_HELPER=y we fail the build at mostpost. ERROR: modpost: "fw_fallback_config" [drivers/base/firmware_loader/firmware_class.ko] undefined! This happens because the variable fw_fallback_config is built into the kernel if CONFIG_FW_LOADER_USER_HELPER=y always, so we need to grant access to the firmware loader module by exporting it. Revert only one hunk from his patch. Fixes: 739604734bd8 ("firmware_loader: remove unused exports") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20200424184916.22843-1-mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-25	s390/protvirt: fix compilation issue	Claudio Imbrenda	2	-3/+2
	The kernel fails to compile with CONFIG_PROTECTED_VIRTUALIZATION_GUEST set but CONFIG_KVM unset. This patch fixes the issue by making the needed variable always available. Link: https://lkml.kernel.org/r/20200423120114.2027410-1-imbrenda@linux.ibm.com Fixes: a0f60f843199 ("s390/protvirt: Add sysfs firmware interface for Ultravisor information") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Philipp Rudo <prudo@linux.ibm.com> Suggested-by: Philipp Rudo <prudo@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-24	selftests/bpf: Fix a couple of broken test_btf cases	Stanislav Fomichev	4	-40/+16
	Commit 51c39bb1d5d1 ("bpf: Introduce function-by-function verification") introduced function linkage flag and changed the error message from "vlen != 0" to "Invalid func linkage" and broke some fake BPF programs. Adjust the test accordingly. AFACT, the programs don't really need any arguments and only look at BTF for maps, so let's drop the args altogether. Before: BTF raw test[103] (func (Non zero vlen)): do_test_raw:3703:FAIL expected err_str:vlen != 0 magic: 0xeb9f version: 1 flags: 0x0 hdr_len: 24 type_off: 0 type_len: 72 str_off: 72 str_len: 10 btf_total_size: 106 [1] INT (anon) size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [2] INT (anon) size=4 bits_offset=0 nr_bits=32 encoding=(none) [3] FUNC_PROTO (anon) return=0 args=(1 a, 2 b) [4] FUNC func type_id=3 Invalid func linkage BTF libbpf test[1] (test_btf_haskv.o): libbpf: load bpf program failed: Invalid argument libbpf: -- BEGIN DUMP LOG --- libbpf: Validating test_long_fname_2() func#1... Arg#0 type PTR in test_long_fname_2() is not supported yet. processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 libbpf: -- END LOG -- libbpf: failed to load program 'dummy_tracepoint' libbpf: failed to load object 'test_btf_haskv.o' do_test_file:4201:FAIL bpf_object__load: -4007 BTF libbpf test[2] (test_btf_newkv.o): libbpf: load bpf program failed: Invalid argument libbpf: -- BEGIN DUMP LOG --- libbpf: Validating test_long_fname_2() func#1... Arg#0 type PTR in test_long_fname_2() is not supported yet. processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 libbpf: -- END LOG -- libbpf: failed to load program 'dummy_tracepoint' libbpf: failed to load object 'test_btf_newkv.o' do_test_file:4201:FAIL bpf_object__load: -4007 BTF libbpf test[3] (test_btf_nokv.o): libbpf: load bpf program failed: Invalid argument libbpf: -- BEGIN DUMP LOG --- libbpf: Validating test_long_fname_2() func#1... Arg#0 type PTR in test_long_fname_2() is not supported yet. processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 libbpf: -- END LOG -- libbpf: failed to load program 'dummy_tracepoint' libbpf: failed to load object 'test_btf_nokv.o' do_test_file:4201:FAIL bpf_object__load: -4007 Fixes: 51c39bb1d5d1 ("bpf: Introduce function-by-function verification") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200422003753.124921-1-sdf@google.com
2020-04-24	tools/runqslower: Ensure own vmlinux.h is picked up first	Andrii Nakryiko	1	-1/+1
	Reorder include paths to ensure that runqslower sources are picking up vmlinux.h, generated by runqslower's own Makefile. When runqslower is built from selftests/bpf, due to current -I$(BPF_INCLUDE) -I$(OUTPUT) ordering, it might pick up not-yet-complete vmlinux.h, generated by selftests Makefile, which could lead to compilation errors like [0]. So ensure that -I$(OUTPUT) goes first and rely on runqslower's Makefile own dependency chain to ensure vmlinux.h is properly completed before source code relying on it is compiled. [0] https://travis-ci.org/github/libbpf/libbpf/jobs/677905925 Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200422012407.176303-1-andriin@fb.com
2020-04-24	bpf: Make bpf_link_fops static	Zou Wei	1	-1/+1
	Fix the following sparse warning: kernel/bpf/syscall.c:2289:30: warning: symbol 'bpf_link_fops' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1587609160-117806-1-git-send-email-zou_wei@huawei.com
2020-04-24	bpftool: Respect the -d option in struct_ops cmd	Martin KaFai Lau	1	-1/+7
	In the prog cmd, the "-d" option turns on the verifier log. This is missed in the "struct_ops" cmd and this patch fixes it. Fixes: 65c93628599d ("bpftool: Add struct_ops support") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200424182911.1259355-1-kafai@fb.com
2020-04-24	selftests/bpf: Add test for freplace program with expected_attach_type	Toke Høiland-Jørgensen	3	-18/+58
	This adds a new selftest that tests the ability to attach an freplace program to a program type that relies on the expected_attach_type of the target program to pass verification. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158773526831.293902.16011743438619684815.stgit@toke.dk
2020-04-24	bpf: Propagate expected_attach_type when verifying freplace programs	Toke Høiland-Jørgensen	1	-0/+8
	For some program types, the verifier relies on the expected_attach_type of the program being verified in the verification process. However, for freplace programs, the attach type was not propagated along with the verifier ops, so the expected_attach_type would always be zero for freplace programs. This in turn caused the verifier to sometimes make the wrong call for freplace programs. For all existing uses of expected_attach_type for this purpose, the result of this was only false negatives (i.e., freplace functions would be rejected by the verifier even though they were valid programs for the target they were replacing). However, should a false positive be introduced, this can lead to out-of-bounds accesses and/or crashes. The fix introduced in this patch is to propagate the expected_attach_type to the freplace program during verification, and reset it after that is done. Fixes: be8704ff07d2 ("bpf: Introduce dynamic program extensions") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158773526726.293902.13257293296560360508.stgit@toke.dk
2020-04-24	bpf: Fix leak in LINK_UPDATE and enforce empty old_prog_fd	Andrii Nakryiko	1	-2/+9
	Fix bug of not putting bpf_link in LINK_UPDATE command. Also enforce zeroed old_prog_fd if no BPF_F_REPLACE flag is specified. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200424052045.4002963-1-andriin@fb.com
2020-04-24	bpf, x86_32: Fix logic error in BPF_LDX zero-extension	Wang YanQing	1	-1/+1
	When verifier_zext is true, we don't need to emit code for zero-extension. Fixes: 836256bf5f37 ("x32: bpf: eliminate zero extension code-gen") Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200423050637.GA4029@udknight
2020-04-24	bpf, x86_32: Fix clobbering of dst for BPF_JSET	Luke Nelson	1	-4/+18
	The current JIT clobbers the destination register for BPF_JSET BPF_X and BPF_K by using "and" and "or" instructions. This is fine when the destination register is a temporary loaded from a register stored on the stack but not otherwise. This patch fixes the problem (for both BPF_K and BPF_X) by always loading the destination register into temporaries since BPF_JSET should not modify the destination register. This bug may not be currently triggerable as BPF_REG_AX is the only register not stored on the stack and the verifier uses it in a limited way. Fixes: 03f5781be2c7b ("bpf, x86_32: add eBPF JIT compiler for ia32") Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Wang YanQing <udknight@gmail.com> Link: https://lore.kernel.org/bpf/20200422173630.8351-2-luke.r.nels@gmail.com
2020-04-24	bpf, x86_32: Fix incorrect encoding in BPF_LDX zero-extension	Luke Nelson	1	-1/+3
	The current JIT uses the following sequence to zero-extend into the upper 32 bits of the destination register for BPF_LDX BPF_{B,H,W}, when the destination register is not on the stack: EMIT3(0xC7, add_1reg(0xC0, dst_hi), 0); The problem is that C7 /0 encodes a MOV instruction that requires a 4-byte immediate; the current code emits only 1 byte of the immediate. This means that the first 3 bytes of the next instruction will be treated as the rest of the immediate, breaking the stream of instructions. This patch fixes the problem by instead emitting "xor dst_hi,dst_hi" to clear the upper 32 bits. This fixes the problem and is more efficient than using MOV to load a zero immediate. This bug may not be currently triggerable as BPF_REG_AX is the only register not stored on the stack and the verifier uses it in a limited way, and the verifier implements a zero-extension optimization. But the JIT should avoid emitting incorrect encodings regardless. Fixes: 03f5781be2c7b ("bpf, x86_32: add eBPF JIT compiler for ia32") Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Luke Nelson <luke.r.nels@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Acked-by: Wang YanQing <udknight@gmail.com> Link: https://lore.kernel.org/bpf/20200422173630.8351-1-luke.r.nels@gmail.com
2020-04-24	bpf: Fix reStructuredText markup	Jakub Wilk	2	-2/+2
	The patch fixes: $ scripts/bpf_helpers_doc.py > bpf-helpers.rst $ rst2man bpf-helpers.rst > bpf-helpers.7 bpf-helpers.rst:1105: (WARNING/2) Inline strong start-string without end-string. Signed-off-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200422082324.2030-1-jwilk@jwilk.net
2020-04-24	net: systemport: suppress warnings on failed Rx SKB allocations	Doug Berger	1	-1/+2
	The driver is designed to drop Rx packets and reclaim the buffers when an allocation fails, and the network interface needs to safely handle this packet loss. Therefore, an allocation failure of Rx SKBs is relatively benign. However, the output of the warning message occurs with a high scheduling priority that can cause excessive jitter/latency for other high priority processing. This commit suppresses the warning messages to prevent scheduling problems while retaining the failure count in the statistics of the network interface. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24	net: bcmgenet: suppress warnings on failed Rx SKB allocations	Doug Berger	1	-1/+2
	The driver is designed to drop Rx packets and reclaim the buffers when an allocation fails, and the network interface needs to safely handle this packet loss. Therefore, an allocation failure of Rx SKBs is relatively benign. However, the output of the warning message occurs with a high scheduling priority that can cause excessive jitter/latency for other high priority processing. This commit suppresses the warning messages to prevent scheduling problems while retaining the failure count in the statistics of the network interface. Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24	macsec: avoid to set wrong mtu	Taehee Yoo	1	-4/+8
	When a macsec interface is created, the mtu is calculated with the lower interface's mtu value. If the mtu of lower interface is lower than the length, which is needed by macsec interface, macsec's mtu value will be overflowed. So, if the lower interface's mtu is too low, macsec interface's mtu should be set to 0. Test commands: ip link add dummy0 mtu 10 type dummy ip link add macsec0 link dummy0 type macsec ip link show macsec0 Before: 11: macsec0@dummy0: <BROADCAST,MULTICAST,M-DOWN> mtu 4294967274 After: 11: macsec0@dummy0: <BROADCAST,MULTICAST,M-DOWN> mtu 0 Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-24	dt-bindings: phy: qcom-qusb2: Fix defaults	Douglas Anderson	1	-3/+3
	The defaults listed in the bindings don't match what the code is actually doing. Presumably existing users care more about keeping existing behavior the same, so change the bindings to match the code in Linux. The "qcom,preemphasis-level" default has been wrong for quite a long time (May 2018). The other two were recently added. As some evidence that these values are wrong, this is from the Linux driver: - qcom,preemphasis-level: sets "PORT_TUNE1", lower 2 bits. Driver programs PORT_TUNE1 to 0x30 by default and (0x30 & 0x3) = 0. - qcom,bias-ctrl-value: sets "PLL_BIAS_CONTROL_2", lower 6 bits. Driver programs PLL_BIAS_CONTROL_2 to 0x20 by default and (0x20 & 0x3f) = 0x20 = 32. - qcom,hsdisc-trim-value: sets "PORT_TUNE2", lower 2 bits. Driver programs PORT_TUNE2 to 0x29 by default and (0x29 & 0x3) = 1. Fixes: 1e6f134eb67a ("dt-bindings: phy: qcom-qusb2: Add support for overriding Phy tuning parameters") Fixes: a8b70ccf10e3 ("dt-bindings: phy-qcom-usb2: Add support to override tuning values") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-24	dt-bindings: Fix erroneous 'additionalProperties'	Rob Herring	6	-7/+17
	There's several cases of json-schema 'additionalProperties' at the wrong indentation level which has the effect of making them DT properties. This is harmless, but let's fix them so a meta-schema check for this can be added. In all the cases, either the 'additionalProperties' was extra or doesn't work because there's a $ref to more properties. In the latter case, we can use 'unevaluatedProperties' instead. Reported-by: Iskren Chernev <iskren.chernev@gmail.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Saravanan Sekar <sravanhome@gmail.com> Cc: Liam Girdwood <lgirdwood@gmail.com> Acked-by: Mark Brown <broonie@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-24	proc: Put thread_pid in release_task not proc_flush_pid	Eric W. Biederman	2	-1/+1
	Oleg pointed out that in the unlikely event the kernel is compiled with CONFIG_PROC_FS unset that release_task will now leak the pid. Move the put_pid out of proc_flush_pid into release_task to fix this and to guarantee I don't make that mistake again. When possible it makes sense to keep get and put in the same function so it can easily been seen how they pair up. Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc") Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-04-24	mm: check that mm is still valid in madvise()	Linus Torvalds	1	-0/+18
	IORING_OP_MADVISE can end up basically doing mprotect() on the VM of another process, which means that it can race with our crazy core dump handling which accesses the VM state without holding the mmap_sem (because it incorrectly thinks that it is the final user). This is clearly a core dumping problem, but we've never fixed it the right way, and instead have the notion of "check that the mm is still ok" using mmget_still_valid() after getting the mmap_sem for writing in any situation where we're not the original VM thread. See commit 04f5866e41fb ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping") for more background on this whole mmget_still_valid() thing. You might want to have a barf bag handy when you do. We're discussing just fixing this properly in the only remaining core dumping routines. But even if we do that, let's make do_madvise() do the right thing, and then when we fix core dumping, we can remove all these mmget_still_valid() checks. Reported-and-tested-by: Jann Horn <jannh@google.com> Fixes: c1ca757bd6f4 ("io_uring: add IORING_OP_MADVISE") Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-24	afs: Make record checking use TASK_UNINTERRUPTIBLE when appropriate	David Howells	4	-12/+11
	When an operation is meant to be done uninterruptibly (such as FS.StoreData), we should not be allowing volume and server record checking to be interrupted. Fixes: d2ddc776a458 ("afs: Overhaul volume and server record caching and fileserver rotation") Signed-off-by: David Howells <dhowells@redhat.com>
2020-04-24	afs: Fix to actually set AFS_SERVER_FL_HAVE_EPOCH	David Howells	1	-1/+1
	AFS keeps track of the epoch value from the rxrpc protocol to note (a) when a fileserver appears to have restarted and (b) when different endpoints of a fileserver do not appear to be associated with the same fileserver (ie. all probes back from a fileserver from all of its interfaces should carry the same epoch). However, the AFS_SERVER_FL_HAVE_EPOCH flag that indicates that we've received the server's epoch is never set, though it is used. Fix this to set the flag when we first receive an epoch value from a probe sent to the filesystem client from the fileserver. Fixes: 3bf0fb6f33dd ("afs: Probe multiple fileservers simultaneously") Signed-off-by: David Howells <dhowells@redhat.com>
2020-04-24	afs: Remove some unused bits	David Howells	3	-8/+3
	Remove three bits: (1) afs_server::no_epoch is neither set nor used. (2) afs_server::have_result is set and a wakeup is applied to it, but nothing looks at it or waits on it. (3) afs_vl_dump_edestaddrreq() prints afs_addr_list::probed, but nothing sets it for VL servers. Signed-off-by: David Howells <dhowells@redhat.com>
2020-04-24	dt-bindings: Fix command line length limit calling dt-mk-schema	Rob Herring	1	-8/+10
	As the number of schemas has increased, we're starting to hit the error "execvp: /bin/sh: Argument list too long". This is due to passing all the schema files on the command line to dt-mk-schema. It currently is only with out of tree builds and is intermittent depending on the file path lengths. Commit 2ba06cd8565b ("kbuild: Always validate DT binding examples") made hitting this proplem more likely since the example validation now always gets the full list of schemas. Fix this by passing the schema file list in a pipe and using xargs. We end up doing the find twice, but the time is insignificant compared to the dt-mk-schema time. Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Rob Herring <robh@kernel.org>