aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-sqlite.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-05-19sched/fair: Optimize enqueue_task_fair()Vincent Guittot1-20/+19
enqueue_task_fair jumps to enqueue_throttle label when cfs_rq_of(se) is throttled which means that se can't be NULL in such case and we can move the label after the if (!se) statement. Futhermore, the latter can be removed because se is always NULL when reaching this point. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/20200513135502.4672-1-vincent.guittot@linaro.org
2020-05-19sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq listVincent Guittot1-12/+30
Although not exactly identical, unthrottle_cfs_rq() and enqueue_task_fair() are quite close and follow the same sequence for enqueuing an entity in the cfs hierarchy. Modify unthrottle_cfs_rq() to use the same pattern as enqueue_task_fair(). This fixes a problem already faced with the latter and add an optimization in the last for_each_sched_entity loop. Fixes: fe61468b2cb (sched/fair: Fix enqueue_task_fair warning) Reported-by Tao Zhou <zohooouoto@zoho.com.cn> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Reviewed-by: Ben Segall <bsegall@google.com> Link: https://lkml.kernel.org/r/20200513135528.4742-1-vincent.guittot@linaro.org
2020-05-19sched/debug: Fix requested task uclamp values shown in procfsPavankumar Kondeti1-2/+2
The intention of commit 96e74ebf8d59 ("sched/debug: Add task uclamp values to SCHED_DEBUG procfs") was to print requested and effective task uclamp values. The requested values printed are read from p->uclamp, which holds the last effective values. Fix this by printing the values from p->uclamp_req. Fixes: 96e74ebf8d59 ("sched/debug: Add task uclamp values to SCHED_DEBUG procfs") Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/1589115401-26391-1-git-send-email-pkondeti@codeaurora.org
2020-05-19sched/fair: Fix enqueue_task_fair() warning some morePhil Auld1-0/+7
sched/fair: Fix enqueue_task_fair warning some more The recent patch, fe61468b2cb (sched/fair: Fix enqueue_task_fair warning) did not fully resolve the issues with the rq->tmp_alone_branch != &rq->leaf_cfs_rq_list warning in enqueue_task_fair. There is a case where the first for_each_sched_entity loop exits due to on_rq, having incompletely updated the list. In this case the second for_each_sched_entity loop can further modify se. The later code to fix up the list management fails to do what is needed because se does not point to the sched_entity which broke out of the first loop. The list is not fixed up because the throttled parent was already added back to the list by a task enqueue in a parallel child hierarchy. Address this by calling list_add_leaf_cfs_rq if there are throttled parents while doing the second for_each_sched_entity loop. Fixes: fe61468b2cb ("sched/fair: Fix enqueue_task_fair warning") Suggested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Phil Auld <pauld@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20200512135222.GC2201@lorien.usersys.redhat.com
2020-05-17Linux 5.7-rc6Linus Torvalds1-1/+1
2020-05-17exec: Move would_dump into flush_old_execEric W. Biederman1-2/+2
I goofed when I added mm->user_ns support to would_dump. I missed the fact that in the case of binfmt_loader, binfmt_em86, binfmt_misc, and binfmt_script bprm->file is reassigned. Which made the move of would_dump from setup_new_exec to __do_execve_file before exec_binprm incorrect as it can result in would_dump running on the script instead of the interpreter of the script. The net result is that the code stopped making unreadable interpreters undumpable. Which allows them to be ptraced and written to disk without special permissions. Oops. The move was necessary because the call in set_new_exec was after bprm->mm was no longer valid. To correct this mistake move the misplaced would_dump from __do_execve_file into flos_old_exec, before exec_mmap is called. I tested and confirmed that without this fix I can attach with gdb to a script with an unreadable interpreter, and with this fix I can not. Cc: stable@vger.kernel.org Fixes: f84df2a6f268 ("exec: Ensure mm->user_ns contains the execed files") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-15KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mceJim Mattson1-1/+1
Bank_num is a one-based count of banks, not a zero-based index. It overflows the allocated space only when strictly greater than KVM_MAX_MCE_BANKS. Fixes: a9e38c3e01ad ("KVM: x86: Catch potential overrun in MCE setup") Signed-off-by: Jue Wang <juew@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20200511225616.19557-1-jmattson@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-15selftests: mptcp: pm: rm the right tmp fileMatthieu Baerts1-1/+1
"$err" is a variable pointing to a temp file. "$out" is not: only used as a local variable in "check()" and representing the output of a command line. Fixes: eedbc685321b (selftests: add PM netlink functional tests) Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15dpaa2-eth: properly handle buffer size restrictionsIoana Ciornei2-12/+18
Depending on the WRIOP version, the buffer size on the RX path must by a multiple of 64 or 256. Handle this restriction properly by aligning down the buffer size to the necessary value. Also, use the new buffer size dynamically computed instead of the compile time one. Fixes: 27c874867c4e ("dpaa2-eth: Use a single page per Rx buffer") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifierDaniel Borkmann3-32/+88
Usage of plain %s conversion specifier in bpf_trace_printk() suffers from the very same issue as bpf_probe_read{,str}() helpers, that is, it is broken on archs with overlapping address ranges. While the helpers have been addressed through work in 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers"), we need an option for bpf_trace_printk() as well to fix it. Similarly as with the helpers, force users to make an explicit choice by adding %pks and %pus specifier to bpf_trace_printk() which will then pick the corresponding strncpy_from_unsafe*() variant to perform the access under KERNEL_DS or USER_DS. The %pk* (kernel specifier) and %pu* (user specifier) can later also be extended for other objects aside strings that are probed and printed under tracing, and reused out of other facilities like bpf_seq_printf() or BTF based type printing. Existing behavior of %s for current users is still kept working for archs where it is not broken and therefore gated through CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. For archs not having this property we fall-back to pick probing under KERNEL_DS as a sensible default. Fixes: 8d3b7dce8622 ("bpf: add support for %s specifier to bpf_trace_printk()") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Link: https://lore.kernel.org/bpf/20200515101118.6508-4-daniel@iogearbox.net
2020-05-15bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_rangeDaniel Borkmann1-1/+3
Given bpf_probe_read{,str}() BPF helpers are now only available under CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE, we need to add the drop-in replacements of bpf_probe_read_{kernel,user}_str() to do_refine_retval_range() as well to avoid hitting the same issue as in 849fa50662fbc ("bpf/verifier: refine retval R0 state for bpf_get_stack helper"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200515101118.6508-3-daniel@iogearbox.net
2020-05-15bpf: Restrict bpf_probe_read{, str}() only to archs where they workDaniel Borkmann5-2/+10
Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs with overlapping address ranges, we should really take the next step to disable them from BPF use there. To generally fix the situation, we've recently added new helper variants bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str(). For details on them, see 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user,kernel}_str helpers"). Given bpf_probe_read{,str}() have been around for ~5 years by now, there are plenty of users at least on x86 still relying on them today, so we cannot remove them entirely w/o breaking the BPF tracing ecosystem. However, their use should be restricted to archs with non-overlapping address ranges where they are working in their current form. Therefore, move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and have x86, arm64, arm select it (other archs supporting it can follow-up on it as well). For the remaining archs, they can workaround easily by relying on the feature probe from bpftool which spills out defines that can be used out of BPF C code to implement the drop-in replacement for old/new kernels via: bpftool feature probe macro Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net
2020-05-15USB: gadget: fix illegal array access in binding with UDCKyungtae Kim1-0/+3
FuzzUSB (a variant of syzkaller) found an illegal array access using an incorrect index while binding a gadget with UDC. Reference: https://www.spinics.net/lists/linux-usb/msg194331.html This bug occurs when a size variable used for a buffer is misused to access its strcpy-ed buffer. Given a buffer along with its size variable (taken from user input), from which, a new buffer is created using kstrdup(). Due to the original buffer containing 0 value in the middle, the size of the kstrdup-ed buffer becomes smaller than that of the original. So accessing the kstrdup-ed buffer with the same size variable triggers memory access violation. The fix makes sure no zero value in the buffer, by comparing the strlen() of the orignal buffer with the size variable, so that the access to the kstrdup-ed buffer is safe. BUG: KASAN: slab-out-of-bounds in gadget_dev_desc_UDC_store+0x1ba/0x200 drivers/usb/gadget/configfs.c:266 Read of size 1 at addr ffff88806a55dd7e by task syz-executor.0/17208 CPU: 2 PID: 17208 Comm: syz-executor.0 Not tainted 5.6.8 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xce/0x128 lib/dump_stack.c:118 print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374 __kasan_report+0x131/0x1b0 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:641 __asan_report_load1_noabort+0x14/0x20 mm/kasan/generic_report.c:132 gadget_dev_desc_UDC_store+0x1ba/0x200 drivers/usb/gadget/configfs.c:266 flush_write_buffer fs/configfs/file.c:251 [inline] configfs_write_file+0x2f1/0x4c0 fs/configfs/file.c:283 __vfs_write+0x85/0x110 fs/read_write.c:494 vfs_write+0x1cd/0x510 fs/read_write.c:558 ksys_write+0x18a/0x220 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:620 do_syscall_64+0x9e/0x510 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Kyungtae Kim <kt0755@gmail.com> Reported-and-tested-by: Kyungtae Kim <kt0755@gmail.com> Cc: Felipe Balbi <balbi@kernel.org> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200510054326.GA19198@pizza01 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15usb: core: hub: limit HUB_QUIRK_DISABLE_AUTOSUSPEND to USB5534BEugeniu Rosca1-1/+5
On Tue, May 12, 2020 at 09:36:07PM +0800, Kai-Heng Feng wrote [1]: > This patch prevents my Raven Ridge xHCI from getting runtime suspend. The problem described in v5.6 commit 1208f9e1d758c9 ("USB: hub: Fix the broken detection of USB3 device in SMSC hub") applies solely to the USB5534B hub [2] present on the Kingfisher Infotainment Carrier Board, manufactured by Shimafuji Electric Inc [3]. Despite that, the aforementioned commit applied the quirk to _all_ hubs carrying vendor ID 0x424 (i.e. SMSC), of which there are more [4] than initially expected. Consequently, the quirk is now enabled on platforms carrying SMSC/Microchip hub models which potentially don't exhibit the original issue. To avoid reports like [1], further limit the quirk's scope to USB5534B [2], by employing both Vendor and Product ID checks. Tested on H3ULCB + Kingfisher rev. M05. [1] https://lore.kernel.org/linux-renesas-soc/73933975-6F0E-40F5-9584-D2B8F615C0F3@canonical.com/ [2] https://www.microchip.com/wwwproducts/en/USB5534B [3] http://www.shimafuji.co.jp/wp/wp-content/uploads/2018/08/SBEV-RCAR-KF-M06Board_HWSpecificationEN_Rev130.pdf [4] https://devicehunt.com/search/type/usb/vendor/0424/device/any Fixes: 1208f9e1d758c9 ("USB: hub: Fix the broken detection of USB3 device in SMSC hub") Cc: stable@vger.kernel.org # v4.14+ Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Hardik Gajjar <hgajjar@de.adit-jv.com> Cc: linux-renesas-soc@vger.kernel.org Cc: linux-usb@vger.kernel.org Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20200514220246.13290-1-erosca@de.adit-jv.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15x86: Fix early boot crash on gcc-10, third tryBorislav Petkov5-1/+23
... or the odyssey of trying to disable the stack protector for the function which generates the stack canary value. The whole story started with Sergei reporting a boot crash with a kernel built with gcc-10: Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013 Call Trace: dump_stack panic ? start_secondary __stack_chk_fail start_secondary secondary_startup_64 -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary This happens because gcc-10 tail-call optimizes the last function call in start_secondary() - cpu_startup_entry() - and thus emits a stack canary check which fails because the canary value changes after the boot_init_stack_canary() call. To fix that, the initial attempt was to mark the one function which generates the stack canary with: __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused) however, using the optimize attribute doesn't work cumulatively as the attribute does not add to but rather replaces previously supplied optimization options - roughly all -fxxx options. The key one among them being -fno-omit-frame-pointer and thus leading to not present frame pointer - frame pointer which the kernel needs. The next attempt to prevent compilers from tail-call optimizing the last function call cpu_startup_entry(), shy of carving out start_secondary() into a separate compilation unit and building it with -fno-stack-protector, was to add an empty asm(""). This current solution was short and sweet, and reportedly, is supported by both compilers but we didn't get very far this time: future (LTO?) optimization passes could potentially eliminate this, which leads us to the third attempt: having an actual memory barrier there which the compiler cannot ignore or move around etc. That should hold for a long time, but hey we said that about the other two solutions too so... Reported-by: Sergei Trofimovich <slyfox@gentoo.org> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Kalle Valo <kvalo@codeaurora.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org
2020-05-15ARM: dts: iwg20d-q7-dbcm-ca: Remove unneeded properties in hdmi@39Ricardo Cañuelo1-2/+0
Remove the adi,input-style and adi,input-justification properties of hdmi@39 to make it compliant with the "adi,adv7511w" DT binding. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://lore.kernel.org/r/20200511110611.3142-6-ricardo.canuelo@collabora.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2020-05-15ARM: dts: renesas: Make hdmi encoder nodes compliant with DT bindingsRicardo Cañuelo9-24/+4
Small fixes to make these DTs compliant with the adi,adv7511w and adi,adv7513 bindings: r8a7745-iwg22d-sodimm-dbhd-ca.dts r8a7790-lager.dts r8a7790-stout.dts r8a7791-koelsch.dts r8a7791-porter.dts r8a7792-blanche.dts r8a7793-gose.dts r8a7794-silk.dts: Remove the adi,input-style and adi,input-justification properties. r8a7792-wheat.dts: Reorder the I2C slave addresses of hdmi@3d and hdmi@39 and remove the adi,input-style and adi,input-justification properties. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://lore.kernel.org/r/20200511110611.3142-3-ricardo.canuelo@collabora.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2020-05-15arm64: dts: renesas: Make hdmi encoder nodes compliant with DT bindingsRicardo Cañuelo6-14/+2
Small fixes to make these DTs compliant with the adi,adv7511w binding. r8a77970-eagle.dts, r8a77970-v3msk.dts, r8a77980-condor.dts, r8a77980-v3hsk.dts, r8a77990-ebisu.dts: Remove the adi,input-style and adi,input-justification properties. r8a77995-draak.dts: Reorder the I2C slave addresses of the hdmi-encoder@39 node and remove the adi,input-style and adi,input-justification properties. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://lore.kernel.org/r/20200511110611.3142-2-ricardo.canuelo@collabora.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2020-05-15x86/unwind/orc: Fix error handling in __unwind_start()Josh Poimboeuf1-7/+9
The unwind_state 'error' field is used to inform the reliable unwinding code that the stack trace can't be trusted. Set this field for all errors in __unwind_start(). Also, move the zeroing out of the unwind_state struct to before the ORC table initialization check, to prevent the caller from reading uninitialized data if the ORC table is corrupted. Fixes: af085d9084b4 ("stacktrace/x86: add function for detecting reliable stack traces") Fixes: d3a09104018c ("x86/unwinder/orc: Dont bail on stack overflow") Fixes: 98d0c8ebf77e ("x86/unwind/orc: Prevent unwinding before ORC initialization") Reported-by: Pavel Machek <pavel@denx.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/d6ac7215a84ca92b895fdd2e1aa546729417e6e6.1589487277.git.jpoimboe@redhat.com
2020-05-14MAINTAINERS: Mark networking drivers as Maintained.David S. Miller1-1/+1
Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14ipmr: Add lockdep expression to ipmr_for_each_table macroAmol Grover1-3/+4
During the initialization process, ipmr_new_table() is called to create new tables which in turn calls ipmr_get_table() which traverses net->ipv4.mr_tables without holding the writer lock. However, this is safe to do so as no tables exist at this time. Hence add a suitable lockdep expression to silence the following false-positive warning: ============================= WARNING: suspicious RCU usage 5.7.0-rc3-next-20200428-syzkaller #0 Not tainted ----------------------------- net/ipv4/ipmr.c:136 RCU-list traversed in non-reader section!! ipmr_get_table+0x130/0x160 net/ipv4/ipmr.c:136 ipmr_new_table net/ipv4/ipmr.c:403 [inline] ipmr_rules_init net/ipv4/ipmr.c:248 [inline] ipmr_net_init+0x133/0x430 net/ipv4/ipmr.c:3089 Fixes: f0ad0860d01e ("ipv4: ipmr: support multiple tables") Reported-by: syzbot+1519f497f2f9f08183c6@syzkaller.appspotmail.com Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14ipmr: Fix RCU list debugging warningAmol Grover1-1/+2
ipmr_for_each_table() macro uses list_for_each_entry_rcu() for traversing outside of an RCU read side critical section but under the protection of rtnl_mutex. Hence, add the corresponding lockdep expression to silence the following false-positive warning at boot: [ 4.319347] ============================= [ 4.319349] WARNING: suspicious RCU usage [ 4.319351] 5.5.4-stable #17 Tainted: G E [ 4.319352] ----------------------------- [ 4.319354] net/ipv4/ipmr.c:1757 RCU-list traversed in non-reader section!! Fixes: f0ad0860d01e ("ipv4: ipmr: support multiple tables") Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.cMadhuparna Bhowmik1-1/+2
This patch fixes the following warning: ============================= WARNING: suspicious RCU usage 5.7.0-rc5-next-20200514-syzkaller #0 Not tainted ----------------------------- drivers/net/hamradio/bpqether.c:149 RCU-list traversed in non-reader section!! Since rtnl lock is held, pass this cond in list_for_each_entry_rcu(). Reported-by: syzbot+bb82cafc737c002d11ca@syzkaller.appspotmail.com Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810Kevin Lo2-2/+7
Set the correct bit when checking for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54810 PHY. Fixes: 0ececcfc9267 ("net: phy: broadcom: Allow BCM54810 to use bcm54xx_adjust_rxrefclk()") Signed-off-by: Kevin Lo <kevlo@kevlo.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14cifs: fix leaked reference on requeued writeAdam McCoy1-1/+1
Failed async writes that are requeued may not clean up a refcount on the file, which can result in a leaked open. This scenario arises very reliably when using persistent handles and a reconnect occurs while writing. cifs_writev_requeue only releases the reference if the write fails (rc != 0). The server->ops->async_writev operation will take its own reference, so the initial reference can always be released. Signed-off-by: Adam McCoy <adam@forsedomani.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2020-05-14NFSv3: fix rpc receive buffer size for MOUNT callOlga Kornievskaia1-1/+2
Prior to commit e3d3ab64dd66 ("SUNRPC: Use au_rslack when computing reply buffer size"), there was enough slack in the reply buffer to commodate filehandles of size 60bytes. However, the real problem was that the reply buffer size for the MOUNT operation was not correctly calculated. Received buffer size used the filehandle size for NFSv2 (32bytes) which is much smaller than the allowed filehandle size for the v3 mounts. Fix the reply buffer size (decode arguments size) for the MNT command. Fixes: 2c94b8eca1a2 ("SUNRPC: Use au_rslack when computing reply buffer size") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-05-14tcp: fix error recovery in tcp_zerocopy_receive()Eric Dumazet1-3/+4
If user provides wrong virtual address in TCP_ZEROCOPY_RECEIVE operation we want to return -EINVAL error. But depending on zc->recv_skip_hint content, we might return -EIO error if the socket has SOCK_DONE set. Make sure to return -EINVAL in this case. BUG: KMSAN: uninit-value in tcp_zerocopy_receive net/ipv4/tcp.c:1833 [inline] BUG: KMSAN: uninit-value in do_tcp_getsockopt+0x4494/0x6320 net/ipv4/tcp.c:3685 CPU: 1 PID: 625 Comm: syz-executor.0 Not tainted 5.7.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 tcp_zerocopy_receive net/ipv4/tcp.c:1833 [inline] do_tcp_getsockopt+0x4494/0x6320 net/ipv4/tcp.c:3685 tcp_getsockopt+0xf8/0x1f0 net/ipv4/tcp.c:3728 sock_common_getsockopt+0x13f/0x180 net/core/sock.c:3131 __sys_getsockopt+0x533/0x7b0 net/socket.c:2177 __do_sys_getsockopt net/socket.c:2192 [inline] __se_sys_getsockopt+0xe1/0x100 net/socket.c:2189 __x64_sys_getsockopt+0x62/0x80 net/socket.c:2189 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:297 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c829 Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f1deeb72c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000037 RAX: ffffffffffffffda RBX: 00000000004e01e0 RCX: 000000000045c829 RDX: 0000000000000023 RSI: 0000000000000006 RDI: 0000000000000009 RBP: 000000000078bf00 R08: 0000000020000200 R09: 0000000000000000 R10: 00000000200001c0 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000001d8 R14: 00000000004d3038 R15: 00007f1deeb736d4 Local variable ----zc@do_tcp_getsockopt created at: do_tcp_getsockopt+0x1a74/0x6320 net/ipv4/tcp.c:3670 do_tcp_getsockopt+0x1a74/0x6320 net/ipv4/tcp.c:3670 Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14char: ipmi: convert to use i2c_new_client_device()Wolfram Sang1-2/+2
Move away from the deprecated API. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Message-Id: <20200326210958.13051-2-wsa+renesas@sang-engineering.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2020-05-14SUNRPC: 'Directory with parent 'rpc_clnt' already present!'J. Bruce Fields1-1/+1
Each rpc_client has a cl_clid which is allocated from a global ida, and a debugfs directory which is named after cl_clid. We're releasing the cl_clid before we free the debugfs directory named after it. As soon as the cl_clid is released, that value is available for another newly created client. That leaves a window where another client may attempt to create a new debugfs directory with the same name as the not-yet-deleted debugfs directory from the dying client. Symptoms are log messages like Directory 4 with parent 'rpc_clnt' already present! Fixes: 7c4310ff5642 "SUNRPC: defer slow parts of rpc_free_client() to a workqueue." Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2020-05-14MAINTAINERS: Add Jakub to networking drivers.David S. Miller1-0/+1
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14MAINTAINERS: another add of Karsten Graul for S390 networkingUrsula Braun1-0/+1
Complete adding of Karsten as maintainer for all S390 networking parts in the kernel. Cc: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14drivers: ipa: fix typos for ipa_smp2p structure docWang Wenhu1-1/+1
Remove the duplicate "mutex", and change "Motex" to "Mutex". Also I recommend it's easier for understanding to make the "ready-interrupt" a bundle for it is a parallel description as "shutdown" which is appended after the slash. Signed-off-by: Wang Wenhu <wenhu.wang@vivo.com> Cc: Alex Elder <elder@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14pppoe: only process PADT targeted at local interfacesGuillaume Nault1-0/+3
We don't want to disconnect a session because of a stray PADT arriving while the interface is in promiscuous mode. Furthermore, multicast and broadcast packets make no sense here, so only PACKET_HOST is accepted. Reported-by: David Balažic <xerces9@gmail.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14selftests/bpf: Enforce returning 0 for fentry/fexit programsYonghong Song1-2/+2
There are a few fentry/fexit programs returning non-0. The tests with these programs will break with the previous patch which enfoced return-0 rules. Fix them properly. Fixes: ac065870d928 ("selftests/bpf: Add BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macros") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200514053207.1298479-1-yhs@fb.com
2020-05-14bpf: Enforce returning 0 for fentry/fexit progsYonghong Song1-0/+17
Currently, tracing/fentry and tracing/fexit prog return values are not enforced. In trampoline codes, the fentry/fexit prog return values are ignored. Let us enforce it to be 0 to avoid confusion and allows potential future extension. This patch also explicitly added return value checking for tracing/raw_tp, tracing/fmod_ret, and freplace programs such that these program return values can be anything. The purpose are two folds: 1. to make it explicit about return value expectations for these programs in verifier. 2. for tracing prog_type, if a future attach type is added, the default is -ENOTSUPP which will enforce to specify return value ranges explicitly. Fixes: fec56f5890d9 ("bpf: Introduce BPF trampoline") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200514053206.1298415-1-yhs@fb.com
2020-05-14net: stmmac: fix num_por initializationVinod Koul1-2/+15
Driver missed initializing num_por which is one of the por values that driver configures to hardware. In order to get these values, add a new structure ethqos_emac_driver_data which holds por and num_por values and populate that in driver probe. Fixes: a7c30e62d4b8 ("net: stmmac: Add driver for Qualcomm ethqos") Reported-by: Rahul Ankushrao Kawadgave <rahulak@qti.qualcomm.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-14security: Fix the default value of secid_to_secctx hookAnders Roxell1-1/+1
security_secid_to_secctx is called by the bpf_lsm hook and a successful return value (i.e 0) implies that the parameter will be consumed by the LSM framework. The current behaviour return success when the pointer isn't initialized when CONFIG_BPF_LSM is enabled, with the default return from kernel/bpf/bpf_lsm.c. This is the internal error: [ 1229.341488][ T2659] usercopy: Kernel memory exposure attempt detected from null address (offset 0, size 280)! [ 1229.374977][ T2659] ------------[ cut here ]------------ [ 1229.376813][ T2659] kernel BUG at mm/usercopy.c:99! [ 1229.378398][ T2659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 1229.380348][ T2659] Modules linked in: [ 1229.381654][ T2659] CPU: 0 PID: 2659 Comm: systemd-journal Tainted: G B W 5.7.0-rc5-next-20200511-00019-g864e0c6319b8-dirty #13 [ 1229.385429][ T2659] Hardware name: linux,dummy-virt (DT) [ 1229.387143][ T2659] pstate: 80400005 (Nzcv daif +PAN -UAO BTYPE=--) [ 1229.389165][ T2659] pc : usercopy_abort+0xc8/0xcc [ 1229.390705][ T2659] lr : usercopy_abort+0xc8/0xcc [ 1229.392225][ T2659] sp : ffff000064247450 [ 1229.393533][ T2659] x29: ffff000064247460 x28: 0000000000000000 [ 1229.395449][ T2659] x27: 0000000000000118 x26: 0000000000000000 [ 1229.397384][ T2659] x25: ffffa000127049e0 x24: ffffa000127049e0 [ 1229.399306][ T2659] x23: ffffa000127048e0 x22: ffffa000127048a0 [ 1229.401241][ T2659] x21: ffffa00012704b80 x20: ffffa000127049e0 [ 1229.403163][ T2659] x19: ffffa00012704820 x18: 0000000000000000 [ 1229.405094][ T2659] x17: 0000000000000000 x16: 0000000000000000 [ 1229.407008][ T2659] x15: 0000000000000000 x14: 003d090000000000 [ 1229.408942][ T2659] x13: ffff80000d5b25b2 x12: 1fffe0000d5b25b1 [ 1229.410859][ T2659] x11: 1fffe0000d5b25b1 x10: ffff80000d5b25b1 [ 1229.412791][ T2659] x9 : ffffa0001034bee0 x8 : ffff00006ad92d8f [ 1229.414707][ T2659] x7 : 0000000000000000 x6 : ffffa00015eacb20 [ 1229.416642][ T2659] x5 : ffff0000693c8040 x4 : 0000000000000000 [ 1229.418558][ T2659] x3 : ffffa0001034befc x2 : d57a7483a01c6300 [ 1229.420610][ T2659] x1 : 0000000000000000 x0 : 0000000000000059 [ 1229.422526][ T2659] Call trace: [ 1229.423631][ T2659] usercopy_abort+0xc8/0xcc [ 1229.425091][ T2659] __check_object_size+0xdc/0x7d4 [ 1229.426729][ T2659] put_cmsg+0xa30/0xa90 [ 1229.428132][ T2659] unix_dgram_recvmsg+0x80c/0x930 [ 1229.429731][ T2659] sock_recvmsg+0x9c/0xc0 [ 1229.431123][ T2659] ____sys_recvmsg+0x1cc/0x5f8 [ 1229.432663][ T2659] ___sys_recvmsg+0x100/0x160 [ 1229.434151][ T2659] __sys_recvmsg+0x110/0x1a8 [ 1229.435623][ T2659] __arm64_sys_recvmsg+0x58/0x70 [ 1229.437218][ T2659] el0_svc_common.constprop.1+0x29c/0x340 [ 1229.438994][ T2659] do_el0_svc+0xe8/0x108 [ 1229.440587][ T2659] el0_svc+0x74/0x88 [ 1229.441917][ T2659] el0_sync_handler+0xe4/0x8b4 [ 1229.443464][ T2659] el0_sync+0x17c/0x180 [ 1229.444920][ T2659] Code: aa1703e2 aa1603e1 910a8260 97ecc860 (d4210000) [ 1229.447070][ T2659] ---[ end trace 400497d91baeaf51 ]--- [ 1229.448791][ T2659] Kernel panic - not syncing: Fatal exception [ 1229.450692][ T2659] Kernel Offset: disabled [ 1229.452061][ T2659] CPU features: 0x240002,20002004 [ 1229.453647][ T2659] Memory Limit: none [ 1229.455015][ T2659] ---[ end Kernel panic - not syncing: Fatal exception ]--- Rework the so the default return value is -EOPNOTSUPP. There are likely other callbacks such as security_inode_getsecctx() that may have the same problem, and that someone that understand the code better needs to audit them. Thank you Arnd for helping me figure out what went wrong. Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: James Morris <jamorris@linux.microsoft.com> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/bpf/20200512174607.9630-1-anders.roxell@linaro.org
2020-05-14libbpf: Fix register naming in PT_REGS s390 macrosSumanth Korikkar1-2/+2
Fix register naming in PT_REGS s390 macros Fixes: b8ebce86ffe6 ("libbpf: Provide CO-RE variants of PT_REGS macros") Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200513154414.29972-1-sumanthk@linux.ibm.com
2020-05-14bpf: Fix bug in mmap() implementation for BPF array mapAndrii Nakryiko2-1/+14
mmap() subsystem allows user-space application to memory-map region with initial page offset. This wasn't taken into account in initial implementation of BPF array memory-mapping. This would result in wrong pages, not taking into account requested page shift, being memory-mmaped into user-space. This patch fixes this gap and adds a test for such scenario. Fixes: fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200512235925.3817805-1-andriin@fb.com
2020-05-14samples: bpf: Fix build errorMatteo Croce1-2/+0
GCC 10 is very strict about symbol clash, and lwt_len_hist_user contains a symbol which clashes with libbpf: /usr/bin/ld: samples/bpf/lwt_len_hist_user.o:(.bss+0x0): multiple definition of `bpf_log_buf'; samples/bpf/bpf_load.o:(.bss+0x8c0): first defined here collect2: error: ld returned 1 exit status bpf_log_buf here seems to be a leftover, so removing it. Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200511113234.80722-1-mcroce@redhat.com
2020-05-14kasan: add missing functions declarations to kasan.hAndrey Konovalov1-2/+32
KASAN is currently missing declarations for __asan_report* and __hwasan* functions. This can lead to compiler warnings. Reported-by: Leon Romanovsky <leon@kernel.org> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Leon Romanovsky <leon@kernel.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/r/45b445a76a79208918f0cc44bfabebaea909b54d.1589297433.git.andreyknvl@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14kasan: consistently disable debugging featuresAndrey Konovalov1-5/+10
KASAN is incompatible with some kernel debugging/tracing features. There's been multiple patches that disable those feature for some of KASAN files one by one. Instead of prolonging that, disable these features for all KASAN files at once. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Leon Romanovsky <leonro@mellanox.com> Link: http://lkml.kernel.org/r/29bd753d5ff5596425905b0b07f51153e2345cc1.1589297433.git.andreyknvl@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14ipc/util.c: sysvipc_find_ipc() incorrectly updates position indexVasily Averin1-6/+6
Commit 89163f93c6f9 ("ipc/util.c: sysvipc_find_ipc() should increase position index") is causing this bug (seen on 5.6.8): # ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages # ipcmk -Q Message queue id: 0 # ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages 0x82db8127 0 root 644 0 0 # ipcmk -Q Message queue id: 1 # ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages 0x82db8127 0 root 644 0 0 0x76d1fb2a 1 root 644 0 0 # ipcrm -q 0 # ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages 0x76d1fb2a 1 root 644 0 0 0x76d1fb2a 1 root 644 0 0 # ipcmk -Q Message queue id: 2 # ipcrm -q 2 # ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages 0x76d1fb2a 1 root 644 0 0 0x76d1fb2a 1 root 644 0 0 # ipcmk -Q Message queue id: 3 # ipcrm -q 1 # ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 0x7c982867 3 root 644 0 0 Whenever an IPC item with a low id is deleted, the items with higher ids are duplicated, as if filling a hole. new_pos should jump through hole of unused ids, pos can be updated inside "for" cycle. Fixes: 89163f93c6f9 ("ipc/util.c: sysvipc_find_ipc() should increase position index") Reported-by: Andreas Schwab <schwab@suse.de> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Waiman Long <longman@redhat.com> Cc: NeilBrown <neilb@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/4921fe9b-9385-a2b4-1dc4-1099be6d2e39@virtuozzo.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14userfaultfd: fix remap event with MREMAP_DONTUNMAPBrian Geffon1-1/+1
A user is not required to set a new address when using MREMAP_DONTUNMAP as it can be used without MREMAP_FIXED. When doing so the remap event will use new_addr which may not have been set and we didn't propagate it back other then in the return value of remap_to. Because ret is always the new address it's probably more correct to use it rather than new_addr on the remap_event_complete call, and it resolves this bug. Fixes: e346b3813067d4b ("mm/mremap: add MREMAP_DONTUNMAP to mremap()") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Brian Geffon <bgeffon@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Lokesh Gidra <lokeshgidra@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: "Michael S . Tsirkin" <mst@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Link: http://lkml.kernel.org/r/20200506172158.218366-1-bgeffon@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14mm/gup: fix fixup_user_fault() on multiple retriesPeter Xu1-5/+7
This part was overlooked when reworking the gup code on multiple retries. When we get the 2nd+ retry, we'll be with TRIED flag set. Current code will bail out on the 2nd retry because the !TRIED check will fail so the retry logic will be skipped. What's worse is that, it will also return zero which errornously hints the caller that the page is faulted in while it's not. The !TRIED flag check seems to not be needed even before the mutliple retries change because if we get a VM_FAULT_RETRY, it must be the 1st retry, and we should not have TRIED set for that. Fix it by removing the !TRIED check, at the meantime check against fatal signals properly before the page fault so we can still properly respond to the user killing the process during retries. Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Brian Geffon <bgeffon@google.com> Link: http://lkml.kernel.org/r/20200502003523.8204-1-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14epoll: call final ep_events_available() check under the lockRoman Penyaev1-20/+28
There is a possible race when ep_scan_ready_list() leaves ->rdllist and ->obflist empty for a short period of time although some events are pending. It is quite likely that ep_events_available() observes empty lists and goes to sleep. Since commit 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") we are conservative in wakeups (there is only one place for wakeup and this is ep_poll_callback()), thus ep_events_available() must always observe correct state of two lists. The easiest and correct way is to do the final check under the lock. This does not impact the performance, since lock is taken anyway for adding a wait entry to the wait queue. The discussion of the problem can be found here: https://lore.kernel.org/linux-fsdevel/a2f22c3c-c25a-4bda-8339-a7bdaf17849e@akamai.com/ In this patch barrierless __set_current_state() is used. This is safe since waitqueue_active() is called under the same lock on wakeup side. Short-circuit for fatal signals (i.e. fatal_signal_pending() check) is moved to the line just before actual events harvesting routine. This is fully compliant to what is said in the comment of the patch where the actual fatal_signal_pending() check was added: c257a340ede0 ("fs, epoll: short circuit fetching events if thread has been killed"). Fixes: 339ddb53d373 ("fs/epoll: remove unnecessary wakeups of nested epoll") Reported-by: Jason Baron <jbaron@akamai.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jason Baron <jbaron@akamai.com> Cc: Khazhismel Kumykov <khazhy@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200505145609.1865152-1-rpenyaev@suse.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14mm, memcg: fix inconsistent oom event behaviorYafang Shao1-0/+2
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") changed the behavior of memcg events, which will now consider subtrees in memory.events. But oom_kill event is a special one as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed in memory.oom_control. The file memory.oom_control is in both root memcg and non root memcg, that is different with memory.event as it only in non-root memcg. That commit is okay for cgroup2, but it is not okay for cgroup1 as it will cause inconsistent behavior between root memcg and non-root memcg. Here's an example on why this behavior is inconsistent in cgroup1. root memcg / memcg foo / memcg bar Suppose there's an oom_kill in memcg bar, then the oon_kill will be root memcg : memory.oom_control(oom_kill) 0 / memcg foo : memory.oom_control(oom_kill) 1 / memcg bar : memory.oom_control(oom_kill) 1 For the non-root memcg, its memory.oom_control(oom_kill) includes its descendants' oom_kill, but for root memcg, it doesn't include its descendants' oom_kill. That means, memory.oom_control(oom_kill) has different meanings in different memcgs. That is inconsistent. Then the user has to know whether the memcg is root or not. If we can't fully support it in cgroup1, for example by adding memory.events.local into cgroup1 as well, then let's don't touch its original behavior. Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200502141055.7378-1-laoar.shao@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-14USB: usbfs: fix mmap dma mismatchGreg Kroah-Hartman1-3/+13
In commit 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch") we switched from always calling remap_pfn_range() to call dma_mmap_coherent() to handle issues with systems with non-coherent USB host controller drivers. Unfortunatly, as syzbot quickly told us, not all the world is host controllers with DMA support, so we need to check what host controller we are attempting to talk to before doing this type of allocation. Thanks to Christoph for the quick idea of how to fix this. Fixes: 2bef9aed6f0e ("usb: usbfs: correct kernel->user page attribute mismatch") Cc: Christoph Hellwig <hch@lst.de> Cc: Hillf Danton <hdanton@sina.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: stable <stable@vger.kernel.org> Reported-by: syzbot+353be47c9ce21b68b7ed@syzkaller.appspotmail.com Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200514112711.1858252-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-15csky: Fixup raw_copy_from_user()Al Viro2-29/+28
If raw_copy_from_user(to, from, N) returns K, callers expect the first N - K bytes starting at to to have been replaced with the contents of corresponding area starting at from and the last K bytes of destination *left* *unmodified*. What arch/sky/lib/usercopy.c is doing is broken - it can lead to e.g. data corruption on write(2). raw_copy_to_user() is inaccurate about return value, which is a bug, but consequences are less drastic than for raw_copy_from_user(). And just what are those access_ok() doing in there? I mean, look into linux/uaccess.h; that's where we do that check (as well as zero tail on failure in the callers that need zeroing). AFAICS, all of that shouldn't be hard to fix; something like a patch below might make a useful starting point. I would suggest moving these macros into usercopy.c (they are never used anywhere else) and possibly expanding them there; if you leave them alive, please at least rename __copy_user_zeroing(). Again, it must not zero anything on failed read. Said that, I'm not sure we won't be better off simply turning usercopy.c into usercopy.S - all that is left there is a couple of functions, each consisting only of inline asm. Guo Ren reply: Yes, raw_copy_from_user is wrong, it's no need zeroing code. unsigned long _copy_from_user(void *to, const void __user *from, unsigned long n) { unsigned long res = n; might_fault(); if (likely(access_ok(from, n))) { kasan_check_write(to, n); res = raw_copy_from_user(to, from, n); } if (unlikely(res)) memset(to + (n - res), 0, res); return res; } EXPORT_SYMBOL(_copy_from_user); You are right and access_ok() should be removed. but, how about: do { ... "2: stw %3, (%1, 0) \n" \ + " subi %0, 4 \n" \ "9: stw %4, (%1, 4) \n" \ + " subi %0, 4 \n" \ "10: stw %5, (%1, 8) \n" \ + " subi %0, 4 \n" \ "11: stw %6, (%1, 12) \n" \ + " subi %0, 4 \n" \ " addi %2, 16 \n" \ " addi %1, 16 \n" \ Don't expand __ex_table AI Viro reply: Hey, I've no idea about the instruction scheduling on csky - if that doesn't slow the things down, all the better. It's just that copy_to_user() and friends are on fairly hot codepaths, and in quite a few situations they will dominate the speed of e.g. read(2). So I tried to keep the fast path unchanged. Up to the architecture maintainers, obviously. Which would be you... As for the fixups size increase (__ex_table size is unchanged)... You have each of those macros expanded exactly once. So the size is not a serious argument, IMO - useless complexity would be, if it is, in fact, useless; the size... not really, especially since those extra subi will at least offset it. Again, up to you - asm optimizations of (essentially) memcpy()-style loops are tricky and can depend upon the fairly subtle details of architecture. So even on something I know reasonably well I would resort to direct experiments if I can't pass the buck to architecture maintainers. It *is* worth optimizing - this is where read() from a file that is already in page cache spends most of the time, etc. Guo Ren reply: Thx, after fixup some typo “sub %0, 4”, apply the patch. TODO: - user copy/from codes are still need optimizing. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-15csky: Fixup gdbmacros.txt with name sp in thread_structGuo Ren4-9/+9
The gdbmacros.txt use sp in thread_struct, but csky use ksp. This cause bttnobp fail to excute. TODO: - Still couldn't display the contents of stack. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>