aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-10-25perf test: Do not fail Intel-PT misc test w/o libpythonNamhyung Kim1-0/+6
The virtual LBR test uses a python script to check the max size of branch stack in the Intel-PT generated LBR. But it didn't check whether python scripting is available (as it's optional). Let's skip the test if the python support is not available. Fixes: f77811a0f62577d2 ("perf test: test_intel_pt.sh: Add 9 tests") Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Ammy Yi <ammy.yi@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221021181055.60183-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25perf list: Fix PMU name pai_crypto in perf list on s390Thomas Richter1-0/+0
Commit e0b23af82d6f454c ("perf list: Add PMU pai_crypto event description for IBM z16") introduced the "Processor Activity Instrumentation" for cryptographic counters for z16. The PMU device driver exports the counters via sysfs files listed in directory /sys/devices/pai_crypto. To specify an event from that PMU, use 'perf stat -e pai_crypto/XXX/'. However the JSON file mentioned in above commit exports the counter decriptions in file pmu-events/arch/s390/cf_z16/pai.json. Rename this file to pmu-events/arch/s390/cf_z16/pai_crypto.json to make the naming consistent. Now 'perf list' shows the counter names under pai_crypto section: pai_crypto: CRYPTO_ALL [CRYPTO ALL. Unit: pai_crypto] ... Output before was pai: CRYPTO_ALL [CRYPTO ALL. Unit: pai_crypto] ... Fixes: e0b23af82d6f454c ("perf list: Add PMU pai_crypto event description for IBM z16") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20221021082557.2695382-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25perf record: Fix event fd racesIan Rogers1-16/+25
The write call may set errno which is problematic if occurring in a function also setting errno. Save and restore errno around the write call. done_fd may be used after close, clear it as part of the close and check its validity in the signal handler. Suggested-by: <gthelen@google.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anand K Mistry <amistry@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20221024011024.462518-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25perf bpf: Fix build with libbpf 0.7.0 by checking if bpf_program__set_insns() is availableArnaldo Carvalho de Melo5-0/+36
During the transition to libbpf 1.0 some functions that perf used were deprecated and finally removed from libbpf, so bpf_program__set_insns() was introduced for perf to continue to use its bpf loader. But when build with LIBBPF_DYNAMIC=1 we now need to check if that function is available so that perf can build with older libbpf versions, even if the end result is emitting a warning to the user that the use of the perf BPF loader requires a newer libbpf, since bpf_program__set_insns() touches libbpf objects internal state. This affects only 'perf trace' when using bpf C code or pre-compiled bytecode as an event. Noticed on RHEL9, that has libbpf 0.7.0, where bpf_program__set_insns() isn't available. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25perf bpf: Fix build with libbpf 0.7.0 by adding prototype for bpf_load_program()Arnaldo Carvalho de Melo1-0/+5
The bpf_load_program() prototype appeared in tools/lib/bpf/bpf.h as deprecated, but nowadays its completely removed, so add it back for building with the system libbpf when using 'make LIBBPF_DYNAMIC=1'. This is a stop gap hack till we do like tools/bpf does with bpftool, i.e. bootstrap the libbpf build and install it in the perf build directory when not using 'make LIBBPF_DYNAMIC=1'. That has to be done to all libraries in tools/lib/, so tha we can remove -Itools/lib/ from the tools/perf CFLAGS. Noticed when building with LIBBPF_DYNAMIC=1 and libbpf 0.7.0 on RHEL9. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25perf vendor events power10: Fix hv-24x7 metric eventsKajol Jain1-36/+36
Testcase stat_all_metrics.sh fails in powerpc: 90: perf all metrics test : FAILED! The testcase "stat_all_metrics.sh" verifies perf stat result for all the metric events present in perf list. It runs perf metric events with various commands and expects non-empty metric result. Incase of powerpc:hv-24x7 events, some of the event count can be 0 based on system configuration. And if that event used as denominator in divide equation, it can cause divide by 0 error. The current nest_metric.json file creating divide by 0 issue for some of the metric events, which results in failure of the "stat_all_metrics.sh" test case. Most of the metrics events have cycles or an event which expect to have a larger value as denominator, so adding 1 to the denominator of the metric expression as a fix. Result in powerpc box after this patch changes: 90: perf all metrics test : Ok Fixes: a3cbcadfdfc330c2 ("perf vendor events power10: Adds 24x7 nest metric events for power10 platform") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Link: https://lore.kernel.org/r/20221014140220.122251-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25perf docs: Fix man page build wrt perf-arm-coresight.txtAdrian Hunter1-0/+0
perf build assumes documentation files starting with "perf-" are man pages but perf-arm-coresight.txt is not a man page: asciidoc: ERROR: perf-arm-coresight.txt: line 2: malformed manpage title asciidoc: ERROR: perf-arm-coresight.txt: line 3: name section expected asciidoc: FAILED: perf-arm-coresight.txt: line 3: section title expected make[3]: *** [Makefile:266: perf-arm-coresight.xml] Error 1 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [Makefile.perf:895: man] Error 2 Fix by renaming it. Fixes: dc2e0fb00bb2b24f ("perf test coresight: Add relevant documentation about ARM64 CoreSight testing") Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com> Reported-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Carsten Haitzler <carsten.haitzler@arm.com> Cc: coresight@lists.linaro.org Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/a176a3e1-6ddc-bb63-e41c-15cda8c2d5d2@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25tools headers UAPI: Sync powerpc syscall tables with the kernel sourcesArnaldo Carvalho de Melo1-6/+10
To pick the changes in these csets: e237506238352f3b ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs") That doesn't cause any changes in the perf tools. As a reminder, this table is used in tools perf to allow features such as: [root@five ~]# perf trace -e set_mempolicy_home_node ^C[root@five ~]# [root@five ~]# perf trace -v -e set_mempolicy_home_node Using CPUID AuthenticAMD-25-21-0 event qualifier tracepoint filter: (common_pid != 253729 && common_pid != 3585) && (id == 450) mmap size 528384B ^C[root@five ~] [root@five ~]# perf trace -v -e set* --max-events 5 Using CPUID AuthenticAMD-25-21-0 event qualifier tracepoint filter: (common_pid != 253734 && common_pid != 3585) && (id == 38 || id == 54 || id == 105 || id == 106 || id == 109 || id == 112 || id == 113 || id == 114 || id == 116 || id == 117 || id == 119 || id == 122 || id == 123 || id == 141 || id == 160 || id == 164 || id == 170 || id == 171 || id == 188 || id == 205 || id == 218 || id == 238 || id == 273 || id == 308 || id == 450) mmap size 528384B 0.000 ( 0.008 ms): bash/253735 setpgid(pid: 253735 (bash), pgid: 253735 (bash)) = 0 6849.011 ( 0.008 ms): bash/16046 setpgid(pid: 253736 (bash), pgid: 253736 (bash)) = 0 6849.080 ( 0.005 ms): bash/253736 setpgid(pid: 253736 (bash), pgid: 253736 (bash)) = 0 7437.718 ( 0.009 ms): gnome-shell/253737 set_robust_list(head: 0x7f34b527e920, len: 24) = 0 13445.986 ( 0.010 ms): bash/16046 setpgid(pid: 253738 (bash), pgid: 253738 (bash)) = 0 [root@five ~]# That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -w set_mempolicy_home_node tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node tools/perf/arch/s390/entry/syscalls/syscall.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node $ $ grep -w set_mempolicy_home_node /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c [450] = "set_mempolicy_home_node", $ This addresses these perf build warnings: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h' diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl' diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/lkml/Y01HN2DGkWz8tC%2FJ@kernel.org/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25scsi: mpt3sas: re-do lost mpt3sas DMA mask fixSreekanth Reddy1-1/+1
This is a re-do of commit e0e0747de0ea ("scsi: mpt3sas: Fix return value check of dma_get_required_mask()"), which I ended up undoing in a mis-merge in commit 62e6e5940c0c ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi"). The original commit message was scsi: mpt3sas: Fix return value check of dma_get_required_mask() Fix the incorrect return value check of dma_get_required_mask(). Due to this incorrect check, the driver was always setting the DMA mask to 63 bit. Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com Fixes: ba27c5cf286d ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations") Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> and this fix was lost when I mis-merged the conflict with commit 9df650963bf6 ("scsi: mpt3sas: Don't change DMA mask while reallocating pools"). Reported-by: Juergen Gross <jgross@suse.com> Fixes: 62e6e5940c0c ("Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi") Link: https://lore.kernel.org/all/CAHk-=wjaK-TxrNaGtFDpL9qNHL1MVkWXO1TT6vObD5tXMSC4Zg@mail.gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-24x86/mm: Do not verify W^X at boot upSteven Rostedt (Google)1-0/+4
Adding on the kernel command line "ftrace=function" triggered: CPA detected W^X violation: 8000000000000063 -> 0000000000000063 range: 0xffffffffc0013000 - 0xffffffffc0013fff PFN 10031b WARNING: CPU: 0 PID: 0 at arch/x86/mm/pat/set_memory.c:609 verify_rwx+0x61/0x6d Call Trace: __change_page_attr_set_clr+0x146/0x8a6 change_page_attr_set_clr+0x135/0x268 change_page_attr_clear.constprop.0+0x16/0x1c set_memory_x+0x2c/0x32 arch_ftrace_update_trampoline+0x218/0x2db ftrace_update_trampoline+0x16/0xa1 __register_ftrace_function+0x93/0xb2 ftrace_startup+0x21/0xf0 register_ftrace_function_nolock+0x26/0x40 register_ftrace_function+0x4e/0x143 function_trace_init+0x7d/0xc3 tracer_init+0x23/0x2c tracing_set_tracer+0x1d5/0x206 register_tracer+0x1c0/0x1e4 init_function_trace+0x90/0x96 early_trace_init+0x25c/0x352 start_kernel+0x424/0x6e4 x86_64_start_reservations+0x24/0x2a x86_64_start_kernel+0x8c/0x95 secondary_startup_64_no_verify+0xe0/0xeb This is because at boot up, kernel text is writable, and there's no reason to do tricks to updated it. But the verifier does not distinguish updates at boot up and at run time, and causes a warning at time of boot. Add a check for system_state == SYSTEM_BOOTING and allow it if that is the case. [ These SYSTEM_BOOTING special cases are all pretty horrid, but the x86 text_poke() code does some odd things at bootup, forcing this for now - Linus ] Link: https://lore.kernel.org/r/20221024112730.180916b3@gandalf.local.home Fixes: 652c5bf380ad0 ("x86/mm: Refuse W^X violations") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-24net-memcg: avoid stalls when under memory pressureJakub Kicinski1-1/+1
As Shakeel explains the commit under Fixes had the unintended side-effect of no longer pre-loading the cached memory allowance. Even tho we previously dropped the first packet received when over memory limit - the consecutive ones would get thru by using the cache. The charging was happening in batches of 128kB, so we'd let in 128kB (truesize) worth of packets per one drop. After the change we no longer force charge, there will be no cache filling side effects. This causes significant drops and connection stalls for workloads which use a lot of page cache, since we can't reclaim page cache under GFP_NOWAIT. Some of the latency can be recovered by improving SACK reneg handling but nowhere near enough to get back to the pre-5.15 performance (the application I'm experimenting with still sees 5-10x worst latency). Apply the suggested workaround of using GFP_ATOMIC. We will now be more permissive than previously as we'll drop _no_ packets in softirq when under pressure. But I can't think of any good and simple way to address that within networking. Link: https://lore.kernel.org/all/20221012163300.795e7b86@kernel.org/ Suggested-by: Shakeel Butt <shakeelb@google.com> Fixes: 4b1327be9fe5 ("net-memcg: pass in gfp_t mask to mem_cgroup_charge_skmem()") Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Link: https://lore.kernel.org/r/20221021160304.1362511-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24tcp: fix indefinite deferral of RTO with SACK renegingNeal Cardwell1-1/+2
This commit fixes a bug that can cause a TCP data sender to repeatedly defer RTOs when encountering SACK reneging. The bug is that when we're in fast recovery in a scenario with SACK reneging, every time we get an ACK we call tcp_check_sack_reneging() and it can note the apparent SACK reneging and rearm the RTO timer for srtt/2 into the future. In some SACK reneging scenarios that can happen repeatedly until the receive window fills up, at which point the sender can't send any more, the ACKs stop arriving, and the RTO fires at srtt/2 after the last ACK. But that can take far too long (O(10 secs)), since the connection is stuck in fast recovery with a low cwnd that cannot grow beyond ssthresh, even if more bandwidth is available. This fix changes the logic in tcp_check_sack_reneging() to only rearm the RTO timer if data is cumulatively ACKed, indicating forward progress. This avoids this kind of nearly infinite loop of RTO timer re-arming. In addition, this meets the goals of tcp_check_sack_reneging() in handling Windows TCP behavior that looks temporarily like SACK reneging but is not really. Many thanks to Jakub Kicinski and Neil Spring, who reported this issue and provided critical packet traces that enabled root-causing this issue. Also, many thanks to Jakub Kicinski for testing this fix. Fixes: 5ae344c949e7 ("tcp: reduce spurious retransmits due to transient SACK reneging") Reported-by: Jakub Kicinski <kuba@kernel.org> Reported-by: Neil Spring <ntspring@fb.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Tested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20221021170821.1093930-1-ncardwell.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24ACPI: video: Fix missing native backlight on ChromebooksDmitry Osipenko1-0/+12
Chromebooks don't have backlight in ACPI table, they suppose to use native backlight in this case. Check presence of the CrOS embedded controller ACPI device and prefer the native backlight if EC found. Suggested-by: Hans de Goede <hdegoede@redhat.com> Fixes: 2600bfa3df99 ("ACPI: video: Add acpi_video_backlight_use_native() helper") Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20221024141210.67784-1-dmitry.osipenko@collabora.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-24tcp: fix a signed-integer-overflow bug in tcp_add_backlog()Lu Wei1-1/+3
The type of sk_rcvbuf and sk_sndbuf in struct sock is int, and in tcp_add_backlog(), the variable limit is caculated by adding sk_rcvbuf, sk_sndbuf and 64 * 1024, it may exceed the max value of int and overflow. This patch reduces the limit budget by halving the sndbuf to solve this issue since ACK packets are much smaller than the payload. Fixes: c9c3321257e1 ("tcp: add tcp_add_backlog()") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24net: lantiq_etop: don't free skb when returning NETDEV_TX_BUSYZhang Changzhong1-1/+0
The ndo_start_xmit() method must not free skb when returning NETDEV_TX_BUSY, since caller is going to requeue freed skb. Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failedZhengchao Shao1-0/+7
When the ops_init() interface is invoked to initialize the net, but ops->init() fails, data is released. However, the ptr pointer in net->gen is invalid. In this case, when nfqnl_nf_hook_drop() is invoked to release the net, invalid address access occurs. The process is as follows: setup_net() ops_init() data = kzalloc(...) ---> alloc "data" net_assign_generic() ---> assign "date" to ptr in net->gen ... ops->init() ---> failed ... kfree(data); ---> ptr in net->gen is invalid ... ops_exit_list() ... nfqnl_nf_hook_drop() *q = nfnl_queue_pernet(net) ---> q is invalid The following is the Call Trace information: BUG: KASAN: use-after-free in nfqnl_nf_hook_drop+0x264/0x280 Read of size 8 at addr ffff88810396b240 by task ip/15855 Call Trace: <TASK> dump_stack_lvl+0x8e/0xd1 print_report+0x155/0x454 kasan_report+0xba/0x1f0 nfqnl_nf_hook_drop+0x264/0x280 nf_queue_nf_hook_drop+0x8b/0x1b0 __nf_unregister_net_hook+0x1ae/0x5a0 nf_unregister_net_hooks+0xde/0x130 ops_exit_list+0xb0/0x170 setup_net+0x7ac/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> Allocated by task 15855: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 __kasan_kmalloc+0xa1/0xb0 __kmalloc+0x49/0xb0 ops_init+0xe7/0x410 setup_net+0x5aa/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Freed by task 15855: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x40 ____kasan_slab_free+0x155/0x1b0 slab_free_freelist_hook+0x11b/0x220 __kmem_cache_free+0xa4/0x360 ops_init+0xb9/0x410 setup_net+0x5aa/0xbd0 copy_net_ns+0x2e6/0x6b0 create_new_namespaces+0x382/0xa50 unshare_nsproxy_namespaces+0xa6/0x1c0 ksys_unshare+0x3a4/0x7e0 __x64_sys_unshare+0x2d/0x40 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: f875bae06533 ("net: Automatically allocate per namespace data.") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24docs: netdev: offer performance feedback to contributorsJakub Kicinski1-0/+10
Some of us gotten used to producing large quantities of peer feedback at work, every 3 or 6 months. Extending the same courtesy to community members seems like a logical step. It may be hard for some folks to get validation of how important their work is internally, especially at smaller companies which don't employ many kernel experts. The concept of "peer feedback" may be a hyperscaler / silicon valley thing so YMMV. Hopefully we can build more context as we go. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24kcm: annotate data-races around kcm->rx_waitEric Dumazet1-6/+11
kcm->rx_psock can be read locklessly in kcm_rfree(). Annotate the read and writes accordingly. syzbot reported: BUG: KCSAN: data-race in kcm_rcv_strparser / kcm_rfree write to 0xffff88810784e3d0 of 1 bytes by task 1823 on cpu 1: reserve_rx_kcm net/kcm/kcmsock.c:283 [inline] kcm_rcv_strparser+0x250/0x3a0 net/kcm/kcmsock.c:363 __strp_recv+0x64c/0xd20 net/strparser/strparser.c:301 strp_recv+0x6d/0x80 net/strparser/strparser.c:335 tcp_read_sock+0x13e/0x5a0 net/ipv4/tcp.c:1703 strp_read_sock net/strparser/strparser.c:358 [inline] do_strp_work net/strparser/strparser.c:406 [inline] strp_work+0xe8/0x180 net/strparser/strparser.c:415 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 read to 0xffff88810784e3d0 of 1 bytes by task 17869 on cpu 0: kcm_rfree+0x121/0x220 net/kcm/kcmsock.c:181 skb_release_head_state+0x8e/0x160 net/core/skbuff.c:841 skb_release_all net/core/skbuff.c:852 [inline] __kfree_skb net/core/skbuff.c:868 [inline] kfree_skb_reason+0x5c/0x260 net/core/skbuff.c:891 kfree_skb include/linux/skbuff.h:1216 [inline] kcm_recvmsg+0x226/0x2b0 net/kcm/kcmsock.c:1161 ____sys_recvmsg+0x16c/0x2e0 ___sys_recvmsg net/socket.c:2743 [inline] do_recvmmsg+0x2f1/0x710 net/socket.c:2837 __sys_recvmmsg net/socket.c:2916 [inline] __do_sys_recvmmsg net/socket.c:2939 [inline] __se_sys_recvmmsg net/socket.c:2932 [inline] __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2932 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x01 -> 0x00 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 17869 Comm: syz-executor.2 Not tainted 6.1.0-rc1-syzkaller-00010-gbb1a1146467a-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24kcm: annotate data-races around kcm->rx_psockEric Dumazet1-3/+5
kcm->rx_psock can be read locklessly in kcm_rfree(). Annotate the read and writes accordingly. We do the same for kcm->rx_wait in the following patch. syzbot reported: BUG: KCSAN: data-race in kcm_rfree / unreserve_rx_kcm write to 0xffff888123d827b8 of 8 bytes by task 2758 on cpu 1: unreserve_rx_kcm+0x72/0x1f0 net/kcm/kcmsock.c:313 kcm_rcv_strparser+0x2b5/0x3a0 net/kcm/kcmsock.c:373 __strp_recv+0x64c/0xd20 net/strparser/strparser.c:301 strp_recv+0x6d/0x80 net/strparser/strparser.c:335 tcp_read_sock+0x13e/0x5a0 net/ipv4/tcp.c:1703 strp_read_sock net/strparser/strparser.c:358 [inline] do_strp_work net/strparser/strparser.c:406 [inline] strp_work+0xe8/0x180 net/strparser/strparser.c:415 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306 read to 0xffff888123d827b8 of 8 bytes by task 5859 on cpu 0: kcm_rfree+0x14c/0x220 net/kcm/kcmsock.c:181 skb_release_head_state+0x8e/0x160 net/core/skbuff.c:841 skb_release_all net/core/skbuff.c:852 [inline] __kfree_skb net/core/skbuff.c:868 [inline] kfree_skb_reason+0x5c/0x260 net/core/skbuff.c:891 kfree_skb include/linux/skbuff.h:1216 [inline] kcm_recvmsg+0x226/0x2b0 net/kcm/kcmsock.c:1161 ____sys_recvmsg+0x16c/0x2e0 ___sys_recvmsg net/socket.c:2743 [inline] do_recvmmsg+0x2f1/0x710 net/socket.c:2837 __sys_recvmmsg net/socket.c:2916 [inline] __do_sys_recvmmsg net/socket.c:2939 [inline] __se_sys_recvmmsg net/socket.c:2932 [inline] __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2932 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0xffff88812971ce00 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 5859 Comm: syz-executor.3 Not tainted 6.0.0-syzkaller-12189-g19d17ab7c68b-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/22/2022 Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24net: fman: Use physical address for userspace interfacesSean Anderson4-10/+10
Before 262f2b782e25 ("net: fman: Map the base address once"), the physical address of the MAC was exposed to userspace in two places: via sysfs and via SIOCGIFMAP. While this is not best practice, it is an external ABI which is in use by userspace software. The aforementioned commit inadvertently modified these addresses and made them virtual. This constitutes and ABI break. Additionally, it leaks the kernel's memory layout to userspace. Partially revert that commit, reintroducing the resource back into struct mac_device, while keeping the intended changes (the rework of the address mapping). Fixes: 262f2b782e25 ("net: fman: Map the base address once") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24platform/x86/intel: pmc/core: Add Raptor Lake support to pmc core driverGayatri Kammela1-0/+2
Add Raptor Lake client parts (both RPL and RPL_S) support to pmc core driver. Raptor Lake client parts reuse all the Alder Lake PCH IPs. Cc: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: David Box <david.e.box@intel.com> Acked-by: Rajneesh Bhardwaj <irenic.rajneesh@gmail.com> Signed-off-by: Gayatri Kammela <gayatri.kammela@linux.intel.com> Link: https://lore.kernel.org/r/20220912233307.409954-2-gayatri.kammela@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-24leds: simatic-ipc-leds-gpio: fix incorrect LED to GPIO mappingHenning Schild1-6/+6
For apollolake the mapping between LEDs and GPIO pins was off because of a refactoring when we introduced a new device model. In addition to the reordering the indices in the lookup table need to be updated as well. Fixes: a97126265dfe ("leds: simatic-ipc-leds-gpio: add new model 227G") Signed-off-by: Henning Schild <henning.schild@siemens.com> Link: https://lore.kernel.org/r/20221024092027.4529-1-henning.schild@siemens.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-24net/mlx5e: Cleanup MACsec uninitialization routineLeon Romanovsky1-10/+1
The mlx5e_macsec_cleanup() routine has NULL pointer dereferencing if mlx5 device doesn't support MACsec (priv->macsec will be NULL). While at it delete comment line, assignment and extra blank lines, so fix everything in one patch. Fixes: 1f53da676439 ("net/mlx5e: Create advanced steering operation (ASO) object for MACsec") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24platform/x86/amd: pmc: Read SMU version during suspend on Cezanne systemsMario Limonciello1-0/+7
commit b0c07116c894 ("platform/x86: amd-pmc: Avoid reading SMU version at probe time") adjusted the behavior for amd-pmc to avoid reading the SMU version at startup but rather on first use to improve boot time. However the SMU version is also used to decide whether to place a timer based wakeup in the OS_HINT message. If the idlemask hasn't been read before this message was sent then the SMU version will not have been cached. Ensure the SMU version has been read before deciding whether or not to run this codepath. Cc: stable@vger.kernel.org # 6.0 Reported-by: You-Sheng Yang <vicamo.yang@canonical.com> Tested-by: Anson Tsao <anson.tsao@amd.com> Fixes: b0c07116c894 ("platform/x86: amd-pmc: Avoid reading SMU version at probe time") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20221020113749.6621-2-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-24atlantic: fix deadlock at aq_nic_stopÍñigo Huguet2-24/+74
NIC is stopped with rtnl_lock held, and during the stop it cancels the 'service_task' work and free irqs. However, if CONFIG_MACSEC is set, rtnl_lock is acquired both from aq_nic_service_task and aq_linkstate_threaded_isr. Then a deadlock happens if aq_nic_stop tries to cancel/disable them when they've already started their execution. As the deadlock is caused by rtnl_lock, it causes many other processes to stall, not only atlantic related stuff. Fix it by introducing a mutex that protects each NIC's macsec related data, and locking it instead of the rtnl_lock from the service task and the threaded IRQ. Before this patch, all macsec data was protected with rtnl_lock, but maybe not all of it needs to be protected. With this new mutex, further efforts can be made to limit the protected data only to that which requires it. However, probably it doesn't worth it because all macsec's data accesses are infrequent, and almost all are done from macsec_ops or ethtool callbacks, called holding rtnl_lock, so macsec_mutex won't never be much contended. The issue appeared repeteadly attaching and deattaching the NIC to a bond interface. Doing that after this patch I cannot reproduce the bug. Fixes: 62c1c2e606f6 ("net: atlantic: MACSec offload skeleton") Reported-by: Li Liang <liali@redhat.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24platform/x86: thinkpad_acpi: Fix reporting a non present second fan on some modelsJelle van der Waa1-1/+3
thinkpad_acpi was reporting 2 fans on a ThinkPad T14s gen 1, even though the laptop has only 1 fan. The second, not present fan always reads 65535 (-1 in 16 bit signed), ignore fans which report 65535 to avoid reporting the non present fan. Signed-off-by: Jelle van der Waa <jvanderwaa@redhat.com> Link: https://lore.kernel.org/r/20221019194751.5392-1-jvanderwaa@redhat.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-24platform/x86: asus-wmi: Add support for ROG X16 tablet modeLuke D. Jones1-0/+9
Add quirk for ASUS ROG X16 Flow 2-in-1 to enable tablet mode with lid flip (all screen rotations). Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20221010063009.32293-1-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-23Linux 6.1-rc2Linus Torvalds1-1/+1
2022-10-23Revert "mfd: syscon: Remove repetition of the regmap_get_val_endian()"Jason A. Donenfeld1-0/+8
This reverts commit 72a95859728a7866522e6633818bebc1c2519b17. It broke reboots on big-endian MIPS and MIPS64 malta QEMU instances, which use the syscon driver. Little-endian is not effected, which means likely it's important to handle regmap_get_val_endian() in this function after all. Fixes: 72a95859728a ("mfd: syscon: Remove repetition of the regmap_get_val_endian()") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Lee Jones <lee@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-23kernel/utsname_sysctl.c: Fix hostname pollingLinus Torvalds2-0/+2
Commit bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") added a new entry to the uts_kern_table[] array, but didn't update the UTS_PROC_xyz enumerators of older entries, breaking anything that used them. Which is admittedly not many cases: it's really just the two uses of uts_proc_notify() in kernel/sys.c. But apparently journald-systemd actually uses this to detect hostname changes. Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Fixes: bfca3dd3d068 ("kernel/utsname_sysctl.c: print kernel arch") Link: https://lore.kernel.org/lkml/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Link: https://linux-regtracking.leemhuis.info/regzbot/regression/0c2b92a6-0f25-9538-178f-eee3b06da23f@secunet.com/ Cc: Petr Vorel <pvorel@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-22io_uring/net: fail zc sendmsg when unsupported by socketPavel Begunkov1-0/+2
The previous patch fails zerocopy send requests for protocols that don't support it, do the same for zerocopy sendmsg. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0854e7bb4c3d810a48ec8b5853e2f61af36a0467.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22io_uring/net: fail zc send when unsupported by socketPavel Begunkov1-0/+2
If a protocol doesn't support zerocopy it will silently fall back to copying. This type of behaviour has always been a source of troubles so it's better to fail such requests instead. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2db3c7f16bb6efab4b04569cd16e6242b40c5cb3.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22net: flag sockets supporting msghdr originated zerocopyPavel Begunkov3-0/+3
We need an efficient way in io_uring to check whether a socket supports zerocopy with msghdr provided ubuf_info. Add a new flag into the struct socket flags fields. Cc: <stable@vger.kernel.org> # 6.0 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/3dafafab822b1c66308bb58a0ac738b1e3f53f74.1666346426.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-22hwmon: (corsair-psu) Add USB id of the new HX1500i psuWilken Gottwalt2-0/+3
Also update the documentation accordingly. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/Y0FghqQCHG/cX5Jz@monster.localdomain Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-10-22tools: include: sync include/api/linux/kvm.hPaolo Bonzini1-0/+1
Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL. Fixes: 17601bfed909 ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option") Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTERAlexander Graf1-0/+56
The KVM_X86_SET_MSR_FILTER ioctls contains a pointer in the passed in struct which means it has a different struct size depending on whether it gets called from 32bit or 64bit code. This patch introduces compat code that converts from the 32bit struct to its 64bit counterpart which then gets used going forward internally. With this applied, 32bit QEMU can successfully set MSR bitmaps when running on 64bit kernels. Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com> Fixes: 1a155254ff937 ("KVM: x86: Introduce MSR filtering") Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-4-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()Alexander Graf1-14/+17
In the next patch we want to introduce a second caller to set_msr_filter() which constructs its own filter list on the stack. Refactor the original function so it takes it as argument instead of reading it through copy_from_user(). Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-3-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22kvm: Add support for arch compat vm ioctlsAlexander Graf2-0/+13
We will introduce the first architecture specific compat vm ioctl in the next patch. Add all necessary boilerplate to allow architectures to override compat vm ioctls when necessary. Signed-off-by: Alexander Graf <graf@amazon.com> Message-Id: <20221017184541.2658-2-graf@amazon.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-21nfp: only clean `sp_indiff` when application firmware is unloadedYinjun Zhang1-23/+15
Currently `sp_indiff` is cleaned when driver is removed. This will cause problem in multi-PF/multi-host case, considering one PF is removed while another is still in use. Since `sp_indiff` is the application firmware property, it should only be cleaned when the firmware is unloaded. Now let management firmware to clean it when necessary, driver only set it. Fixes: b1e4f11e426d ("nfp: refine the ABI of getting `sp_indiff` info") Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20221020081411.80186-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-21amd-xgbe: add the bit rate quirk for Molex cablesRaju Rangoju1-1/+8
The offset 12 (bit-rate) of EEPROM SFP DAC (passive) cables is expected to be in the range 0x64 to 0x68. However, the 5 meter and 7 meter Molex passive cables have the rate ceiling 0x78 at offset 12. Add a quirk for Molex passive cables to extend the rate ceiling to 0x78. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-21amd-xgbe: fix the SFP compliance codes check for DAC cablesRaju Rangoju1-4/+4
The current XGBE code assumes that offset 6 of EEPROM SFP DAC (passive) cables is NULL. However, some cables (the 5 meter and 7 meter Molex passive cables) have non-zero data at offset 6. Fix the logic by moving the passive cable check above the active checks, so as not to be improperly identified as an active cable. This will fix the issue for any passive cable that advertises 1000Base-CX in offset 6. Fixes: abf0a1c2b26a ("amd-xgbe: Add support for SFP+ modules") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-21amd-xgbe: enable PLL_CTL for fixed PHY modes onlyRaju Rangoju1-2/+8
PLL control setting(RRC) is needed only in fixed PHY configuration to fix the peer-peer issues. Without the PLL control setting, the link up takes longer time in a fixed phy configuration. Driver implements SW RRC for Autoneg On configuration, hence PLL control setting (RRC) is not needed for AN On configuration, and can be skipped. Also, PLL re-initialization is not needed for PHY Power Off and RRC commands. Otherwise, they lead to mailbox errors. Added the changes accordingly. Fixes: daf182d360e5 ("net: amd-xgbe: Toggle PLL settings during rate change") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-21amd-xgbe: use enums for mailbox cmd and sub_cmdsRaju Rangoju2-13/+41
Instead of using hardcoded values, use enumerations for mailbox command and sub commands. Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-21amd-xgbe: Yellow carp devices do not need rrcRaju Rangoju3-1/+7
Link stability issues are noticed on Yellow carp platforms when Receiver Reset Cycle is issued. Since the CDR workaround is disabled on these platforms, the Receiver Reset Cycle is not needed. So, avoid issuing rrc on Yellow carp platforms. Fixes: dbb6c58b5a61 ("net: amd-xgbe: Add Support for Yellow Carp Ethernet device") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-21bpf: Use __llist_del_all() whenever possbile during memory drainingHou Tao1-2/+5
Except for waiting_for_gp list, there are no concurrent operations on free_by_rcu, free_llist and free_llist_extra lists, so use __llist_del_all() instead of llist_del_all(). waiting_for_gp list can be deleted by RCU callback concurrently, so still use llist_del_all(). Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20221021114913.60508-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21bpf: Wait for busy refill_work when destroying bpf memory allocatorHou Tao1-0/+11
A busy irq work is an unfinished irq work and it can be either in the pending state or in the running state. When destroying bpf memory allocator, refill_work may be busy for PREEMPT_RT kernel in which irq work is invoked in a per-CPU RT-kthread. It is also possible for kernel with arch_irq_work_has_interrupt() being false (e.g. 1-cpu arm32 host or mips) and irq work is inovked in timer interrupt. The busy refill_work leads to various issues. The obvious one is that there will be concurrent operations on free_by_rcu and free_list between irq work and memory draining. Another one is call_rcu_in_progress will not be reliable for the checking of pending RCU callback because do_call_rcu() may have not been invoked by irq work yet. The other is there will be use-after-free if irq work is freed before the callback of irq work is invoked as shown below: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 12ab94067 P4D 12ab94067 PUD 1796b4067 PMD 0 Oops: 0010 [#1] PREEMPT_RT SMP CPU: 5 PID: 64 Comm: irq_work/5 Not tainted 6.0.0-rt11+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffadc080293e78 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffcdc07fb6a388 RCX: ffffa05000a2e000 RDX: ffffa05000a2e000 RSI: ffffffff96cc9827 RDI: ffffcdc07fb6a388 ...... Call Trace: <TASK> irq_work_single+0x24/0x60 irq_work_run_list+0x24/0x30 run_irq_workd+0x23/0x30 smpboot_thread_fn+0x203/0x300 kthread+0x126/0x150 ret_from_fork+0x1f/0x30 </TASK> Considering the ease of concurrency handling, no overhead for irq_work_sync() under non-PREEMPT_RT kernel and has-irq-work-interrupt kernel and the short wait time used for irq_work_sync() under PREEMPT_RT (When running two test_maps on PREEMPT_RT kernel and 72-cpus host, the max wait time is about 8ms and the 99th percentile is 10us), just using irq_work_sync() to wait for busy refill_work to complete before memory draining and memory freeing. Fixes: 7c8199e24fa0 ("bpf: Introduce any context BPF specific memory allocator.") Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20221021114913.60508-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21x86/fpu: Fix copy_xstate_to_uabi() to copy init states correctlyChang S. Bae1-0/+9
When an extended state component is not present in fpstate, but in init state, the function copies from init_fpstate via copy_feature(). But, dynamic states are not present in init_fpstate because of all-zeros init states. Then retrieving them from init_fpstate will explode like this: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... RIP: 0010:memcpy_erms+0x6/0x10 ? __copy_xstate_to_uabi_buf+0x381/0x870 fpu_copy_guest_fpstate_to_uabi+0x28/0x80 kvm_arch_vcpu_ioctl+0x14c/0x1460 [kvm] ? __this_cpu_preempt_check+0x13/0x20 ? vmx_vcpu_put+0x2e/0x260 [kvm_intel] kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? kvm_vcpu_ioctl+0xea/0x6b0 [kvm] ? __fget_light+0xd4/0x130 __x64_sys_ioctl+0xe3/0x910 ? debug_smp_processor_id+0x17/0x20 ? fpregs_assert_state_consistent+0x27/0x50 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Adjust the 'mask' to zero out the userspace buffer for the features that are not available both from fpstate and from init_fpstate. The dynamic features depend on the compacted XSAVE format. Ensure it is enabled before reading XCOMP_BV in init_fpstate. Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: Yuan Yao <yuan.yao@intel.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/lkml/BYAPR11MB3717EDEF2351C958F2C86EED95259@BYAPR11MB3717.namprd11.prod.outlook.com/ Link: https://lkml.kernel.org/r/20221021185844.13472-1-chang.seok.bae@intel.com
2022-10-21x86/unwind/orc: Fix unreliable stack dump with gcovChen Zhongjin1-1/+1
When a console stack dump is initiated with CONFIG_GCOV_PROFILE_ALL enabled, show_trace_log_lvl() gets out of sync with the ORC unwinder, causing the stack trace to show all text addresses as unreliable: # echo l > /proc/sysrq-trigger [ 477.521031] sysrq: Show backtrace of all active CPUs [ 477.523813] NMI backtrace for cpu 0 [ 477.524492] CPU: 0 PID: 1021 Comm: bash Not tainted 6.0.0 #65 [ 477.525295] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014 [ 477.526439] Call Trace: [ 477.526854] <TASK> [ 477.527216] ? dump_stack_lvl+0xc7/0x114 [ 477.527801] ? dump_stack+0x13/0x1f [ 477.528331] ? nmi_cpu_backtrace.cold+0xb5/0x10d [ 477.528998] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 477.529641] ? nmi_trigger_cpumask_backtrace+0x16a/0x1f0 [ 477.530393] ? arch_trigger_cpumask_backtrace+0x1d/0x30 [ 477.531136] ? sysrq_handle_showallcpus+0x1b/0x30 [ 477.531818] ? __handle_sysrq.cold+0x4e/0x1ae [ 477.532451] ? write_sysrq_trigger+0x63/0x80 [ 477.533080] ? proc_reg_write+0x92/0x110 [ 477.533663] ? vfs_write+0x174/0x530 [ 477.534265] ? handle_mm_fault+0x16f/0x500 [ 477.534940] ? ksys_write+0x7b/0x170 [ 477.535543] ? __x64_sys_write+0x1d/0x30 [ 477.536191] ? do_syscall_64+0x6b/0x100 [ 477.536809] ? entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 477.537609] </TASK> This happens when the compiled code for show_stack() has a single word on the stack, and doesn't use a tail call to show_stack_log_lvl(). (CONFIG_GCOV_PROFILE_ALL=y is the only known case of this.) Then the __unwind_start() skip logic hits an off-by-one bug and fails to unwind all the way to the intended starting frame. Fix it by reverting the following commit: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") The original justification for that commit no longer exists. That original issue was later fixed in a different way, with the following commit: f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels") Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> [jpoimboe: rewrite commit log] Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-10-21MAINTAINERS: add keyword match on PTPJakub Kicinski1-0/+1
Most of PTP drivers live under ethernet and we have to keep telling people to CC the PTP maintainers. Let's try a keyword match, we can refine as we go if it causes false positives. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-21ethtool: pse-pd: fix null-deref on genl_info in dumpJakub Kicinski1-1/+1
ethnl_default_dump_one() passes NULL as info. It's correct not to set extack during dump, as we should just silently skip interfaces which can't provide the information. Reported-by: syzbot+81c4b4bbba6eea2cfcae@syzkaller.appspotmail.com Fixes: 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>