aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/syscall-counts.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-01-12perf stat: Add aggr creators that are passed a cpuIan Rogers2-34/+51
The cpu_map and index can get confused. Add variants of the cpu_map__get routines that are passed a cpu. Make the existing cpu_map__get routines use the new functions with a view to remove them when no longer used. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Vineet Singh <vineet.singh@intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12libperf: Add comments to 'struct perf_cpu_map'Ian Rogers1-0/+9
A particular observed problem is confusing the index with the CPU value, documentation should hopefully reduce this type of problem. Reviewed-by: James Clark <james.clark@arm.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Vineet Singh <vineet.singh@intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12perf evsel: Improve error message for uncore eventsIan Rogers1-0/+4
When a group has multiple events and the leader fails it can yield errors like: $ perf stat -e '{uncore_imc/cas_count_read/},instructions' /bin/true Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_imc/cas_count_read/). /bin/dmesg | grep -i perf may provide additional information. However, when not the group leader <not supported> is given: $ perf stat -e '{instructions,uncore_imc/cas_count_read/}' /bin/true ... 1,619,057 instructions <not supported> MiB uncore_imc/cas_count_read/ This is necessary because get_group_fd will fail if the leader fails and is the direct result of the check on line 750 of builtin-stat.c in stat_handle_error that returns COUNTER_SKIP for the latter case. This patch improves the error message to: $ perf stat -e '{uncore_imc/cas_count_read/},instructions' /bin/true Error: Invalid event (uncore_imc/cas_count_read/) in per-thread mode, enable system wide with '-a'. v2. Changed the test to use !target__has_cpu as suggested by Namhyung Kim. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20211223183948.3423989-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12Revert "perf powerpc: Add data source encodings for power10 platform"Arnaldo Carvalho de Melo1-42/+12
This was in a patchkit mixing up kernel with tools/ parts and I mistakenly got it merged in the perf tools tree, revert it, it'll go via the PowerPC kernel tree. This reverts commit af2b24f228a0373ac65eb7a502e0bc31e2c0269d. Cc: kajoljain <kjain@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lore.kernel.org/lkml/20220112171659.531d22ce@canb.auug.org.au Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12Revert "perf powerpc: Add encodings to represent data based on newer composite PERF_MEM_LVLNUM* fields"Arnaldo Carvalho de Melo1-3/+3
This was in a patchkit mixing up kernel with tools/ parts and I mistakenly got it merged in the perf tools tree, revert it, it'll go via the PowerPC kernel tree. This reverts commit 0ebce3d65f1f53c936fdd51e975bd876ba7ed64f. Cc: kajoljain <kjain@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lore.kernel.org/lkml/20220112171659.531d22ce@canb.auug.org.au Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-12perf script: Fix hex dump character outputAdrian Hunter1-1/+1
Using grep -C with perf script -D can give erroneous results as grep loses lines due to non-printable characters, for example, below the 0020, 0060 and 0070 lines are missing: $ perf script -D | grep -C10 AUX | head . 0010: 08 00 00 00 00 00 00 00 1f 00 00 00 00 00 00 00 ................ . 0030: 01 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 ................ . 0040: 00 08 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ . 0080: 02 00 00 00 00 00 00 00 1b 00 00 00 00 00 00 00 ................ . 0090: 00 00 00 00 00 00 00 00 ........ 0 0 0x450 [0x98]: PERF_RECORD_AUXTRACE_INFO type: 1 PMU Type 8 Time Shift 31 perf's isprint() is a custom implementation from the kernel, but the kernel's _ctype appears to include characters from Latin-1 Supplement which is not compatible with, for example, UTF-8. Fix by checking also isascii(). After: $ tools/perf/perf script -D | grep -C10 AUX | head . 0010: 08 00 00 00 00 00 00 00 1f 00 00 00 00 00 00 00 ................ . 0020: 03 84 32 2f 00 00 00 00 63 7c 4f d2 fa ff ff ff ..2/....c|O..... . 0030: 01 00 00 00 00 00 00 00 00 04 00 00 00 00 00 00 ................ . 0040: 00 08 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................ . 0050: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ . 0060: 00 02 00 00 00 00 00 00 00 c0 03 00 00 00 00 00 ................ . 0070: e2 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................ . 0080: 02 00 00 00 00 00 00 00 1b 00 00 00 00 00 00 00 ................ . 0090: 00 00 00 00 00 00 00 00 ........ Fixes: 3052ba56bcb58904 ("tools perf: Move from sane_ctype.h obtained from git to the Linux's original") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20220112085057.277205-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-11perf test: Enable system wide for metricgroups testIan Rogers1-1/+1
Uncore events as group leaders fail in per-thread mode causing exit errors. Enable system-wide for metricgroup testing. This fixes the HPC metric group when tested on skylakex. Fixes: 4a87dea9e60fe100 ("perf test: Workload test of metric and metricgroups") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20211223183948.3423989-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10perf annotate: Avoid TUI crash when navigating in the annotation of recursive functionsDario Petrillo1-9/+14
In 'perf report', entering a recursive function from inside of itself (either directly of indirectly through some other function) results in calling symbol__annotate2 multiple() times, and freeing the whole disassembly when exiting from the innermost instance. The first issue causes the function's disassembly to be duplicated, and the latter a heap use-after-free (and crash) when trying to access the disassembly again. I reproduced the bug on perf 5.11.22 (Ubuntu 20.04.3 LTS) and 5.16.rc8 with the following testcase (compile with gcc recursive.c -o recursive). To reproduce: - perf record ./recursive - perf report - enter fibonacci and annotate it - move the cursor on one of the "callq fibonacci" instructions and press enter - at this point there will be two copies of the function in the disassembly - go back by pressing q, and perf will crash #include <stdio.h> int fibonacci(int n) { if(n <= 2) return 1; return fibonacci(n-1) + fibonacci(n-2); } int main() { printf("%d\n", fibonacci(40)); } This patch addresses the issue by annotating a function and freeing the associated memory on exit only if no annotation is already present, so that a recursive function is only annotated on entry. Signed-off-by: Dario Petrillo <dario.pk1@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org Link: http://lore.kernel.org/lkml/20220109234441.325106-1-dario.pk1@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10perf powerpc: Update global/local variants for p_stage_cycAthira Rajeev1-1/+7
Update the arch_support_sort_key() function in powerpc to enable presenting local and global variants of sort key 'p_stage_cyc'. Update the "se_header" strings for these in arch_perf_header_entry() along with instruction latency. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20211203022038.48240-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-10perf sort: Include global and local variants for p_stage_cyc sort keyAthira Rajeev4-12/+32
Sort key 'p_stage_cyc' is used to present the latency cycles spent in pipeline stages. perf has local 'p_stage_cyc' sort key to display this info. There is no global variant available for this sort key. The local variant shows latency in a single sample, whereas the global value will be useful to present the total latency (sum of latencies) in the hist entry. It represents the latency number multiplied by the number of samples. Add global ('p_stage_cyc') and local variant ('local_p_stage_cyc') for this sort key. Use 'local_p_stage_cyc' as default option for "mem" sort mode. Also add this to the list of dynamic sort keys and made the "dynamic_headers" and "arch_specific_sort_keys" as static. Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20211203022038.48240-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-09Linux 5.16Linus Torvalds1-1/+1
2022-01-09Revert "drm/amdgpu: stop scheduler when calling hw_fini (v2)"Len Brown1-8/+0
This reverts commit f7d6779df642720e22bffd449e683bb8690bd3bf. This bisected regression has impacted suspend-resume stability since 5.15-rc1. It regressed -stable via 5.14.10. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215315 Fixes: f7d6779df64 ("drm/amdgpu: stop scheduler when calling hw_fini (v2)") Cc: Guchun Chen <guchun.chen@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: <stable@vger.kernel.org> # 5.14+ Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-01-08Input: zinitix - make sure the IRQ is allocated before it gets enabledNikita Travkin1-9/+9
Since irq request is the last thing in the driver probe, it happens later than the input device registration. This means that there is a small time window where if the open method is called the driver will attempt to enable not yet available irq. Fix that by moving the irq request before the input device registration. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Fixes: 26822652c85e ("Input: add zinitix touchscreen driver") Signed-off-by: Nikita Travkin <nikita@trvn.ru> Link: https://lore.kernel.org/r/20220106072840.36851-2-nikita@trvn.ru Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-01-08ARM: dts: gpio-ranges property is now requiredPhil Elwell2-0/+4
Since [1], added in 5.7, the absence of a gpio-ranges property has prevented GPIOs from being restored to inputs when released. Add those properties for BCM283x and BCM2711 devices. [1] commit 2ab73c6d8323 ("gpio: Support GPIO controllers without pin-ranges") Link: https://lore.kernel.org/r/20220104170247.956760-1-linus.walleij@linaro.org Fixes: 2ab73c6d8323 ("gpio: Support GPIO controllers without pin-ranges") Fixes: 266423e60ea1 ("pinctrl: bcm2835: Change init order for gpio hogs") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Reported-by: Florian Fainelli <f.fainelli@gmail.com> Reported-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Phil Elwell <phil@raspberrypi.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20211206092237.4105895-3-phil@raspberrypi.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2022-01-08s390/dasd: use default_groups in kobj_typeGreg Kroah-Hartman1-1/+2
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the s390 dasd sysfs code to use default_groups field which has been the preferred way since commit aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Stefan Haberland <sth@linux.ibm.com> Cc: Jan Hoeppner <hoeppner@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220106095401.3274637-1-gregkh@linuxfoundation.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-08s390/sclp_sd: use default_groups in kobj_typeGreg Kroah-Hartman1-1/+2
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the sclp_sd sysfs code to use default_groups field which has been the preferred way since commit aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20220106095252.3273905-1-gregkh@linuxfoundation.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-07Revert "i2c: core: support bus regulator controlling in adapter"Wolfram Sang1-95/+0
This largely reverts commit 5a7b95fb993ec399c8a685552aa6a8fc995c40bd. It breaks suspend with AMD GPUs, and we couldn't incrementally fix it. So, let's remove the code and go back to the drawing board. We keep the header extension to not break drivers already populating the regulator. We expect to re-add the code handling it soon. Fixes: 5a7b95fb993e ("i2c: core: support bus regulator controlling in adapter") Reported-by: "Tareque Md.Hanif" <tarequemd.hanif@yahoo.com> Link: https://lore.kernel.org/r/1295184560.182511.1639075777725@mail.yahoo.com Reported-by: Konstantin Kharlamov <hi-angel@yandex.ru> Link: https://lore.kernel.org/r/7143a7147978f4104171072d9f5225d2ce355ec1.camel@yandex.ru BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1850 Tested-by: "Tareque Md.Hanif" <tarequemd.hanif@yahoo.com> Tested-by: Konstantin Kharlamov <hi-angel@yandex.ru> Signed-off-by: Wolfram Sang <wsa@kernel.org> Cc: <stable@vger.kernel.org> # 5.14+
2022-01-07Revert "libtraceevent: Increase libtraceevent logging when verbose"Arnaldo Carvalho de Melo1-19/+0
This reverts commit 08efcb4a638d260ef7fcbae64ecf7ceceb3f1841. This breaks the build as it will prefer using libbpf-devel header files, even when not using LIBBPF_DYNAMIC=1, breaking the build. This was detected on OpenSuSE Tumbleweed with libtraceevent-devel 1.3.0, as described by Jiri Slaby: ======================================================================= It breaks build with LIBTRACEEVENT_DYNAMIC and version 1.3.0: > util/debug.c: In function ‘perf_debug_option’: > util/debug.c:243:17: error: implicit declaration of function ‘tep_set_loglevel’ [-Werror=implicit-function-declaration] > 243 | tep_set_loglevel(TEP_LOG_INFO); > | ^~~~~~~~~~~~~~~~ > util/debug.c:243:34: error: ‘TEP_LOG_INFO’ undeclared (first use in this function); did you mean ‘TEP_PRINT_INFO’? > 243 | tep_set_loglevel(TEP_LOG_INFO); > | ^~~~~~~~~~~~ > | TEP_PRINT_INFO > util/debug.c:243:34: note: each undeclared identifier is reported only once for each function it appears in > util/debug.c:245:34: error: ‘TEP_LOG_DEBUG’ undeclared (first use in this function) > 245 | tep_set_loglevel(TEP_LOG_DEBUG); > | ^~~~~~~~~~~~~ > util/debug.c:247:34: error: ‘TEP_LOG_ALL’ undeclared (first use in this function) > 247 | tep_set_loglevel(TEP_LOG_ALL); > | ^~~~~~~~~~~ It is because the gcc's command line looks like: gcc ... -I/home/abuild/rpmbuild/BUILD/tools/lib/ ... -DLIBTRACEEVENT_VERSION=65790 ... ======================================================================= The proper way to fix this is more involved and so not suitable for this late in the 5.16-rc stage. Reported-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/lkml/bc2b0786-8965-1bcd-2316-9d9bb37b9c31@kernel.org Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/lkml/YddGjjmlMZzxUZbN@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-07perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense toJiri Olsa1-1/+1
When running 'perf trace' with an BPF object like: # perf trace -e openat,tools/perf/examples/bpf/hello.c the event parsing eventually calls llvm__get_kbuild_opts() that runs a script and that ends up with SIGCHLD delivered to the 'perf trace' handler, which assumes the workload process is done and quits 'perf trace'. Move the SIGCHLD handler setup directly to trace__run(), where the event is parsed and the object is already compiled. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Christy Lee <christyc.y.lee@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220106222030.227499-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-07KVM: x86: Check for rmaps allocationNikunj A Dadhania1-0/+3
With TDP MMU being the default now, access to mmu_rmaps_stat debugfs file causes following oops: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 3185 Comm: cat Not tainted 5.16.0-rc4+ #204 RIP: 0010:pte_list_count+0x6/0x40 Call Trace: <TASK> ? kvm_mmu_rmaps_stat_show+0x15e/0x320 seq_read_iter+0x126/0x4b0 ? aa_file_perm+0x124/0x490 seq_read+0xf5/0x140 full_proxy_read+0x5c/0x80 vfs_read+0x9f/0x1a0 ksys_read+0x67/0xe0 __x64_sys_read+0x19/0x20 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7fca6fc13912 Return early when rmaps are not present. Reported-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220105040337.4234-1-nikunj@amd.com> Cc: stable@vger.kernel.org Fixes: 3bcd0662d66f ("KVM: X86: Introduce mmu_rmaps_stat per-vm debugfs file") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07KVM: SEV: Mark nested locking of kvm->lockWanpeng Li1-1/+1
Both source and dest vms' kvm->locks are held in sev_lock_two_vms. Mark one with a different subtype to avoid false positives from lockdep. Fixes: c9d61dcb0bc26 (KVM: SEV: accept signals in sev_lock_two_vms) Reported-by: Yiru Xu <xyru1999@gmail.com> Tested-by: Jinrong Liang <cloudliang@tencent.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1641364863-26331-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-01-07x86/sgx: Fix NULL pointer dereference on non-SGX systemsDave Hansen1-18/+47
== Problem == Nathan Chancellor reported an oops when aceessing the 'sgx_total_bytes' sysfs file: https://lore.kernel.org/all/YbzhBrimHGGpddDM@archlinux-ax161/ The sysfs output code accesses the sgx_numa_nodes[] array unconditionally. However, this array is allocated during SGX initialization, which only occurs on systems where SGX is supported. If the sysfs file is accessed on systems without SGX support, sgx_numa_nodes[] is NULL and an oops occurs. == Solution == To fix this, hide the entire nodeX/x86/ attribute group on systems without SGX support using the ->is_visible attribute group callback. Unfortunately, SGX is initialized via a device_initcall() which occurs _after_ the ->is_visible() callback. Instead of moving SGX initialization earlier, call sysfs_update_group() during SGX initialization to update the group visiblility. This update requires moving the SGX sysfs code earlier in sgx/main.c. There are no code changes other than the addition of arch_update_sysfs_visibility() and a minor whitespace fixup to arch_node_attr_is_visible() which checkpatch caught. CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-sgx@vger.kernel.org Cc: x86@kernel.org Fixes: 50468e431335 ("x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Link: https://lkml.kernel.org/r/20220104171527.5E8416A8@davehans-spike.ostc.intel.com
2022-01-07s390/pci: simplify __pciwb_mio() inline asmNiklas Schnelle1-4/+1
The PCI Write Barrier instruction ignores the registers encoded in it. There is thus no need to explicitly set the register to zero or to associate it with a variable at all. In the resulting binary this removes an unnecessary lghi and it makes the code simpler. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-01-06selftests: cgroup: Test open-time cgroup namespace usage for migration checksTejun Heo1-0/+97
When a task is writing to an fd opened by a different task, the perm check should use the cgroup namespace of the latter task. Add a test for it. Tested-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06selftests: cgroup: Test open-time credential usage for migration checksTejun Heo1-0/+68
When a task is writing to an fd opened by a different task, the perm check should use the credentials of the latter task. Add a test for it. Tested-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644Tejun Heo1-1/+1
0644 is an odd perm to create a cgroup which is a directory. Use the regular 0755 instead. This is necessary for euid switching test case. Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06cgroup: Use open-time cgroup namespace for process migration perm checksTejun Heo2-9/+21
cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's cgroup namespace which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created. This patch makes cgroup remember the cgroup namespace at the time of open and uses it for migration permission checks instad of current's. Note that this only applies to cgroup2 as cgroup1 doesn't have namespace support. This also fixes a use-after-free bug on cgroupns reported in https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Note that backporting this fix also requires the preceding patch. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Fixes: 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option") Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06cgroup: Allocate cgroup_file_ctx for kernfs_open_file->privTejun Heo3-31/+65
of->priv is currently used by each interface file implementation to store private information. This patch collects the current two private data usages into struct cgroup_file_ctx which is allocated and freed by the common path. This allows generic private data which applies to multiple files, which will be used to in the following patch. Note that cgroup_procs iterator is now embedded as procs.iter in the new cgroup_file_ctx so that it doesn't need to be allocated and freed separately. v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in cgroup_file_ctx as suggested by Linus. v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too. Converted. Didn't change to embedded allocation as cgroup1 pidlists get stored for caching. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Michal Koutný <mkoutny@suse.com>
2022-01-06cgroup: Use open-time credentials for process migraton perm checksTejun Heo2-4/+12
cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's credentials which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created. This patch makes both cgroup2 and cgroup1 process migration interfaces to use the credentials saved at the time of open (file->f_cred) instead of current's. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy") Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-01-06i2c: mpc: Avoid out of bounds memory accessChris Packham1-6/+9
When performing an I2C transfer where the last message was a write KASAN would complain: BUG: KASAN: slab-out-of-bounds in mpc_i2c_do_action+0x154/0x630 Read of size 2 at addr c814e310 by task swapper/2/0 CPU: 2 PID: 0 Comm: swapper/2 Tainted: G B 5.16.0-rc8 #1 Call Trace: [e5ee9d50] [c08418e8] dump_stack_lvl+0x4c/0x6c (unreliable) [e5ee9d70] [c02f8a14] print_address_description.constprop.13+0x64/0x3b0 [e5ee9da0] [c02f9030] kasan_report+0x1f0/0x204 [e5ee9de0] [c0c76ee4] mpc_i2c_do_action+0x154/0x630 [e5ee9e30] [c0c782c4] mpc_i2c_isr+0x164/0x240 [e5ee9e60] [c00f3a04] __handle_irq_event_percpu+0xf4/0x3b0 [e5ee9ec0] [c00f3d40] handle_irq_event_percpu+0x80/0x110 [e5ee9f40] [c00f3e48] handle_irq_event+0x78/0xd0 [e5ee9f60] [c00fcfec] handle_fasteoi_irq+0x19c/0x370 [e5ee9fa0] [c00f1d84] generic_handle_irq+0x54/0x80 [e5ee9fc0] [c0006b54] __do_irq+0x64/0x200 [e5ee9ff0] [c0007958] __do_IRQ+0xe8/0x1c0 [c812dd50] [e3eaab20] 0xe3eaab20 [c812dd90] [c0007a4c] do_IRQ+0x1c/0x30 [c812dda0] [c0000c04] ExternalInput+0x144/0x160 --- interrupt: 500 at arch_cpu_idle+0x34/0x60 NIP: c000b684 LR: c000b684 CTR: c0019688 REGS: c812ddb0 TRAP: 0500 Tainted: G B (5.16.0-rc8) MSR: 00029002 <CE,EE,ME> CR: 22000488 XER: 20000000 GPR00: c10ef7fc c812de90 c80ff200 c2394718 00000001 00000001 c10e3f90 00000003 GPR08: 00000000 c0019688 c2394718 fc7d625b 22000484 00000000 21e17000 c208228c GPR16: e3e99284 00000000 ffffffff c2390000 c001bac0 c2082288 c812df60 c001ba60 GPR24: c23949c0 00000018 00080000 00000004 c80ff200 00000002 c2348ee4 c2394718 NIP [c000b684] arch_cpu_idle+0x34/0x60 LR [c000b684] arch_cpu_idle+0x34/0x60 --- interrupt: 500 [c812de90] [c10e3f90] rcu_eqs_enter.isra.60+0xc0/0x110 (unreliable) [c812deb0] [c10ef7fc] default_idle_call+0xbc/0x230 [c812dee0] [c00af0e8] do_idle+0x1c8/0x200 [c812df10] [c00af3c0] cpu_startup_entry+0x20/0x30 [c812df20] [c001e010] start_secondary+0x5d0/0xba0 [c812dff0] [c00028a0] __secondary_start+0x90/0xdc This happened because we would overrun the i2c->msgs array on the final interrupt for the I2C STOP. This didn't happen if the last message was a read because there is no interrupt in that case. Ensure that we only access the current message if we are not processing a I2C STOP condition. Fixes: 1538d82f4647 ("i2c: mpc: Interrupt driven transfer") Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2022-01-05tracing: Tag trace_percpu_buffer as a percpu pointerNaveen N. Rao1-2/+2
Tag trace_percpu_buffer as a percpu pointer to resolve warnings reported by sparse: /linux/kernel/trace/trace.c:3218:46: warning: incorrect type in initializer (different address spaces) /linux/kernel/trace/trace.c:3218:46: expected void const [noderef] __percpu *__vpp_verify /linux/kernel/trace/trace.c:3218:46: got struct trace_buffer_struct * /linux/kernel/trace/trace.c:3234:9: warning: incorrect type in initializer (different address spaces) /linux/kernel/trace/trace.c:3234:9: expected void const [noderef] __percpu *__vpp_verify /linux/kernel/trace/trace.c:3234:9: got int * Link: https://lkml.kernel.org/r/ebabd3f23101d89cb75671b68b6f819f5edc830b.1640255304.git.naveen.n.rao@linux.vnet.ibm.com Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Fixes: 07d777fe8c398 ("tracing: Add percpu buffers for trace_printk()") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-05tracing: Fix check for trace_percpu_buffer validity in get_trace_buf()Naveen N. Rao1-1/+1
With the new osnoise tracer, we are seeing the below splat: Kernel attempted to read user page (c7d880000) - exploit attempt? (uid: 0) BUG: Unable to handle kernel data access on read at 0xc7d880000 Faulting instruction address: 0xc0000000002ffa10 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries ... NIP [c0000000002ffa10] __trace_array_vprintk.part.0+0x70/0x2f0 LR [c0000000002ff9fc] __trace_array_vprintk.part.0+0x5c/0x2f0 Call Trace: [c0000008bdd73b80] [c0000000001c49cc] put_prev_task_fair+0x3c/0x60 (unreliable) [c0000008bdd73be0] [c000000000301430] trace_array_printk_buf+0x70/0x90 [c0000008bdd73c00] [c0000000003178b0] trace_sched_switch_callback+0x250/0x290 [c0000008bdd73c90] [c000000000e70d60] __schedule+0x410/0x710 [c0000008bdd73d40] [c000000000e710c0] schedule+0x60/0x130 [c0000008bdd73d70] [c000000000030614] interrupt_exit_user_prepare_main+0x264/0x270 [c0000008bdd73de0] [c000000000030a70] syscall_exit_prepare+0x150/0x180 [c0000008bdd73e10] [c00000000000c174] system_call_vectored_common+0xf4/0x278 osnoise tracer on ppc64le is triggering osnoise_taint() for negative duration in get_int_safe_duration() called from trace_sched_switch_callback()->thread_exit(). The problem though is that the check for a valid trace_percpu_buffer is incorrect in get_trace_buf(). The check is being done after calculating the pointer for the current cpu, rather than on the main percpu pointer. Fix the check to be against trace_percpu_buffer. Link: https://lkml.kernel.org/r/a920e4272e0b0635cf20c444707cbce1b2c8973d.1640255304.git.naveen.n.rao@linux.vnet.ibm.com Cc: stable@vger.kernel.org Fixes: e2ace001176dc9 ("tracing: Choose static tp_printk buffer by explicit nesting count") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-05ftrace/samples: Add missing prototypes direct functionsJiri Olsa4-0/+11
There's another compilation fail (first here [1]) reported by kernel test robot for W=1 clang build: >> samples/ftrace/ftrace-direct-multi-modify.c:7:6: warning: no previous prototype for function 'my_direct_func1' [-Wmissing-prototypes] void my_direct_func1(unsigned long ip) Direct functions in ftrace direct sample modules need to have prototypes defined. They are already global in order to be visible for the inline assembly, so there's no problem. The kernel test robot reported just error for ftrace-direct-multi-modify, but I got same errors also for the rest of the modules touched by this patch. [1] 67d4f6e3bf5d ftrace/samples: Add missing prototype for my_direct_func Link: https://lkml.kernel.org/r/20211219135317.212430-1-jolsa@kernel.org Reported-by: kernel test robot <lkp@intel.com> Fixes: e1067a07cfbc ("ftrace/samples: Add module to test multi direct modify interface") Fixes: ae0cc3b7e7f5 ("ftrace/samples: Add a sample module that implements modify_ftrace_direct()") Fixes: 156473a0ff4f ("ftrace: Add another example of register_ftrace_direct() use case") Fixes: b06457c83af6 ("ftrace: Add sample module that uses register_ftrace_direct()") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2022-01-05RDMA/core: Don't infoleak GRH fieldsLeon Romanovsky1-1/+1
If dst->is_global field is not set, the GRH fields are not cleared and the following infoleak is reported. ===================================================== BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_user+0x1c9/0x270 lib/usercopy.c:33 instrument_copy_to_user include/linux/instrumented.h:121 [inline] _copy_to_user+0x1c9/0x270 lib/usercopy.c:33 copy_to_user include/linux/uaccess.h:209 [inline] ucma_init_qp_attr+0x8c7/0xb10 drivers/infiniband/core/ucma.c:1242 ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732 vfs_write+0x8ce/0x2030 fs/read_write.c:588 ksys_write+0x28b/0x510 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_32_irqs_on arch/x86/entry/common.c:114 [inline] __do_fast_syscall_32+0x96/0xf0 arch/x86/entry/common.c:180 do_fast_syscall_32+0x34/0x70 arch/x86/entry/common.c:205 do_SYSENTER_32+0x1b/0x20 arch/x86/entry/common.c:248 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c Local variable resp created at: ucma_init_qp_attr+0xa4/0xb10 drivers/infiniband/core/ucma.c:1214 ucma_write+0x637/0x6c0 drivers/infiniband/core/ucma.c:1732 Bytes 40-59 of 144 are uninitialized Memory access of size 144 starts at ffff888167523b00 Data copied to user address 0000000020000100 CPU: 1 PID: 25910 Comm: syz-executor.1 Not tainted 5.16.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 ===================================================== Fixes: 4ba66093bdc6 ("IB/core: Check for global flag when using ah_attr") Link: https://lore.kernel.org/r/0e9dd51f93410b7b2f4f5562f52befc878b71afa.1641298868.git.leonro@nvidia.com Reported-by: syzbot+6d532fa8f9463da290bc@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-05selftests: set amt.sh executableTaehee Yoo1-0/+0
amt.sh test script will not work because it doesn't have execution permission. So, it adds execution permission. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: c08e8baea78e ("selftests: add amt interface selftest script") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20220105144436.13415-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05RDMA/uverbs: Check for null return of kmalloc_arrayJiasheng Jiang1-0/+3
Because of the possible failure of the allocation, data might be NULL pointer and will cause the dereference of the NULL pointer later. Therefore, it might be better to check it and return -ENOMEM. Fixes: 6884c6c4bd09 ("RDMA/verbs: Store the write/write_ex uapi entry points in the uverbs_api") Link: https://lore.kernel.org/r/20211231093315.1917667-1-jiasheng@iscas.ac.cn Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-05Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks"Aaron Ma1-3/+6
This reverts commit f77b83b5bbab53d2be339184838b19ed2c62c0a5. This change breaks multiple usb to ethernet dongles attached on Lenovo USB hub. Fixes: f77b83b5bbab ("net: usb: r8152: Add MAC passthrough support for more Lenovo Docks") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Link: https://lore.kernel.org/r/20220105155102.8557-1-aaron.ma@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05arm64: Use correct method to calculate nomap region boundariesHuacai Chen1-2/+4
Nomap regions are treated as "reserved". When region boundaries are not page aligned, we usually increase the "reserved" regions rather than decrease them. So, we should use memblock_region_reserved_base_pfn()/ memblock_region_reserved_end_pfn() instead of memblock_region_memory_ base_pfn()/memblock_region_memory_base_pfn() to calculate boundaries. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/r/20211022070646.41923-1-chenhuacai@loongson.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-05Revert "RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow"Maor Gottlieb2-15/+17
This patch is not the full fix and still causes to call traces during mlx5_ib_dereg_mr(). This reverts commit f0ae4afe3d35e67db042c58a52909e06262b740f. Fixes: f0ae4afe3d35 ("RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow") Link: https://lore.kernel.org/r/20211222101312.1358616-1-maorg@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-05arm64: Drop outdated links in commentsKees Cook1-4/+0
As started by commit 05a5f51ca566 ("Documentation: Replace lkml.org links with lore"), an effort was made to replace lkml.org links with lore to better use a single source that's more likely to stay available long-term. However, it seems these links don't offer much value here, so just remove them entirely. Cc: Joe Perches <joe@perches.com> Suggested-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/20210211100213.GA29813@willie-the-truck/ Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20211215191835.1420010-1-keescook@chromium.org [catalin.marinas@arm.com: removed the arch/arm changes] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-01-04sfc: The RX page_ring is optionalMartin Habets2-0/+10
The RX page_ring is an optional feature that improves performance. When allocation fails the driver can still function, but possibly with a lower bandwidth. Guard against dereferencing a NULL page_ring. Fixes: 2768935a4660 ("sfc: reuse pages to avoid DMA mapping/unmapping costs") Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Reported-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Link: https://lore.kernel.org/r/164111288276.5798.10330502993729113868.stgit@palantir17.mph.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-04iavf: Fix limit of total number of queues to active queues of VFKaren Sornek1-1/+4
In the absence of this validation, if the user requests to configure queues more than the enabled queues, it results in sending the requested number of queues to the kernel stack (due to the asynchronous nature of VF response), in which case the stack might pick a queue to transmit that is not enabled and result in Tx hang. Fix this bug by limiting the total number of queues allocated for VF to active queues of VF. Fixes: d5b33d024496 ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Ashwin Vijayavel <ashwin.vijayavel@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: Fix incorrect netdev's real number of RX/TX queuesJedrzej Jagielski1-7/+25
There was a wrong queues representation in sysfs during driver's reinitialization in case of online cpus number is less than combined queues. It was caused by stopped NetworkManager, which is responsible for calling vsi_open function during driver's initialization. In specific situation (ex. 12 cpus online) there were 16 queues in /sys/class/net/<iface>/queues. In case of modifying queues with value higher, than number of online cpus, then it caused write errors and other errors. Add updating of sysfs's queues representation during driver initialization. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Lukasz Cieplicki <lukaszx.cieplicki@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: Fix for displaying message regarding NVM versionMateusz Palczewski1-2/+2
When loading the i40e driver, it prints a message like: 'The driver for the device detected a newer version of the NVM image v1.x than expected v1.y. Please install the most recent version of the network driver.' This is misleading as the driver is working as expected. Fix that by removing the second part of message and changing it from dev_info to dev_dbg. Fixes: 4fb29bddb57f ("i40e: The driver now prints the API version in error message") Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: fix use-after-free in i40e_sync_filters_subtask()Di Zhu1-0/+24
Using ifconfig command to delete the ipv6 address will cause the i40e network card driver to delete its internal mac_filter and i40e_service_task kernel thread will concurrently access the mac_filter. These two processes are not protected by lock so causing the following use-after-free problems. print_address_description+0x70/0x360 ? vprintk_func+0x5e/0xf0 kasan_report+0x1b2/0x330 i40e_sync_vsi_filters+0x4f0/0x1850 [i40e] i40e_sync_filters_subtask+0xe3/0x130 [i40e] i40e_service_task+0x195/0x24c0 [i40e] process_one_work+0x3f5/0x7d0 worker_thread+0x61/0x6c0 ? process_one_work+0x7d0/0x7d0 kthread+0x1c3/0x1f0 ? kthread_park+0xc0/0xc0 ret_from_fork+0x35/0x40 Allocated by task 2279810: kasan_kmalloc+0xa0/0xd0 kmem_cache_alloc_trace+0xf3/0x1e0 i40e_add_filter+0x127/0x2b0 [i40e] i40e_add_mac_filter+0x156/0x190 [i40e] i40e_addr_sync+0x2d/0x40 [i40e] __hw_addr_sync_dev+0x154/0x210 i40e_set_rx_mode+0x6d/0xf0 [i40e] __dev_set_rx_mode+0xfb/0x1f0 __dev_mc_add+0x6c/0x90 igmp6_group_added+0x214/0x230 __ipv6_dev_mc_inc+0x338/0x4f0 addrconf_join_solict.part.7+0xa2/0xd0 addrconf_dad_work+0x500/0x980 process_one_work+0x3f5/0x7d0 worker_thread+0x61/0x6c0 kthread+0x1c3/0x1f0 ret_from_fork+0x35/0x40 Freed by task 2547073: __kasan_slab_free+0x130/0x180 kfree+0x90/0x1b0 __i40e_del_filter+0xa3/0xf0 [i40e] i40e_del_mac_filter+0xf3/0x130 [i40e] i40e_addr_unsync+0x85/0xa0 [i40e] __hw_addr_sync_dev+0x9d/0x210 i40e_set_rx_mode+0x6d/0xf0 [i40e] __dev_set_rx_mode+0xfb/0x1f0 __dev_mc_del+0x69/0x80 igmp6_group_dropped+0x279/0x510 __ipv6_dev_mc_dec+0x174/0x220 addrconf_leave_solict.part.8+0xa2/0xd0 __ipv6_ifa_notify+0x4cd/0x570 ipv6_ifa_notify+0x58/0x80 ipv6_del_addr+0x259/0x4a0 inet6_addr_del+0x188/0x260 addrconf_del_ifaddr+0xcc/0x130 inet6_ioctl+0x152/0x190 sock_do_ioctl+0xd8/0x2b0 sock_ioctl+0x2e5/0x4c0 do_vfs_ioctl+0x14e/0xa80 ksys_ioctl+0x7c/0xa0 __x64_sys_ioctl+0x42/0x50 do_syscall_64+0x98/0x2c0 entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Di Zhu <zhudi2@huawei.com> Signed-off-by: Rui Zhang <zhangrui182@huawei.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: Fix to not show opcode msg on unsuccessful VF MAC changeMateusz Palczewski1-8/+32
Hide i40e opcode information sent during response to VF in case when untrusted VF tried to change MAC on the VF interface. This is implemented by adding an additional parameter 'hide' to the response sent to VF function that hides the display of error information, but forwards the error code to VF. Previously it was not possible to send response with some error code to VF without displaying opcode information. Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04ieee802154: atusb: fix uninit value in atusb_set_extended_addrPavel Skripkin1-4/+6
Alexander reported a use of uninitialized value in atusb_set_extended_addr(), that is caused by reading 0 bytes via usb_control_msg(). Fix it by validating if the number of bytes transferred is actually correct, since usb_control_msg() may read less bytes, than was requested by caller. Fail log: BUG: KASAN: uninit-cmp in ieee802154_is_valid_extended_unicast_addr include/linux/ieee802154.h:310 [inline] BUG: KASAN: uninit-cmp in atusb_set_extended_addr drivers/net/ieee802154/atusb.c:1000 [inline] BUG: KASAN: uninit-cmp in atusb_probe.cold+0x29f/0x14db drivers/net/ieee802154/atusb.c:1056 Uninit value used in comparison: 311daa649a2003bd stack handle: 000000009a2003bd ieee802154_is_valid_extended_unicast_addr include/linux/ieee802154.h:310 [inline] atusb_set_extended_addr drivers/net/ieee802154/atusb.c:1000 [inline] atusb_probe.cold+0x29f/0x14db drivers/net/ieee802154/atusb.c:1056 usb_probe_interface+0x314/0x7f0 drivers/usb/core/driver.c:396 Fixes: 7490b008d123 ("ieee802154: add support for atusb transceiver") Reported-by: Alexander Potapenko <glider@google.com> Acked-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20220104182806.7188-1-paskripkin@gmail.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2022-01-04EDAC/i10nm: Release mdev/mbase when failing to detect HBMQiuxu Zhuo1-0/+9
On systems without HBM (High Bandwidth Memory) mdev/mbase are not released/unmapped. Add the code to release mdev/mbase when failing to detect HBM. [Tony: re-word commit message] Cc: <stable@vger.kernel.org> Fixes: c945088384d0 ("EDAC/i10nm: Add support for high bandwidth memory") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20211224091126.1246-1-qiuxu.zhuo@intel.com
2022-01-04arm64: perf: Don't register user access sysctl handler multiple timesWill Deacon1-2/+9
Commit e2012600810c ("arm64: perf: Add userspace counter access disable switch") introduced a new 'perf_user_access' sysctl file to enable and disable direct userspace access to the PMU counters. Sadly, Geert reports that on his big.LITTLE SoC ('Renesas Salvator-XS w/ R-Car H3'), the file is created for each PMU type probed, resulting in a splat during boot: | hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available | sysctl duplicate entry: /kernel//perf_user_access | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc3-arm64-renesas-00003-ge2012600810c #1420 | Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT) | Call trace: | dump_backtrace+0x0/0x190 | show_stack+0x14/0x20 | dump_stack_lvl+0x88/0xb0 | dump_stack+0x14/0x2c | __register_sysctl_table+0x384/0x818 | register_sysctl+0x20/0x28 | armv8_pmu_init.constprop.0+0x118/0x150 | armv8_a57_pmu_init+0x1c/0x28 | arm_pmu_device_probe+0x1b4/0x558 | armv8_pmu_device_probe+0x18/0x20 | platform_probe+0x64/0xd0 | hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available Introduce a state variable to track creation of the sysctl file and ensure that it is only created once. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: e2012600810c ("arm64: perf: Add userspace counter access disable switch") Link: https://lore.kernel.org/r/CAMuHMdVcDxR9sGzc5pcnORiotonERBgc6dsXZXMd6wTvLGA9iw@mail.gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2022-01-04RDMA/rxe: Prevent double freeing rxe_map_set()Li Zhijian1-9/+7
The same rxe_map_set could be freed twice: rxe_reg_user_mr() -> rxe_mr_init_user() -> rxe_mr_free_map_set() # 1st -> rxe_drop_ref() ... -> rxe_mr_cleanup() -> rxe_mr_free_map_set() # 2nd Follow normal convection and put resource cleanup either in the error unwind of the allocator, or the overall free function. Leave the object unchanged with a NULL cur_map_set on failure and remove the unncessary free in rxe_mr_init_user(). Link: https://lore.kernel.org/r/20211228014406.1033444-1-lizhijian@cn.fujitsu.com Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>