aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/exported-sql-viewer.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-09-03tools/power turbostat: Build with _FILE_OFFSET_BITS=64Alexander Monakov1-0/+1
For compatibility reasons, Glibc off_t is a 32-bit type on 32-bit x86 unless _FILE_OFFSET_BITS=64 is defined. Add this define, as otherwise reading MSRs with index 0x80000000 and above attempts a pread with a negative offset, which fails. Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Tested-by: Liwei Song <liwei.song@windriver.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Support AMD Family 19hKim Phillips1-23/+11
Family 19h processors have the same RAPL (Running average power limit) hardware register interface as Family 17h processors. Change the family checks to succeed for Family 17h and above to enable core and package energy measurement on Family 19h machines. Also update the TDP to the largest found at the bottom of the page at amd.com->processors->servers->epyc->2nd-gen-epyc, i.e., the EPYC 7H12. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Len Brown <len.brown@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Remove empty columns for JacobsvilleAntti Laakso1-3/+30
Jacobsville doesn't have Package C2 and C6. Also Core and DRAM RAPL are not available. Adjust output accordingly. Signed-off-by: Antti Laakso <antti.laakso@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Add a new GFXAMHz column that exposes gt_act_freq_mhz.Rafael Antognolli1-0/+50
The column already present called GFXMHz reads from gt_cur_freq_mhz, which represents the GT frequency that was requested, but power management might not be able to do that. So the new column will display what the actual frequency GT is running at. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power x86_energy_perf_policy: Input/output error in a VMOndřej Lysoněk1-13/+54
I've encountered an issue with x86_energy_perf_policy. If I run it on a machine that I'm told is a qemu-kvm virtual machine running inside a privileged container, I get the following error: x86_energy_perf_policy: /dev/cpu/0/msr offset 0x1ad read failed: Input/output error I get the same error in a Digital Ocean droplet, so that might be a similar environment. I created the following patch which is intended to give a more user-friendly message. It's based on a patch for turbostat from Prarit Bhargava that was posted some time ago. The patch is "[v2] turbostat: Running on virtual machine is not supported" [1]. Given my limited knowledge of the topic, I can't say with confidence that this is the right solution, though (that's why this is not an official patch submission). Also, I'm not sure what the convention with exit codes is in this tool. Also, instead of the error message, perhaps the tool should just not print anything in this case, which is how it behaves in a "regular" VM? [1] https://patchwork.kernel.org/patch/9868587/ Signed-off-by: Ondřej Lysoněk <olysonek@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Skip pc8, pc9, pc10 columns, if they are disabledLen Brown1-3/+6
Like we skip PC3 and PC6 columns when the package C-state limit disables them, skip PC8/PC9/CP10 under analogous conditions. Reported-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Support additional CPU model numbersLen Brown1-0/+4
Initial support for models recently added to intel-family.h. Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Fix output formatting for ACPI CST enumerationDavid Arcari1-0/+20
turbostat formatting is broken with ACPI CST for enumeration. The problem is that the CX_ACPI% is eight characters long which does not work with tab formatting. One simple solution is to remove the underbar from the state name such that C1_ACPI will be displayed as C1ACPI. Signed-off-by: David Arcari <darcari@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Replace HTTP links with HTTPS ones: TURBOSTAT UTILITYAlexander A. Klimov1-1/+1
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Use sched_getcpu() instead of hardcoded cpu 0Prarit Bhargava1-3/+10
Disabling cpu 0 results in an error turbostat: /sys/devices/system/cpu/cpu0/topology/thread_siblings: open failed: No such file or directory Use sched_getcpu() instead of a hardcoded cpu 0 to get the max cpu number. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Enable accumulate RAPL displayChen Yu1-22/+16
Enable the accumulated RAPL display by default. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Introduce functions to accumulate RAPL consumptionChen Yu2-7/+219
Since the RAPL Joule Counter is 32 bit, turbostat would only print a *star* instead of printing the actual energy consumed to indicate the overflow due to long duration. This does not meet the requirement from servers as the sampling time of turbostat is usually very long on servers. So maintain a set of MSR buffer, and update them periodically before the 32bit MSR register is wrapped round, so as to avoid the overflow. The idea is similar to the implementation of ktime_get(): Periodical MSR timer: total_rapl_sum += (current_rapl_msr - last_rapl_msr); Using get_msr_sum() to get the accumulated RAPL: return (current_rapl_msr - last_rapl_msr) + total_rapl_sum; The accumulated RAPL mechanism will be turned on in next patch. Originally-by: Aaron Lu <aaron.lwe@gmail.com> Reviewed-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Make the energy variable to be 64 bitChen Yu1-17/+13
Change the energy variable from 32bit to 64bit, so that it can record long time duration. After this conversion, adjust the DELTA_WRAP32() accordingly. Signed-off-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Always print idle in the system configuration headerDoug Smythies1-3/+0
If the --quiet option is not used, turbostat prints a useful system configuration header during startup. But inclusion of idle system configuration information in this header is currently a function of inclusion in the columns chosen to be displayed. Always list this idle system configuration. Signed-off-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Len Brown <len.brown@intel.com>
2020-09-03tools/power turbostat: Print /dev/cpu_dma_latencyLen Brown1-0/+28
Users are puzzled when they use tuned performance and all their C-states vanish. Dump /dev/cpu_dma_latency and state whether the value is default, or constraining, to explain this situation. Signed-off-by: Len Brown <len.brown@intel.com>
2020-07-25x86/cpu: Add Lakefield, Alder Lake and Rocket Lake models to the to Intel CPU familyTony Luck1-0/+7
Add three new Intel CPU models. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200721043749.31567-1-tony.luck@intel.com
2020-07-19Linux 5.8-rc6Linus Torvalds1-1/+1
2020-07-19x86/boot: Don't add the EFI stub to targetsArvind Sankar1-2/+2
vmlinux-objs-y is added to targets, which currently means that the EFI stub gets added to the targets as well. It shouldn't be added since it is built elsewhere. This confuses Makefile.build which interprets the EFI stub as a target $(obj)/$(objtree)/drivers/firmware/efi/libstub/lib.a and will create drivers/firmware/efi/libstub/ underneath arch/x86/boot/compressed, to hold this supposed target, if building out-of-tree. [0] Fix this by pulling the stub out of vmlinux-objs-y into efi-obj-y. [0] See scripts/Makefile.build near the end: # Create directories for object files if they do not exist Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200715032631.1562882-1-nivedita@alum.mit.edu
2020-07-19x86/entry: Actually disable stack protectorKees Cook1-3/+11
Some builds of GCC enable stack protector by default. Simply removing the arguments is not sufficient to disable stack protector, as the stack protector for those GCC builds must be explicitly disabled. Remove the argument removals and add -fno-stack-protector. Additionally include missed x32 argument updates, and adjust whitespace for readability. Fixes: 20355e5f73a7 ("x86/entry: Exclude low level entry code from sanitizing") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/202006261333.585319CA6B@keescook
2020-07-18hwmon: (drivetemp) Avoid SCT usage on Toshiba DT01ACA family drivesMaciej S. Szmigiero1-0/+43
It has been observed that Toshiba DT01ACA family drives have WRITE FPDMA QUEUED command timeouts and sometimes just freeze until power-cycled under heavy write loads when their temperature is getting polled in SCT mode. The SMART mode seems to be fine, though. Let's make sure we don't use SCT mode for these drives then. While only the 3 TB model was actually caught exhibiting the problem let's play safe here to avoid data corruption and extend the ban to the whole family. Fixes: 5b46903d8bf3 ("hwmon: Driver for disk and solid state drives with temperature sensors") Cc: stable@vger.kernel.org Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Link: https://lore.kernel.org/r/0cb2e7022b66c6d21d3f189a12a97878d0e7511b.1595075458.git.mail@maciej.szmigiero.name Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-07-18x86/ioperm: Fix io bitmap invalidation on Xen PVAndy Lutomirski6-17/+38
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop machinery, so the TSS and Xen's io bitmap would get out of sync whenever disabling a valid io bitmap. Add a new pvop for tss_invalidate_io_bitmap() to fix it. This is XSA-329. Fixes: 22fe5b0439dd ("x86/ioperm: Move TSS bitmap update to exit to user work") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
2020-07-17genirq/affinity: Handle affinity setting on inactive interrupts correctlyThomas Gleixner2-19/+40
Setting interrupt affinity on inactive interrupts is inconsistent when hierarchical irq domains are enabled. The core code should just store the affinity and not call into the irq chip driver for inactive interrupts because the chip drivers may not be in a state to handle such requests. X86 has a hacky workaround for that but all other irq chips have not which causes problems e.g. on GIC V3 ITS. Instead of adding more ugly hacks all over the place, solve the problem in the core code. If the affinity is set on an inactive interrupt then: - Store it in the irq descriptors affinity mask - Update the effective affinity to reflect that so user space has a consistent view - Don't call into the irq chip driver This is the core equivalent of the X86 workaround and works correctly because the affinity setting is established in the irq chip when the interrupt is activated later on. Note, that this is only effective when hierarchical irq domains are enabled by the architecture. Doing it unconditionally would break legacy irq chip implementations. For hierarchial irq domains this works correctly as none of the drivers can have a dependency on affinity setting in inactive state by design. Remove the X86 workaround as it is not longer required. Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts") Reported-by: Ali Saidi <alisaidi@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Ali Saidi <alisaidi@amazon.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
2020-07-17timer: Fix wheel index calculation on last levelFrederic Weisbecker1-2/+2
When an expiration delta falls into the last level of the wheel, that delta has be compared against the maximum possible delay and reduced to fit in if necessary. However instead of comparing the delta against the maximum, the code compares the actual expiry against the maximum. Then instead of fixing the delta to fit in, it sets the maximum delta as the expiry value. This can result in various undesired outcomes, the worst possible one being a timer expiring 15 days ahead to fire immediately. Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel") Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200717140551.29076-2-frederic@kernel.org
2020-07-17SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion")Olga Kornievskaia2-10/+4
Reverting commit d03727b248d0 "NFSv4 fix CLOSE not waiting for direct IO compeletion". This patch made it so that fput() by calling inode_dio_done() in nfs_file_release() would wait uninterruptably for any outstanding directIO to the file (but that wait on IO should be killable). The problem the patch was also trying to address was REMOVE returning ERR_ACCESS because the file is still opened, is supposed to be resolved by server returning ERR_FILE_OPEN and not ERR_ACCESS. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-07-17RISC-V: Upgrade smp_mb__after_spinlock() to iorw,iorwPalmer Dabbelt1-1/+9
While digging through the recent mmiowb preemption issue it came up that we aren't actually preventing IO from crossing a scheduling boundary. While it's a bit ugly to overload smp_mb__after_spinlock() with this behavior, it's what PowerPC is doing so there's some precedent. Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-07-17tools arch kvm: Sync kvm headers with the kernel sourcesArnaldo Carvalho de Melo1-2/+3
To pick up the changes from: 83d31e5271ac ("KVM: nVMX: fixes for preemption timer migration") That don't entail changes in tooling. This silences these tools/perf build warnings: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-17perf tools: Sync hashmap.h with libbpf'sArnaldo Carvalho de Melo1-4/+8
To pick up the changes in: b2f9f1535bb9 ("libbpf: Fix libbpf hashmap on (I)LP32 architectures") Silencing this warning: Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h' diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h I'll eventually update the warning to remove the "Kernel ABI" part and instead state libbpf when noticing that the original is at "tools/lib/something". Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andriin@fb.com> Cc: Jakub Bogusz <qboosh@pld-linux.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-17libsubcmd: Fix OPT_CALLBACK_SET()Ravi Bangoria1-0/+3
Any option macro with _SET suffix should set opt->set variable which is not happening for OPT_CALLBACK_SET(). This is causing issues with perf record --switch-output-event. Fix that. Before: # ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \ --switch-output-event syscalls:sys_enter_mmap ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.297 MB perf.data (657 samples) ] After: $ ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \ --switch-output-event syscalls:sys_enter_mmap [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020061918144542 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020061918144608 ] [ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020061918144660 ] ^C[ perf record: dump data: Woken up 1 times ] [ perf record: Dump perf.data.2020061918144784 ] [ perf record: Woken up 0 times to write data ] [ perf record: Dump perf.data.2020061918144803 ] [ perf record: Captured and wrote 0.419 MB perf.data.<timestamp> ] Fixes: 636eb4d001b1 ("libsubcmd: Introduce OPT_CALLBACK_SET()") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200619133412.50705-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-17drivers/perf: Prevent forced unbinding of PMU driversQi Liu13-0/+13
Forcefully unbinding PMU drivers during perf sampling will lead to a kernel panic, because the perf upper-layer framework call a NULL pointer in this situation. To solve this issue, "suppress_bind_attrs" should be set to true, so that bind/unbind can be disabled via sysfs and prevent unbinding PMU drivers during perf sampling. Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-17asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible()Will Deacon1-2/+4
Although mmiowb() is concerned only with serialising MMIO writes occuring in contexts where a spinlock is held, the call to mmiowb_set_pending() from the MMIO write accessors can occur in preemptible contexts, such as during driver probe() functions where ordering between CPUs is not usually a concern, assuming that the task migration path provides the necessary ordering guarantees. Unfortunately, the default implementation of mmiowb_set_pending() is not preempt-safe, as it makes use of a a per-cpu variable to track its internal state. This has been reported to generate the following splat on riscv: | BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 | caller is regmap_mmio_write32le+0x1c/0x46 | CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1 | Call Trace: | walk_stackframe+0x0/0x7a | dump_stack+0x6e/0x88 | regmap_mmio_write32le+0x18/0x46 | check_preemption_disabled+0xa4/0xaa | regmap_mmio_write32le+0x18/0x46 | regmap_mmio_write+0x26/0x44 | regmap_write+0x28/0x48 | sifive_gpio_probe+0xc0/0x1da Although it's possible to fix the driver in this case, other splats have been seen from other drivers, including the infamous 8250 UART, and so it's better to address this problem in the mmiowb core itself. Fix mmiowb_set_pending() by using the raw_cpu_ptr() to get at the mmiowb state and then only updating the 'mmiowb_pending' field if we are not preemptible (i.e. we have a non-zero nesting count). Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Guo Ren <guoren@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Palmer Dabbelt <palmer@dabbelt.com> Reported-by: Emil Renner Berthing <kernel@esmil.dk> Tested-by: Emil Renner Berthing <kernel@esmil.dk> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> Link: https://lore.kernel.org/r/20200716112816.7356-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16sched/fair: handle case of task_h_load() returning 0Vincent Guittot1-2/+13
task_h_load() can return 0 in some situations like running stress-ng mmapfork, which forks thousands of threads, in a sched group on a 224 cores system. The load balance doesn't handle this correctly because env->imbalance never decreases and it will stop pulling tasks only after reaching loop_max, which can be equal to the number of running tasks of the cfs. Make sure that imbalance will be decreased by at least 1. misfit task is the other feature that doesn't handle correctly such situation although it's probably more difficult to face the problem because of the smaller number of CPUs and running tasks on heterogenous system. We can't simply ensure that task_h_load() returns at least one because it would imply to handle underflow in other places. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: <stable@vger.kernel.org> # v4.4+ Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
2020-07-16regmap: debugfs: Don't sleep while atomic for fast_io regmapsDouglas Anderson1-23/+29
If a regmap has "fast_io" set then its lock function uses a spinlock. That doesn't work so well with the functions: * regmap_cache_only_write_file() * regmap_cache_bypass_write_file() Both of the above functions have the pattern: 1. Lock the regmap. 2. Call: debugfs_write_file_bool() copy_from_user() __might_fault() __might_sleep() Let's reorder things a bit so that we do all of our sleepable functions before we grab the lock. Fixes: d3dc5430d68f ("regmap: debugfs: Allow writes to cache state settings") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200715164611.1.I35b3533e8a80efde0cec1cc70f71e1e74b2fa0da@changeid Signed-off-by: Mark Brown <broonie@kernel.org>
2020-07-16x86: math-emu: Fix up 'cmp' insn for clang iasArnd Bergmann1-1/+1
The clang integrated assembler requires the 'cmp' instruction to have a length prefix here: arch/x86/math-emu/wm_sqrt.S:212:2: error: ambiguous instructions require an explicit suffix (could be 'cmpb', 'cmpw', or 'cmpl') cmp $0xffffffff,-24(%ebp) ^ Make this a 32-bit comparison, which it was clearly meant to be. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lkml.kernel.org/r/20200527135352.1198078-1-arnd@arndb.de
2020-07-16x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERVSedat Dilek1-2/+2
When assembling with Clang via `make LLVM_IAS=1` and CONFIG_HYPERV enabled, we observe the following error: <instantiation>:9:6: error: expected absolute expression .if HYPERVISOR_REENLIGHTENMENT_VECTOR == 3 ^ <instantiation>:1:1: note: while in macro instantiation idtentry HYPERVISOR_REENLIGHTENMENT_VECTOR asm_sysvec_hyperv_reenlightenment sysvec_hyperv_reenlightenment has_error_code=0 ^ ./arch/x86/include/asm/idtentry.h:627:1: note: while in macro instantiation idtentry_sysvec HYPERVISOR_REENLIGHTENMENT_VECTOR sysvec_hyperv_reenlightenment; ^ <instantiation>:9:6: error: expected absolute expression .if HYPERVISOR_STIMER0_VECTOR == 3 ^ <instantiation>:1:1: note: while in macro instantiation idtentry HYPERVISOR_STIMER0_VECTOR asm_sysvec_hyperv_stimer0 sysvec_hyperv_stimer0 has_error_code=0 ^ ./arch/x86/include/asm/idtentry.h:628:1: note: while in macro instantiation idtentry_sysvec HYPERVISOR_STIMER0_VECTOR sysvec_hyperv_stimer0; This is caused by typos in arch/x86/include/asm/idtentry.h: HYPERVISOR_REENLIGHTENMENT_VECTOR -> HYPERV_REENLIGHTENMENT_VECTOR HYPERVISOR_STIMER0_VECTOR -> HYPERV_STIMER0_VECTOR For more details see ClangBuiltLinux issue #1088. Fixes: a16be368dd3f ("x86/entry: Convert various hypervisor vectors to IDTENTRY_SYSVEC") Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1088 Link: https://github.com/ClangBuiltLinux/linux/issues/1043 Link: https://lore.kernel.org/patchwork/patch/1272115/ Link: https://lkml.kernel.org/r/20200714194740.4548-1-sedat.dilek@gmail.com
2020-07-16x86/entry: Add compatibility with IASJian Cai1-8/+6
Clang's integrated assembler does not allow symbols with non-absolute values to be reassigned. Modify the interrupt entry loop macro to be compatible with IAS by using a label and an offset. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Suggested-by: Brian Gerst <brgerst@gmail.com> Suggested-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Jian Cai <caij2003@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Link: https://github.com/ClangBuiltLinux/linux/issues/1043 Link: https://lkml.kernel.org/r/20200714233024.1789985-1-caij2003@gmail.com
2020-07-16nvme: explicitly update mpath disk capacity on revalidationAnthony Iliopoulos2-0/+14
Commit 3b4b19721ec652 ("nvme: fix possible deadlock when I/O is blocked") reverted multipath head disk revalidation due to deadlocks caused by holding the bd_mutex during revalidate. Updating the multipath disk blockdev size is still required though for userspace to be able to observe any resizing while the device is mounted. Directly update the bdev inode size to avoid unnecessarily holding the bdev->bd_mutex. Fixes: 3b4b19721ec652 ("nvme: fix possible deadlock when I/O is blocked") Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-16arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEPWill Deacon1-2/+2
Rather than open-code test_tsk_thread_flag() at each callsite, simply replace the couple of offenders with calls to test_tsk_thread_flag() directly. Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter()Will Deacon1-2/+2
Setting a system call number of -1 is special, as it indicates that the current system call should be skipped. Use NO_SYSCALL instead of -1 when checking for this scenario, which is different from the -1 returned due to a seccomp failure. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: syscall: Expand the comment about ptrace and syscall(-1)Will Deacon1-1/+15
If a task executes syscall(-1), we intercept this early and force x0 to be -ENOSYS so that we don't need to distinguish this scenario from one where the scno is -1 because a tracer wants to skip the system call using ptrace. With the return value set, the return path is the same as the skip case. Although there is a one-line comment noting this in el0_svc_common(), it misses out most of the detail. Expand the comment to describe a bit more about what is going on. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Add a comment describing our syscall entry/exit trap ABIWill Deacon1-2/+14
Our tracehook logic for syscall entry/exit raises a SIGTRAP back to the tracer following a ptrace request such as PTRACE_SYSCALL. As part of this procedure, we clobber the reported value of one of the tracee's general purpose registers (x7 for native tasks, r12 for compat) to indicate whether the stop occurred on syscall entry or exit. This is a slightly unfortunate ABI, as it prevents the tracer from accessing the real register value and is at odds with other similar stops such as seccomp traps. Since we're stuck with this ABI, expand the comment in our tracehook logic to acknowledge the issue and describe the behaviour in more detail. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Luis Machado <luis.machado@linaro.org> Reported-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: compat: Ensure upper 32 bits of x0 are zero on syscall returnWill Deacon2-1/+14
Although we zero the upper bits of x0 on entry to the kernel from an AArch32 task, we do not clear them on the exception return path and can therefore expose 64-bit sign extended syscall return values to userspace via interfaces such as the 'perf_regs' ABI, which deal exclusively with 64-bit registers. Explicitly clear the upper 32 bits of x0 on return from a compat system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Cc: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Override SPSR.SS when single-stepping is enabledWill Deacon3-6/+20
Luis reports that, when reverse debugging with GDB, single-step does not function as expected on arm64: | I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP | request by GDB won't execute the underlying instruction. As a consequence, | the PC doesn't move, but we return a SIGTRAP just like we would for a | regular successful PTRACE_SINGLESTEP request. The underlying problem is that when the CPU register state is restored as part of a reverse step, the SPSR.SS bit is cleared and so the hardware single-step state can transition to the "active-pending" state, causing an unexpected step exception to be taken immediately if a step operation is attempted. In hindsight, we probably shouldn't have exposed SPSR.SS in the pstate accessible by the GPR regset, but it's a bit late for that now. Instead, simply prevent userspace from configuring the bit to a value which is inconsistent with the TIF_SINGLESTEP state for the task being traced. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Keno Fischer <keno@juliacomputing.com> Link: https://lore.kernel.org/r/1eed6d69-d53d-9657-1fc9-c089be07f98c@linaro.org Reported-by: Luis Machado <luis.machado@linaro.org> Tested-by: Luis Machado <luis.machado@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16arm64: ptrace: Consistently use pseudo-singlestep exceptionsWill Deacon4-16/+23
Although the arm64 single-step state machine can be fast-forwarded in cases where we wish to generate a SIGTRAP without actually executing an instruction, this has two major limitations outside of simply skipping an instruction due to emulation. 1. Stepping out of a ptrace signal stop into a signal handler where SIGTRAP is blocked. Fast-forwarding the stepping state machine in this case will result in a forced SIGTRAP, with the handler reset to SIG_DFL. 2. The hardware implicitly fast-forwards the state machine when executing an SVC instruction for issuing a system call. This can interact badly with subsequent ptrace stops signalled during the execution of the system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt the stepping state by updating the PSTATE for the tracee. Resolve both of these issues by injecting a pseudo-singlestep exception on entry to a signal handler and also on return to userspace following a system call. Cc: <stable@vger.kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Tested-by: Luis Machado <luis.machado@linaro.org> Reported-by: Keno Fischer <keno@juliacomputing.com> Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16drivers/perf: Fix kernel panic when rmmod PMU modules during perf samplingQi Liu5-0/+5
When users try to remove PMU modules during perf sampling, kernel panic will happen because the pmu->read() is a NULL pointer here. INFO on HiSilicon hip08 platform as follow: pc : hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu] lr : hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu] sp : ffff800010103e90 x29: ffff800010103e90 x28: ffff0027db0c0e40 x27: ffffa29a76f129d8 x26: ffffa29a77ceb000 x25: ffffa29a773a5000 x24: ffffa29a77392000 x23: ffffddffe5943f08 x22: ffff002784285960 x21: ffff002784285800 x20: ffff0027d2e76c80 x19: ffff0027842859e0 x18: ffff80003498bcc8 x17: ffffa29a76afe910 x16: ffffa29a7583f530 x15: 16151a1512061a1e x14: 0000000000000000 x13: ffffa29a76f1e238 x12: 0000000000000001 x11: 0000000000000400 x10: 00000000000009f0 x9 : ffff8000107b3e70 x8 : ffff0027db0c1890 x7 : ffffa29a773a7000 x6 : 00000007f5131013 x5 : 00000007f5131013 x4 : 09f257d417c00000 x3 : 00000002187bd7ce x2 : ffffa29a38f0f0d8 x1 : ffffa29a38eae268 x0 : ffff0027d2e76c80 Call trace: hisi_uncore_pmu_event_update+0x30/0xa4 [hisi_uncore_pmu] hisi_uncore_pmu_read+0x20/0x2c [hisi_uncore_pmu] __perf_event_read+0x1a0/0x1f8 flush_smp_call_function_queue+0xa0/0x160 generic_smp_call_function_single_interrupt+0x18/0x20 handle_IPI+0x31c/0x4dc gic_handle_irq+0x2c8/0x310 el1_irq+0xcc/0x180 arch_cpu_idle+0x4c/0x20c default_idle_call+0x20/0x30 do_idle+0x1b4/0x270 cpu_startup_entry+0x28/0x30 secondary_start_kernel+0x1a4/0x1fc To solve the above issue, current module should be registered to kernel, so that try_module_get() can be invoked when perf sampling starts. This adds the reference counting of module and could prevent users from removing modules during sampling. Reported-by: Haifeng Wang <wang.wanghaifeng@huawei.com> Signed-off-by: Qi Liu <liuqi115@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1594891165-8228-1-git-send-email-liuqi115@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-16ALSA: hda/realtek - fixup for yet another Intel reference boardPeiSen Hou1-0/+1
Add headset_jack for the intel reference board support with 10ec:1230. Signed-off-by: PeiSen Hou <pshou@realtek.com.tw> Link: https://lore.kernel.org/r/20200716090134.9811-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-07-16USB: serial: iuu_phoenix: fix memory corruptionJohan Hovold1-3/+5
The driver would happily overwrite its write buffer with user data in 256 byte increments due to a removed buffer-space sanity check. Fixes: 5fcf62b0f1f2 ("tty: iuu_phoenix: fix locking.") Cc: stable <stable@vger.kernel.org> # 2.6.31 Signed-off-by: Johan Hovold <johan@kernel.org>
2020-07-16ALSA: hda/realtek - Enable Speaker for ASUS UX563Kailang Yang1-0/+1
ASUS UX563 speaker can't output. Add quirk to link suitable model will enable it. This model also could enable headset Mic. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/96dee3ab01a04c28a7b44061e88009dd@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-07-16ALSA: hda/realtek - Enable Speaker for ASUS UX533 and UX534Kailang Yang1-0/+2
ASUS UX533 and UX534 speaker still can't output. End User feedback speaker didn't have output. Add this COEF value will enable it. Fixes: 4e051106730d ("ALSA: hda/realtek: Enable audio jacks of ASUS UX533FD with ALC294") Cc: <stable@vger.kernel.org> Signed-off-by: Kailang Yang <kailang@realtek.com> Link: https://lore.kernel.org/r/80334402a93b48e385f8f4841b59ae09@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-07-16ovl: fix lookup of indexed hardlinks with metacopyAmir Goldstein1-0/+4
We recently moved setting inode flag OVL_UPPERDATA to ovl_lookup(). When looking up an overlay dentry, upperdentry may be found by index and not by name. In that case, we fail to read the metacopy xattr and falsly set the OVL_UPPERDATA on the overlay inode. This caused a regression in xfstest overlay/033 when run with OVERLAY_MOUNT_OPTIONS="-o metacopy=on". Fixes: 28166ab3c875 ("ovl: initialize OVL_UPPERDATA in ovl_lookup()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-07-16ovl: fix unneeded call to ovl_change_flags()Amir Goldstein1-4/+6
The check if user has changed the overlay file was wrong, causing unneeded call to ovl_change_flags() including taking f_lock on every file access. Fixes: d989903058a8 ("ovl: do not generate duplicate fsnotify events for "fake" path") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>