Age | Commit message (Collapse) | Author | Files | Lines |
|
console_verbose() increases console loglevel to
CONSOLE_LOGLEVEL_MOTORMOUTH, which provides more information
to debug a panic/oops.
Unfortunately, in Arista we maintain some DUTs (Device Under Test) that
are configured to have 9600 baud rate. While verbose console messages
have their value to post-analyze crashes, on such setup they:
- may prevent panic/oops messages being printed
- take too long to flush on console resulting in watchdog reboot
In all our setups we use kdump which saves dmesg buffer after panic,
so in reality those extra messages on console provide no additional value,
but rather add risk of not getting to __crash_kexec().
Provide printk.console_no_auto_verbose boot parameter, which allows
to switch off printk being verbose on oops/panic/lockdep.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210727130635.675184-3-dima@arista.com
|
|
It' unused since removal of mn10300:
commit 739d875dd698 ("mn10300: Remove the architecture")
x86 stopped using it in v2.6.12 (see history git):
commit 7574828b3dbb ("[PATCH] x86_64: add nmi button support")
Let's clean it up from the header.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210727130635.675184-2-dima@arista.com
|
|
UBSAN reported (via LKP)
[ 11.021349][ T1] UBSAN: shift-out-of-bounds in lib/test_scanf.c:275:51
[ 11.022782][ T1] shift exponent 32 is too large for 32-bit type 'unsigned int'
When n_bits == 0, the shift is out of range. Switch code to use GENMASK
to handle this case.
Fixes: 50f530e176ea ("lib: test_scanf: Add tests for sscanf number conversion")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210727150132.28920-1-andriy.shevchenko@linux.intel.com
|
|
Commit 337015573718 ("printk: Userspace format indexing support") turned
printk() into a macro, but left the kerneldoc comment for it with the (now)
_printk() function, resulting in this docs-build warning:
kernel/printk/printk.c:1: warning: 'printk' not found
Move the kerneldoc comment back next to the (now) macro it's meant to
describe and have the docs build find it there.
Fixes: 337015573718b161 ("printk: Userspace format indexing support")
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/87o8aqt7qn.fsf@meer.lwn.net
|
|
The commit 337015573718b161 ("printk: Userspace format indexing support")
triggered the following build failure:
kernel/printk/index.c:140:6: warning: no previous prototype for ‘pi_create_file’ [-Wmissing-prototypes]
void pi_create_file(struct module *mod)
^~~~~~~~~~~~~~
kernel/printk/index.c:146:6: warning: no previous prototype for ‘pi_remove_file’ [-Wmissing-prototypes]
void pi_remove_file(struct module *mod)
^~~~~~~~~~~~~~
Fixes: 337015573718b161 ("printk: Userspace format indexing support")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Chris Down <chris@chrisdown.name>
[pmladek@suse.com: Let the compiler decide about inlining.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/lkml/YPql089IwSpudw%2F1@alley/
|
|
The commit 337015573718b161 ("printk: Userspace format indexing support")
caused the following build failure:
arch/mips/kernel/genex.o: In function `handle_mcheck_int':
(.text+0x190c): undefined reference to `printk'
arch/mips/kernel/genex.o: In function `handle_reserved_int':
(.text+0x1c8c): undefined reference to `printk'
Fixes: 337015573718b161 ("printk: Userspace format indexing support")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/YPbBfdz9srIpI+bb@chrisdown.name
|
|
While for most kinds of issues we have counters, tracepoints, or metrics
with a stable interface which can reliably be used to indicate issues,
in order to react to production issues quickly we sometimes need to work
with the interface which most kernel developers naturally use when
developing: printk, and printk-esques like dev_printk.
dev_printk is by far the most likely custom subsystem printk to benefit
from the printk indexing infrastructure, since niche device issues
brought about by production changes, firmware upgrades, and the like are
one of the most common things that we need printk infrastructure's
assistance to monitor.
Often these errors were never expected to practically manifest in
reality, and exhibit in code without extensive (or any) metrics present.
As such, there are typically very few options for issue detection
available to those with large fleets at the time the incident happens,
and we thus benefit strongly from monitoring netconsole in these
instances.
As such, add the infrastructure for dev_printk to be indexed in the
printk index. Even on a minimal kernel config, the coverage of the base
kernel's printk index is significantly improved:
Before:
[root@ktst ~]# wc -l /sys/kernel/debug/printk/index/vmlinux
4497 /sys/kernel/debug/printk/index/vmlinux
After:
[root@ktst ~]# wc -l /sys/kernel/debug/printk/index/vmlinux
5573 /sys/kernel/debug/printk/index/vmlinux
In terms of implementation, in order to trivially disambiguate them,
dev_printk is now a macro which wraps _dev_printk.
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/959c7aed1017cb2c9de922e0a820d397e29c6a5a.1623775748.git.chris@chrisdown.name
|
|
We have a number of systems industry-wide that have a subset of their
functionality that works as follows:
1. Receive a message from local kmsg, serial console, or netconsole;
2. Apply a set of rules to classify the message;
3. Do something based on this classification (like scheduling a
remediation for the machine), rinse, and repeat.
As a couple of examples of places we have this implemented just inside
Facebook, although this isn't a Facebook-specific problem, we have this
inside our netconsole processing (for alarm classification), and as part
of our machine health checking. We use these messages to determine
fairly important metrics around production health, and it's important
that we get them right.
While for some kinds of issues we have counters, tracepoints, or metrics
with a stable interface which can reliably indicate the issue, in order
to react to production issues quickly we need to work with the interface
which most kernel developers naturally use when developing: printk.
Most production issues come from unexpected phenomena, and as such
usually the code in question doesn't have easily usable tracepoints or
other counters available for the specific problem being mitigated. We
have a number of lines of monitoring defence against problems in
production (host metrics, process metrics, service metrics, etc), and
where it's not feasible to reliably monitor at another level, this kind
of pragmatic netconsole monitoring is essential.
As one would expect, monitoring using printk is rather brittle for a
number of reasons -- most notably that the message might disappear
entirely in a new version of the kernel, or that the message may change
in some way that the regex or other classification methods start to
silently fail.
One factor that makes this even harder is that, under normal operation,
many of these messages are never expected to be hit. For example, there
may be a rare hardware bug which one wants to detect if it was to ever
happen again, but its recurrence is not likely or anticipated. This
precludes using something like checking whether the printk in question
was printed somewhere fleetwide recently to determine whether the
message in question is still present or not, since we don't anticipate
that it should be printed anywhere, but still need to monitor for its
future presence in the long-term.
This class of issue has happened on a number of occasions, causing
unhealthy machines with hardware issues to remain in production for
longer than ideal. As a recent example, some monitoring around
blk_update_request fell out of date and caused semi-broken machines to
remain in production for longer than would be desirable.
Searching through the codebase to find the message is also extremely
fragile, because many of the messages are further constructed beyond
their callsite (eg. btrfs_printk and other module-specific wrappers,
each with their own functionality). Even if they aren't, guessing the
format and formulation of the underlying message based on the aesthetics
of the message emitted is not a recipe for success at scale, and our
previous issues with fleetwide machine health checking demonstrate as
much.
This provides a solution to the issue of silently changed or deleted
printks: we record pointers to all printk format strings known at
compile time into a new .printk_index section, both in vmlinux and
modules. At runtime, this can then be iterated by looking at
<debugfs>/printk/index/<module>, which emits the following format, both
readable by humans and able to be parsed by machines:
$ head -1 vmlinux; shuf -n 5 vmlinux
# <level[,flags]> filename:line function "format"
<5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n"
<4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n"
<6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n"
<6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n"
<6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n"
This mitigates the majority of cases where we have a highly-specific
printk which we want to match on, as we can now enumerate and check
whether the format changed or the printk callsite disappeared entirely
in userspace. This allows us to catch changes to printks we monitor
earlier and decide what to do about it before it becomes problematic.
There is no additional runtime cost for printk callers or printk itself,
and the assembly generated is exactly the same.
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Jessica Yu <jeyu@kernel.org> # for module.{c,h}
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
|
|
parse_prefix is needed externally by later patches, so move it into a
context where it can be used as such. Also give it the printk_ prefix to
reduce the chance of collisions.
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/b22ba314a860e5c7f887958f1eab2649f9bd1d06.1623775748.git.chris@chrisdown.name
|
|
In the past, `enum log_flags` was part of `struct log`, hence the name.
`struct log` has since been reworked and now this struct is stored
inside `struct printk_info`. However, the name was never updated, which
is somewhat confusing -- especially since these flags operate at the
record level rather than at the level of an abstract log.
printk_info_flags also joins its other metadata struct friends in
printk_ringbuffer.h.
Signed-off-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/3dd801982f02603e6e3aa4f8bc4f5ebb830a4949.1623775748.git.chris@chrisdown.name
|
|
From an abstract point of view, escape_special's counterpart,
unescape_special, already handles the unescaping of blackslashed double
quote sequences.
As a more practical example, printk indexing is an example case where
this is already practically useful. Compare an example with
`ESCAPE_SPECIAL | ESCAPE_SPACE`, with quotes not escaped:
[root@ktst ~]# grep drivers/pci/pci-stub.c:69 /sys/kernel/debug/printk/index/vmlinux
<4> drivers/pci/pci-stub.c:69 pci_stub_init "pci-stub: invalid ID string "%s"\n"
...and the same after this patch:
[root@ktst ~]# grep drivers/pci/pci-stub.c:69 /sys/kernel/debug/printk/index/vmlinux
<4> drivers/pci/pci-stub.c:69 pci_stub_init "pci-stub: invalid ID string \"%s\"\n"
One can of course, alternatively, use ESCAPE_APPEND with a quote in
@only, but without this patch quotes are coerced into hex or octal which
can hurt readability quite significantly.
I've checked uses of ESCAPE_SPECIAL and %pE across the codebase, and I'm
pretty confident that this shouldn't affect any stable interfaces.
Signed-off-by: Chris Down <chris@chrisdown.name>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/af144c5b75e41ce417386253ba2694456bc04118.1623775748.git.chris@chrisdown.name
|
|
|
|
I know nothing about zone_device pages and !device_private pages; but if
try_to_migrate_one() will do nothing for them, then it's better that
try_to_migrate() filter them first, than trawl through all their vmas.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Link: https://lore.kernel.org/lkml/1241d356-8ec9-f47b-a5ec-9b2bf66d242@google.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
In the unlikely race case that page_mlock_one() finds VM_LOCKED has been
cleared by the time it got page table lock, page_vma_mapped_walk_done()
must be called before returning, either explicitly, or by a final call
to page_vma_mapped_walk() - otherwise the page table remains locked.
Fixes: cd62734ca60d ("mm/rmap: split try_to_munlock from try_to_unmap")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/lkml/20210711151446.GB4070@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/f71f8523-cba7-3342-40a7-114abc5d1f51@google.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The kernel recovers in due course from missing Mlocked pages: but there
was no point in calling page_mlock() (formerly known as
try_to_munlock()) on a THP, because nothing got done even when it was
found to be mapped in another VM_LOCKED vma.
It's true that we need to be careful: Mlocked accounting of pte-mapped
THPs is too difficult (so consistently avoided); but Mlocked accounting
of only-pmd-mapped THPs is supposed to work, even when multiple mappings
are mlocked and munlocked or munmapped. Refine the tests.
There is already a VM_BUG_ON_PAGE(PageDoubleMap) in page_mlock(), so
page_mlock_one() does not even have to worry about that complication.
(I said the kernel recovers: but would page reclaim be likely to split
THP before rediscovering that it's VM_LOCKED? I've not followed that up)
Fixes: 9a73f61bdb8a ("thp, mlock: do not mlock PTE-mapped file huge pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/lkml/cfa154c-d595-406-eb7d-eb9df730f944@google.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Parallel developments in mm/rmap.c have left behind some out-of-date
comments: try_to_migrate_one() also accepts TTU_SYNC (already commented
in try_to_migrate() itself), and try_to_migrate() returns nothing at
all.
TTU_SPLIT_FREEZE has just been deleted, so reword the comment about it
in mm/huge_memory.c; and TTU_IGNORE_ACCESS was removed in 5.11, so
delete the "recently referenced" comment from try_to_unmap_one() (once
upon a time the comment was near the removed codeblock, but they drifted
apart).
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Link: https://lore.kernel.org/lkml/563ce5b2-7a44-5b4d-1dfd-59a0e65932a9@google.com/
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to
local_lock") folded in a workaround patch for pahole that was unable to
deal with zero-sized percpu structures.
A superior workaround is achieved with commit a0b8200d06ad ("kbuild:
skip per-CPU BTF generation for pahole v1.18-v1.21").
This patch reverts the dummy field and the pahole version check.
Fixes: dbbee9d5cd83 ("mm/page_alloc: convert per-cpu list protection to local_lock")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
arch/arm/mach-ixp4xx/include/mach/platform.h now gets included indirectly
and defines REG_OFFSET. Rename the register and bit definition to something
specific to the driver.
Fixes: 7fd70c65faac ("ARM: irqstat: Get rid of duplicated declaration")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210710211431.1393589-1-alexandre.belloni@bootlin.com
|
|
commit 03623b4b041c ("rtc: pcf2127: add tamper detection support")
added support for timestamp interrupts. However they are not being
handled in the irq handler. If a timestamp interrupt occurs it
results in kernel disabling the interrupt and displaying the call
trace:
[ 121.145580] irq 78: nobody cared (try booting with the "irqpoll" option)
...
[ 121.238087] [<00000000c4d69393>] irq_default_primary_handler threaded [<000000000a90d25b>] pcf2127_rtc_irq [rtc_pcf2127]
[ 121.248971] Disabling IRQ #78
Handle timestamp interrupts in pcf2127_rtc_irq(). Save time stamp
before clearing TSF1 and TSF2 flags so that it can't be overwritten.
Set a flag to mark if the timestamp is valid and only report to sysfs
if the flag is set. To mimic the hardware behavior, don’t save
another timestamp until the first one has been read by the userspace.
However, if the alarm irq is not configured, keep the old way of
handling timestamp interrupt in the timestamp0 sysfs calls.
Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Reviewed-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Tested-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210629150643.31551-1-ykaukab@suse.de
|
|
The offset variable is checked by at91_rtc_readalarm(), but this check
is unnecessary because the previous check knew that the value of this
variable was not 0.
This removes that unnecessary offset variable checks.
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210708051340.341345-1-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
s5m_check_peding_alarm_interrupt() in s5m_rtc_read_alarm() gets the return
value, but doesn't use it.
This modifies using the s5m_check_peding_alarm_interrupt()"s return value
as the s5m_rtc_read_alarm()'s return value.
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: linux-samsung-soc@vger.kernel.org
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210708051304.341278-1-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-11-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-9-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-8-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
For C files, use the C99 format (//).
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-7-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
For C files, use the C99 format (//).
Cc: Orson Zhai <orsonzhai@gmail.com>
Cc: Baolin Wang <baolin.wang7@gmail.com>
Cc: Chunyan Zhang <zhang.lyra@gmail.com>
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-6-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-5-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-4-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-3-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
Use SPDX-License-Identifier instead of a verbose license text.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210707075804.337458-2-nobuhiro1.iwamatsu@toshiba.co.jp
|
|
This reverts commit 65db04053efea3f3e412a7e0cc599962999c96b4.
Guenter reported that after 65db04053efe, the ppc:sam460ex qemu emulation
no longer boots from nvme:
nvme nvme0: Device not ready; aborting initialisation, CSTS=0x0
nvme nvme0: Removing after probe failure status: -19
Link: https://lore.kernel.org/r/20210709231529.GA3270116@roeck-us.net
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
|
After updating the datasheet URL, the PCF85063A datasheet revision
has changed.
Adjust it accordingly.
Reported-by: Nobuhiro Iwamatsu <iwamatsu@nigauri.org>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210624120953.2313378-1-festevam@gmail.com
|
|
Take maintainership of the binding as PAvel said he doesn't have the
hardware anymore.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Link: https://lore.kernel.org/r/20210620224030.1115356-1-alexandre.belloni@bootlin.com
|
|
ASan reported a memory leak for items of the entlist returned from scandir().
In fact, scandir() returns a malloc'd array of malloc'd dirents.
This patch adds the missing (z)frees.
Fixes: da963834fe6975a1 ("perf test: Iterate over shell tests in alphabetical order")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210709163454.672082-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a test for the newly added perf_evlist__set_leader() function.
Committer testing:
$ cd tools/lib/perf/
$ sudo make tests
[sudo] password for acme:
running static:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...OK
- running tests/test-evsel.c...OK
running dynamic:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...OK
- running tests/test-evsel.c...OK
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We shouldn't just panic, return a value that doesn't clash with what
perf_evsel__open() was already returning in case of error, i.e. errno
when sys_perf_event_open() fails.
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Link: http://lore.kernel.org/lkml/YOiOA5zOtVH9IBbE@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To 2.33
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
The optional @ref parameter might contain an NULL node_name, so
prevent dereferencing it in cifs_compose_mount_options().
Addresses-Coverity: 1476408 ("Explicit null dereferenced")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Add support to set group_fd in perf_evsel__open() and make it follow the
group setup.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Support for faster packet signing (using GMAC instead of CMAC) can
now be negotiated to some newer servers, including Windows.
See MS-SMB2 section 2.2.3.17.
This patch adds support for sending the new negotiate context
with the first of three supported signing algorithms (AES-CMAC)
and decoding the response. A followon patch will add support
for sending the other two (including AES-GMAC, which is fastest)
and changing the signing algorithm used based on what was
negotiated.
To allow the client to request GMAC signing set module parameter
"enable_negotiate_signing" to 1.
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Some different PMU types may have the same substring. For example, on
Icelake server we have PMU types "uncore_imc" and
"uncore_imc_free_running". Both PMU types have the substring
"uncore_imc". But the parser wrongly thinks they are the same PMU type.
We enable an imc event,
perf stat -e uncore_imc/event=0xe3/ -a -- sleep 1
Perf actually expands the event to:
uncore_imc_0/event=0xe3/
uncore_imc_1/event=0xe3/
uncore_imc_2/event=0xe3/
uncore_imc_3/event=0xe3/
uncore_imc_4/event=0xe3/
uncore_imc_5/event=0xe3/
uncore_imc_6/event=0xe3/
uncore_imc_7/event=0xe3/
uncore_imc_free_running_0/event=0xe3/
uncore_imc_free_running_1/event=0xe3/
uncore_imc_free_running_3/event=0xe3/
uncore_imc_free_running_4/event=0xe3/
That's because the "uncore_imc_free_running" matches the
pattern "uncore_imc*".
Now we check that the last characters of PMU name is '_<digit>'.
For example, for pattern "uncore_imc*", "uncore_imc_0" is parsed ok, but
"uncore_imc_free_running_0" fails.
Fixes: b2b9d3a3f0211c5d ("perf pmu: Support wildcards on pmu name in dynamic pmu events")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210701064253.1175-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some symbols may not be resolved if a user only monitors one type of
PMU.
$ sudo perf record -e cpu_atom/branch-instructions/ ./big_small_workload
$ sudo perf report –stdio
# Overhead Command Shared Object Symbol
# ........ ......... ................. .....................
#
28.02% perf-exec [unknown] [.] 0x0000000000401cf6
11.32% perf-exec [unknown] [.] 0x0000000000401d04
10.90% perf-exec [unknown] [.] 0x0000000000401d11
10.61% perf-exec [unknown] [.] 0x0000000000401cfc
To parse symbols the metadata records, e.g., PERF_RECORD_COMM, which are
generated by the kernel, are required.
To decide whether to generate the metadata records, the kernel relies on
the event_filter_match() to filter the unrelated events.
On a hybrid system, event_filter_match() further checks the CPU mask of
the current enabled PMU. If an event is collected on the CPU which
doesn't have an enabled PMU, it's treated as an unrelated event.
The "big_small_workload" is created in a big core, but runs on a small
core. The metadata records are filtered, because the user only monitors
the PMU of the small core. The big core PMU is not enabled.
For a hybrid system, a dummy event is required to generate the complete
side-band events.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1625760212-18441-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors.
The Topdown metrics L1 event was added as default in 42641d6f4d15e6db
("perf stat: Add Topdown metrics events as default events")
From the Sapphire Rapids server and later platforms, the same dedicated
"metrics" register is extended to support both L1 and L2 events.
Add both L1 and L2 Topdown metrics events as default to enrich the
default measuring information if the new measurement register is
available.
On legacy systems there is no change to avoid extra multiplexing.
The topdown_level indicates the max metrics level for the top-down
statistics. Set it to 2 to display all L1 and L2 Topdown metrics events.
With the patch:
$ perf stat sleep 1
Performance counter stats for 'sleep 1':
0.59 msec task-clock # 0.001 CPUs utilized
1 context-switches # 1.687 K/sec
0 cpu-migrations # 0.000 /sec
76 page-faults # 128.198 K/sec
1,405,318 cycles # 2.371 GHz
1,471,136 instructions # 1.05 insn per cycle
310,132 branches # 523.136 M/sec
10,435 branch-misses # 3.36% of all branches
8,431,908 slots # 14.223 G/sec
1,554,116 topdown-retiring # 18.4% retiring
1,289,585 topdown-bad-spec # 15.2% bad speculation
2,810,636 topdown-fe-bound # 33.2% frontend bound
2,810,636 topdown-be-bound # 33.2% backend bound
231,464 topdown-heavy-ops # 2.7% heavy operations # 15.6% light operations
1,223,453 topdown-br-mispredict # 14.5% branch mispredict # 0.8% machine clears
1,884,779 topdown-fetch-lat # 22.3% fetch latency # 10.9% fetch bandwidth
1,454,917 topdown-mem-bound # 17.2% memory bound # 16.0% Core bound
1.001179699 seconds time elapsed
0.000000000 seconds user
0.001238000 seconds sys
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1625760169-18396-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move the implementation of evlist__set_leader() to a new libperf
perf_evlist__set_leader() function with the same functionality make it a
libperf exported API.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.
Also add several evsel helpers to ease up the transition:
struct evsel *evsel__leader(struct evsel *evsel);
- get leader evsel
bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
- true if evsel has leader as leader
bool evsel__is_leader(struct evsel *evsel);
- true if evsel is itw own leader
void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
- set leader for evsel
Committer notes:
Fix this when building with 'make BUILD_BPF_SKEL=1'
tools/perf/util/bpf_counter.c
- if (evsel->leader->core.nr_members > 1) {
+ if (evsel->core.leader->nr_members > 1) {
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.
Committer notes:
Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.
Also fixed up these:
$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c: evsel->idx + i);
tools/perf/ui/gtk/annotate.c: evsel->idx);
$
That running 'make -C tools/perf build-test' caught.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Colin reports that Coverity complains about checking for poll being
non-zero after having dereferenced it multiple times. This is a valid
complaint, and actually a leftover from back when this code was based
on the aio poll code.
Kill the redundant check.
Link: https://lore.kernel.org/io-uring/fe70c532-e2a7-3722-58a1-0fa4e5c5ff2c@canonical.com/
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Accessing raw timers (currently only CLOCK_MONOTONIC_RAW) through VDSO
doesn't return the correct time when using the GIC as clock source.
The address of the GIC mapped page is in this case not calculated
correctly. The GIC mapped page is calculated from the VDSO data by
subtracting PAGE_SIZE:
void *get_gic(const struct vdso_data *data) {
return (void __iomem *)data - PAGE_SIZE;
}
However, the data pointer is not page aligned for raw clock sources.
This is because the VDSO data for raw clock sources (CS_RAW = 1) is
stored after the VDSO data for coarse clock sources (CS_HRES_COARSE = 0).
Therefore, only the VDSO data for CS_HRES_COARSE is page aligned:
+--------------------+
| |
| vd[CS_RAW] | ---+
| vd[CS_HRES_COARSE] | |
+--------------------+ | -PAGE_SIZE
| | |
| GIC mapped page | <--+
| |
+--------------------+
When __arch_get_hw_counter() is called with &vd[CS_RAW], get_gic returns
the wrong address (somewhere inside the GIC mapped page). The GIC counter
values are not returned which results in an invalid time.
Fixes: a7f4df4e21dd ("MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()")
Signed-off-by: Martin Fäcknitz <faecknitz@hotsplots.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
|
|
Since d4a45c68dc81 ("irqdomain: Protect the linear revmap with RCU"),
any irqdomain lookup requires the RCU read lock to be held.
This assumes that the architecture code will be structured such as
irq_enter() will be called *before* the interrupt is looked up
in the irq domain. However, this isn't the case for MIPS, and a number
of drivers are structured to do it the other way around when handling
an interrupt in their root irqchip (secondary irqchips are OK by
construction).
This results in a RCU splat on a lockdep-enabled kernel when the kernel
takes an interrupt from idle, as reported by Guenter Roeck.
Note that this could have fired previously if any driver had used
tree-based irqdomain, which always had the RCU requirement.
To solve this, provide a MIPS-specific helper (do_domain_IRQ())
as the pendent of do_IRQ() that will do thing in the right order
(and maybe save some cycles in the process).
Ideally, MIPS would be moved over to using handle_domain_irq(),
but that's much more ambitious.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
[maz: add dependency on CONFIG_IRQ_DOMAIN after report from the kernelci bot]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20210705172352.GA56304@roeck-us.net
Link: https://lore.kernel.org/r/20210706110647.3979002-1-maz@kernel.org
|