aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-07-15docs: accounting: convert to ReSTMauro Carvalho Chehab8-93/+140
Rename the accounting documentation files to ReST, add an index for them and adjust in order to produce a nice html output via the Sphinx build system. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: DMA-API-HOWTO.txt: fix an unmarked code blockMauro Carvalho Chehab1-1/+1
When building with Sphinx, it would produce this warning: docs/Documentation/DMA-API-HOWTO.rst:222: WARNING: Definition list ends without a blank line; unexpected unindent. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: rbtree.txt: fix Sphinx build warningsMauro Carvalho Chehab1-3/+3
Ths file is already at ReST format. Yet, some recent changes made it to produce a few warnings when building it with Sphinx. Those are trivially fixed by marking some literal blocks. Fix them before adding it to the docs building system. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: phy: convert samsung-usb2.txt to ReST formatMauro Carvalho Chehab2-30/+34
In order to merge it into a Sphinx book, we need first to convert to ReST. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: nvmem: convert docs to ReST and rename to *.rstMauro Carvalho Chehab1-53/+59
In order to be able to add it into a doc book, we need first convert it to ReST. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - mark literal blocks; - adjust title markups. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: bus-devices: ti-gpmc.rst: convert it to ReSTMauro Carvalho Chehab1-53/+110
In order to be able to add this file to a book, it needs first to be converted to ReST and renamed. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: xen-tpmfront.txt: convert it to .rstMauro Carvalho Chehab1-47/+60
In order to be able to add this file to the security book, we need first to convert it to reST. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: memory-devices: convert ti-emif.txt to ReSTMauro Carvalho Chehab1-10/+17
Prepare this file to be moved to a kernel book by converting it to ReST format and renaming it to ti-emif.rst. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: arm: convert docs to ReST and rename to *.rstMauro Carvalho Chehab115-1426/+1991
Converts ARM the text files to ReST, preparing them to be an architecture book. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Reviewed-by Corentin Labbe <clabbe.montjoie@gmail.com> # For sun4i-ss
2019-07-15docs: early-userspace: convert docs to ReST and rename to *.rstMauro Carvalho Chehab6-10/+38
The two files there describes a Kernel API feature, used to support early userspace stuff. Prepare for moving them to the kernel API book by converting to ReST format. The conversion itself was quite trivial: just add/mark a few titles as such, add a literal block markup, add a table markup and a few blank lines, in order to make Sphinx to properly parse it. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: pti_intel_mid.txt: convert it to pti_intel_mid.rstMauro Carvalho Chehab2-99/+106
Convert this small file to ReST format and rename it. Most of the conversion were related to adjusting whitespaces in order for each section to be properly parsed. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: console.txt: convert docs to ReST and rename to *.rstMauro Carvalho Chehab3-31/+38
Convert this small file to ReST in preparation for adding it to the driver-api book. While this is not part of the driver-api book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
2019-07-15docs: cma/debugfs.txt: convert docs to ReST and rename to *.rstMauro Carvalho Chehab1-1/+7
The debugfs interface for CMA should be there together with other mm-related documents. Convert this small file to ReST and move it to its rightful place. The conversion is actually quite simple: just add a title for the document. In order to make it to look better for the audience, also mark the "echo" command as a literal block. While this is not part of any book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: m68k: convert docs to ReST and rename to *.rstMauro Carvalho Chehab3-147/+191
Convert the m68k kernel-options.txt file to ReST. The conversion is trivial, as the document is already on a format close enough to ReST. Just some small adjustments were needed in order to make it both good for being parsed while keeping it on a good txt shape. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: lp855x-driver.txt: convert to ReST and move to kernel-apiMauro Carvalho Chehab3-67/+84
This small file seems to be an attempt to start documenting backlight drivers. It contains descriptions of the controls for the driver with could sound as an somewhat user-faced description, but it's main focus is to describe, instead, the data that should be passed via platform data and some driver-specific stuff. While this is not part of the driver-api book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: lcd-panel-cgram.txt: convert docs to ReST and rename to *.rstMauro Carvalho Chehab2-3/+8
This small text file describes the usage of parallel port LCD displays from userspace PoV. So, a good candidate for the admin guide. While this is not part of the admin-guide book, mark it as :orphan:, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: connector: convert to ReST and rename to connector.rstMauro Carvalho Chehab4-88/+109
As it has some function definitions, move them to connector.h. The remaining conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-15docs: locking: convert docs to ReST and rename to *.rstMauro Carvalho Chehab20-387/+511
Convert the locking documents to ReST and add them to the kernel development book where it belongs. Most of the stuff here is just to make Sphinx to properly parse the text file, as they're already in good shape, not requiring massive changes in order to be parsed. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Federico Vaga <federico.vaga@vaga.pv.it>
2019-07-14*: convert stream-like files -> stream_open, even if they use noop_llseekKirill Smelkov3-3/+10
This patch continues 10dce8af3422 (fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock) and c5bf68fe0c86 (*: convert stream-like files from nonseekable_open -> stream_open) and teaches steam_open.cocci to consider files as being stream-like not only if they have .llseek=no_llseek, but also if they have .llseek=noop_llseek. This is safe to do: the comment about noop_llseek says This is an implementation of ->llseek useable for the rare special case when userspace expects the seek to succeed but the (device) file is actually not able to perform the seek. In this case you use noop_llseek() instead of falling back to the default implementation of ->llseek. and in general noop_llseek was massively added to drivers in 6038f373a3dc (llseek: automatically add .llseek fop) when changing default for NULL .llseek from NOP to no_llseek with the idea to avoid breaking compatibility, if maybe some user-space program was using lseek on a device without caring about the result, but caring if it was an error or not. Amended semantic patch produces two changes when applied tree-wide: drivers/hid/hid-sensor-custom.c:690:8-24: WARNING: hid_sensor_custom_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open. drivers/input/mousedev.c:564:1-17: ERROR: mousedev_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix. Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jan Blunck <jblunck@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Kosina <jikos@kernel.org> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
2019-07-13locking/lockdep: Fix lock used or unused stats errorYuyang Du1-2/+3
The stats variable nr_unused_locks is incremented every time a new lock class is register and decremented when the lock is first used in __lock_acquire(). And after all, it is shown and checked in lockdep_stats. However, under configurations that either CONFIG_TRACE_IRQFLAGS or CONFIG_PROVE_LOCKING is not defined: The commit: 091806515124b20 ("locking/lockdep: Consolidate lock usage bit initialization") missed marking the LOCK_USED flag at IRQ usage initialization because as mark_usage() is not called. And the commit: 886532aee3cd42d ("locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING") further made mark_lock() not defined such that the LOCK_USED cannot be marked at all when the lock is first acquired. As a result, we fix this by not showing and checking the stats under such configurations for lockdep_stats. Reported-by: Qian Cai <cai@lca.pw> Signed-off-by: Yuyang Du <duyuyang@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: arnd@arndb.de Cc: frederic@kernel.org Link: https://lkml.kernel.org/r/20190709101522.9117-1-duyuyang@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-13sched/core: Fix preempt warning in ttwuPeter Zijlstra1-1/+3
John reported a DEBUG_PREEMPT warning caused by commit: aacedf26fb76 ("sched/core: Optimize try_to_wake_up() for local wakeups") I overlooked that ttwu_stat() requires preemption disabled. Reported-by: John Stultz <john.stultz@linaro.org> Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: aacedf26fb76 ("sched/core: Optimize try_to_wake_up() for local wakeups") Link: https://lkml.kernel.org/r/20190710105736.GK3402@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-13perf/x86/intel: Fix spurious NMI on fixed counterKan Liang1-5/+3
If a user first sample a PEBS event on a fixed counter, then sample a non-PEBS event on the same fixed counter on Icelake, it will trigger spurious NMI. For example: perf record -e 'cycles:p' -a perf record -e 'cycles' -a The error message for spurious NMI: [June 21 15:38] Uhhuh. NMI received for unknown reason 30 on CPU 2. [ +0.000000] Do you have a strange power saving mode enabled? [ +0.000000] Dazed and confused, but trying to continue The bug was introduced by the following commit: commit 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") The commit moves the intel_pmu_pebs_disable() after intel_pmu_disable_fixed(), which returns immediately. The related bit of PEBS_ENABLE MSR will never be cleared for the fixed counter. Then a non-PEBS event runs on the fixed counter, but the bit on PEBS_ENABLE is still set, which triggers spurious NMIs. Check and disable PEBS for fixed counters after intel_pmu_disable_fixed(). Reported-by: Yi, Ammy <ammy.yi@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 6f55967ad9d9 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") Link: https://lkml.kernel.org/r/20190625142135.22112-1-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-13perf/core: Fix exclusive events' groupingAlexander Shishkin2-12/+27
So far, we tried to disallow grouping exclusive events for the fear of complications they would cause with moving between contexts. Specifically, moving a software group to a hardware context would violate the exclusivity rules if both groups contain matching exclusive events. This attempt was, however, unsuccessful: the check that we have in the perf_event_open() syscall is both wrong (looks at wrong PMU) and insufficient (group leader may still be exclusive), as can be illustrated by running: $ perf record -e '{intel_pt//,cycles}' uname $ perf record -e '{cycles,intel_pt//}' uname ultimately successfully. Furthermore, we are completely free to trigger the exclusivity violation by: perf -e '{cycles,intel_pt//}' -e '{intel_pt//,instructions}' even though the helpful perf record will not allow that, the ABI will. The warning later in the perf_event_open() path will also not trigger, because it's also wrong. Fix all this by validating the original group before moving, getting rid of broken safeguards and placing a useful one to perf_install_in_context(). Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: mathieu.poirier@linaro.org Cc: will.deacon@arm.com Fixes: bed5b25ad9c8a ("perf: Add a pmu capability for "exclusive" events") Link: https://lkml.kernel.org/r/20190701110755.24646-1-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-13perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCsKim Phillips1-4/+11
Fill in the L3 performance event select register ThreadMask bitfield, to enable per hardware thread accounting. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gary Hook <Gary.Hook@amd.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liska <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: https://lkml.kernel.org/r/20190628215906.4276-2-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-13perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCsKim Phillips1-1/+1
The following commit: d7cbbe49a930 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events") enables L3 PMC events for all threads and slices by writing 1's in 'ChL3PmcCfg' (L3 PMC PERF_CTL) register fields. Those bitfields overlap with high order event select bits in the Data Fabric PMC control register, however. So when a user requests raw Data Fabric events (-e amd_df/event=0xYYY/), the two highest order bits get inadvertently set, changing the counter select to events that don't exist, and for which no counts are read. This patch changes the logic to write the L3 masks only when dealing with L3 PMC counters. AMD Family 16h and below Northbridge (NB) counters were not affected. Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gary Hook <Gary.Hook@amd.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liska <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: d7cbbe49a930 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events") Link: https://lkml.kernel.org/r/20190628215906.4276-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-13perf/core: Fix race between close() and fork()Peter Zijlstra1-8/+41
Syzcaller reported the following Use-after-Free bug: close() clone() copy_process() perf_event_init_task() perf_event_init_context() mutex_lock(parent_ctx->mutex) inherit_task_group() inherit_group() inherit_event() mutex_lock(event->child_mutex) // expose event on child list list_add_tail() mutex_unlock(event->child_mutex) mutex_unlock(parent_ctx->mutex) ... goto bad_fork_* bad_fork_cleanup_perf: perf_event_free_task() perf_release() perf_event_release_kernel() list_for_each_entry() mutex_lock(ctx->mutex) mutex_lock(event->child_mutex) // event is from the failing inherit // on the other CPU perf_remove_from_context() list_move() mutex_unlock(event->child_mutex) mutex_unlock(ctx->mutex) mutex_lock(ctx->mutex) list_for_each_entry_safe() // event already stolen mutex_unlock(ctx->mutex) delayed_free_task() free_task() list_for_each_entry_safe() list_del() free_event() _free_event() // and so event->hw.target // is the already freed failed clone() if (event->hw.target) put_task_struct(event->hw.target) // WHOOPSIE, already quite dead Which puts the lie to the the comment on perf_event_free_task(): 'unexposed, unused context' not so much. Which is a 'fun' confluence of fail; copy_process() doing an unconditional free_task() and not respecting refcounts, and perf having creative locking. In particular: 82d94856fa22 ("perf/core: Fix lock inversion between perf,trace,cpuhp") seems to have overlooked this 'fun' parade. Solve it by using the fact that detached events still have a reference count on their (previous) context. With this perf_event_free_task() can detect when events have escaped and wait for their destruction. Debugged-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 82d94856fa22 ("perf/core: Fix lock inversion between perf,trace,cpuhp") Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-07-12ppp: mppe: Revert "ppp: mppe: Add softdep to arc4"Eric Biggers1-1/+0
Commit 0e5a610b5ca5 ("ppp: mppe: switch to RC4 library interface"), which was merged through the crypto tree for v5.3, changed ppp_mppe.c to use the new arc4_crypt() library function rather than access RC4 through the dynamic crypto_skcipher API. Meanwhile commit aad1dcc4f011 ("ppp: mppe: Add softdep to arc4") was merged through the net tree and added a module soft-dependency on "arc4". The latter commit no longer makes sense because the code now uses the "libarc4" module rather than "arc4", and also due to the direct use of arc4_crypt(), no module soft-dependency is required. So revert the latter commit. Cc: Takashi Iwai <tiwai@suse.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12net: dsa: qca8k: replace legacy gpio includeChristian Lamparter1-1/+1
This patch replaces the legacy bulk gpio.h include with the proper gpio/consumer.h variant. This was caught by the kbuild test robot that was running into an error because of this. For more information why linux/gpio.h is bad can be found in: commit 56a46b6144e7 ("gpio: Clarify that <linux/gpio.h> is legacy") Reported-by: kbuild test robot <lkp@intel.com> Link: https://www.spinics.net/lists/netdev/msg584447.html Fixes: a653f2f538f9 ("net: dsa: qca8k: introduce reset via gpio feature") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12net: hisilicon: Use devm_platform_ioremap_resourceJiangfeng Xiao4-18/+7
Use devm_platform_ioremap_resource instead of devm_ioremap_resource. Make the code simpler. Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12cxgb4: reduce kernel stack usage in cudbg_collect_mem_region()Arnd Bergmann1-6/+13
The cudbg_collect_mem_region() and cudbg_read_fw_mem() both use several hundred kilobytes of kernel stack space. One gets inlined into the other, which causes the stack usage to be combined beyond the warning limit when building with clang: drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c:1057:12: error: stack frame size of 1244 bytes in function 'cudbg_collect_mem_region' [-Werror,-Wframe-larger-than=] Restructuring cudbg_collect_mem_region() lets clang do the same optimization that gcc does and reuse the stack slots as it can see that the large variables are never used together. A better fix might be to avoid using cudbg_meminfo on the stack altogether, but that requires a larger rewrite. Fixes: a1c69520f785 ("cxgb4: collect MC memory dump") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12tipc: ensure head->lock is initialisedChris Packham1-1/+1
tipc_named_node_up() creates a skb list. It passes the list to tipc_node_xmit() which has some code paths that can call skb_queue_purge() which relies on the list->lock being initialised. The spin_lock is only needed if the messages end up on the receive path but when the list is created in tipc_named_node_up() we don't necessarily know if it is going to end up there. Once all the skb list users are updated in tipc it will then be possible to update them to use the unlocked variants of the skb list functions and initialise the lock when we know the message will follow the receive path. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12tc-tests: updated skbedit testsRoman Mashak1-0/+117
- Added mask upper bound test case - Added mask validation test case - Added mask replacement case Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12nfp: flower: ensure ip protocol is specified for L4 matchesJohn Hurley1-9/+6
Flower rules on the NFP firmware are able to match on an IP protocol field. When parsing rules in the driver, unknown IP protocols are only rejected when further matches are to be carried out on layer 4 fields, as the firmware will not be able to extract such fields from packets. L4 protocol dissectors such as FLOW_DISSECTOR_KEY_PORTS are only parsed if an IP protocol is specified. This leaves a loophole whereby a rule that attempts to match on transport layer information such as port numbers but does not explicitly give an IP protocol type can be incorrectly offloaded (in this case with wildcard port numbers matches). Fix this by rejecting the offload of flows that attempt to match on L4 information, not only when matching on an unknown IP protocol type, but also when the protocol is wildcarded. Fixes: 2a04784594f6 ("nfp: flower: check L4 matches on unknown IP protocols") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12nfp: flower: fix ethernet check on match fieldsJohn Hurley1-8/+5
NFP firmware does not explicitly match on an ethernet type field. Rather, each rule has a bitmask of match fields that can be used to infer the ethernet type. Currently, if a flower rule contains an unknown ethernet type, a check is carried out for matches on other fields of the packet. If matches on layer 3 or 4 are found, then the offload is rejected as firmware will not be able to extract these fields from a packet with an ethernet type it does not currently understand. However, if a rule contains an unknown ethernet type without any L3 (or above) matches then this will effectively be offloaded as a rule with a wildcarded ethertype. This can lead to misclassifications on the firmware. Fix this issue by rejecting all flower rules that specify a match on an unknown ethernet type. Further ensure correct offloads by moving the 'L3 and above' check to any rule that does not specify an ethernet type and rejecting rules with further matches. This means that we can still offload rules with a wildcarded ethertype if they only match on L2 fields but will prevent rules which match on further fields that we cannot be sure if the firmware will be able to extract. Fixes: af9d842c1354 ("nfp: extend flower add flow offload") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12net/mlx5e: Provide cb_list pointer when setting up tc block on repVlad Buslov1-1/+4
Recent refactoring of tc block offloads infrastructure introduced new flow_block_cb_setup_simple() method intended to be used as unified way for all drivers to register offload callbacks. However, commit that actually extended all users (drivers) with block cb list and provided it to flow_block infra missed mlx5 en_rep. This leads to following NULL-pointer dereference when creating Qdisc: [ 278.385175] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 278.393233] #PF: supervisor read access in kernel mode [ 278.399446] #PF: error_code(0x0000) - not-present page [ 278.405847] PGD 8000000850e73067 P4D 8000000850e73067 PUD 8620cd067 PMD 0 [ 278.414141] Oops: 0000 [#1] SMP PTI [ 278.419019] CPU: 7 PID: 3369 Comm: tc Not tainted 5.2.0-rc6+ #492 [ 278.426580] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 278.435853] RIP: 0010:flow_block_cb_setup_simple+0xc4/0x190 [ 278.442953] Code: 10 48 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 49 89 00 48 05 00 01 00 00 49 89 40 08 31 c0 c3 b8 a1 ff ff ff c3 f3 c3 <48> 8b 06 48 39 c6 75 0a eb 1a 48 8b 00 48 39 c6 74 12 48 3b 50 28 [ 278.464829] RSP: 0018:ffffaf07c3f97990 EFLAGS: 00010246 [ 278.471648] RAX: 0000000000000000 RBX: ffff9b43ed4c7680 RCX: ffff9b43d5f80840 [ 278.480408] RDX: ffffffffc0491650 RSI: 0000000000000000 RDI: ffffaf07c3f97998 [ 278.489110] RBP: ffff9b43ddff9000 R08: ffff9b43d5f80840 R09: 0000000000000001 [ 278.497838] R10: 0000000000000009 R11: 00000000000003ad R12: ffffaf07c3f97c08 [ 278.506595] R13: ffff9b43d5f80000 R14: ffff9b43ed4c7680 R15: ffff9b43dfa20b40 [ 278.515374] FS: 00007f796be1b400(0000) GS:ffff9b43ef840000(0000) knlGS:0000000000000000 [ 278.525099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 278.532453] CR2: 0000000000000000 CR3: 0000000840398002 CR4: 00000000001606e0 [ 278.541197] Call Trace: [ 278.545252] tcf_block_offload_cmd.isra.52+0x7e/0xb0 [ 278.551871] tcf_block_get_ext+0x365/0x3e0 [ 278.557569] qdisc_create+0x15c/0x4e0 [ 278.562859] ? kmem_cache_alloc_trace+0x1a2/0x1c0 [ 278.569235] tc_modify_qdisc+0x1c8/0x780 [ 278.574761] rtnetlink_rcv_msg+0x291/0x340 [ 278.580518] ? _cond_resched+0x15/0x40 [ 278.585856] ? rtnl_calcit.isra.29+0x120/0x120 [ 278.591868] netlink_rcv_skb+0x4a/0x110 [ 278.597198] netlink_unicast+0x1a0/0x250 [ 278.602601] netlink_sendmsg+0x2c1/0x3c0 [ 278.608022] sock_sendmsg+0x5b/0x60 [ 278.612969] ___sys_sendmsg+0x289/0x310 [ 278.618231] ? do_wp_page+0x99/0x730 [ 278.623216] ? page_add_new_anon_rmap+0xbe/0x140 [ 278.629298] ? __handle_mm_fault+0xc84/0x1360 [ 278.635113] ? __sys_sendmsg+0x5e/0xa0 [ 278.640285] __sys_sendmsg+0x5e/0xa0 [ 278.645239] do_syscall_64+0x5b/0x1b0 [ 278.650274] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 278.656697] RIP: 0033:0x7f796abdeb87 [ 278.661628] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00 00 00 53 48 89 f3 48 [ 278.683248] RSP: 002b:00007ffde213ba48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 278.692245] RAX: ffffffffffffffda RBX: 000000005d261e6f RCX: 00007f796abdeb87 [ 278.700862] RDX: 0000000000000000 RSI: 00007ffde213bab0 RDI: 0000000000000003 [ 278.709527] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006 [ 278.718167] R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001 [ 278.726743] R13: 000000000067b580 R14: 0000000000000000 R15: 0000000000000000 [ 278.735302] Modules linked in: dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache bridge stp llc sunrpc mlx5_ib ib_uverbs intel_rapl ib_core sb_edac x86_pkg_temp_ thermal intel_powerclamp coretemp kvm_intel kvm mlx5_core irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel igb ghash_clmulni_intel ses mei_me enclosure mlxfw ipmi_ssif intel_cstate iTCO_wdt ptp mei pps_core iTCO_vendor_support pcspkr joydev intel_uncore i2c_i801 ipmi_si lpc_ich intel_rapl_perf ioatdma wmi dca pcc_cpufreq ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ast i2c_algo_bit drm_k ms_helper ttm drm mpt3sas raid_class scsi_transport_sas [ 278.802263] CR2: 0000000000000000 [ 278.807170] ---[ end trace b1f0a442a279e66f ]--- Extend en_rep with new static mlx5e_rep_block_cb_list list and pass it to flow_block_cb_setup_simple() function instead of hardcoded NULL pointer. Fixes: 955bcb6ea0df ("drivers: net: use flow block API") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12net: phy: make exported variables non-staticDenis Efremov2-3/+6
The variables phy_basic_ports_array, phy_fibre_port_array and phy_all_ports_features_array are declared static and marked EXPORT_SYMBOL_GPL(), which is at best an odd combination. Because the variables were decided to be a part of API, this commit removes the static attributes and adds the declarations to the header. Fixes: 3c1bcc8614db ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd()Vlad Buslov2-1/+11
After recent refactoring of block offlads infrastructure, indr_dev->block pointer is dereferenced before it is verified to be non-NULL. Example stack trace where this behavior leads to NULL-pointer dereference error when creating vxlan dev on system with mlx5 NIC with offloads enabled: [ 1157.852938] ================================================================== [ 1157.866877] BUG: KASAN: null-ptr-deref in tc_indr_block_ing_cmd.isra.41+0x9c/0x160 [ 1157.880877] Read of size 4 at addr 0000000000000090 by task ip/3829 [ 1157.901637] CPU: 22 PID: 3829 Comm: ip Not tainted 5.2.0-rc6+ #488 [ 1157.914438] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 1157.929031] Call Trace: [ 1157.938318] dump_stack+0x9a/0xeb [ 1157.948362] ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160 [ 1157.960262] ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160 [ 1157.972082] __kasan_report+0x176/0x192 [ 1157.982513] ? tc_indr_block_ing_cmd.isra.41+0x9c/0x160 [ 1157.994348] kasan_report+0xe/0x20 [ 1158.004324] tc_indr_block_ing_cmd.isra.41+0x9c/0x160 [ 1158.015950] ? tcf_block_setup+0x430/0x430 [ 1158.026558] ? kasan_unpoison_shadow+0x30/0x40 [ 1158.037464] __tc_indr_block_cb_register+0x5f5/0xf20 [ 1158.049288] ? mlx5e_rep_indr_tc_block_unbind+0xa0/0xa0 [mlx5_core] [ 1158.062344] ? tc_indr_block_dev_put.part.47+0x5c0/0x5c0 [ 1158.074498] ? rdma_roce_rescan_device+0x20/0x20 [ib_core] [ 1158.086580] ? br_device_event+0x98/0x480 [bridge] [ 1158.097870] ? strcmp+0x30/0x50 [ 1158.107578] mlx5e_nic_rep_netdevice_event+0xdd/0x180 [mlx5_core] [ 1158.120212] notifier_call_chain+0x6d/0xa0 [ 1158.130753] register_netdevice+0x6fc/0x7e0 [ 1158.141322] ? netdev_change_features+0xa0/0xa0 [ 1158.152218] ? vxlan_config_apply+0x210/0x310 [vxlan] [ 1158.163593] __vxlan_dev_create+0x2ad/0x520 [vxlan] [ 1158.174770] ? vxlan_changelink+0x490/0x490 [vxlan] [ 1158.185870] ? rcu_read_unlock+0x60/0x60 [vxlan] [ 1158.196798] vxlan_newlink+0x99/0xf0 [vxlan] [ 1158.207303] ? __vxlan_dev_create+0x520/0x520 [vxlan] [ 1158.218601] ? rtnl_create_link+0x3d0/0x450 [ 1158.228900] __rtnl_newlink+0x8a7/0xb00 [ 1158.238701] ? stack_access_ok+0x35/0x80 [ 1158.248450] ? rtnl_link_unregister+0x1a0/0x1a0 [ 1158.258735] ? find_held_lock+0x6d/0xd0 [ 1158.268379] ? is_bpf_text_address+0x67/0xf0 [ 1158.278330] ? lock_acquire+0xc1/0x1f0 [ 1158.287686] ? is_bpf_text_address+0x5/0xf0 [ 1158.297449] ? is_bpf_text_address+0x86/0xf0 [ 1158.307310] ? kernel_text_address+0xec/0x100 [ 1158.317155] ? arch_stack_walk+0x92/0xe0 [ 1158.326497] ? __kernel_text_address+0xe/0x30 [ 1158.336213] ? unwind_get_return_address+0x2f/0x50 [ 1158.346267] ? create_prof_cpu_mask+0x20/0x20 [ 1158.355936] ? arch_stack_walk+0x92/0xe0 [ 1158.365117] ? stack_trace_save+0x8a/0xb0 [ 1158.374272] ? stack_trace_consume_entry+0x80/0x80 [ 1158.384226] ? match_held_lock+0x33/0x210 [ 1158.393216] ? kasan_unpoison_shadow+0x30/0x40 [ 1158.402593] rtnl_newlink+0x53/0x80 [ 1158.410925] rtnetlink_rcv_msg+0x3a5/0x600 [ 1158.419777] ? validate_linkmsg+0x400/0x400 [ 1158.428620] ? find_held_lock+0x6d/0xd0 [ 1158.437117] ? match_held_lock+0x1b/0x210 [ 1158.445760] ? validate_linkmsg+0x400/0x400 [ 1158.454642] netlink_rcv_skb+0xc7/0x1f0 [ 1158.463150] ? netlink_ack+0x470/0x470 [ 1158.471538] ? netlink_deliver_tap+0x1f3/0x5a0 [ 1158.480607] netlink_unicast+0x2ae/0x350 [ 1158.489099] ? netlink_attachskb+0x340/0x340 [ 1158.497935] ? _copy_from_iter_full+0xde/0x3b0 [ 1158.506945] ? __virt_addr_valid+0xb6/0xf0 [ 1158.515578] ? __check_object_size+0x159/0x240 [ 1158.524515] netlink_sendmsg+0x4d3/0x630 [ 1158.532879] ? netlink_unicast+0x350/0x350 [ 1158.541400] ? netlink_unicast+0x350/0x350 [ 1158.549805] sock_sendmsg+0x94/0xa0 [ 1158.557561] ___sys_sendmsg+0x49d/0x570 [ 1158.565625] ? copy_msghdr_from_user+0x210/0x210 [ 1158.574457] ? __fput+0x1e2/0x330 [ 1158.581948] ? __kasan_slab_free+0x130/0x180 [ 1158.590407] ? kmem_cache_free+0xb6/0x2d0 [ 1158.598574] ? mark_lock+0xc7/0x790 [ 1158.606177] ? task_work_run+0xcf/0x100 [ 1158.614165] ? exit_to_usermode_loop+0x102/0x110 [ 1158.622954] ? __lock_acquire+0x963/0x1ee0 [ 1158.631199] ? lockdep_hardirqs_on+0x260/0x260 [ 1158.639777] ? match_held_lock+0x1b/0x210 [ 1158.647918] ? lockdep_hardirqs_on+0x260/0x260 [ 1158.656501] ? match_held_lock+0x1b/0x210 [ 1158.664643] ? __fget_light+0xa6/0xe0 [ 1158.672423] ? __sys_sendmsg+0xd2/0x150 [ 1158.680334] __sys_sendmsg+0xd2/0x150 [ 1158.688063] ? __ia32_sys_shutdown+0x30/0x30 [ 1158.696435] ? lock_downgrade+0x2e0/0x2e0 [ 1158.704541] ? mark_held_locks+0x1a/0x90 [ 1158.712611] ? mark_held_locks+0x1a/0x90 [ 1158.720619] ? do_syscall_64+0x1e/0x2c0 [ 1158.728530] do_syscall_64+0x78/0x2c0 [ 1158.736254] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1158.745414] RIP: 0033:0x7f62d505cb87 [ 1158.753070] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00 00 00 00 8b 05 6a 2b 2c 00 48 63 d2 48 63 ff 85 c0 75 18 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 59 f3 c3 0f 1f 80 00 00[87/1817] 48 89 f3 48 [ 1158.780924] RSP: 002b:00007fffd9832268 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 1158.793204] RAX: ffffffffffffffda RBX: 000000005d26048f RCX: 00007f62d505cb87 [ 1158.805111] RDX: 0000000000000000 RSI: 00007fffd98322d0 RDI: 0000000000000003 [ 1158.817055] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000006 [ 1158.828987] R10: 00007f62d50ce260 R11: 0000000000000246 R12: 0000000000000001 [ 1158.840909] R13: 000000000067e540 R14: 0000000000000000 R15: 000000000067ed20 [ 1158.852873] ================================================================== Introduce new function tcf_block_non_null_shared() that verifies block pointer before dereferencing it to obtain index. Use the function in tc_indr_block_ing_cmd() to prevent NULL pointer dereference. Fixes: 955bcb6ea0df ("drivers: net: use flow block API") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12davinci_cpdma: don't cast dma_addr_t to pointerArnd Bergmann1-13/+13
dma_addr_t may be 64-bit wide on 32-bit architectures, so it is not valid to cast between it and a pointer: drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_si': drivers/net/ethernet/ti/davinci_cpdma.c:1047:12: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_idle_submit_mapped': drivers/net/ethernet/ti/davinci_cpdma.c:1114:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_mapped': drivers/net/ethernet/ti/davinci_cpdma.c:1164:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] Solve this by using two separate members in 'struct submit_info'. Since this avoids the use of the 'flag' member, the structure does not even grow in typical configurations. Fixes: 6670acacd59e ("net: ethernet: ti: davinci_cpdma: add dma mapped submit") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12net: openvswitch: do not update max_headroom if new headroom is equal to old headroomTaehee Yoo1-11/+28
When a vport is deleted, the maximum headroom size would be changed. If the vport which has the largest headroom is deleted, the new max_headroom would be set. But, if the new headroom size is equal to the old headroom size, updating routine is unnecessary. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12coresight: Make the coresight_device_fwnode_match declaration's fwnode parameter constNathan Chancellor1-1/+1
Fix Linus' merge error in the parent commit, causing: drivers/hwtracing/coresight/coresight.c:1051:11: error: incompatible pointer types passing 'int (struct device *, void *)' to parameter of type 'int (*)(struct device *, const void *)' [-Werror,-Wincompatible-pointer-types] coresight_device_fwnode_match); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/device.h:173:17: note: passing argument to parameter 'match' here int (*match)(struct device *dev, const void *data)); ^ due to missed header file fixup. Fixes: f632a8170a6b ("Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> [ Greg even sent this patch with his pull request, but I stupidly thought it was the merge resolution fix I had already done as part of the merge. But no, this was the extra fix for the header file that goes with the definition I _had_ caught - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12net/mlx5e: Convert single case statement switch statements into if statementsNathan Chancellor1-23/+11
During the review of commit 1ff2f0fa450e ("net/mlx5e: Return in default case statement in tx_post_resync_params"), Leon and Nick pointed out that the switch statements can be converted to single if statements that return early so that the code is easier to follow. Suggested-by: Leon Romanovsky <leon@kernel.org> Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-12mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()Tetsuo Handa1-2/+0
Since commit bbbe48029720 ("mm, oom: remove 'prefer children over parent' heuristic") removed the "%s: Kill process %d (%s) score %u or sacrifice child\n" line, oc->chosen_points is no longer used after select_bad_process(). Link: http://lkml.kernel.org/r/1560853435-15575-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12oom: decouple mems_allowed from oom_unkillable_taskShakeel Butt3-28/+33
Commit ef08e3b4981a ("[PATCH] cpusets: confine oom_killer to mem_exclusive cpuset") introduces a heuristic where a potential oom-killer victim is skipped if the intersection of the potential victim and the current (the process triggered the oom) is empty based on the reason that killing such victim most probably will not help the current allocating process. However the commit 7887a3da753e ("[PATCH] oom: cpuset hint") changed the heuristic to just decrease the oom_badness scores of such potential victim based on the reason that the cpuset of such processes might have changed and previously they may have allocated memory on mems where the current allocating process can allocate from. Unintentionally 7887a3da753e ("[PATCH] oom: cpuset hint") introduced a side effect as the oom_badness is also exposed to the user space through /proc/[pid]/oom_score, so, readers with different cpusets can read different oom_score of the same process. Later, commit 6cf86ac6f36b ("oom: filter tasks not sharing the same cpuset") fixed the side effect introduced by 7887a3da753e by moving the cpuset intersection back to only oom-killer context and out of oom_badness. However the combination of ab290adbaf8f ("oom: make oom_unkillable_task() helper function") and 26ebc984913b ("oom: /proc/<pid>/oom_score treat kernel thread honestly") unintentionally brought back the cpuset intersection check into the oom_badness calculation function. Other than doing cpuset/mempolicy intersection from oom_badness, the memcg oom context is also doing cpuset/mempolicy intersection which is quite wrong and is caught by syzcaller with the following report: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 28426 Comm: syz-executor.5 Not tainted 5.2.0-rc3-next-20190607 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline] RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline] RIP: 0010:oom_unkillable_task mm/oom_kill.c:168 [inline] RIP: 0010:oom_unkillable_task+0x180/0x400 mm/oom_kill.c:155 Code: c1 ea 03 80 3c 02 00 0f 85 80 02 00 00 4c 8b a3 10 07 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8d 74 24 10 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 67 02 00 00 49 8b 44 24 10 4c 8d a0 68 fa ff ff RSP: 0018:ffff888000127490 EFLAGS: 00010a03 RAX: dffffc0000000000 RBX: ffff8880a4cd5438 RCX: ffffffff818dae9c RDX: 100000000c3cc602 RSI: ffffffff818dac8d RDI: 0000000000000001 RBP: ffff8880001274d0 R08: ffff888000086180 R09: ffffed1015d26be0 R10: ffffed1015d26bdf R11: ffff8880ae935efb R12: 8000000061e63007 R13: 0000000000000000 R14: 8000000061e63017 R15: 1ffff11000024ea6 FS: 00005555561f5940(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000607304 CR3: 000000009237e000 CR4: 00000000001426f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Call Trace: oom_evaluate_task+0x49/0x520 mm/oom_kill.c:321 mem_cgroup_scan_tasks+0xcc/0x180 mm/memcontrol.c:1169 select_bad_process mm/oom_kill.c:374 [inline] out_of_memory mm/oom_kill.c:1088 [inline] out_of_memory+0x6b2/0x1280 mm/oom_kill.c:1035 mem_cgroup_out_of_memory+0x1ca/0x230 mm/memcontrol.c:1573 mem_cgroup_oom mm/memcontrol.c:1905 [inline] try_charge+0xfbe/0x1480 mm/memcontrol.c:2468 mem_cgroup_try_charge+0x24d/0x5e0 mm/memcontrol.c:6073 mem_cgroup_try_charge_delay+0x1f/0xa0 mm/memcontrol.c:6088 do_huge_pmd_wp_page_fallback+0x24f/0x1680 mm/huge_memory.c:1201 do_huge_pmd_wp_page+0x7fc/0x2160 mm/huge_memory.c:1359 wp_huge_pmd mm/memory.c:3793 [inline] __handle_mm_fault+0x164c/0x3eb0 mm/memory.c:4006 handle_mm_fault+0x3b7/0xa90 mm/memory.c:4053 do_user_addr_fault arch/x86/mm/fault.c:1455 [inline] __do_page_fault+0x5ef/0xda0 arch/x86/mm/fault.c:1521 do_page_fault+0x71/0x57d arch/x86/mm/fault.c:1552 page_fault+0x1e/0x30 arch/x86/entry/entry_64.S:1156 RIP: 0033:0x400590 Code: 06 e9 49 01 00 00 48 8b 44 24 10 48 0b 44 24 28 75 1f 48 8b 14 24 48 8b 7c 24 20 be 04 00 00 00 e8 f5 56 00 00 48 8b 74 24 08 <89> 06 e9 1e 01 00 00 48 8b 44 24 08 48 8b 14 24 be 04 00 00 00 8b RSP: 002b:00007fff7bc49780 EFLAGS: 00010206 RAX: 0000000000000001 RBX: 0000000000760000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000000002000cffc RDI: 0000000000000001 RBP: fffffffffffffffe R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000075 R11: 0000000000000246 R12: 0000000000760008 R13: 00000000004c55f2 R14: 0000000000000000 R15: 00007fff7bc499b0 Modules linked in: ---[ end trace a65689219582ffff ]--- RIP: 0010:__read_once_size include/linux/compiler.h:194 [inline] RIP: 0010:has_intersects_mems_allowed mm/oom_kill.c:84 [inline] RIP: 0010:oom_unkillable_task mm/oom_kill.c:168 [inline] RIP: 0010:oom_unkillable_task+0x180/0x400 mm/oom_kill.c:155 Code: c1 ea 03 80 3c 02 00 0f 85 80 02 00 00 4c 8b a3 10 07 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8d 74 24 10 4c 89 f2 48 c1 ea 03 <80> 3c 02 00 0f 85 67 02 00 00 49 8b 44 24 10 4c 8d a0 68 fa ff ff RSP: 0018:ffff888000127490 EFLAGS: 00010a03 RAX: dffffc0000000000 RBX: ffff8880a4cd5438 RCX: ffffffff818dae9c RDX: 100000000c3cc602 RSI: ffffffff818dac8d RDI: 0000000000000001 RBP: ffff8880001274d0 R08: ffff888000086180 R09: ffffed1015d26be0 R10: ffffed1015d26bdf R11: ffff8880ae935efb R12: 8000000061e63007 R13: 0000000000000000 R14: 8000000061e63017 R15: 1ffff11000024ea6 FS: 00005555561f5940(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2f823000 CR3: 000000009237e000 CR4: 00000000001426f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 The fix is to decouple the cpuset/mempolicy intersection check from oom_unkillable_task() and make sure cpuset/mempolicy intersection check is only done in the global oom context. [shakeelb@google.com: change function name and update comment] Link: http://lkml.kernel.org/r/20190628152421.198994-3-shakeelb@google.com Link: http://lkml.kernel.org/r/20190624212631.87212-3-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Reported-by: syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm, oom: remove redundant task_in_mem_cgroup() checkShakeel Butt5-47/+9
oom_unkillable_task() can be called from three different contexts i.e. global OOM, memcg OOM and oom_score procfs interface. At the moment oom_unkillable_task() does a task_in_mem_cgroup() check on the given process. Since there is no reason to perform task_in_mem_cgroup() check for global OOM and oom_score procfs interface, those contexts provide NULL memcg and skips the task_in_mem_cgroup() check. However for memcg OOM context, the oom_unkillable_task() is always called from mem_cgroup_scan_tasks() and thus task_in_mem_cgroup() check becomes redundant and effectively dead code. So, just remove the task_in_mem_cgroup() check altogether. Link: http://lkml.kernel.org/r/20190624212631.87212-2-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm, oom: refactor dump_tasks for memcg OOMsShakeel Butt1-28/+40
dump_tasks() traverses all the existing processes even for the memcg OOM context which is not only unnecessary but also wasteful. This imposes a long RCU critical section even from a contained context which can be quite disruptive. Change dump_tasks() to be aligned with select_bad_process and use mem_cgroup_scan_tasks to selectively traverse only processes of the target memcg hierarchy during memcg OOM. Link: http://lkml.kernel.org/r/20190617231207.160865-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()Tetsuo Handa2-4/+1
Since commit c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations") corrected how CSS_TASK_ITER_PROCS works, mem_cgroup_scan_tasks() can use CSS_TASK_ITER_PROCS in order to check only one thread from each thread group. [penguin-kernel@I-love.SAKURA.ne.jp: remove thread group leader check in oom_evaluate_task()] Link: http://lkml.kernel.org/r/1560853257-14934-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Link: http://lkml.kernel.org/r/c763afc8-f0ae-756a-56a7-395f625b95fc@i-love.sakura.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm/memory-failure.c: clarify error messageJane Chu1-1/+1
Some user who install SIGBUS handler that does longjmp out therefore keeping the process alive is confused by the error message "[188988.765862] Memory failure: 0x1840200: Killing cellsrv:33395 due to hardware memory corruption" Slightly modify the error message to improve clarity. Link: http://lkml.kernel.org/r/1558403523-22079-1-git-send-email-jane.chu@oracle.com Signed-off-by: Jane Chu <jane.chu@oracle.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Pankaj Gupta <pagupta@redhat.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm: vmalloc: show number of vmalloc pages in /proc/meminfoRoman Gushchin3-1/+13
Vmalloc() is getting more and more used these days (kernel stacks, bpf and percpu allocator are new top users), and the total % of memory consumed by vmalloc() can be pretty significant and changes dynamically. /proc/meminfo is the best place to display this information: its top goal is to show top consumers of the memory. Since the VmallocUsed field in /proc/meminfo is not in use for quite a long time (it has been defined to 0 by a5ad88ce8c7f ("mm: get rid of 'vmalloc_info' from /proc/meminfo")), let's reuse it for showing the actual physical memory consumption of vmalloc(). Link: http://lkml.kernel.org/r/20190417194002.12369-3-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm: smaps: split PSS into componentsLuigi Semenzato3-42/+105
Report separate components (anon, file, and shmem) for PSS in smaps_rollup. This helps understand and tune the memory manager behavior in consumer devices, particularly mobile devices. Many of them (e.g. chromebooks and Android-based devices) use zram for anon memory, and perform disk reads for discarded file pages. The difference in latency is large (e.g. reading a single page from SSD is 30 times slower than decompressing a zram page on one popular device), thus it is useful to know how much of the PSS is anon vs. file. All the information is already present in /proc/pid/smaps, but much more expensive to obtain because of the large size of that procfs entry. This patch also removes a small code duplication in smaps_account, which would have gotten worse otherwise. Also updated Documentation/filesystems/proc.txt (the smaps section was a bit stale, and I added a smaps_rollup section) and Documentation/ABI/testing/procfs-smaps_rollup. [semenzato@chromium.org: v5] Link: http://lkml.kernel.org/r/20190626234333.44608-1-semenzato@chromium.org Link: http://lkml.kernel.org/r/20190626180429.174569-1-semenzato@chromium.org Signed-off-by: Luigi Semenzato <semenzato@chromium.org> Acked-by: Yu Zhao <yuzhao@chromium.org> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Yu Zhao <yuzhao@chromium.org> Cc: Brian Geffon <bgeffon@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12mm: use down_read_killable for locking mmap_sem in access_remote_vmKonstantin Khlebnikov2-2/+5
This function is used by ptrace and proc files like /proc/pid/cmdline and /proc/pid/environ. Access_remote_vm never returns error codes, all errors are ignored and only size of successfully read data is returned. So, if current task was killed we'll simply return 0 (bytes read). Mmap_sem could be locked for a long time or forever if something goes wrong. Using a killable lock permits cleanup of stuck tasks and simplifies investigation. Link: http://lkml.kernel.org/r/156007494202.3335.16782303099589302087.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Michal Koutný <mkoutny@suse.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>