aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-01-28x86/ftrace: Add one more ENDPROC annotationJosh Poimboeuf1-1/+1
When ORC support was added for the ftrace_64.S code, an ENDPROC for function_hook() was missed. This results in the following warning: arch/x86/kernel/ftrace_64.o: warning: objtool: .entry.text+0x0: unreachable instruction Fixes: e2ac83d74a4d ("x86/ftrace: Fix ORC unwinding from ftrace handlers") Reported-by: Steven Rostedt <rostedt@goodmis.org> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20180128022150.dqierscqmt3uwwsr@treble
2018-01-27hwmon: (dell-smm) Disable fan support for Dell Vostro 3360Oleksandr Natalenko1-0/+7
Calling fan related SMM functions implemented by Dell BIOS firmware on Dell Vostro 3360 freeze kernel for about 500ms. Unfortunately, it is unlikely for Dell to fix this since the machine is pretty old, so this commit just disables fan support to make the system usable again. Via "force" module param fan support can be enabled. Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751 Link: http://lkml.iu.edu/hypermail/linux/kernel/1711.2/06083.html Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-01-27hwmon: (dell-smm) Disable fan support for Dell Inspiron 7720Pali Rohár1-1/+39
Calling fan related SMM functions implemented by Dell BIOS firmware on Dell Inspiron 7720 freeze kernel for about 500ms. Until Dell fixes it we need to disable fan support for Dell Inspiron 7720 as it makes system unusable. Via "force" module param fan support can be enabled. Reported-by: vova7890@mail.ru Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751 Cc: stable@vger.kernel.org # v4.0+, will need backport Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-01-27hwmon: (dell-smm) Enable broken functionality via "force" module paramPali Rohár1-2/+5
Some Dell machines are broken and some functionality is disabled. Show warning into dmesg about this fact and allow user via "force" module param to override brokenness and enable broken functionality. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-01-27hrtimer: Reset hrtimer cpu base proper on CPU hotplugThomas Gleixner1-0/+3
The hrtimer interrupt code contains a hang detection and mitigation mechanism, which prevents that a long delayed hrtimer interrupt causes a continous retriggering of interrupts which prevent the system from making progress. If a hang is detected then the timer hardware is programmed with a certain delay into the future and a flag is set in the hrtimer cpu base which prevents newly enqueued timers from reprogramming the timer hardware prior to the chosen delay. The subsequent hrtimer interrupt after the delay clears the flag and resumes normal operation. If such a hang happens in the last hrtimer interrupt before a CPU is unplugged then the hang_detected flag is set and stays that way when the CPU is plugged in again. At that point the timer hardware is not armed and it cannot be armed because the hang_detected flag is still active, so nothing clears that flag. As a consequence the CPU does not receive hrtimer interrupts and no timers expire on that CPU which results in RCU stalls and other malfunctions. Clear the flag along with some other less critical members of the hrtimer cpu base to ensure starting from a clean state when a CPU is plugged in. Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the root cause of that hard to reproduce heisenbug. Once understood it's trivial and certainly justifies a brown paperbag. Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
2018-01-27x86: Mark hpa as a "Designated Reviewer" for the time beingH. Peter Anvin1-11/+1
Due to some unfortunate events, I have not been directly involved in the x86 kernel patch flow for a while now. I have also not been able to ramp back up by now like I had hoped to, and after reviewing what I will need to work on both internally at Intel and elsewhere in the near term, it is clear that I am not going to be able to ramp back up until late 2018 at the very earliest. It is not acceptable to not recognize that this load is currently taken by Ingo and Thomas without my direct participation, so I mark myself as R: (designated reviewer) rather than M: (maintainer) until further notice. This is in fact recognizing the de facto situation for the past few years. I have obviously no intention of going away, and I will do everything within my power to improve Linux on x86 and x86 for Linux. This, however, puts credit where it is due and reflects a change of focus. This patch also removes stale entries for portions of the x86 architecture which have not been maintained separately from arch/x86 for a long time. If there is a reason to re-introduce them then that can happen later. Signed-off-by: H. Peter Anvin <h.peter.anvin@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180125195934.5253-1-hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-26regulator: Fix build errorMark Brown1-1/+1
3d67fe950707 (regulator: core: Refactor regulator_list_voltage()) missed one user of regulator_list_voltage(), update for that. Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSINGStefan Hajnoczi1-1/+1
select(2) with wfds but no rfds must return when the socket is shut down by the peer. This way userspace notices socket activity and gets -EPIPE from the next write(2). Currently select(2) does not return for virtio-vsock when a SEND+RCV shutdown packet is received. This is because vsock_poll() only sets POLLOUT | POLLWRNORM for TCP_CLOSE, not the TCP_CLOSING state that the socket is in when the shutdown is received. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed stateAlexey Kodanev1-0/+3
ccid2_hc_tx_rto_expire() timer callback always restarts the timer again and can run indefinitely (unless it is stopped outside), and after commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"), which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from dccp_destroy_sock() to sk_destruct(), this started to happen quite often. The timer prevents releasing the socket, as a result, sk_destruct() won't be called. Found with LTP/dccp_ipsec tests running on the bonding device, which later couldn't be unloaded after the tests were completed: unregister_netdevice: waiting for bond0 to become free. Usage count = 148 Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26Update the RISC-V MAINTAINERS filePalmer Dabbelt1-2/+2
Now that we're upstream in Linux we've been able to make some infrastructure changes so our port works a bit more like other ports. Specifically: * We now have a mailing list specific to the RISC-V Linux port, hosted at lists.infreadead.org. * We now have a kernel.org git tree where work on our port is coordinated. This patch changes the RISC-V maintainers entry to reflect these new bits of infrastructure. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-01-26regulator: core: Refactor regulator_list_voltage()Maciej Purski1-5/+5
Change _regulator_list_voltage() argument from regulator to regulator_dev in order to provide better separation of core layers. Allow calling _regulator_list_voltage() from functions, with regulator_dev argument. This refactoring is needed in order to implement setting voltage of coupled regulators. Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26regulator: core: Move of_find_regulator_by_node() to of_regulator.cMaciej Purski3-22/+24
As of_find_regulator_by_node() is an of function it should be moved from core.c to of_regulator.c. It provides better separation of device tree functions from the core and allows other of_functions in of_regulator.c to resolve device_node to regulator_dev. This will be useful for implementation of parsing coupled regulators properties. Declare of_find_regulator_by_node() function in internal.h as well as regulator_class and dev_to_rdev(), as they are needed by of_find_regulator_by_node(). Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernelsAndy Lutomirski1-13/+9
On a 5-level kernel, if a non-init mm has a top-level entry, it needs to match init_mm's, but the vmalloc_fault() code skipped over the BUG_ON() that would have checked it. While we're at it, get rid of the rather confusing 4-level folded "pgd" logic. Cleans-up: b50858ce3e2a ("x86/mm/vmalloc: Add 5-level paging support") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Neil Berrington <neil.berrington@datacore.com> Link: https://lkml.kernel.org/r/2ae598f8c279b0a29baf75df207e6f2fdddc0a1b.1516914529.git.luto@kernel.org
2018-01-26x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systemsAndy Lutomirski1-5/+29
Neil Berrington reported a double-fault on a VM with 768GB of RAM that uses large amounts of vmalloc space with PTI enabled. The cause is that load_new_mm_cr3() was never fixed to take the 5-level pgd folding code into account, so, on a 4-level kernel, the pgd synchronization logic compiles away to exactly nothing. Interestingly, the problem doesn't trigger with nopti. I assume this is because the kernel is mapped with global pages if we boot with nopti. The sequence of operations when we create a new task is that we first load its mm while still running on the old stack (which crashes if the old stack is unmapped in the new mm unless the TLB saves us), then we call prepare_switch_to(), and then we switch to the new stack. prepare_switch_to() pokes the new stack directly, which will populate the mapping through vmalloc_fault(). I assume that we're getting lucky on non-PTI systems -- the old stack's TLB entry stays alive long enough to make it all the way through prepare_switch_to() and switch_to() so that we make it to a valid stack. Fixes: b50858ce3e2a ("x86/mm/vmalloc: Add 5-level paging support") Reported-and-tested-by: Neil Berrington <neil.berrington@datacore.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: stable@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/346541c56caed61abbe693d7d2742b4a380c5001.1516914529.git.luto@kernel.org
2018-01-26spi: dw: Remove unused members from struct chip_dataJarkko Nikula1-2/+0
Local struct chip_data has two members that are not used: - cs. Looks like was never used - enable_dma. Became unused by the commit f89a6d8f43eb ("spi: dw-mid: move to use core SPI DMA mappings"). Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26regulator: add PM suspend and resume hooksChunyan Zhang4-33/+251
In this patch, consumers are allowed to set suspend voltage, and this actually just set the "uV" in constraint::regulator_state, when the regulator_suspend_late() was called by PM core through callback when the system is entering into suspend, the regulator device would act suspend activity then. And it assumes that if any consumer set suspend voltage, the regulator device should be enabled in the suspend state. And if the suspend voltage of a regulator device for all consumers was set zero, the regulator device would be off in the suspend state. This patch also provides a new function hook to regulator devices for resuming from suspend states. Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26regulator: empty the old suspend functionsChunyan Zhang2-77/+2
Regualtor suspend/resume functions should only be called by PM suspend core via registering dev_pm_ops, and regulator devices should implement the callback functions. Thus, any regulator consumer shouldn't call the regulator suspend/resume functions directly. In order to avoid compile errors, two empty functions with the same name still be left for the time being. Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26regulator: leave one item to record whether regulator is enabledChunyan Zhang3-14/+25
The items "disabled" and "enabled" are a little redundant, since only one of them would be set to record if the regulator device should keep on or be switched to off in suspend states. So in this patch, the "disabled" was removed, only leave the "enabled": - enabled == 1 for regulator-on-in-suspend - enabled == 0 for regulator-off-in-suspend - enabled == -1 means do nothing when entering suspend mode. Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26regulator: make regulator voltage be an array to support more statesChunyan Zhang2-30/+51
Some regulator consumers would like to make the regulator device keeping a voltage range output when the system entering into suspend states. Making regulator voltage be an array can allow consumers to set voltage for normal state as well as for suspend states through the same code. Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26regulator: added support for suspend statesChunyan Zhang1-2/+10
Some systems need to set regulators to specific states when they enter low power modes, especially around CPUs. There are many of these modes depending on the particular runtime state. Currently the regulator consumers are not granted permission to change suspend state of regulator devices, the constraints are configured at startup. In order to allow changes in a vlotage range, we need to add new properties for voltage range and a flag to give permission to change the suspend voltage and suspend on/off in suspend mode. Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-26spi: orion: Fix a resource leak if the optional "axi" clk is deferredChristophe Jaillet1-5/+8
If the optional "axi" clk is deferred, we still need to undo some initialisation. Especially 'master' must be released. It will be reallocated the next time 'orion_spi_probe()' is called. Add a new label to clean what needs to be cleaned and rename another label to improve the names used. Fixes: 92ae112e477a ("spi: orion: Fix clock resource by adding an optional bus clock") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2018-01-25net: vrf: Add support for sends to local broadcast addressDavid Ahern1-2/+3
Sukumar reported that sends to the local broadcast address (255.255.255.255) are broken. Check for the address in vrf driver and do not redirect to the VRF device - similar to multicast packets. With this change sockets can use SO_BINDTODEVICE to specify an egress interface and receive responses. Note: the egress interface can not be a VRF device but needs to be the enslaved device. https://bugzilla.kernel.org/show_bug.cgi?id=198521 Reported-by: Sukumar Gopalakrishnan <sukumarg1973@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25r8169: fix memory corruption on retrieval of hardware statistics.Francois Romieu1-7/+2
Hardware statistics retrieval hurts in tight invocation loops. Avoid extraneous write and enforce strict ordering of writes targeted to the tally counters dump area address registers. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Oliver Freyermuth <o.freyermuth@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25orangefs: fix deadlock; do not write i_size in read_iterMartin Brandenburg2-16/+2
After do_readv_writev, the inode cache is invalidated anyway, so i_size will never be read. It will be fetched from the server which will also know about updates from other machines. Fixes deadlock on 32-bit SMP. See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2 Signed-off-by: Martin Brandenburg <martin@omnibond.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-26drm/nouveau: Move irq setup/teardown to pci ctor/dtorLyude Paul1-15/+31
For a while we've been having issues with seemingly random interrupts coming from nvidia cards when resuming them. Originally the fix for this was thought to be just re-arming the MSI interrupt registers right after re-allocating our IRQs, however it seems a lot of what we do is both wrong and not even nessecary. This was made apparent by what appeared to be a regression in the mainline kernel that started introducing suspend/resume issues for nouveau: a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation) After this commit was introduced, we started getting interrupts from the GPU before we actually re-allocated our own IRQ (see references below) and assigned the IRQ handler. Investigating this turned out that the problem was not with the commit, but the fact that nouveau even free/allocates it's irqs before and after suspend/resume. For starters: drivers in the linux kernel haven't had to handle freeing/re-allocating their IRQs during suspend/resume cycles for quite a while now. Nouveau seems to be one of the few drivers left that still does this, despite the fact there's no reason we actually need to since disabling interrupts from the device side should be enough, as the kernel is already smart enough to know to disable host-side interrupts for us before going into suspend. Since we were tearing down our IRQs by hand however, that means there was a short period during resume where interrupts could be received before we re-allocated our IRQ which would lead to us getting an unhandled IRQ. Since we never handle said IRQ and re-arm the interrupt registers, this would cause us to miss all of the interrupts from the GPU and cause our init process to start timing out on anything requiring interrupts. So, since this whole setup/teardown every suspend/resume cycle is useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor functions instead so they're only called at driver load and driver unload. This should fix most of the issues with pending interrupts on resume, along with getting suspend/resume for nouveau to work again. As well, this probably means we can also just remove the msi rearm call inside nvkm_pci_init(). But since our main focus here is to fix suspend/resume before 4.15, we'll save that for a later patch. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-01-25net: don't call update_pmtu unconditionallyNicolas Dichtel9-18/+20
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to: "BUG: unable to handle kernel NULL pointer dereference at (null)" Let's add a helper to check if update_pmtu is available before calling it. Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path") Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") CC: Roman Kapl <code@rkapl.cz> CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25net: tcp: close sock if net namespace is exitingDan Streetman3-0/+28
When a tcp socket is closed, if it detects that its net namespace is exiting, close immediately and do not wait for FIN sequence. For normal sockets, a reference is taken to their net namespace, so it will never exit while the socket is open. However, kernel sockets do not take a reference to their net namespace, so it may begin exiting while the kernel socket is still open. In this case if the kernel socket is a tcp socket, it will stay open trying to complete its close sequence. The sock's dst(s) hold a reference to their interface, which are all transferred to the namespace's loopback interface when the real interfaces are taken down. When the namespace tries to take down its loopback interface, it hangs waiting for all references to the loopback interface to release, which results in messages like: unregister_netdevice: waiting for lo to become free. Usage count = 1 These messages continue until the socket finally times out and closes. Since the net namespace cleanup holds the net_mutex while calling its registered pernet callbacks, any new net namespace initialization is blocked until the current net namespace finishes exiting. After this change, the tcp socket notices the exiting net namespace, and closes immediately, releasing its dst(s) and their reference to the loopback interface, which lets the net namespace continue exiting. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811 Signed-off-by: Dan Streetman <ddstreet@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25perf/x86: Fix perf,x86,cpuhp deadlockPeter Zijlstra1-15/+18
More lockdep gifts, a 5-way lockup race: perf_event_create_kernel_counter() perf_event_alloc() perf_try_init_event() x86_pmu_event_init() __x86_pmu_event_init() x86_reserve_hardware() #0 mutex_lock(&pmc_reserve_mutex); reserve_ds_buffer() #1 get_online_cpus() perf_event_release_kernel() _free_event() hw_perf_event_destroy() x86_release_hardware() #0 mutex_lock(&pmc_reserve_mutex) release_ds_buffer() #1 get_online_cpus() #1 do_cpu_up() perf_event_init_cpu() #2 mutex_lock(&pmus_lock) #3 mutex_lock(&ctx->mutex) sys_perf_event_open() mutex_lock_double() #3 mutex_lock(ctx->mutex) #4 mutex_lock_nested(ctx->mutex, 1); perf_try_init_event() #4 mutex_lock_nested(ctx->mutex, 1) x86_pmu_event_init() intel_pmu_hw_config() x86_add_exclusive() #0 mutex_lock(&pmc_reserve_mutex) Fix it by using ordering constructs instead of locking. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25perf/core: Fix ctx::mutex deadlockPeter Zijlstra1-1/+7
Lockdep noticed the following 3-way lockup scenario: sys_perf_event_open() perf_event_alloc() perf_try_init_event() #0 ctx = perf_event_ctx_lock_nested(1) perf_swevent_init() swevent_hlist_get() #1 mutex_lock(&pmus_lock) perf_event_init_cpu() #1 mutex_lock(&pmus_lock) #2 mutex_lock(&ctx->mutex) sys_perf_event_open() mutex_lock_double() #2 mutex_lock() #0 mutex_lock_nested() And while we need that perf_event_ctx_lock_nested() for HW PMUs such that they can iterate the sibling list, trying to match it to the available counters, the software PMUs need do no such thing. Exclude them. In particular the swevent triggers the above invertion, while the tpevent PMU triggers a more elaborate one through their event_mutex. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25perf/core: Fix another perf,trace,cpuhp lock inversionPeter Zijlstra1-2/+24
Lockdep noticed the following 3-way lockup race: perf_trace_init() #0 mutex_lock(&event_mutex) perf_trace_event_init() perf_trace_event_reg() tp_event->class->reg() := tracepoint_probe_register #1 mutex_lock(&tracepoints_mutex) trace_point_add_func() #2 static_key_enable() #2 do_cpu_up() perf_event_init_cpu() #3 mutex_lock(&pmus_lock) #4 mutex_lock(&ctx->mutex) perf_ioctl() #4 ctx = perf_event_ctx_lock() _perf_iotcl() ftrace_profile_set_filter() #0 mutex_lock(&event_mutex) Fudge it for now by noting that the tracepoint state does not depend on the event <-> context relation. Ugly though :/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25perf/core: Fix lock inversion between perf,trace,cpuhpPeter Zijlstra1-2/+11
Lockdep gifted us with noticing the following 4-way lockup scenario: perf_trace_init() #0 mutex_lock(&event_mutex) perf_trace_event_init() perf_trace_event_reg() tp_event->class->reg() := tracepoint_probe_register #1 mutex_lock(&tracepoints_mutex) trace_point_add_func() #2 static_key_enable() #2 do_cpu_up() perf_event_init_cpu() #3 mutex_lock(&pmus_lock) #4 mutex_lock(&ctx->mutex) perf_event_task_disable() mutex_lock(&current->perf_event_mutex) #4 ctx = perf_event_ctx_lock() #5 perf_event_for_each_child() do_exit() task_work_run() __fput() perf_release() perf_event_release_kernel() #4 mutex_lock(&ctx->mutex) #5 mutex_lock(&event->child_mutex) free_event() _free_event() event->destroy() := perf_trace_destroy #0 mutex_lock(&event_mutex); Fix that by moving the free_event() out from under the locks. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25mtd: nand: sunxi: Fix ECC strength choiceMiquel Raynal1-1/+7
When the requested ECC strength does not exactly match the strengths supported by the ECC engine, the driver is selecting the closest strength meeting the 'selected_strength > requested_strength' constraint. Fix the fact that, in this particular case, ecc->strength value was not updated to match the 'selected_strength'. For instance, one can encounter this issue when no ECC requirement is filled in the device tree while the NAND chip minimum requirement is not a strength/step_size combo natively supported by the ECC engine. Fixes: 1fef62c1423b ("mtd: nand: add sunxi NAND flash controller support") Cc: <stable@vger.kernel.org> Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2018-01-25mtd: nand: gpmi: Fix subpage readsBoris Brezillon1-5/+13
Commit 25f815f66a14 ("mtd: nand: force drivers to explicitly send READ/PROG commands") added a call to nand_read_page_op() in gpmi_ecc_read_page(), which means this function now sends a READ0 command and place the data pointer at the beginning of the page. This logic is breaking gpmi_ecc_read_subpage() which was calling gpmi_ecc_read_page() and expected it to only retrieve the data without sending the READ0 command. Create a gpmi_ecc_read_page_data() helper which only does the data retrieval and ECC correction steps and implement gpmi_ecc_read_page() as a wrapper that calls nand_read_page_op()+gpmi_ecc_read_page_data(). This way, gpmi_ecc_read_subpage() can call gpmi_ecc_read_page_data() which restores the logic we had before commit 25f815f66a14 ("mtd: nand: force drivers to explicitly send READ/PROG commands"). Fixes: 25f815f66a14 ("mtd: nand: force drivers to explicitly send READ/PROG commands") Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reviewed-by: Miquel Raynal <miquel.raynal@free-electrons.com> Acked-by: Han Xu <han.xu@nxp.com>
2018-01-24net/ibm/emac: wrong bit is used for STA control register writeIvan Mikhaylov1-1/+1
STA control register has areas of mode and opcodes for opeations. 18 bit is using for mode selection, where 0 is old MIO/MDIO access method and 1 is indirect access mode. 19-20 bits are using for setting up read/write operation(STA opcodes). In current state 'read' is set into old MIO/MDIO mode with 19 bit and write operation is set into 18 bit which is mode selection, not a write operation. To correlate write with read we set it into 20 bit. All those bit operations are MSB 0 based. Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24net/ibm/emac: add 8192 rx/tx fifo sizeIvan Mikhaylov2-0/+8
emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes, in current state if we set it up in dts 8192 as example, we will get only 2048 which may impact on network speed. Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01"Nick Dyer1-3/+9
Since the sysfs attribute hangs off the RMI bus, which doesn't go away during firmware flash, it needs to be explicitly removed, otherwise we would try and register the same attribute twice. This reverts commit 36a44af5c176d619552d99697433261141dd1296. Signed-off-by: Nick Dyer <nick@shmanahar.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-01-24vhost: do not try to access device IOTLB when not initializedJason Wang1-0/+4
The code will try to access dev->iotlb when processing VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead to NULL pointer dereference. Fixes this by check dev->iotlb before. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()Jason Wang1-1/+1
We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to hold mutexes of all virtqueues. This may confuse lockdep to report a possible deadlock because of trying to hold locks belong to same class. Switch to use mutex_lock_nested() to avoid false positive. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24i40e: flower: check if TC offload is enabled on a netdevJakub Kicinski1-0/+2
Since TC block changes drivers are required to check if the TC hw offload flag is set on the interface themselves. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Fixes: 44ae12a768b7 ("net: sched: move the can_offload check from binding phase to rule insertion phase") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Amritha Nambiar <amritha.nambiar@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64Corentin Labbe1-1/+1
This patch fixes the typo CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64 Fixes: 81658ad0d923 ("sparc64: Add CAMELLIA driver making use of the new camellia opcodes.") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24qed: Free reserved MR tidMichal Kalderon1-11/+17
A tid was allocated for reserved MR during initialization but not freed. This lead to an annoying output message during rdma unload flow. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24qed: Remove reserveration of dpi for kernelMichal Kalderon1-3/+0
Double reservation for kernel dedicated dpi was performed. Once in the core module and once in qedr. Remove the reservation from core. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24kcm: Check if sk_user_data already set in kcm_attachTom Herbert1-2/+14
This is needed to prevent sk_user_data being overwritten. The check is done under the callback lock. This should prevent a socket from being attached twice to a KCM mux. It also prevents a socket from being attached for other use cases of sk_user_data as long as the other cases set sk_user_data under the lock. Followup work is needed to unify all the use cases of sk_user_data to use the same locking. Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24kcm: Only allow TCP sockets to be attached to a KCM muxTom Herbert1-2/+7
TCP sockets for IPv4 and IPv6 that are not listeners or in closed stated are allowed to be attached to a KCM mux. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25MAINTAINERS: update email address for James MorrisJames Morris1-1/+1
Update my email address. Signed-off-by: James Morris <jmorris@namei.org>
2018-01-24net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptrWolfgang Bumiller1-1/+1
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as skb->data points to the network header. Use skb_mac_header instead. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24net: sched: em_nbyte: don't add the data offset twiceWolfgang Bumiller1-1/+1
'ptr' is shifted by the offset and then validated, the memcmp should not add it a second time. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Btrfs: fix stale entries in readdirJosef Bacik1-18/+8
In fixing the readdir+pagefault deadlock I accidentally introduced a stale entry regression in readdir. If we get close to full for the temporary buffer, and then skip a few delayed deletions, and then try to add another entry that won't fit, we will emit the entries we found and retry. Unfortunately we delete entries from our del_list as we find them, assuming we won't need them. However our pos will be with whatever our last entry was, which could be before the delayed deletions we skipped, so the next search will add the deleted entries back into our readdir buffer. So instead don't delete entries we find in our del_list so we can make sure we always find our delayed deletions. This is a slight perf hit for readdir with lots of pending deletions, but hopefully this isn't a common occurrence. If it is we can revist this and optimize it. cc: stable@vger.kernel.org Fixes: 23b5ec74943f ("btrfs: fix readdir deadlock with pagefault") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-24MAINTAINERS: clarify that only verified bugs should be submitted to security@Willy Tarreau1-1/+9
We're seeing a raise of automated reports from testing tools and reports about address leaks that are not really exploitable as-is, many of which do not represent an immediate risk justifying to work in closed places. Signed-off-by: Willy Tarreau <w@1wt.eu> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-24regulator: qcom_spmi: Use regmap helpers for enable/disable/is_enabled callbackAxel Lin1-53/+31
Setup .enable_reg/.enable_mask/.enable_val fields, then we can use the regmap helpers for enable/disable/is_enabled callback implementation. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@kernel.org>