aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-06-05irqchip/mips-gic: Use the correct local interrupt map registersPaul Burton2-2/+32
The MIPS GIC contains a block of registers used to map local interrupts to a particular CPU interrupt pin. Since these registers are found at a consecutive range of addresses we access them using an index, via the (read|write)_gic_v[lo]_map accessor functions. We currently use values from enum mips_gic_local_interrupt as those indices. Unfortunately whilst enum mips_gic_local_interrupt provides the correct offsets for bits in the pending & mask registers, the ordering of the map registers is subtly different... Compared with the ordering of pending & mask bits, the map registers move the FDC from the end of the list to index 3 after the timer interrupt. As a result the performance counter & software interrupts are therefore at indices 4-6 rather than indices 3-5. Notably this causes problems with performance counter interrupts being incorrectly mapped on some systems, and presumably will also cause problems for FDC interrupts. Introduce a function to map from enum mips_gic_local_interrupt to the index of the corresponding map register, and use it to ensure we access the map registers for the correct interrupts. Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: a0dc5cb5e31b ("irqchip: mips-gic: Simplify gic_local_irq_domain_map()") Fixes: da61fcf9d62a ("irqchip: mips-gic: Use irq_cpu_online to (un)mask all-VP(E) IRQs") Reported-and-tested-by: Archer Yan <ayan@wavecomp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-05irqchip/ti-sci-inta: Fix kernel crash if irq_create_fwspec_mapping failPeter Ujfalusi1-2/+2
irq_create_fwspec_mapping() can fail, returning 0 as parent_virq. In this case vint_desc is going to be NULL in ti_sci_inta_alloc_irq() which will cause NULL pointer dereference. Also note that irq_create_fwspec_mapping() returns 'unsigned int' so the check '<=' was wrong. Use -EINVAL if irq_create_fwspec_mapping() returned with 0. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-06-05irqchip/irq-csky-mpintc: Support auto irq deliver to all cpusGuo Ren1-2/+13
The csky,mpintc could deliver a external irq to one cpu or all cpus, but it couldn't deliver a external irq to a group of cpus with cpu_mask. So we only use auto deliver mode when affinity mask_val is equal to cpu_present_mask. There is no limitation for only two cpus in SMP system. Signed-off-by: Guo Ren <ren_guo@c-sky.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03iommu/dma-iommu: Remove iommu_dma_map_msi_msg()Julien Grall2-25/+0
A recent change split iommu_dma_map_msi_msg() in two new functions. The function was still implemented to avoid modifying all the callers at once. Now that all the callers have been reworked, iommu_dma_map_msi_msg() can be removed. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg()Julien Grall1-2/+8
The functions mbi_compose_m{b, s}i_msg may be called from non-preemptible context. However, on RT, iommu_dma_map_msi_msg() requires to be called from a preemptible context. A recent patch split iommu_dma_map_msi_msg in two new functions: one that should be called in preemptible context, the other does not have any requirement. The GICv3 MSI driver is reworked to avoid executing preemptible code in non-preemptible context. This can be achieved by preparing the MSI mapping when allocating the MSI interrupt. Signed-off-by: Julien Grall <julien.grall@arm.com> [maz: only call iommu_dma_prepare_msi once, fix commit log accordingly] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg()Julien Grall1-1/+6
ls_scfg_msi_compose_msg() may be called from non-preemptible context. However, on RT, iommu_dma_map_msi_msg() requires to be called from a preemptible context. A recent patch split iommu_dma_map_msi_msg() in two new functions: one that should be called in preemptible context, the other does not have any requirement. The FreeScale SCFG MSI driver is reworked to avoid executing preemptible code in non-preemptible context. This can be achieved by preparing the MSI maping when allocating the MSI interrupt. Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg()Julien Grall1-1/+6
its_irq_compose_msi_msg() may be called from non-preemptible context. However, on RT, iommu_dma_map_msi_msg requires to be called from a preemptible context. A recent change split iommu_dma_map_msi_msg() in two new functions: one that should be called in preemptible context, the other does not have any requirement. The GICv3 ITS driver is reworked to avoid executing preemptible code in non-preemptible context. This can be achieved by preparing the MSI mapping when allocating the MSI interrupt. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg()Julien Grall1-1/+7
gicv2m_compose_msi_msg() may be called from non-preemptible context. However, on RT, iommu_dma_map_msi_msg() requires to be called from a preemptible context. A recent change split iommu_dma_map_msi_msg() in two new functions: one that should be called in preemptible context, the other does not have any requirement. The GICv2m driver is reworked to avoid executing preemptible code in non-preemptible context. This can be achieved by preparing the MSI mapping when allocating the MSI interrupt. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two partsJulien Grall3-9/+63
On RT, iommu_dma_map_msi_msg() may be called from non-preemptible context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock (they can sleep on RT). iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT and update the MSI message with the IOVA. Only the part to lookup for the MSI page requires to be called in preemptible context. As the MSI page cannot change over the lifecycle of the MSI interrupt, the lookup can be cached and re-used later on. iomma_dma_map_msi_msg() is now split in two functions: - iommu_dma_prepare_msi(): This function will prepare the mapping in the IOMMU and store the cookie in the structure msi_desc. This function should be called in preemptible context. - iommu_dma_compose_msi_msg(): This function will update the MSI message with the IOVA when the device is behind an IOMMU. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-03genirq/msi: Add a new field in msi_desc to store an IOMMU cookieJulien Grall2-0/+29
When an MSI doorbell is located downstream of an IOMMU, it is required to swizzle the physical address with an appropriately-mapped IOVA for any device attached to one of our DMA ops domain. At the moment, the allocation of the mapping may be done when composing the message. However, the composing may be done in non-preemtible context while the allocation requires to be called from preemptible context. A follow-up change will split the current logic in two functions requiring to keep an IOMMU cookie per MSI. A new field is introduced in msi_desc to store an IOMMU cookie. As the cookie may not be required in some configuration, the field is protected under a new config CONFIG_IRQ_MSI_IOMMU. A pair of helpers has also been introduced to access the field. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01arm64: arch_k3: Enable interrupt controller driversLokesh Vutla1-0/+5
Select the TISCI Interrupt Router, Aggregator drivers and all its dependencies for TI's SoCs based on K3 architecture. Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01irqchip/ti-sci-inta: Add msi domain supportLokesh Vutla2-1/+40
Add a msi domain that is child to the INTA domain. Clients uses the INTA MSI bus layer to allocate irqs in this MSI domain. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01soc: ti: Add MSI domain bus support for Interrupt AggregatorLokesh Vutla7-0/+189
With the system coprocessor managing the range allocation of the inputs to Interrupt Aggregator, it is difficult to represent the device IRQs from DT. The suggestion is to use MSI in such cases where devices wants to allocate and group interrupts dynamically. Create a MSI domain bus layer that allocates and frees MSIs for a device. APIs that are implemented: - ti_sci_inta_msi_create_irq_domain() that creates a MSI domain - ti_sci_inta_msi_domain_alloc_irqs() that creates MSIs for the specified device and resource. - ti_sci_inta_msi_domain_free_irqs() frees the irqs attached to the device. - ti_sci_inta_msi_get_virq() for getting the virq attached to a specific event. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01irqchip/ti-sci-inta: Add support for Interrupt Aggregator driverLokesh Vutla4-0/+589
Texas Instruments' K3 generation SoCs has an IP Interrupt Aggregator which is an interrupt controller that does the following: - Converts events to interrupts that can be understood by an interrupt router. - Allows for multiplexing of events to interrupts. Configuration of the interrupt aggregator registers can only be done by a system co-processor and the driver needs to send a message to this co processor over TISCI protocol. Add the required infrastructure to allow the allocation and routing of these events. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01dt-bindings: irqchip: Introduce TISCI Interrupt Aggregator bindingsLokesh Vutla2-0/+67
Add the DT binding documentation for Interrupt Aggregator driver. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01irqchip/ti-sci-intr: Add support for Interrupt Router driverLokesh Vutla4-0/+287
Texas Instruments' K3 generation SoCs has an IP Interrupt Router that does allows for redirection of input interrupts to host interrupt controller. Interrupt Router inputs are either from a peripheral or from an Interrupt Aggregator which is another interrupt controller. Configuration of the interrupt router registers can only be done by a system co-processor and the driver needs to send a message to this co processor over TISCI protocol. Add support for Interrupt Router driver over TISCI protocol. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01dt-bindings: irqchip: Introduce TISCI Interrupt router bindingsLokesh Vutla2-0/+83
Add the DT binding documentation for Interrupt router driver. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01gpio: thunderx: Use the default parent apis for {request,release}_resourcesLokesh Vutla1-12/+4
thunderx_gpio_irq_{request,release}_resources apis are trying to {request,release} resources on parent interrupt. There are default apis doing the same. Use the default parent apis instead of writing the same code snippet. Cc: linux-gpio@vger.kernel.org Cc: Linus Walleij <linus.walleij@linaro.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01genirq: Introduce irq_chip_{request,release}_resource_parent() apisLokesh Vutla2-0/+29
Introduce irq_chip_{request,release}_resource_parent() apis so that these can be used in hierarchical irqchips. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01firmware: ti_sci: Add helper apis to manage resourcesLokesh Vutla2-0/+184
Each resource with in the device can be uniquely identified as defined by TISCI. Since this is generic across the devices, resource allocation also can be made generic instead of each client driver handling the resource. So add helper apis to manage the resource. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01firmware: ti_sci: Add RM mapping table for am654Peter Ujfalusi2-1/+25
Add the resource mapping table for AM654 SoC as defined in http://downloads.ti.com/tisci/esd/latest/5_soc_doc/am6x/resasg_types.html Introduce a new compatible for AM654 "ti,am654-sci" for using this resource map table. Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01firmware: ti_sci: Add support for IRQ managementLokesh Vutla3-0/+331
TISCI abstracts the handling of IRQ routes where interrupt sources are not directly connected to host interrupt controller. Add support for the set of TISCI commands for requesting and releasing IRQs. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01firmware: ti_sci: Add support for RM core opsLokesh Vutla3-0/+239
TISCI provides support for getting the resources(IRQ, RING etc..) assigned to a specific device. These resources can be handled by the client and in turn sends TISCI cmd to configure the resources. It is very important that client should keep track on usage of these resources. Add support for TISCI commands to get resource ranges. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-05-01firmware: ti_sci: Add support to get TISCI handle using of_phandleGrygorii Strashko2-0/+100
TISCI has been updated to have support for Resource management(like interrupts etc..). And there can be multiple device instances of a resource type in a SoC. So every driver corresponding to a resource type should get a TISCI handle so that it can make TISCI calls. And each DT node corresponding to a device should exist under its corresponding bus node as per the SoC architecture. But existing apis in TISCI library assumes that all TISCI users are child nodes of TISCI. Which is not true in the above case. So introduce (devm_)ti_sci_get_by_phandle() apis that can be used by TISCI users to get TISCI handle using of phandle property. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Nishanth Menon <nm@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/renesas-intc-irqpin: Remove devm_kzalloc() error printingGeert Uytterhoeven1-3/+1
There is no need to print a message if devm_kzalloc() fails, as the memory allocation core already takes care of that. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip: Remove unneeded select IRQ_DOMAINGeert Uytterhoeven1-6/+0
IRQ_DOMAIN_HIERARCHY selects IRQ_DOMAIN, hence there is no need for drivers to select both. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-v3-its: Make free_lpi_range a little cheaperRasmus Villemoes1-30/+31
Using list_add + list_sort to insert an element and keeping the list sorted is a somewhat blunt instrument; one can find the right place to insert in fewer lines of code than the cmp callback uses. Moreover, walking the entire list afterwards to merge adjacent ranges is overkill, since we know that only the just-inserted element may be merged with its neighbours. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-v3-its: Drop redundant initialization in mk_lpi_rangeRasmus Villemoes1-2/+1
There's no reason to ask kmalloc() to zero the allocation, since all the fields get initialized immediately afterwards. Except that there's also not any reason to initialize the ->entry member, since the element gets added to the lpi_range_list immediately. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-v3-its: Move allocation outside mutexRasmus Villemoes1-9/+6
There's no reason to do the allocation of the new lpi_range inside the lpi_range_lock. One could change the code to avoid the allocation altogether in case the freed range can be merged with one or two existing ranges (in which case the allocation would naturally be done under the lock), but it's probably not worth complicating the code for that. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/stm32: Use a platform driver for stm32mp1-exti deviceFabien Dessenne1-93/+140
This irqchip driver uses the hwspinlock framework (coprocessor HW regs access concurrency) for the stm32mp1-exti device. Hence, this driver needs to handle the hwspinlock driver dependency using the deferred probe mechanism which requires to move this driver into a platform one with a probe() ops. This applies only for the device which is "st,stm32mp1-exti" compatible, the management of the other devices (st,stm32h7-exti / st,stm32-exti) is kept unchanged (use IRQCHIP_DECLARE) Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-pm: Fix suspend handlingSameer Pujar1-0/+2
If interrupts are enabled for a non-root GIC device that uses the gic-pm driver, when system suspend occurs, the current interrupt state is not saved and restored correctly and so interrupts do not work again on resuming the system. Add a late suspend handler to save and restore the state for these devices. Suggested-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Sameer Pujar <spujar@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-pm: Update driver to use clk_bulk APIsSameer Pujar1-37/+37
gic-pm driver is using pm-clk framework to manage clock resources, where clocks remain always ON. This happens on Tegra devices which use BPMP co-processor to manage the clocks. Calls to BPMP are always blocking and hence it is necessary to enable/disable clocks during prepare/unprepare phase respectively. When pm-clk is used, prepare count of clock is not balanced until pm_clk_remove() happens. Clock is prepared in the driver probe() and thus prepare count of clock remains non-zero, which in turn keeps clock ON always. Please note that above mentioned behavior is specific to Tegra devices using BPMP for clock management and this should not be seen on other devices. Though this patch uses clk_bulk APIs to address the mentioned behavior, this works fine for all devices. To simplify gic_get_clocks() API is removed and instead probe can do necessary setup. Suggested-by: Mohan Kumar D <mkumard@nvidia.com> Signed-off-by: Sameer Pujar <spujar@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irq/irqdomain: Fix typo in the comment on top of __irq_domain_alloc_irqs()Julien Grall1-1/+1
The word 'number' has been misspelt in the comment on top of _irq_domain_alloc_irqs(). Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/imx-irqsteer: Use devm_platform_ioremap_resource() to simplify codeAnson Huang1-3/+1
Use the new helper devm_platform_ioremap_resource() which wraps the platform_get_resource() and devm_ioremap_resource() together, to simplify the code. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-v3-its: Fix typo in a comment in its_msi_prepare()Julien Grall1-1/+1
The word 'entirely' has been misspelt in a comment in its_msi_prepare(). Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/gic-v3-its: fix some definitions of inner cacheability attributesHongbo Yao1-6/+6
Some definitions of Inner Cacheability attibutes need to be corrected. Fixes: 8c828a535e29f ("irqchip/gicv3-its: Restore all cacheability attributes") Signed-off-by: Hongbo Yao <yaohongbo@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-29irqchip/bcm: Restore registration print with %pOFFlorian Fainelli3-0/+8
It is useful to print which interrupt controllers are registered in the system and which parent IRQ they use, especially given that L2 interrupt controllers do not call request_irq() on their parent interrupt and do not appear under /proc/interrupts for that reason. We used to print the base register address virtual address which had little value, use %pOF to print the path to the Device Tree node which maps to the physical address more easily and is what people need to troubleshoot systems. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2019-04-14Linux 5.1-rc5Linus Torvalds1-1/+1
2019-04-14fs: prevent page refcount overflow in pipe_buf_getMatthew Wilcox5-15/+29
Change pipe_buf_get() to return a bool indicating whether it succeeded in raising the refcount of the page (if the thing in the pipe is a page). This removes another mechanism for overflowing the page refcount. All callers converted to handle a failure. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14mm: prevent get_user_pages() from overflowing page refcountLinus Torvalds2-12/+49
If the page refcount wraps around past zero, it will be freed while there are still four billion references to it. One of the possible avenues for an attacker to try to make this happen is by doing direct IO on a page multiple times. This patch makes get_user_pages() refuse to take a new page reference if there are already more than two billion references to the page. Reported-by: Jann Horn <jannh@google.com> Acked-by: Matthew Wilcox <willy@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14mm: add 'try_get_page()' helper functionLinus Torvalds1-0/+9
This is the same as the traditional 'get_page()' function, but instead of unconditionally incrementing the reference count of the page, it only does so if the count was "safe". It returns whether the reference count was incremented (and is marked __must_check, since the caller obviously has to be aware of it). Also like 'get_page()', you can't use this function unless you already had a reference to the page. The intent is that you can use this exactly like get_page(), but in situations where you want to limit the maximum reference count. The code currently does an unconditional WARN_ON_ONCE() if we ever hit the reference count issues (either zero or negative), as a notification that the conditional non-increment actually happened. NOTE! The count access for the "safety" check is inherently racy, but that doesn't matter since the buffer we use is basically half the range of the reference count (ie we look at the sign of the count). Acked-by: Matthew Wilcox <willy@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-14mm: make page ref count overflow check tighter and more explicitLinus Torvalds1-1/+5
We have a VM_BUG_ON() to check that the page reference count doesn't underflow (or get close to overflow) by checking the sign of the count. That's all fine, but we actually want to allow people to use a "get page ref unless it's already very high" helper function, and we want that one to use the sign of the page ref (without triggering this VM_BUG_ON). Change the VM_BUG_ON to only check for small underflows (or _very_ close to overflowing), and ignore overflows which have strayed into negative territory. Acked-by: Matthew Wilcox <willy@infradead.org> Cc: Jann Horn <jannh@google.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-04-12clk: imx: Fix PLL_1416X not rounding ratesLeonard Crestez1-1/+1
Code which initializes the "clk_init_data.ops" checks pll->rate_table before that field is ever assigned to so it always picks "clk_pll1416x_min_ops". This breaks dynamic rate rounding for features such as cpufreq. Fix by checking pll_clk->rate_table instead, here pll_clk refers to the constant initialization data coming from per-soc clk driver. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Fixes: 8646d4dcc7fb ("clk: imx: Add PLLs driver for imx8mm soc") Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-12clk: mediatek: fix clk-gate flag settingWeiyi Lu1-2/+1
CLK_SET_RATE_PARENT would be dropped. Merge two flag setting together to correct the error. Fixes: 5a1cc4c27ad2 ("clk: mediatek: Add flags to mtk_gate") Cc: <stable@vger.kernel.org> Signed-off-by: Weiyi Lu <weiyi.lu@mediatek.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-04-12arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result valueWill Deacon1-8/+8
Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't explicitly set the return value on the non-faulting path and instead leaves it holding the result of the underlying atomic operation. This means that any FUTEX_WAKE_OP atomic operation which computes a non-zero value will be reported as having failed. Regrettably, I wrote the buggy code back in 2011 and it was upstreamed as part of the initial arm64 support in 2012. The reasons we appear to get away with this are: 1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get exercised by futex() test applications 2. If the result of the atomic operation is zero, the system call behaves correctly 3. Prior to version 2.25, the only operation used by GLIBC set the futex to zero, and therefore worked as expected. From 2.25 onwards, FUTEX_WAKE_OP is not used by GLIBC at all. Fix the implementation by ensuring that the return value is either 0 to indicate that the atomic operation completed successfully, or -EFAULT if we encountered a fault when accessing the user mapping. Cc: <stable@kernel.org> Fixes: 6170a97460db ("arm64: Atomic operations") Signed-off-by: Will Deacon <will.deacon@arm.com>
2019-04-12iommu/amd: Set exclusion range correctlyJoerg Roedel1-1/+1
The exlcusion range limit register needs to contain the base-address of the last page that is part of the range, as bits 0-11 of this register are treated as 0xfff by the hardware for comparisons. So correctly set the exclusion range in the hardware to the last page which is _in_ the range. Fixes: b2026aa2dce44 ('x86, AMD IOMMU: add functions for programming IOMMU MMIO space') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-04-12clang-format: Update with the latest for_each macro listMiguel Ojeda1-0/+24
Re-run the shell fragment that generated the original list now that there are two dozens of new entries after v5.1's merge window. Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
2019-04-12perf/core: Fix perf_event_disable_inatomic() racePeter Zijlstra2-11/+45
Thomas-Mich Richter reported he triggered a WARN()ing from event_function_local() on his s390. The problem boils down to: CPU-A CPU-B perf_event_overflow() perf_event_disable_inatomic() @pending_disable = 1 irq_work_queue(); sched-out event_sched_out() @pending_disable = 0 sched-in perf_event_overflow() perf_event_disable_inatomic() @pending_disable = 1; irq_work_queue(); // FAILS irq_work_run() perf_pending_event() if (@pending_disable) perf_event_disable_local(); // WHOOPS The problem exists in generic, but s390 is particularly sensitive because it doesn't implement arch_irq_work_raise(), nor does it call irq_work_run() from it's PMU interrupt handler (nor would that be sufficient in this case, because s390 also generates perf_event_overflow() from pmu::stop). Add to that the fact that s390 is a virtual architecture and (virtual) CPU-A can stall long enough for the above race to happen, even if it would self-IPI. Adding a irq_work_sync() to event_sched_in() would work for all hardare PMUs that properly use irq_work_run() but fails for software PMUs. Instead encode the CPU number in @pending_disable, such that we can tell which CPU requested the disable. This then allows us to detect the above scenario and even redirect the IPI to make up for the failed queue. Reported-by: Thomas-Mich Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-04-11block: fix the return errno for direct IOJason Yan1-4/+4
If the last bio returned is not dio->bio, the status of the bio will not assigned to dio->bio if it is error. This will cause the whole IO status wrong. ksoftirqd/21-117 [021] ..s. 4017.966090: 8,0 C N 4883648 [0] <idle>-0 [018] ..s. 4017.970888: 8,0 C WS 4924800 + 1024 [0] <idle>-0 [018] ..s. 4017.970909: 8,0 D WS 4935424 + 1024 [<idle>] <idle>-0 [018] ..s. 4017.970924: 8,0 D WS 4936448 + 321 [<idle>] ksoftirqd/21-117 [021] ..s. 4017.995033: 8,0 C R 4883648 + 336 [65475] ksoftirqd/21-117 [021] d.s. 4018.001988: myprobe1: (blkdev_bio_end_io+0x0/0x168) bi_status=7 ksoftirqd/21-117 [021] d.s. 4018.001992: myprobe: (aio_complete_rw+0x0/0x148) x0=0xffff802f2595ad80 res=0x12a000 res2=0x0 We always have to assign bio->bi_status to dio->bio.bi_status because we will only check dio->bio.bi_status when we return the whole IO to the upper layer. Fixes: 542ff7bf18c6 ("block: new direct I/O implementation") Cc: stable@vger.kernel.org Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-11Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping"Trond Myklebust2-45/+8
This reverts commit 009a82f6437490c262584d65a14094a818bcb747. The ability to optimise here relies on compiler being able to optimise away tail calls to avoid stack overflows. Unfortunately, we are seeing reports of problems, so let's just revert. Reported-by: Daniel Mack <daniel@zonque.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>