aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/stackcollapse.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2018-10-17tools/testing/nvdimm: Populate dirty shutdown dataDan Williams4-3/+16
Allow the unit tests to verify the retrieval of the dirty shutdown count via smart commands, and allow the driver-load-time retrieval of the smart health payload to be simulated by nfit_test. Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-17acpi, nfit: Collect shutdown statusDan Williams5-25/+118
Some NVDIMMs, in addition to providing an indication of whether the previous shutdown was clean, also provide a running count of lifetime dirty-shutdown events for the device. In anticipation of this functionality appearing on more devices arrange for the nfit driver to retrieve / cache this data at DIMM discovery time, and export it via sysfs. Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-16acpi, nfit: Introduce nfit_mem flagsDan Williams2-13/+22
In preparation for adding a flag to indicate whether a DIMM publishes a dirty-shutdown count, convert the existing flags to a bit field. Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-12libnvdimm, label: Fix sparse warningDan Williams1-1/+3
The kbuild robot reports: drivers/nvdimm/label.c:500:32: warning: restricted __le32 degrades to integer ...read 'nslot' into a local u32. Reported-by: kbuild test robot <lkp@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-12nvdimm: Use namespace index data to reduce number of label reads neededAlexander Duyck3-12/+88
This patch adds logic that is meant to make use of the namespace index data to reduce the number of reads that are needed to initialize a given namespace. The general idea is that once we have enough data to validate the namespace index we do so and then proceed to fetch only those labels that are not listed as being "free". By doing this I am seeing a total time reduction from about 4-5 seconds to 2-3 seconds for 24 NVDIMM modules each with 128K of label config area. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-12nvdimm: Split label init out from the logic for getting config dataAlexander Duyck5-31/+61
This patch splits the initialization of the label data into two functions. One for doing the init, and another for reading the actual configuration data. The idea behind this is that by doing this we create a symmetry between the getting and setting of config data in that we have a function for both. In addition it will make it easier for us to identify the bits that are related to init versus the pieces that are a wrapper for reading data from the ACPI interface. So for example by splitting things out like this it becomes much more obvious that we were performing checks that weren't necessarily related to the set/get operations such as relying on ndd->data being present when the set and get ops should not care about a locally cached copy of the label area. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-12nvdimm: Remove empty if statementAlexander Duyck1-3/+2
This patch removes an empty statement from an if expression and promotes the else statement to the if expression with the expression logic reversed. I feel this is more readable as the empty statement can lead to issues if any additional logic was ever added. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-12nvdimm: Clarify comment in sizeof_namespace_indexAlexander Duyck1-1/+2
When working on the label code I found it rather confusing to see several spots that reference a minimum label size of 256 while working with labels that are 128 bytes in size. This patch is meant to provide a clarification on one of the comments that was at the heart of the issue. Specifically for version 1.2 and later of the namespace specification the minimum label size is 256, prior to that the minimum label size was 128. So we should state that as such to avoid confusion. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-12nvdimm: Sanity check labeloffAlexander Duyck1-0/+7
This patch adds validation for the labeloff field in the indexes. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-10libnvdimm, dimm: Maximize label transfer sizeDan Williams1-7/+6
Use kvzalloc() to bypass the arbitrary PAGE_SIZE limit of label transfer operations. Given the expense of calling into firmware, maximize the amount of label data we transfer per call to be up to the total label space if allowed by the firmware. Instead of limiting based on PAGE_SIZE we can instead simply limit the maximum size based on either the config_size int he case of the get operation, or the length of the write based on the set operation. On a system with 24 NVDIMM modules each with a config_size of 128K and a maximum transfer size of 64K - 4, this patch reduces the init time for the label data from around 24 seconds down to between 4-5 seconds. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-09libnvdimm, pmem: Fix badblocks population for 'raw' namespacesDan Williams1-1/+3
The driver is only initializing bb_res in the devm_memremap_pages() paths, but the raw namespace case is passing an uninitialized bb_res to nvdimm_badblocks_populate(). Fixes: e8d513483300 ("memremap: change devm_memremap_pages interface...") Cc: <stable@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Reported-by: Jacek Zloch <jacek.zloch@intel.com> Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-10-01libnvdimm, namespace: Drop the repeat assignment for variable dev->parentGuangZhe Fu1-1/+0
The variable dev-parent is assigned twice with the same &nd_region->dev. I think we could drop the second one. Signed-off-by: GuangZhe Fu <fugz1@lenovo.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-28libnvdimm, region: Fail badblocks listing for inactive regionsDan Williams1-2/+9
While experimenting with region driver loading the following backtrace was triggered: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. [..] Call Trace: dump_stack+0x85/0xcb register_lock_class+0x571/0x580 ? __lock_acquire+0x2ba/0x1310 ? kernfs_seq_start+0x2a/0x80 __lock_acquire+0xd4/0x1310 ? dev_attr_show+0x1c/0x50 ? __lock_acquire+0x2ba/0x1310 ? kernfs_seq_start+0x2a/0x80 ? lock_acquire+0x9e/0x1a0 lock_acquire+0x9e/0x1a0 ? dev_attr_show+0x1c/0x50 badblocks_show+0x70/0x190 ? dev_attr_show+0x1c/0x50 dev_attr_show+0x1c/0x50 This results from a missing successful call to devm_init_badblocks() from nd_region_probe(). Block attempts to show badblocks while the region is not enabled. Fixes: 6a6bef90425e ("libnvdimm: add mechanism to publish badblocks...") Cc: <stable@vger.kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-28libnvdimm, pfn: during init, clear errors in the metadata areaVishal Verma1-1/+60
If there are badblocks present in the 'struct page' area for pfn namespaces, until now, the only way to clear them has been to force the namespace into raw mode, clear the errors, and re-enable the fsdax mode. This is clunky, given that it should be easy enough for the pfn driver to do the same. Add a new helper that uses the most recently available badblocks list to check whether there are any badblocks that lie in the volatile struct page area. If so, before initializing the struct pages, send down targeted writes via nvdimm_write_bytes to write zeroes to the affected blocks, and thus clear errors. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-26libnvdimm: Set device node in nd_device_registerAlexander Duyck1-6/+10
This change makes it so that we don't repeatedly overwrite the device node for nvdimm regions. The earliest we can set the node is immediately after calling device init, so I have moved the code there so we can avoid rewriting the node with each uevent. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-26libnvdimm: Hold reference on parent while scheduling async initAlexander Duyck1-0/+4
Unlike asynchronous initialization in the core we have not yet associated the device with the parent, and as such the device doesn't hold a reference to the parent. In order to resolve that we should be holding a reference on the parent until the asynchronous initialization has completed. Cc: <stable@vger.kernel.org> Fixes: 4d88a97aa9e8 ("libnvdimm: ...base ... infrastructure") Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-26libnvdimm: remove duplicate includePankaj Gupta1-1/+0
Removed duplicate include. Signed-off-by: Pankaj Gupta <pagupta@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-09-16Linux 4.19-rc4Linus Torvalds1-1/+1
2018-09-16Code of Conduct: Let's revamp it.Greg Kroah-Hartman3-29/+82
The Code of Conflict is not achieving its implicit goal of fostering civility and the spirit of 'be excellent to each other'. Explicit guidelines have demonstrated success in other projects and other areas of the kernel. Here is a Code of Conduct statement for the wider kernel. It is based on the Contributor Covenant as described at www.contributor-covenant.org From this point forward, we should abide by these rules in order to help make the kernel community a welcoming environment to participate in. Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Olof Johansson <olof@lxom.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-09-15x86/APM: Fix build warning when PROC_FS is not enabledRandy Dunlap1-0/+2
Fix build warning in apm_32.c when CONFIG_PROC_FS is not enabled: ../arch/x86/kernel/apm_32.c:1643:12: warning: 'proc_apm_show' defined but not used [-Wunused-function] static int proc_apm_show(struct seq_file *m, void *v) Fixes: 3f3942aca6da ("proc: introduce proc_create_single{,_data}") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jiri Kosina <jikos@kernel.org> Link: https://lkml.kernel.org/r/be39ac12-44c2-4715-247f-4dcc3c525b8b@infradead.org
2018-09-14NFS: Don't open code clearing of delegation stateTrond Myklebust1-9/+12
Add a helper for the case when the nfs4 open state has been set to use a delegation stateid, and we want to revert to using the open stateid. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-09-14NFSv4.1 fix infinite loop on I/O.Trond Myklebust2-3/+9
The previous fix broke recovery of delegated stateids because it assumes that if we did not mark the delegation as suspect, then the delegation has effectively been revoked, and so it removes that delegation irrespectively of whether or not it is valid and still in use. While this is "mostly harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning in an infinite loop while complaining that we're using an invalid stateid (in this case the all-zero stateid). What we rather want to do here is ensure that the delegation is always correctly marked as needing testing when that is the case. So we want to close the loophole offered by nfs4_schedule_stateid_recovery(), which marks the state as needing to be reclaimed, but not the delegation that may be backing it. Fixes: 0e3d3e5df07dc ("NFSv4.1 fix infinite loop on IO BAD_STATEID error") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-09-14NFSv4: Fix a tracepoint Oops in initiate_file_draining()Trond Myklebust1-1/+1
Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to change the test in the tracepoint. Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-09-14pNFS: Ensure we return the error if someone kills a waiting layoutgetTrond Myklebust1-10/+16
If someone interrupts a wait on one or more outstanding layoutgets in pnfs_update_layout() then return the ERESTARTSYS/EINTR error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-09-14NFSv4: Fix a tracepoint Oops in initiate_file_draining()Trond Myklebust1-1/+1
Now that the value of 'ino' can be NULL or an ERR_PTR(), we need to change the test in the tracepoint. Fixes: ce5624f7e6675 ("NFSv4: Return NFS4ERR_DELAY when a layout fails...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-09-14Revert "x86/mm/legacy: Populate the user page-table with user pgd's"Joerg Roedel2-10/+1
This reverts commit 1f40a46cf47c12d93a5ad9dccd82bd36ff8f956a. It turned out that this patch is not sufficient to enable PTI on 32 bit systems with legacy 2-level page-tables. In this paging mode the huge-page PTEs are in the top-level page-table directory, where also the mirroring to the user-space page-table happens. So every huge PTE exits twice, in the kernel and in the user page-table. That means that accessed/dirty bits need to be fetched from two PTEs in this mode to be safe, but this is not trivial to implement because it needs changes to generic code just for the sake of enabling PTI with 32-bit legacy paging. As all systems that need PTI should support PAE anyway, remove support for PTI when 32-bit legacy paging is used. Fixes: 7757d607c6b3 ('x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32') Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: https://lkml.kernel.org/r/1536922754-31379-1-git-send-email-joro@8bytes.org
2018-09-14xen/gntdev: fix up blockable calls to mn_invl_range_startMichal Hocko1-11/+15
Patch series "mmu_notifiers follow ups". Tetsuo has noticed some fallouts from 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers"). One of them has been fixed and picked up by AMD/DRM maintainer [1]. XEN issue is fixed by patch 1. I have also clarified expectations about blockable semantic of invalidate_range_end. Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no longer used nor needed. [1] http://lkml.kernel.org/r/20180824135257.GU29735@dhcp22.suse.cz This patch (of 3): 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") has introduced blockable parameter to all mmu_notifiers and the notifier has to back off when called in !blockable case and it could block down the road. The above commit implemented that for mn_invl_range_start but both in_range checks are done unconditionally regardless of the blockable mode and as such they would fail all the time for regular calls. Fix this by checking blockable parameter as well. Once we are there we can remove the stale TODO. The lock has to be sleepable because we wait for completion down in gnttab_unmap_refs_sync. Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usageJosh Abraham1-1/+1
This patch removes duplicate macro useage in events_base.c. It also fixes gcc warning: variable ‘col’ set but not used [-Wunused-but-set-variable] Signed-off-by: Joshua Abraham <j.abraham1776@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14xen: avoid crash in disable_hotplug_cpuOlaf Hering1-7/+8
The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0: BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased) Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010 RIP: e030:device_offline+0x9/0xb0 Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6 RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000 R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30 R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0 FS: 00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660 Call Trace: handle_vcpu_hotplug_event+0xb5/0xc0 xenwatch_thread+0x80/0x140 ? wait_woken+0x80/0x80 kthread+0x112/0x130 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x3a/0x50 This happens because handle_vcpu_hotplug_event is called twice. In the first iteration cpu_present is still true, in the second iteration cpu_present is false which causes get_cpu_device to return NULL. In case of cpu#0, cpu_online is apparently always true. Fix this crash by checking if the cpu can be hotplugged, which is false for a cpu that was just removed. Also check if the cpu was actually offlined by device_remove, otherwise leave the cpu_present state as it is. Rearrange to code to do all work with device_hotplug_lock held. Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14xen/balloon: add runtime control for scrubbing ballooned out pagesMarek Marczykowski-Górecki6-6/+33
Scrubbing pages on initial balloon down can take some time, especially in nested virtualization case (nested EPT is slow). When HVM/PVH guest is started with memory= significantly lower than maxmem=, all the extra pages will be scrubbed before returning to Xen. But since most of them weren't used at all at that point, Xen needs to populate them first (from populate-on-demand pool). In nested virt case (Xen inside KVM) this slows down the guest boot by 15-30s with just 1.5GB needed to be returned to Xen. Add runtime parameter to enable/disable it, to allow initially disabling scrubbing, then enable it back during boot (for example in initramfs). Such usage relies on assumption that a) most pages ballooned out during initial boot weren't used at all, and b) even if they were, very few secrets are in the guest at that time (before any serious userspace kicks in). Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also enabled by default), controlling default value for the new runtime switch. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14xen/manage: don't complain about an empty value in control/sysrq nodeVitaly Kuznetsov1-2/+4
When guest receives a sysrq request from the host it acknowledges it by writing '\0' to control/sysrq xenstore node. This, however, make xenstore watch fire again but xenbus_scanf() fails to parse empty value with "%c" format string: sysrq: SysRq : Emergency Sync Emergency Sync complete xen:manage: Error -34 reading sysrq code in control/sysrq Ignore -ERANGE the same way we already ignore -ENOENT, empty value in control/sysrq is totally legal. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2018-09-14asm-generic: io: Fix ioport_map() for !CONFIG_GENERIC_IOMAP && CONFIG_INDIRECT_PIOAndrew Murray1-1/+2
The !CONFIG_GENERIC_IOMAP version of ioport_map uses MMIO_UPPER_LIMIT to prevent users from making I/O accesses outside the expected I/O range - however it erroneously treats MMIO_UPPER_LIMIT as a mask which is contradictory to its other users. The introduction of CONFIG_INDIRECT_PIO, which subtracts an arbitrary amount from IO_SPACE_LIMIT to form MMIO_UPPER_LIMIT, results in ioport_map mangling the given port rather than capping it. We address this by aligning more closely with the CONFIG_GENERIC_IOMAP implementation of ioport_map by using the comparison operator and returning NULL where the port exceeds MMIO_UPPER_LIMIT. Though note that we preserve the existing behavior of masking with IO_SPACE_LIMIT such that we don't break existing buggy drivers that somehow rely on this masking. Fixes: 5745392e0c2b ("PCI: Apply the new generic I/O management on PCI IO hosts") Reported-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-09-13mm: get rid of vmacache_flush_all() entirelyLinus Torvalds6-48/+4
Jann Horn points out that the vmacache_flush_all() function is not only potentially expensive, it's buggy too. It also happens to be entirely unnecessary, because the sequence number overflow case can be avoided by simply making the sequence number be 64-bit. That doesn't even grow the data structures in question, because the other adjacent fields are already 64-bit. So simplify the whole thing by just making the sequence number overflow case go away entirely, which gets rid of all the complications and makes the code faster too. Win-win. [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics also just goes away entirely with this ] Reported-by: Jann Horn <jannh@google.com> Suggested-by: Will Deacon <will.deacon@arm.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-09-13MAINTAINERS: Make Dennis the percpu tree maintainerTejun Heo1-2/+2
Dennis rewrote a significant portion of the percpu allocator and has shown that he can respond in a timely and helpful manner when issues are reported against percpu allocator. Let's make Dennis the percpu tree maintainer. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Christoph Lameter <cl@linux.com>
2018-09-13pstore: Fix incorrect persistent ram buffer mappingBin Yang1-3/+14
persistent_ram_vmap() returns the page start vaddr. persistent_ram_iomap() supports non-page-aligned mapping. persistent_ram_buffer_map() always adds offset-in-page to the vaddr returned from these two functions, which causes incorrect mapping of non-page-aligned persistent ram buffer. By default ftrace_size is 4096 and max_ftrace_cnt is nr_cpu_ids. Without this patch, the zone_sz in ramoops_init_przs() is 4096/nr_cpu_ids which might not be page aligned. If the offset-in-page > 2048, the vaddr will be in next page. If the next page is not mapped, it will cause kernel panic: [ 0.074231] BUG: unable to handle kernel paging request at ffffa19e0081b000 ... [ 0.075000] RIP: 0010:persistent_ram_new+0x1f8/0x39f ... [ 0.075000] Call Trace: [ 0.075000] ramoops_init_przs.part.10.constprop.15+0x105/0x260 [ 0.075000] ramoops_probe+0x232/0x3a0 [ 0.075000] platform_drv_probe+0x3e/0xa0 [ 0.075000] driver_probe_device+0x2cd/0x400 [ 0.075000] __driver_attach+0xe4/0x110 [ 0.075000] ? driver_probe_device+0x400/0x400 [ 0.075000] bus_for_each_dev+0x70/0xa0 [ 0.075000] driver_attach+0x1e/0x20 [ 0.075000] bus_add_driver+0x159/0x230 [ 0.075000] ? do_early_param+0x95/0x95 [ 0.075000] driver_register+0x70/0xc0 [ 0.075000] ? init_pstore_fs+0x4d/0x4d [ 0.075000] __platform_driver_register+0x36/0x40 [ 0.075000] ramoops_init+0x12f/0x131 [ 0.075000] do_one_initcall+0x4d/0x12c [ 0.075000] ? do_early_param+0x95/0x95 [ 0.075000] kernel_init_freeable+0x19b/0x222 [ 0.075000] ? rest_init+0xbb/0xbb [ 0.075000] kernel_init+0xe/0xfc [ 0.075000] ret_from_fork+0x3a/0x50 Signed-off-by: Bin Yang <bin.yang@intel.com> [kees: add comments describing the mapping differences, updated commit log] Fixes: 24c3d2f342ed ("staging: android: persistent_ram: Make it possible to use memory outside of bootmem") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2018-09-13drm/nouveau/devinit: fix warning when PMU/PRE_OS is missingBen Skeggs1-10/+11
Messed up when sending pull request and sent an outdated version of previous patch, this fixes it up to remove warnings. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-09-12null_blk: fix zoned support for non-rq based operationJens Axboe3-34/+62
The supported added for zones in null_blk seem to assume that only rq based operation is possible. But this depends on the queue_mode setting, if this is set to 0, then cmd->bio is what we need to be operating on. Right now any attempt to load null_blk with queue_mode=0 will insta-crash, since cmd->rq is NULL and null_handle_cmd() assumes it to always be set. Make the zoned code deal with bio's instead, or pass in the appropriate sector/nr_sectors instead. Fixes: ca4b2a011948 ("null_blk: add zone support") Tested-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-12cifs: read overflow in is_valid_oplock_break()Dan Carpenter1-0/+8
We need to verify that the "data_offset" is within bounds. Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2018-09-12nfp: flower: reject tunnel encap with ipv6 outer headers for offloadingLouis Peens1-0/+6
This fixes a bug where ipv6 tunnels would report that it is getting offloaded to hardware but would actually be rejected by hardware. Fixes: b27d6a95a70d ("nfp: compile flower vxlan tunnel set actions") Signed-off-by: Louis Peens <louis.peens@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12nfp: flower: fix vlan match by checking both vlan id and vlan pcpPieter Jansen van Vuuren3-1/+13
Previously we only checked if the vlan id field is present when trying to match a vlan tag. The vlan id and vlan pcp field should be treated independently. Fixes: 5571e8c9f241 ("nfp: extend flower matching capabilities") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12tipc: check return value of __tipc_dump_start()Cong Wang1-1/+4
When __tipc_dump_start() fails with running out of memory, we have no reason to continue, especially we should avoid calling tipc_dump_done(). Fixes: 8f5c5fcf3533 ("tipc: call start and done ops directly in __tipc_nl_compat_dumpit()") Reported-and-tested-by: syzbot+3f8324abccfbf8c74a9f@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12s390/qeth: don't dump past end of unknown HW headerJulian Wiedmann2-2/+2
For inbound data with an unsupported HW header format, only dump the actual HW header. We have no idea how much payload follows it, and what it contains. Worst case, we dump past the end of the Inbound Buffer and access whatever is located next in memory. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12s390/qeth: use vzalloc for QUERY OAT bufferWenjia Zhang1-2/+3
qeth_query_oat_command() currently allocates the kernel buffer for the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with fragmented memory, large allocations may fail (eg. the qethqoat tool by default uses 132KB). Solve this issue by using vzalloc, backing the allocation with non-contiguous memory. Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12s390/qeth: switch on SG by default for IQD devicesJulian Wiedmann1-0/+2
Scatter-gather transmit brings a nice performance boost. Considering the rather large MTU sizes at play, it's also totally the Right Thing To Do. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12s390/qeth: indicate error when netdev allocation failsJulian Wiedmann1-1/+3
Bailing out on allocation error is nice, but we also need to tell the ccwgroup core that creating the qeth groupdev failed. Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-12x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3Guenter Roeck1-2/+1
Commit eeb89e2bb1ac ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()") moved loading the fixmap in efi_call_phys_epilog() after load_cr3() since it was assumed to be more logical. Turns out this is incorrect: In efi_call_phys_prolog(), the gdt with its physical address is loaded first, and when the %cr3 is reloaded in _epilog from initial_page_table to swapper_pg_dir again the gdt is no longer mapped. This results in a triple fault if an interrupt occurs after load_cr3() and before load_fixmap_gdt(0). Calling load_fixmap_gdt(0) first restores the execution order prior to commit eeb89e2bb1ac and fixes the problem. Fixes: eeb89e2bb1ac ("x86/efi: Load fixmap GDT in efi_call_phys_epilog()") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: linux-efi@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1536689892-21538-1-git-send-email-linux@roeck-us.net
2018-09-12x86/xen: Disable CPU0 hotplug for Xen PVJuergen Gross1-1/+3
Xen PV guests don't allow CPU0 hotplug, so disable it. Signed-off-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20180912174122.24282-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-09-12tracing/Makefile: Fix handling redefinition of CC_FLAGS_FTRACEPaulo Zanoni1-3/+5
As a Kernel developer, I make heavy use of "make targz-pkg" in order to locally compile and remotely install my development Kernels. The nice feature I rely on is that after a normal "make", "make targz-pkg" only generates the tarball without having to recompile everything. That was true until commit f28bc3c32c05 ("tracing: Handle CC_FLAGS_FTRACE more accurately"). After it, running "make targz-pkg" after "make" will recompile the whole Kernel tree, making my development workflow much slower. The Kernel is choosing to recompile everything because it claims the command line has changed. A diff of the .cmd files show a repeated -mfentry in one of the files. That is because "make targz-pkg" calls "make modules_install" and the environment is already populated with the exported variables, CC_FLAGS_FTRACE being one of them. Then, -mfentry gets duplicated because it is not protected behind an ifndef block, like -pg. To complicate the problem a little bit more, architectures can define their own version CC_FLAGS_FTRACE, so our code not only has to consider recursive Makefiles, but also architecture overrides. So in this patch we move CC_FLAGS_FTRACE up and unconditionally define it to -pg. Then we let the architecture Makefiles possibly override it, and finally append the extra options later. This ensures the variable is always fully redefined at each invocation so recursive Makefiles don't keep appending, and hopefully it maintains the intended behavior on how architectures can override the defaults.. Thanks Steven Rostedt and Vasily Gorbik for the help on this regression. Cc: Michal Marek <michal.lkml@markovi.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: linux-kbuild@vger.kernel.org Fixes: commit f28bc3c32c05 ("tracing: Handle CC_FLAGS_FTRACE more accurately") Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-09-12cifs: integer overflow in in SMB2_ioctl()Dan Carpenter1-2/+2
The "le32_to_cpu(rsp->OutputOffset) + *plen" addition can overflow and wrap around to a smaller value which looks like it would lead to an information leak. Fixes: 4a72dafa19ba ("SMB2 FSCTL and IOCTL worker function") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> CC: Stable <stable@vger.kernel.org>
2018-09-12CIFS: fix wrapping bugs in num_entries()Dan Carpenter1-10/+15
The problem is that "entryptr + next_offset" and "entryptr + len + size" can wrap. I ended up changing the type of "entryptr" because it makes the math easier when we don't have to do so much casting. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org>