aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/tools/perf/scripts/python/call-graph-from-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2017-02-03xfs: reset b_first_retry_time when clear the retry status of xfs_buf_tHou Tao1-0/+1
After successful IO or permanent error, b_first_retry_time also needs to be cleared, else the invalid first retry time will be used by the next retry check. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-02-02xfs: mark speculative prealloc CoW fork extents unwrittenDarrick J. Wong5-11/+123
Christoph Hellwig pointed out that there's a potentially nasty race when performing simultaneous nearby directio cow writes: "Thread 1 writes a range from B to c " B --------- C p "a little later thread 2 writes from A to B " A --------- B p [editor's note: the 'p' denote cowextsize boundaries, which I added to make this more clear] "but the code preallocates beyond B into the range where thread "1 has just written, but ->end_io hasn't been called yet. "But once ->end_io is called thread 2 has already allocated "up to the extent size hint into the write range of thread 1, "so the end_io handler will splice the unintialized blocks from "that preallocation back into the file right after B." We can avoid this race by ensuring that thread 1 cannot accidentally remap the blocks that thread 2 allocated (as part of speculative preallocation) as part of t2's write preparation in t1's end_io handler. The way we make this happen is by taking advantage of the unwritten extent flag as an intermediate step. Recall that when we begin the process of writing data to shared blocks, we create a delayed allocation extent in the CoW fork: D: --RRRRRRSSSRRRRRRRR--- C: ------DDDDDDD--------- When a thread prepares to CoW some dirty data out to disk, it will now convert the delalloc reservation into an /unwritten/ allocated extent in the cow fork. The da conversion code tries to opportunistically allocate as much of a (speculatively prealloc'd) extent as possible, so we may end up allocating a larger extent than we're actually writing out: D: --RRRRRRSSSRRRRRRRR--- U: ------UUUUUUU--------- Next, we convert only the part of the extent that we're actively planning to write to normal (i.e. not unwritten) status: D: --RRRRRRSSSRRRRRRRR--- U: ------UURRUUU--------- If the write succeeds, the end_cow function will now scan the relevant range of the CoW fork for real extents and remap only the real extents into the data fork: D: --RRRRRRRRSRRRRRRRR--- U: ------UU--UUU--------- This ensures that we never obliterate valid data fork extents with unwritten blocks from the CoW fork. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-02-02xfs: allow unwritten extents in the CoW forkDarrick J. Wong1-30/+50
In the data fork, we only allow extents to perform the following state transitions: delay -> real <-> unwritten There's no way to move directly from a delalloc reservation to an /unwritten/ allocated extent. However, for the CoW fork we want to be able to do the following to each extent: delalloc -> unwritten -> written -> remapped to data fork This will help us to avoid a race in the speculative CoW preallocation code between a first thread that is allocating a CoW extent and a second thread that is remapping part of a file after a write. In order to do this, however, we need two things: first, we have to be able to transition from da to unwritten, and second the function that converts between real and unwritten has to be made aware of the cow fork. Do both of those things. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-02-02xfs: verify free block header fieldsDarrick J. Wong1-2/+49
Perform basic sanity checking of the directory free block header fields so that we avoid hanging the system on invalid data. (Granted that just means that now we shutdown on directory write, but that seems better than hanging...) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-02-02xfs: check for obviously bad level values in the bmbt rootDarrick J. Wong1-1/+5
We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's no such thing as a zero-level bmbt (for that we have extents format), so if we see this, send back an error code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-02-02xfs: filter out obviously bad btree pointersDarrick J. Wong3-6/+4
Don't let anybody load an obviously bad btree pointer. Since the values come from disk, we must return an error, not just ASSERT. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2017-02-02xfs: fail _dir_open when readahead failsDarrick J. Wong3-7/+5
When we open a directory, we try to readahead block 0 of the directory on the assumption that we're going to need it soon. If the bmbt is corrupt, the directory will never be usable and the readahead fails immediately, so we might as well prevent the directory from being opened at all. This prevents a subsequent read or modify operation from hitting it and taking the fs offline. NOTE: We're only checking for early failures in the block mapping, not the readahead directory block itself. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-02-02xfs: fix toctou race when locking an inode to access the data mapDarrick J. Wong1-2/+1
We use di_format and if_flags to decide whether we're grabbing the ilock in btree mode (btree extents not loaded) or shared mode (anything else), but the state of those fields can be changed by other threads that are also trying to load the btree extents -- IFEXTENTS gets set before the _bmap_read_extents call and cleared if it fails. We don't actually need to have IFEXTENTS set until after the bmbt records are successfully loaded and validated, which will fix the race between multiple threads trying to read the same directory. The next patch strengthens directory bmbt validation by refusing to open the directory if reading the bmbt to start directory readahead fails. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-01-30xfs: remove unused full argument from bmapEric Sandeen3-7/+5
The "full" argument was used only by the fiemap formatter, which is now gone with the iomap updates. Remove the unused arg. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: fix eofblocks race with file extending async dio writesBrian Foster1-0/+3
It's possible for post-eof blocks to end up being used for direct I/O writes. dio write performs an upfront unwritten extent allocation, sends the dio and then updates the inode size (if necessary) on write completion. If a file release occurs while a file extending dio write is in flight, it is possible to mistake the post-eof blocks for speculative preallocation and incorrectly truncate them from the inode. This means that the resulting dio write completion can discover a hole and allocate new blocks rather than perform unwritten extent conversion. This requires a strange mix of I/O and is thus not likely to reproduce in real world workloads. It is intermittently reproduced by generic/299. The error manifests as an assert failure due to transaction overrun because the aforementioned write completion transaction has only reserved enough blocks for btree operations: XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \ file: fs/xfs//xfs_trans.c, line: 309 The root cause is that xfs_free_eofblocks() uses i_size to truncate post-eof blocks from the inode, but async, file extending direct writes do not update i_size until write completion, long after inode locks are dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode to the incorrect size. Update xfs_free_eofblocks() to serialize against dio similar to how extending writes are serialized against i_size updates before post-eof block zeroing. Specifically, wait on dio while under the iolock. This ensures that dio write completions have updated i_size before post-eof blocks are processed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: sync eofblocks scans under iolock are livelock proneBrian Foster3-44/+16
The xfs_eofblocks.eof_scan_owner field is an internal field to facilitate invoking eofb scans from the kernel while under the iolock. This is necessary because the eofb scan acquires the iolock of each inode. Synchronous scans are invoked on certain buffered write failures while under iolock. In such cases, the scan owner indicates that the context for the scan already owns the particular iolock and prevents a double lock deadlock. eofblocks scans while under iolock are still livelock prone in the event of multiple parallel scans, however. If multiple buffered writes to different inodes fail and invoke eofblocks scans at the same time, each scan avoids a deadlock with its own inode by virtue of the eof_scan_owner field, but will never be able to acquire the iolock of the inode from the parallel scan. Because the low free space scans are invoked with SYNC_WAIT, the scan will not return until it has processed every tagged inode and thus both scans will spin indefinitely on the iolock being held across the opposite scan. This problem can be reproduced reliably by generic/224 on systems with higher cpu counts (x16). To avoid this problem, simplify the semantics of eofblocks scans to never invoke a scan while under iolock. This means that the buffered write context must drop the iolock before the scan. It must reacquire the lock before the write retry and also repeat the initial write checks, as the original state might no longer be valid once the iolock was dropped. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: pull up iolock from xfs_free_eofblocks()Brian Foster4-59/+60
xfs_free_eofblocks() requires the IOLOCK_EXCL lock, but is called from different contexts where the lock may or may not be held. The need_iolock parameter exists for this reason, to indicate whether xfs_free_eofblocks() must acquire the iolock itself before it can proceed. This is ugly and confusing. Simplify the semantics of xfs_free_eofblocks() to require the caller to acquire the iolock appropriately and kill the need_iolock parameter. While here, the mp param can be removed as well as the xfs_mount is accessible from the xfs_inode structure. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: remove unused struct declarationsEric Sandeen4-4/+0
After scratching my head looking for "xfs_busy_extent" I realized it's not used; it's xfs_extent_busy, and the declaration for the other name is bogus. Remove that and a few others as well. (struct xfs_log_callback is used, but the 2nd declaration is unnecessary). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30iomap: constify struct iomap_opsChristoph Hellwig11-33/+33
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: remove boilerplate around xfs_btree_init_blockEric Sandeen5-56/+19
Now that xfs_btree_init_block_int is able to determine crc status from the passed-in mp, we can determine the proper magic as well if we are given a btree number, rather than an explicit magic value. Change xfs_btree_init_block[_int] callers to pass in the btree number, and let xfs_btree_init_block_int use the xfs_magics array via the xfs_btree_magic macro to determine which magic value is needed. This makes all of the if (crc) / else stanzas identical, and the if/else can be removed, leading to a single, common init_block call. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: make xfs_btree_magic more genericEric Sandeen2-8/+28
Right now the xfs_btree_magic() define takes only a cursor; change this to take crc and btnum args to make it more generically useful, and move to a function. This will allow xfs_btree_init_block_int callers which don't have a cursor to make use of the xfs_magics array, which will happen in the next patch. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-30xfs: glean crc status from mp not flags in xfs_btree_init_block_intEric Sandeen4-13/+13
xfs_btree_init_block_int() can determine whether crcs are in effect without the passed-in XFS_BTREE_CRC_BLOCKS flag; the mp argument allows us to determine this from the superblock. Remove the flag from callers, and use xfs_sb_version_hascrc(&mp->m_sb) internally instead. This removes one difference between the if & else cases in the callers. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-29Linux 4.10-rc6Linus Torvalds1-2/+2
2017-01-29drm/i915: Check for NULL i915_vma in intel_unpin_fb_obj()Linus Torvalds1-0/+3
I've seen this trigger twice now, where the i915_gem_object_to_ggtt() call in intel_unpin_fb_obj() returns NULL, resulting in an oops immediately afterwards as the (inlined) call to i915_vma_unpin_fence() tries to dereference it. It seems to be some race condition where the object is going away at shutdown time, since both times happened when shutting down the X server. The call chains were different: - VT ioctl(KDSETMODE, KD_TEXT): intel_cleanup_plane_fb+0x5b/0xa0 [i915] drm_atomic_helper_cleanup_planes+0x6f/0x90 [drm_kms_helper] intel_atomic_commit_tail+0x749/0xfe0 [i915] intel_atomic_commit+0x3cb/0x4f0 [i915] drm_atomic_commit+0x4b/0x50 [drm] restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper] drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] intel_fbdev_set_par+0x18/0x70 [i915] fb_set_var+0x236/0x460 fbcon_blank+0x30f/0x350 do_unblank_screen+0xd2/0x1a0 vt_ioctl+0x507/0x12a0 tty_ioctl+0x355/0xc30 do_vfs_ioctl+0xa3/0x5e0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x13/0x94 - i915 unpin_work workqueue: intel_unpin_work_fn+0x58/0x140 [i915] process_one_work+0x1f1/0x480 worker_thread+0x48/0x4d0 kthread+0x101/0x140 and this patch purely papers over the issue by adding a NULL pointer check and a WARN_ON_ONCE() to avoid the oops that would then generally make the machine unresponsive. Other callers of i915_gem_object_to_ggtt() seem to also check for the returned pointer being NULL and warn about it, so this clearly has happened before in other places. [ Reported it originally to the i915 developers on Jan 8, applying the ugly workaround on my own now after triggering the problem for the second time with no feedback. This is likely to be the same bug reported as https://bugs.freedesktop.org/show_bug.cgi?id=98829 https://bugs.freedesktop.org/show_bug.cgi?id=99134 which has a patch for the underlying problem, but it hasn't gotten to me, so I'm applying the workaround. ] Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-28parisc: Don't use BITS_PER_LONG in userspace-exported swab.h headerHelge Deller3-5/+10
In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin. Solve this problem by using __BITS_PER_LONG instead. Since we now #include asm/bitsperlong.h avoid further potential userspace pollution by moving the #define of SHIFT_PER_LONG to bitops.h which is not exported to userspace. This patch unbreaks compiling qemu on hppa/parisc. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org>
2017-01-28parisc, parport_gsc: Fixes for printk continuation linesHelge Deller1-4/+4
Signed-off-by: Helge Deller <deller@gmx.de>
2017-01-27RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabledJack Morgenstein1-1/+2
If IPV6 has not been enabled in the underlying kernel, we must avoid calling IPV6 procedures in rdma_cm.ko. This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements surrounding any code which calls external IPV6 procedures. In the instance fixed here, procedure cma_bind_addr() called ipv6_addr_type() -- which resulted in calling external procedure __ipv6_addr_type(). Fixes: 6c26a77124ff ("RDMA/cma: fix IPv6 address resolution") Cc: <stable@vger.kernel.org> # v4.2+ Cc: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-27ARC: [arcompact] handle unaligned access delay slot corner caseVineet Gupta1-1/+2
After emulating an unaligned access in delay slot of a branch, we pretend as the delay slot never happened - so return back to actual branch target (or next PC if branch was not taken). Curently we did this by handling STATUS32.DE, we also need to clear the BTA.T bit, which is disregarded when returning from original misaligned exception, but could cause weirdness if it took the interrupt return path (in case interrupt was acive too) One ARC700 customer ran into this when enabling unaligned access fixup for kernel mode accesses as well Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-27xfs: prevent quotacheck from overloading inode lruBrian Foster1-1/+2
Quotacheck runs at mount time in situations where quota accounting must be recalculated. In doing so, it uses bulkstat to visit every inode in the filesystem. Historically, every inode processed during quotacheck was released and immediately tagged for reclaim because quotacheck runs before the superblock is marked active by the VFS. In other words, the final iput() lead to an immediate ->destroy_inode() call, which allowed the XFS background reclaim worker to start reclaiming inodes. Commit 17c12bcd3 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped") marks the XFS superblock active sooner as part of the mount process to support caching inodes processed during log recovery. This occurs before quotacheck and thus means all inodes processed by quotacheck are inserted to the LRU on release. The s_umount lock is held until the mount has completed and thus prevents the shrinkers from operating on the sb. This means that quotacheck can excessively populate the inode LRU and lead to OOM conditions on systems without sufficient RAM. Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on inodes processed by quotacheck. This causes ->drop_inode() to return 1 and in turn causes iput_final() to evict the inode. This preserves the original quotacheck behavior and prevents it from overloading the LRU and running out of memory. CC: stable@vger.kernel.org # v4.9 Reported-by: Martin Svec <martin.svec@zoner.cz> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-27ISDN: eicon: silence misleading array-bounds warningArnd Bergmann1-1/+2
With some gcc versions, we get a warning about the eicon driver, and that currently shows up as the only remaining warning in one of the build bots: In file included from ../drivers/isdn/hardware/eicon/message.c:30:0: eicon/message.c: In function 'mixer_notify_update': eicon/platform.h:333:18: warning: array subscript is above array bounds [-Warray-bounds] The code is easily changed to open-code the unusual PUT_WORD() line causing this to avoid the warning. Cc: stable@vger.kernel.org Link: http://arm-soc.lixom.net/buildlogs/stable-rc/v4.4.45/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27net: phy: micrel: add support for KSZ8795Sean Nyekjaer2-0/+16
This is adds support for the PHYs in the KSZ8795 5port managed switch. It will allow to detect the link between the switch and the soc and uses the same read_status functions as the KSZ8873MLL switch. Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27gtp: fix cross netns recv on gtp socketAndreas Schultz1-6/+4
The use of the passed through netlink src_net to check for a cross netns operation was wrong. Using the GTP socket and the GTP netdevice is always correct (even if the netdev has been moved to new netns after link creation). Remove the now obsolete net field from gtp_dev. Signed-off-by: Andreas Schultz <aschultz@tpip.net> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27gtp: clear DF bit on GTP packet txAndreas Schultz1-1/+1
3GPP TS 29.281 and 3GPP TS 29.060 imply that GTP-U packets should be sent with the DF bit cleared. For example 3GPP TS 29.060, Release 8, Section 13.2.2: > Backbone router: Any router in the backbone may fragment the GTP > packet if needed, according to IPv4. Signed-off-by: Andreas Schultz <aschultz@tpip.net> Acked-by: Harald Welte <laforge@netfilter.org> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27gtp: add genl family modules aliasAndreas Schultz1-0/+1
Auto-load the module when userspace asks for the gtp netlink family. Signed-off-by: Andreas Schultz <aschultz@tpip.net> Acked-by: Harald Welte <laforge@netfilter.org> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27tcp: don't annotate mark on control socket from tcp_v6_send_response()Pablo Neira6-10/+10
Unlike ipv4, this control socket is shared by all cpus so we cannot use it as scratchpad area to annotate the mark that we pass to ip6_xmit(). Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket family caches the flowi6 structure in the sctp_transport structure, so we cannot use to carry the mark unless we later on reset it back, which I discarded since it looks ugly to me. Fixes: bf99b4ded5f8 ("tcp: fix mark propagation with fwmark_reflect enabled") Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27arm64: skip register_cpufreq_notifier on ACPI-based systemsPrashanth Prakash1-1/+7
On ACPI based systems where the topology is setup using the API store_cpu_topology, at the moment we do not have necessary code to parse cpu capacity and handle cpufreq notifier, thus resulting in a kernel panic. Stack: init_cpu_capacity_callback+0xb4/0x1c8 notifier_call_chain+0x5c/0xa0 __blocking_notifier_call_chain+0x58/0xa0 blocking_notifier_call_chain+0x3c/0x50 cpufreq_set_policy+0xe4/0x328 cpufreq_init_policy+0x80/0x100 cpufreq_online+0x418/0x710 cpufreq_add_dev+0x118/0x180 subsys_interface_register+0xa4/0xf8 cpufreq_register_driver+0x1c0/0x298 cppc_cpufreq_init+0xdc/0x1000 [cppc_cpufreq] do_one_initcall+0x5c/0x168 do_init_module+0x64/0x1e4 load_module+0x130c/0x14d0 SyS_finit_module+0x108/0x120 el0_svc_naked+0x24/0x28 Fixes: 7202bde8b7ae ("arm64: parse cpu capacity-dmips-mhz from DT") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Prashanth Prakash <pprakash@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-27drm/nouveau: Handle fbcon suspend/resume in seperate workerLyude Paul2-9/+36
Resuming from RPM can happen while already holding dev->mode_config.mutex. This means we can't actually handle fbcon in any RPM resume workers, since restoring fbcon requires grabbing dev->mode_config.mutex again. So move the fbcon suspend/resume code into it's own worker, and rely on that instead to avoid deadlocking. This fixes more deadlocks for runtime suspending the GPU on the ThinkPad W541. Reproduction recipe: - Get a machine with both optimus and a nvidia card with connectors attached to it - Wait for the nvidia GPU to suspend - Attempt to manually reprobe any of the connectors on the nvidia GPU using sysfs - *deadlock* [airlied: use READ_ONCE to address Hans's comment] Signed-off-by: Lyude <lyude@redhat.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Kilian Singer <kilian.singer@quantumtechnology.info> Cc: Lukas Wunner <lukas@wunner.de> Cc: David Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-27drm/nouveau: Don't enabling polling twice on runtime resumeLyude Paul2-2/+6
As it turns out, on cards that actually have CRTCs on them we're already calling drm_kms_helper_poll_enable(drm_dev) from nouveau_display_resume() before we call it in nouveau_pmops_runtime_resume(). This leads us to accidentally trying to enable polling twice, which results in a potential deadlock between the RPM locks and drm_dev->mode_config.mutex if we end up trying to enable polling the second time while output_poll_execute is running and holding the mode_config lock. As such, make sure we only enable polling in nouveau_pmops_runtime_resume() if we need to. This fixes hangs observed on the ThinkPad W541 Signed-off-by: Lyude <lyude@redhat.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Kilian Singer <kilian.singer@quantumtechnology.info> Cc: Lukas Wunner <lukas@wunner.de> Cc: David Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-27drm/ast: Fixed system hanged if disable P2AY.C. Chen3-79/+97
The original ast driver will access some BMC configuration through P2A bridge that can be disabled since AST2300 and after. It will cause system hanged if P2A bridge is disabled. Here is the update to fix it. Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-26Btrfs: remove ->{get, set}_acl() from btrfs_dir_ro_inode_operationsOmar Sandoval1-2/+0
Subvolume directory inodes can't have ACLs. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2017-01-26Btrfs: disable xattr operations on subvolume directoriesOmar Sandoval1-0/+1
When you snapshot a subvolume containing a subvolume, you get a placeholder directory where the subvolume would be. These directory inodes have ->i_ops set to btrfs_dir_ro_inode_operations. Previously, these i_ops didn't include the xattr operation callbacks. The conversion to xattr_handlers missed this case, leading to bogus attempts to set xattrs on these inodes. This manifested itself as failures when running delayed inodes. To fix this, clear IOP_XATTR in ->i_opflags on these inodes. Fixes: 6c6ef9f26e59 ("xattr: Stop calling {get,set,remove}xattr inode operations") Cc: Andreas Gruenbacher <agruenba@redhat.com> Reported-by: Chris Murphy <lists@colorremedies.com> Tested-by: Chris Murphy <lists@colorremedies.com> Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2017-01-26Btrfs: remove old tree_root case in btrfs_read_locked_inode()Omar Sandoval1-4/+1
As Jeff explained in c2951f32d36c ("btrfs: remove old tree_root dirent processing in btrfs_real_readdir()"), supporting this old format is no longer necessary since the Btrfs magic number has been updated since we changed to the current format. There are other places where we still handle this old format, but since this is part of a fix that is going to stable, I'm only removing this one for now. Cc: <stable@vger.kernel.org> # 4.9.x Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2017-01-26ravb: unmap descriptors when freeing ringsKazuya Mizuguchi1-48/+64
"swiotlb buffer is full" errors occur after repeated initialisation of a device - f.e. suspend/resume or ip link set up/down. This is because memory mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit() is not released. Resolve this problem by unmapping descriptors when freeing rings. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com> [simon: reworked] Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-26pNFS: Fix a reference leak in _pnfs_return_layoutTrond Myklebust1-1/+1
IF NFS_LAYOUT_RETURN_REQUESTED is not set, then we currently exit without freeing the list of invalidated layout segments, leading to a reference leak. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 24408f5282 ("pNFS: Fix bugs in _pnfs_return_layout") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-26nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"Chuck Lever1-0/+1
Lock sequence IDs are bumped in decode_lock by calling nfs_increment_seqid(). nfs_increment_sequid() does not use the seqid_mutating_err() function fixed in commit 059aa7348241 ("Don't increment lock sequence ID after NFS4ERR_MOVED"). Fixes: 059aa7348241 ("Don't increment lock sequence ID after ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Xuan Qi <xuan.qi@oracle.com> Cc: stable@vger.kernel.org # v3.7+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2017-01-26xfs: fix bmv_count confusion w/ shared extentsDarrick J. Wong1-10/+18
In a bmapx call, bmv_count is the total size of the array, including the zeroth element that userspace uses to supply the search key. The output array starts at offset 1 so that we can set up the user for the next invocation. Since we now can split an extent into multiple bmap records due to shared/unshared status, we have to be careful that we don't overflow the output array. In the original patch f86f403794b ("xfs: teach get_bmapx about shared extents and the CoW fork") I used cur_ext (the output index) to check for overflows, albeit with an off-by-one error. Since nexleft no longer describes the number of unfilled slots in the output, we can rip all that out and use cur_ext for the overflow check directly. Failure to do this causes heap corruption in bmapx callers such as xfs_io and xfs_scrub. xfs/328 can reproduce this problem. Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-26sysctl: fix proc_doulongvec_ms_jiffies_minmax()Eric Dumazet1-0/+1
We perform the conversion between kernel jiffies and ms only when exporting kernel value to user space. We need to do the opposite operation when value is written by user. Only matters when HZ != 1000 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-26Revert "sd: remove __data_len hack for WRITE SAME"Bart Van Assche1-1/+16
This patch reverts commit f80de881d8df and avoids that sending a WRITE SAME command to the iSCSI initiator triggers the following: BUG: unable to handle kernel NULL pointer dereference at 0000000000000014 TARGET_CORE[iSCSI]: Expected Transfer Length: 260096 does not match SCSI CDB Length: 512 for SAM Opcode: 0x41 IP: iscsi_tcp_segment_done+0x20b/0x310 [libiscsi_tcp] Oops: 0000 [#1] SMP Modules linked in: target_core_user uio target_core_iblock target_core_file iscsi_target_mod target_core_mod netconsole configfs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper virtio_console virtio_rng virtio_balloon serio_raw i2c_piix4 acpi_cpufreq button iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext4 jbd2 mbcache virtio_blk virtio_net psmouse floppy drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_pci CPU: 2 PID: 5 Comm: kworker/u8:0 Not tainted 4.10.0-rc5-debug+ #3 Workqueue: iscsi_q_0 iscsi_xmitworker [libiscsi] RIP: 0010:iscsi_tcp_segment_done+0x20b/0x310 [libiscsi_tcp] Call Trace: iscsi_sw_tcp_xmit_segment+0x84/0x120 [iscsi_tcp] iscsi_sw_tcp_pdu_xmit+0x51/0x180 [iscsi_tcp] iscsi_tcp_task_xmit+0xb3/0x290 [libiscsi_tcp] iscsi_xmit_task+0x4e/0xc0 [libiscsi] iscsi_xmitworker+0x243/0x330 [libiscsi] process_one_work+0x1d8/0x4b0 worker_thread+0x49/0x4a0 kthread+0x102/0x140 Fixes: f80de881d8df ("sd: remove __data_len hack for WRITE SAME") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Jens Axboe <axboe@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Chris Leech <cleech@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-26nvme-fc: use blk_rq_nr_phys_segmentsChristoph Hellwig1-3/+3
Without this deallocate won't work properly due to the mismatch of the bio/request size and the actual payload size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2017-01-26nvmet-rdma: Fix missing dma sync to nvme data structuresParav Pandit1-0/+17
This patch performs dma sync operations on nvme_command and nvme_completion. nvme_command is synced (a) on receiving of the recv queue completion for cpu access. (b) before posting recv wqe back to rdma adapter for device access. nvme_completion is synced (a) on receiving of the recv queue completion of associated nvme_command for cpu access. (b) before posting send wqe to rdma adapter for device access. This patch is generated for git://git.infradead.org/nvme-fabrics.git Branch: nvmf-4.10 Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
2017-01-26nvmet: Call fatal_error from keep-alive timout expirationSagi Grimberg1-1/+1
We only need to call delete_ctrl once, so given that both keep-alive timeout and any other fatal error can trigger it, just make sure we only call delete_ctrl once. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-01-26nvmet: cancel fatal error and flush async work before free controllerSagi Grimberg1-0/+3
Make sure they are not running and we can free the controller safely. Signed-off-by: Roy Shterman <roys@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-01-26nvmet: delete controllers deletion upon subsystem releaseSagi Grimberg3-0/+12
No reason for them to be kept around if we are deleting the subsystem, so instead of passively wait for the host to disconnect, actively delete the controllers. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-01-26nvmet_fc: correct logic in disconnect queue LS handlingJames Smart1-14/+22
Correct logic in disconnect queue LS handling. Rework so that queue searching and error reporting is above the section to send back a ls rjt Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2017-01-25xfs: clear _XBF_PAGES from buffers when readahead pageDarrick J. Wong1-0/+1
If we try to allocate memory pages to back an xfs_buf that we're trying to read, it's possible that we'll be so short on memory that the page allocation fails. For a blocking read we'll just wait, but for readahead we simply dump all the pages we've collected so far. Unfortunately, after dumping the pages we neglect to clear the _XBF_PAGES state, which means that the subsequent call to xfs_buf_free thinks that b_pages still points to pages we own. It then double-frees the b_pages pages. This results in screaming about negative page refcounts from the memory manager, which xfs oughtn't be triggering. To reproduce this case, mount a filesystem where the size of the inodes far outweighs the availalble memory (a ~500M inode filesystem on a VM with 300MB memory did the trick here) and run bulkstat in parallel with other memory eating processes to put a huge load on the system. The "check summary" phase of xfs_scrub also works for this purpose. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>