linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-01-21	Btrfs: cleanup unused run_most	Liu Bo	1	-4/+1
	"run_most" is not used anymore. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Rename all ref_count to refs in struct	Zhao Lei	1	-13/+13
	refs is better than ref_count to record a struct's ref count. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Suggested-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Introduce BTRFS_BLOCK_GROUP_RAID56_MASK to check raid56 simply	Zhao Lei	4	-28/+19
	So we can check raid56 with: (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) instead of long: (map->type & (BTRFS_BLOCK_GROUP_RAID5 \| BTRFS_BLOCK_GROUP_RAID6)) Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Include map_type in raid_bio	Zhao Lei	4	-18/+22
	Corrent code use many kinds of "clever" way to determine operation target's raid type, as: raid_map != NULL or raid_map[MAX_NR] == RAID[56]_Q_STRIPE To make code easy to maintenance, this patch put raid type into bbio, and we can always get raid type from bbio with a "stupid" way. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Simplify scrub_setup_recheck_block()'s argument	Zhao Lei	1	-16/+9
	scrub_setup_recheck_block() have many arguments but most of them can be get from one of them, we can remove them to make code clean. Some other cleanup for that function also included in this patch. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Combine per-page recover in dev-replace and scrub	Zhao Lei	1	-72/+48
	The code are similar, combine them to make code clean and easy to maintenance. Some lost condition are also completed with benefit of this combination. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Separate finding-right-mirror and writing-to-target's process in scrub_handle_errored_block()	Zhao Lei	1	-27/+17
	In corrent code, code of finding-right-mirror and writing-to-target are mixed in logic, if we find a right mirror but failed in writing to target, it will treat as "hadn't found right block", and fill the target with sblock_bad. Actually, "failed in writing to target" does not mean "source block is wrong", this patch separate above two condition in logic, and do some cleanup to make code clean. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Break loop when reach BTRFS_MAX_MIRRORS in scrub_setup_recheck_block()	Zhao Lei	1	-1/+1
	Use break instead of useless loop should be more suitable in this case. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: btrfs_rm_dev_replace_blocked(): Use wait_event()	Zhao Lei	1	-11/+2
	Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Cleanup btrfs_bio_counter_inc_blocked()	Zhao Lei	1	-6/+6
	1: Remove no-need DEFINE_WAIT(wait) 2: Add likely() for BTRFS_FS_STATE_DEV_REPLACING condition 3: Use while loop instead of goto Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Remove noneed force_write in scrub_write_block_to_dev_replace	Zhao Lei	1	-12/+7
	It is always 1 in this place, because !1 case was already jumped out in previous code. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON()	Zhao Lei	1	-1/+2
	if (sctx->is_dev_replace && !is_metadata && !have_csum) { ... goto nodatasum_case; } ... nodatasum_case: WARN_ON(sctx->is_dev_replace); In above code, nodatasum_case marker should be moved after WARN_ON(). Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: add ref_count and free function for btrfs_bio	Zhao Lei	7	-54/+57
	1: ref_count is simple than current RBIO_HOLD_BBIO_MAP_BIT flag to keep btrfs_bio's memory in raid56 recovery implement. 2: free function for bbio will make code clean and flexible, plus forced data type checking in compile. Changelog v1->v2: Rename following by David Sterba's suggestion: put_btrfs_bio() -> btrfs_put_bio() get_btrfs_bio() -> btrfs_get_bio() bbio->ref_count -> bbio->refs Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: Make raid_map array be inlined in btrfs_bio structure	Zhao Lei	5	-125/+105
	It can make code more simple and clear, we need not care about free bbio and raid_map together. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: sort raid_map before adding tgtdev stripes	Zhao Lei	1	-14/+8
	It can avoid complex calculation of real stripes in sort, moreover, we can clean up code of sorting tgtdev_map because it will be in order initially. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: fix a out-of-bound access of raid_map	Zhao Lei	1	-2/+5
	We add the number of stripes on target devices into bbio->num_stripes if we are under device replacement, and we just sort the raid_map of those stripes that not on the target devices, so if when we need real raid_map, we need skip the stripes on the target devices. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: fix fsync log replay for inodes with a mix of regular refs and extrefs	Filipe Manana	2	-5/+43
	If we have an inode with a large number of hard links, some of which may be extrefs, turn a regular ref into an extref, fsync the inode and then replay the fsync log (after a crash/reboot), we can endup with an fsync log that makes the replay code always fail with -EOVERFLOW when processing the inode's references. This is easy to reproduce with the test case I made for xfstests. Its steps are the following: _scratch_mkfs "-O extref" >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create a test file with 3001 hard links. This number is large enough to # make btrfs start using extrefs at some point even if the fs has the maximum # possible leaf/node size (64Kb). echo "hello world" > $SCRATCH_MNT/foo for i in `seq 1 3000`; do ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i` done # Make sure all metadata and data are durably persisted. sync # Now remove one link, add a new one with a new name, add another new one with # the same name as the one we just removed and fsync the inode. rm -f $SCRATCH_MNT/foo_link_0001 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_0001 rm -f $SCRATCH_MNT/foo_link_0002 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3002 ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3003 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo # Simulate a crash/power loss. This makes sure the next mount # will see an fsync log and will replay that log. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Check that the number of hard links is correct, we are able to remove all # the hard links and read the file's data. This is just to verify we don't # get stale file handle errors (due to dangling directory index entries that # point to inodes that no longer exist). echo "Link count: $(stat --format=%h $SCRATCH_MNT/foo)" [ -f $SCRATCH_MNT/foo ] \|\| echo "Link foo is missing" for ((i = 1; i <= 3003; i++)); do name=foo_link_`printf "%04d" $i` if [ $i -eq 2 ]; then [ -f $SCRATCH_MNT/$name ] && echo "Link $name found" else [ -f $SCRATCH_MNT/$name ] \|\| echo "Link $name is missing" fi done rm -f $SCRATCH_MNT/foo_link_* cat $SCRATCH_MNT/foo rm -f $SCRATCH_MNT/foo status=0 exit The fix is simply to correct the overflow condition when overwriting a reference item because it was wrong, trying to increase the item in the fs/subvol tree by an impossible amount. Also ensure that we don't insert one normal ref and one ext ref for the same dentry - this happened because processing a dir index entry from the parent in the log happened when the normal ref item was full, which made the logic insert an extref and later when the normal ref had enough room, it would be inserted again when processing the ref item from the child inode in the log. This issue has been present since the introduction of the extrefs feature (2012). A test case for xfstests follows soon. This test only passes if the previous patch titled "Btrfs: fix fsync when extend references are added to an inode" is applied too. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: fix fsync when extend references are added to an inode	Filipe Manana	1	-11/+16
	If we added an extended reference to an inode and fsync'ed it, the log replay code would make our inode have an incorrect link count, which was lower then the expected/correct count. This resulted in stale directory index entries after deleting some of the hard links, and any access to the dangling directory entries resulted in -ESTALE errors because the entries pointed to inode items that don't exist anymore. This is easy to reproduce with the test case I made for xfstests, and the bulk of that test is: _scratch_mkfs "-O extref" >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create a test file with 3001 hard links. This number is large enough to # make btrfs start using extrefs at some point even if the fs has the maximum # possible leaf/node size (64Kb). echo "hello world" > $SCRATCH_MNT/foo for i in `seq 1 3000`; do ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i` done # Make sure all metadata and data are durably persisted. sync # Add one more link to the inode that ends up being a btrfs extref and fsync # the inode. ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo # Simulate a crash/power loss. This makes sure the next mount # will see an fsync log and will replay that log. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Now after the fsync log replay btrfs left our inode with a wrong link count N, # which was smaller than the correct link count M (N < M). # So after removing N hard links, the remaining M - N directory entries were # still visible to user space but it was impossible to do anything with them # because they pointed to an inode that didn't exist anymore. This resulted in # stale file handle errors (-ESTALE) when accessing those dentries for example. # # So remove all hard links except the first one and then attempt to read the # file, to verify we don't get an -ESTALE error when accessing the inodel # # The btrfs fsck tool also detected the incorrect inode link count and it # reported an error message like the following: # # root 5 inode 257 errors 2001, no inode item, link count wrong # unresolved ref dir 256 index 2978 namelen 13 name foo_link_2976 filetype 1 errors 4, no inode ref # # The fstests framework automatically calls fsck after a test is run, so we # don't need to call fsck explicitly here. rm -f $SCRATCH_MNT/foo_link_* cat $SCRATCH_MNT/foo status=0 exit So make sure an fsync always flushes the delayed inode item, so that the fsync log contains it (needed in order to trigger the link count fixup code) and fix the extref counting function, which always return -ENOENT to its caller (and made it assume there were always 0 extrefs). This issue has been present since the introduction of the extrefs feature (2012). A test case for xfstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: fix directory inconsistency after fsync log replay	Filipe Manana	1	-2/+18
	If we have an inode (file) with a link count greater than 1, remove one of its hard links, fsync the inode, power fail/crash and then replay the fsync log on the next mount, we end up getting the parent directory's metadata inconsistent - its i_size still reflects the deleted hard link and has dangling index entries (with no matching inode reference entries). This prevents the directory from ever being deletable, as its i_size can never decrease to BTRFS_EMPTY_DIR_SIZE even if all of its children inodes are deleted, and the dangling index entries can never be removed (as they point to an inode that does not exist anymore). This is easy to reproduce with the following excerpt from the test case for xfstests that I just made: _scratch_mkfs >> $seqres.full 2>&1 _init_flakey _mount_flakey # Create a test file with 2 hard links in the same directory. mkdir -p $SCRATCH_MNT/a/b echo "hello world" > $SCRATCH_MNT/a/b/foo ln $SCRATCH_MNT/a/b/foo $SCRATCH_MNT/a/b/bar # Make sure all metadata and data are durably persisted. sync # Now remove one of the hard links and fsync the inode. rm -f $SCRATCH_MNT/a/b/bar $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/b/foo # Simulate a crash/power loss. This makes sure the next mount # will see an fsync log and will replay that log. _load_flakey_table $FLAKEY_DROP_WRITES _unmount_flakey _load_flakey_table $FLAKEY_ALLOW_WRITES _mount_flakey # Remove the last hard link of the file and attempt to remove its parent # directory - this failed in btrfs because the fsync log and replay code # didn't decrement the parent directory's i_size and left dangling directory # index entries - this made the btrfs rmdir implementation always fail with # the error -ENOTEMPTY. # # The dangling directory index entries were visible to user space, but it was # impossible to do anything on them (unlink, open, read, write, stat, etc) # because the inode they pointed to did not exist anymore. # # The parent directory's metadata inconsistency (stale index entries) was # also detected by btrfs' fsck tool, which is run automatically by the fstests # framework when the test finishes. The error message reported by fsck was: # # root 5 inode 259 errors 2001, no inode item, link count wrong # unresolved ref dir 258 index 3 namelen 3 name bar filetype 1 errors 4, no inode ref # rm -f $SCRATCH_MNT/a/b/* rmdir $SCRATCH_MNT/a/b rmdir $SCRATCH_MNT/a To fix this just make sure that after an unlink, if the inode is fsync'ed, he parent inode is fully logged in the fsync log. A test case for xfstests follows soon. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: lookup for block group only if needed when freeing a tree block	Filipe Manana	1	-4/+6
	Very often our extent buffer's header generation doesn't match the current transaction's id or it is also referenced by other trees (snapshots), so we don't need the corresponding block group cache object. Therefore only search for it if we are going to use it, so we avoid an unnecessary search in the block groups rbtree (and acquiring and releasing its spinlock). Freeing a tree block is performed when COWing or deleting a node/leaf, which implies we are holding the node/leaf's parent node lock, therefore reducing the amount of time spent when freeing a tree block helps reducing the amount of time we are holding the parent node's lock. For example, for a run of xfstests/generic/083, the block group cache object was needed only 682 times for a total of 226691 calls to free a tree block. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	btrfs: remove a no-op unfreeze superbock callback	David Sterba	1	-6/+0
	Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	btrfs: switch extent_state state to unsigned	David Sterba	4	-55/+55
	Currently there's a 4B hole in the structure between refs and state and there are only 16 bits used so we can make it unsigned. This will get a better packing and may save some stack space for local variables. The size of extent_state gets reduced by 8B and there are usually a lot of slab objects. struct extent_state { u64 start; /* 0 8 / u64 end; / 8 8 / struct rb_node rb_node; / 16 24 / wait_queue_head_t wq; / 40 24 / / --- cacheline 1 boundary (64 bytes) --- / atomic_t refs; / 64 4 / / XXX 4 bytes hole, try to pack / long unsigned int state; / 72 8 / u64 private; / 80 8 / / size: 88, cachelines: 2, members: 7 / / sum members: 84, holes: 1, sum holes: 4 / / last cacheline: 24 bytes */ }; Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	btrfs: set proper message level for skinny metadata	David Sterba	1	-1/+1
	This has been confusing people for too long, the message is really just informative. CC: <stable@vger.kernel.org> # 3.10+ Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	btrfs: update message levels after checksum errors	David Sterba	2	-2/+2
	The errors are worth noting and might get missed with INFO level. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	btrfs: update message levels during failed mount	David Sterba	1	-8/+8
	All error conditions from open_ctree shall be ERR. Warning would suggest that something's wrong and we can continue. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	btrfs: update message levels for errors	David Sterba	2	-5/+6
	Several messages that point to some internal problem, level INFO is wrong here. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: fix setup_leaf_for_split() to avoid leaf corruption	Filipe Manana	1	-2/+4
	We were incorrectly detecting when the target key didn't exist anymore after releasing the path and re-searching the tree. This could make us split or duplicate (btrfs_split_item() and btrfs_duplicate_item() are its only callers at the moment) an item when we should not. For the case of duplicating an item, we currently only duplicate checksum items (csum tree) and file extent items (fs/subvol trees). For the checksum items we end up overriding the item completely, but for file extent items we update only some of their fields in the copy (done in __btrfs_drop_extents), which means we can end up having a logical corruption for some values. Also for the case where we duplicate a file extent item it will make us produce a leaf with a wrong key order, as btrfs_duplicate_item() advances us to the next slot and then its caller sets a smaller key on the new item at that slot (like in __btrfs_drop_extents() e.g.). Alternatively if the tree search in setup_leaf_for_split() leaves with path->slots[0] == btrfs_header_nritems(path->nodes[0]), we end up accessing beyond the leaf's end (when we check if the item's size has changed) and make our caller insert an item at the invalid slot btrfs_header_nritems(path->nodes[0]) + 1, causing an invalid memory access if the leaf is full or nearly full. This issue has been present since the introduction of this function in 2009: Btrfs: Add btrfs_duplicate_item commit ad48fd754676bfae4139be1a897b1ea58f9aaf21 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: track dirty block groups on their own list	Josef Bacik	5	-124/+72
	Currently any time we try to update the block groups on disk we will walk _all_ block groups and check for the ->dirty flag to see if it is set. This function can get called several times during a commit. So if you have several terabytes of data you will be a very sad panda as we will loop through _all_ of the block groups several times, which makes the commit take a while which slows down the rest of the file system operations. This patch introduces a dirty list for the block groups that we get added to when we dirty the block group for the first time. Then we simply update any block groups that have been dirtied since the last time we called btrfs_write_dirty_block_groups. This allows us to clean up how we write the free space cache out so it is much cleaner. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-21	Btrfs: change how we track dirty roots	Josef Bacik	3	-5/+21
	I've been overloading root->dirty_list to keep track of dirty roots and which roots need to have their commit roots switched at transaction commit time. This could cause us to lose an update to the root which could corrupt the file system. To fix this use a state bit to know if the root is dirty, and if it isn't set we go ahead and move the root to the dirty list. This way if we re-dirty the root after adding it to the switch_commit list we make sure to update it. This also makes it so that the extent root is always the last root on the dirty list to try and keep the amount of churn down at this point in the commit. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2015-01-18	Linux 3.19-rc5	Linus Torvalds	1	-1/+1

2015-01-17	clk: fix possible null pointer dereference	Stanimir Varbanov	1	-1/+1
	The commit 646cafc6 (clk: Change clk_ops->determine_rate to return a clk_hw as the best parent) opens a possibility for null pointer dereference, fix this. Signed-off-by: Stanimir Varbanov <svarbanov@mm-sol.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Michael Turquette <mturquette@linaro.org>
2015-01-17	Revert "clk: ppc-corenet: Fix Section mismatch warning"	Kevin Hao	1	-1/+1
	This reverts commit da788acb28386aa896224e784954bb73c99ff26c. That commit tried to fix the section mismatch warning by moving the ppc_corenet_clk_driver struct to init section. This is definitely wrong because the kernel would free the memories occupied by this struct after boot while this driver is still registered in the driver core. The kernel would panic when accessing this driver struct. Cc: stable@vger.kernel.org # 3.17 Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Michael Turquette <mturquette@linaro.org>
2015-01-17	clk: rockchip: fix deadlock possibility in cpuclk	Heiko Stübner	1	-4/+6
	Lockdep reported a possible deadlock between the cpuclk lock and for example the i2c driver. CPU0 CPU1 ---- ---- lock(clk_lock); local_irq_disable(); lock(&(&i2c->lock)->rlock); lock(clk_lock); <Interrupt> lock(&(&i2c->lock)->rlock); * DEADLOCK * The generic clock-types of the core ccf already use spin_lock_irqsave when touching clock registers, so do the same for the cpuclk. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Michael Turquette <mturquette@linaro.org> [mturquette@linaro.org: removed initialization of "flags"]
2015-01-16	reset: sunxi: fix spinlock initialization	Tyler Baker	1	-0/+4
	Call spin_lock_init() before the spinlocks are used, both in early init and probe functions preventing a lockdep splat. I have been observing lockdep complaining [1] during boot on my a80 optimus [2] when CONFIG_PROVE_LOCKING has been enabled. This patch resolves the splat, and has been tested on a few other sunxi platforms without issue. [1] http://storage.kernelci.org/next/next-20150107/arm-multi_v7_defconfig+CONFIG_PROVE_LOCKING=y/lab-tbaker/boot-sun9i-a80-optimus.html [2] http://kernelci.org/boot/?a80-optimus Signed-off-by: Tyler Baker <tyler.baker@linaro.org> Cc: <stable@vger.kernel.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-01-16	ARM: dts: disable CCI on exynos5420 based arndale-octa	Abhilash Kesavan	2	-1/+5
	The arndale-octa board was giving "imprecise external aborts" during boot-up with MCPM enabled. CCI enablement of the boot cluster was found to be the cause of these aborts (possibly because the secure f/w was not allowing it). Hence, disable CCI for the arndale-octa board. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Tested-by: Kevin Hilman <khilman@linaro.org> Tested-by: Tyler Baker <tyler.baker@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-01-16	drivers: bus: check cci device tree node status	Abhilash Kesavan	1	-0/+3
	The arm-cci driver completes the probe sequence even if the cci node is marked as disabled. Add a check in the driver to honour the cci status in the device tree. Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-01-16	ARM: rockchip: disable jtag/sdmmc autoswitching on rk3288	Heiko Stübner	1	-0/+27
	rk3288 SoCs have a function to automatically switch between jtag/sdmmc pinmux settings depending on the card state. This collides with a lot of assumptions. It only works when using the internal card-detect mechanism and breaks horribly when using either the normal card-detect via the slot-gpio function or via any other pin. Also there is of course no link between the mmc and jtag on the software-side, so the jtag clocks may very well be disabled when the card is ejected and the soc switches back to the jtag pinmux. Leaving the switching function enabled did result in mmc timeouts and rcu stalls thus hanging the system on 3.19-rc1. Therefore disable it in all cases, as we expect the devicetree to explicitly select either mmc or jtag pinmuxes anyway. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-01-16	ARM: nomadik: fix up leftover device tree pins	Linus Walleij	1	-4/+4
	We altered the device tree bindings for the Nomadik family of pin controllers to be standard, this file was merged out-of-order so we missed fixing this. Fix it up. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
2015-01-17	kernel: avoid overflow in cmp_range	Louis Langholtz	1	-5/+5
	Avoid overflow possibility. [ The overflow is purely theoretical, since this is used for memory ranges that aren't even close to using the full 64 bits, but this is the right thing to do regardless. - Linus ] Signed-off-by: Louis Langholtz <lou_langholtz@me.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Peter Anvin <hpa@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-01-16	perf tools powerpc: Use dwfl_report_elf() instead of offline.	Sukadev Bhattiprolu	1	-8/+11
	dwfl_report_offline() works only when libraries are prelinked. Replace dwfl_report_offline() with dwfl_report_elf() so we correctly extract debug info even from libraries that are not prelinked. Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20150114221045.GA17703@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf tools: Fix segfault for symbol annotation on TUI	Namhyung Kim	1	-7/+1
	Currently the symbol structure is allocated with symbol_conf.priv_size to carry sideband information like annotation, map browser on TUI and sort-by-name tree node. So retrieving these information from symbol needs to care about the details of such placement. However the annotation code just assumes that the symbol is placed after the struct annotation. But actually there's other info between them. So accessing those struct will lead to an undefined behavior (usually a crash) after they write their info to the same location. To reproduce the problem, please follow the steps below: 1. run perf report (TUI of course) with -v option 2. open map browser (by pressing right arrow key for any entry) 3. search any function (by pressing '/' key and input whatever..) 4. return to the hist browser (by pressing 'q' or left arrow key) 5. open annotation window for the same entry (by pressing 'a' key) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1421234288-22758-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf test: Fix dwarf unwind using libunwind.	Wang Nan	2	-3/+61
	Perf tool fails to unwind user stack if the event raises in a shared object. This patch improves tests/dwarf-unwind.c to demonstrate the problem by utilizing commonly used glibc function "bsearch". If perf is not statically linked, the testcase will try to unwind a mixed call trace. By debugging libunwind I found that there is a bug in unwind-libunwind: it always passes 0 as segbase to libunwind, cause libunwind unable to locate debug_frame entry fir first level ip address (I add some more debugging output into libunwind to make things clear): >_Uarm_dwarf_find_debug_frame: start_ip = 10be98, end_ip = 10c2a4 >_Uarm_dwarf_find_debug_frame: found debug_frame table `/lib/libc-2.18.so': segbase=0x0, len=7, gp=0x0, table_data=0x449388 >_Uarm_dwarf_search_unwind_table: call lookup:ip = b6cd3bcc, segbase = 0, rel_ip = b6cd3bcc >lookup: e->start_ip_offset = bcf18 (rel_ip = b6cd3bcc) >lookup: e->start_ip_offset = 6d314 (rel_ip = b6cd3bcc) >lookup: e->start_ip_offset = 33d0c (rel_ip = b6cd3bcc) ... >lookup: e->start_ip_offset = 15d0c (rel_ip = b6cd3bcc) >lookup: e->start_ip_offset = 15c40 (rel_ip = b6cd3bcc) >_Uarm_dwarf_search_unwind_table: IP b6cd3bcc inside range b6c12000-b6d4c000, but no explicit unwind info found >put_rs_cache: unmasking signals/interrupts and releasing lock >_Uarm_dwarf_step: returning -10 >_Uarm_step: dwarf_step()=-10 This patch passes map->start as segbase to dwarf_find_debug_frame(), so di will be initialized correctly. In addition, dso and executable are different when setting segbase. This patch first check whether the elf is executable, and pass segbase only for shared object. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1421203007-75799-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf tools: Avoid build splat for syscall numbers with uclibc	Vineet Gupta	3	-3/+1
	This is due to duplicated unistd inclusion (via uClibc headers + kernel headers) Also seen on ARM uClibc based tools ------- ARC build ---------->8------------- CC util/evlist.o In file included from ~/arc/k.org/arch/arc/include/uapi/asm/unistd.h:25:0, from util/../perf-sys.h:10, from util/../perf.h:15, from util/event.h:7, from util/event.c:3: ~/arc/k.org/include/uapi/asm-generic/unistd.h:906:0: warning: "__NR_fcntl64" redefined [enabled by default] #define __NR_fcntl64 __NR3264_fcntl ^ In file included from ~/arc/gnu/INSTALL_1412-arc-2014.12-rc1/arc-snps-linux-uclibc/sysroot/usr/include/sys/syscall.h:24:0, from util/../perf-sys.h:6, ----------------->8------------------- ------- ARM build ---------->8------------- CC FPIC plugin_scsi.o In file included from util/../perf-sys.h:9:0, from util/../perf.h:15, from util/cache.h:7, from perf.c:12: ~/arc/k.org/arch/arm/include/uapi/asm/unistd.h:28:0: warning: "__NR_restart_syscall" redefined [enabled by default] In file included from ~/buildroot/host/usr/arm-buildroot-linux-uclibcgnueabi/sysroot/usr/include/sys/syscall.h:25:0, from util/../perf-sys.h:6, from util/../perf.h:15, from util/cache.h:7, from perf.c:12: ~/buildroot/host/usr/arm-buildroot-linux-uclibcgnueabi/sysroot/usr/include/bits/sysnum.h:17:0: note: this is the location of the previous definition ----------------->8------------------- Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421156604-30603-4-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf tools: Elide strlcpy warning with uclibc	Vineet Gupta	1	-0/+2
	----------------->8------------------ CC bench/sched-pipe.o In file included from builtin-annotate.c:13:0: util/cache.h:76:15: warning: redundant redeclaration of 'strlcpy' [-Wredundant-decls] extern size_t strlcpy(char dest, const char src, size_t size); ^ In file included from util/util.h:55:0, from builtin.h:4, from builtin-annotate.c:8: ~/vineetg/arc/gnu/INSTALL_1412-arc-2014.12-rc1/arc-snps-linux-uclibc/sysroot/usr/include/string.h:396:15: note: previous declaration of 'strlcpy' was here extern size_t strlcpy(char __restrict dst, const char __restrict src, ----------------->8------------------ Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421156604-30603-3-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf tools: Fix statfs.f_type data type mismatch build error with uclibc	Alexey Brodkin	2	-2/+2
	ARC Linux uses the no legacy syscalls abi and corresponding uClibc headers statfs defines f_type to be U32 which causes perf build breakage http://git.uclibc.org/uClibc/tree/libc/sysdeps/linux/common-generic/bits/statfs.h ----------->8--------------- CC fs/fs.o fs/fs.c: In function 'fs__valid_mount': fs/fs.c:82:24: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] else if (st_fs.f_type != magic) ^ cc1: all warnings being treated as errors ----------->8--------------- Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Cody P Schafer <dev@codyps.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Link: http://lkml.kernel.org/r/1420888254-17504-2-git-send-email-vgupta@synopsys.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	tools: Remove bitops/hweight usage of bits in tools/perf	Arnaldo Carvalho de Melo	10	-42/+30
	We need to use lib/hweight.c for that, just like we do for lib/rbtree.c, so tools need to link hweight.o. For now do it directly, but we need to have a tools/lib/lk.a or .so that collects these goodies... Reported-by: Jan Beulich <JBeulich@suse.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-a1e91dx3apzqw5kbdt7ut21s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf machine: Fix __machine__findnew_thread() error path	Namhyung Kim	1	-1/+3
	When thread__init_map_groups() fails, a new thread should be removed from the rbtree since it's gonna be freed. Also update last match cache only if the function succeeded. Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1420763892-15535-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf tools: Fix building error in x86_64 when dwarf unwind is on	Namhyung Kim	3	-15/+17
	When build with 'make ARCH=x86' and dwarf unwind is on, there is a compiling error: CC /home/wn/perf/arch/x86/util/unwind-libdw.o CC /home/wn/perf/arch/x86/tests/regs_load.o arch/x86/tests/regs_load.S: Assembler messages: arch/x86/tests/regs_load.S:65: Error: operand type mismatch for `push' arch/x86/tests/regs_load.S:72: Error: operand type mismatch for `pop' make[1]: * [/home/wn/perf/arch/x86/tests/regs_load.o] Error 1 make[1]: INTERNAL: Exiting with 25 jobserver tokens available; should be 24! make: * [all] Error 2 ... Which is caused by incorrectly undefine macro HAVE_ARCH_X86_64_SUPPORT. 'config/Makefile.arch' tests __x86_64__ only when 'ARCH=x86_64'. However, when building x86_64 kernel, ARCH=x86 is valid and commonly used. Build systems, such as yocto, uses x86_64 compiler with 'ARCH=x86' to build x86_64 perf, which causes mismatching. As __LP64__ is defined for x86_64 as well, we can consolidate the __x86_64__ check to the __LP64__ check and get rid of the IS_X86_64 IMHO. (This patch is made by Namhyung Kim when replying my v1 patch: https://lkml.org/lkml/2015/1/7/17 I modified the code to remove dependency on RAW_ARCH: https://lkml.org/lkml/2015/1/7/865 Namhyung Kim didn't provide his SOB in his original email. I add mine only for my modification.) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1421029255-23039-1-git-send-email-wangnan0@huawei.com [ Namhyung provided his S-o-B on a followup to this patch thread on lkml ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	perf probe: Propagate error code when write(2) failed	Namhyung Kim	1	-1/+3
	When it failed to write probe commands to the probe_event file in debugfs, it needs to propagate the error code properly. Current code blindly uses the return value of the write(2) so it always uses -1 (-EPERM) and it might confuse users. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1420886028-15135-4-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-16	arm64: partially revert "ARM: 8167/1: extend the reserved memory for initrd to be page aligned"	Catalin Marinas	1	-7/+1
	This patch partially reverts commit 421520ba98290a73b35b7644e877a48f18e06004 (only the arm64 part). There is no guarantee that the boot-loader places other images like dtb in a different page than initrd start/end, especially when the kernel is built with 64KB pages. When this happens, such pages must not be freed. The free_reserved_area() already takes care of rounding up "start" and rounding down "end" to avoid freeing partially used pages. Cc: <stable@vger.kernel.org> # 3.17+ Reported-by: Peter Maydell <Peter.Maydell@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>