aboutsummaryrefslogtreecommitdiffstats
path: root/fs (follow)
AgeCommit message (Collapse)AuthorFilesLines
2016-12-14radix-tree: improve multiorder iteratorsMatthew Wilcox1-1/+1
This fixes several interlinked problems with the iterators in the presence of multiorder entries. 1. radix_tree_iter_next() would only advance by one slot, which would result in the iterators returning the same entry more than once if there were sibling entries. 2. radix_tree_next_slot() could return an internal pointer instead of a user pointer if a tagged multiorder entry was immediately followed by an entry of lower order. 3. radix_tree_next_slot() expanded to a lot more code than it used to when multiorder support was compiled in. And I wasn't comfortable with entry_to_node() being in a header file. Fixing radix_tree_iter_next() for the presence of sibling entries necessarily involves examining the contents of the radix tree, so we now need to pass 'slot' to radix_tree_iter_next(), and we need to change the calling convention so it is called *before* dropping the lock which protects the tree. Also rename it to radix_tree_iter_resume(), as some people thought it was necessary to call radix_tree_iter_next() each time around the loop. radix_tree_next_slot() becomes closer to how it looked before multiorder support was introduced. It only checks to see if the next entry in the chunk is a sibling entry or a pointer to a node; this should be rare enough that handling this case out of line is not a performance impact (and such impact is amortised by the fact that the entry we just processed was a multiorder entry). Also, radix_tree_next_slot() used to force a new chunk lookup for untagged entries, which is more expensive than the out of line sibling entry skipping. Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.com Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14btrfs: fix race in btrfs_free_dummy_fs_info()Matthew Wilcox1-0/+1
We drop the lock which protects the radix tree, so we must call radix_tree_iter_next() in order to avoid a modification to the tree invalidating the iterator state. Link: http://lkml.kernel.org/r/1480369871-5271-54-git-send-email-mawilcox@linuxonhyperv.com Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14dax: clear dirty entry tags on cache flushJan Kara1-0/+64
Currently we never clear dirty tags in DAX mappings and thus address ranges to flush accumulate. Now that we have locking of radix tree entries, we have all the locking necessary to reliably clear the radix tree dirty tag when flushing caches for corresponding address range. Similarly to page_mkclean() we also have to write-protect pages to get a page fault when the page is next written to so that we can mark the entry dirty again. Link: http://lkml.kernel.org/r/1479460644-25076-21-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14dax: protect PTE modification on WP fault by radix tree entry lockJan Kara1-6/+16
Currently PTE gets updated in wp_pfn_shared() after dax_pfn_mkwrite() has released corresponding radix tree entry lock. When we want to writeprotect PTE on cache flush, we need PTE modification to happen under radix tree entry lock to ensure consistent updates of PTE and radix tree (standard faults use page lock to ensure this consistency). So move update of PTE bit into dax_pfn_mkwrite(). Link: http://lkml.kernel.org/r/1479460644-25076-20-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14dax: make cache flushing protected by entry lockJan Kara1-22/+39
Currently, flushing of caches for DAX mappings was ignoring entry lock. So far this was ok (modulo a bug that a difference in entry lock could cause cache flushing to be mistakenly skipped) but in the following patches we will write-protect PTEs on cache flushing and clear dirty tags. For that we will need more exclusion. So do cache flushing under an entry lock. This allows us to remove one lock-unlock pair of mapping->tree_lock as a bonus. Link: http://lkml.kernel.org/r/1479460644-25076-19-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14mm: move handling of COW faults into DAX codeJan Kara1-32/+29
Move final handling of COW faults from generic code into DAX fault handler. That way generic code doesn't have to be aware of peculiarities of DAX locking so remove that knowledge and make locking functions private to fs/dax.c. Link: http://lkml.kernel.org/r/1479460644-25076-11-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14mm: use vmf->address instead of of vmf->virtual_addressJan Kara1-2/+2
Every single user of vmf->virtual_address typed that entry to unsigned long before doing anything with it so the type of virtual_address does not really provide us any additional safety. Just use masked vmf->address which already has the appropriate type. Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14mm: join struct fault_env and vm_faultJan Kara1-10/+12
Currently we have two different structures for passing fault information around - struct vm_fault and struct fault_env. DAX will need more information in struct vm_fault to handle its faults so the content of that structure would become event closer to fault_env. Furthermore it would need to generate struct fault_env to be able to call some of the generic functions. So at this point I don't think there's much use in keeping these two structures separate. Just embed into struct vm_fault all that is needed to use it for both purposes. Link: http://lkml.kernel.org/r/1479460644-25076-2-git-send-email-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14mm: add locked parameter to get_user_pages_remote()Lorenzo Stoakes1-1/+1
Patch series "mm: unexport __get_user_pages_unlocked()". This patch series continues the cleanup of get_user_pages*() functions taking advantage of the fact we can now pass gup_flags as we please. It firstly adds an additional 'locked' parameter to get_user_pages_remote() to allow for its callers to utilise VM_FAULT_RETRY functionality. This is necessary as the invocation of __get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of this and no other existing higher level function would allow it to do so. Secondly existing callers of __get_user_pages_unlocked() are replaced with the appropriate higher-level replacement - get_user_pages_unlocked() if the current task and memory descriptor are referenced, or get_user_pages_remote() if other task/memory descriptors are referenced (having acquiring mmap_sem.) This patch (of 2): Add a int *locked parameter to get_user_pages_remote() to allow VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked(). Taking into account the previous adjustments to get_user_pages*() functions allowing for the passing of gup_flags, we are now in a position where __get_user_pages_unlocked() need only be exported for his ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to subsequently unexport __get_user_pages_unlocked() as well as allowing for future flexibility in the use of get_user_pages_remote(). [sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change] Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Jan Kara <jack@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14btrfs: better handle btrfs_printk() defaultsPetr Mladek1-9/+3
Commit 262c5e86fec7 ("printk/btrfs: handle more message headers") triggers: warning: `ratelimit' may be used uninitialized in this function with gcc (4.1.2) and probably many other versions. The code actually is correct but a bit twisted. Let's make it more straightforward and set the default values at the beginning. Link: http://lkml.kernel.org/r/20161213135246.GQ3506@pathway.suse.cz Signed-off-by: Petr Mladek <pmladek@suse.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: David Sterba <dsterba@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds33-1166/+1336
Pull ext4 updates from Ted Ts'o: "This merge request includes the dax-4.0-iomap-pmd branch which is needed for both ext4 and xfs dax changes to use iomap for DAX. It also includes the fscrypt branch which is needed for ubifs encryption work as well as ext4 encryption and fscrypt cleanups. Lots of cleanups and bug fixes, especially making sure ext4 is robust against maliciously corrupted file systems --- especially maliciously corrupted xattr blocks and a maliciously corrupted superblock. Also fix ext4 support for 64k block sizes so it works well on ppcle. Fixed mbcache so we don't miss some common xattr blocks that can be merged" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits) dax: Fix sleep in atomic contex in grab_mapping_entry() fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL fscrypt: Delay bounce page pool allocation until needed fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page() fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page() fscrypt: Never allocate fscrypt_ctx on in-place encryption fscrypt: Use correct index in decrypt path. fscrypt: move the policy flags and encryption mode definitions to uapi header fscrypt: move non-public structures and constants to fscrypt_private.h fscrypt: unexport fscrypt_initialize() fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info() fscrypto: move ioctl processing more fully into common code fscrypto: remove unneeded Kconfig dependencies MAINTAINERS: fscrypto: recommend linux-fsdevel for fscrypto patches ext4: do not perform data journaling when data is encrypted ext4: return -ENOMEM instead of success ext4: reject inodes with negative size ext4: remove another test in ext4_alloc_file_blocks() Documentation: fix description of ext4's block_validity mount option ext4: fix checks for data=ordered and journal_async_commit options ...
2016-12-14Merge tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fsLinus Torvalds20-475/+1018
Pull f2fs updates from Jaegeuk Kim: "This patch series contains several performance tuning patches regarding to the IO submission flow, in addition to supporting new features such as a ZBC-base drive and multiple devices. It also includes some major bug fixes such as: - checkpoint version control - fdatasync-related roll-forward recovery routine - memory boundary or null-pointer access in corner cases - missing error cases It has various minor clean-up patches as well" * tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits) f2fs: fix a missing size change in f2fs_setattr f2fs: fix to access nullified flush_cmd_control pointer f2fs: free meta pages if sanity check for ckpt is failed f2fs: detect wrong layout f2fs: call sync_fs when f2fs is idle Revert "f2fs: use percpu_counter for # of dirty pages in inode" f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage f2fs: do not activate auto_recovery for fallocated i_size f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack f2fs: fix 32-bit build f2fs: set ->owner for debugfs status file's file_operations f2fs: fix incorrect free inode count in ->statfs f2fs: drop duplicate header timer.h f2fs: fix wrong AUTO_RECOVER condition f2fs: do not recover i_size if it's valid f2fs: fix fdatasync f2fs: fix to account total free nid correctly f2fs: fix an infinite loop when flush nodes in cp f2fs: don't wait writeback for datas during checkpoint f2fs: fix wrong written_valid_blocks counting ...
2016-12-14Merge tag 'dlm-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlmLinus Torvalds9-20/+22
Pull dlm fixes from David Teigland: "This set fixes error reporting for dlm sockets, removes the unbound property on the dlm callback workqueue to improve performance, and includes a couple trivial changes" * tag 'dlm-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: fix error return code in sctp_accept_from_sock() dlm: don't specify WQ_UNBOUND for the ast callback workqueue dlm: remove lock_sock to avoid scheduling while atomic dlm: don't save callbacks after accept dlm: audit and remove any unnecessary uses of module.h dlm: make genl_ops const
2016-12-14Merge tag 'jfs-4.10' of git://github.com/kleikamp/linux-shaggyLinus Torvalds1-1/+1
Pull jfs update from David Kleikamp: "The jfs piece of the current_time() series" * tag 'jfs-4.10' of git://github.com/kleikamp/linux-shaggy: fs: jfs: Replace CURRENT_TIME_SEC by current_time()
2016-12-13Merge tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds2-1/+1
Pull xen updates from Juergen Gross: "Xen features and fixes for 4.10 These are some fixes, a move of some arm related headers to share them between arm and arm64 and a series introducing a helper to make code more readable. The most notable change is David stepping down as maintainer of the Xen hypervisor interface. This results in me sending you the pull requests for Xen related code from now on" * tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (29 commits) xen/balloon: Only mark a page as managed when it is released xenbus: fix deadlock on writes to /proc/xen/xenbus xen/scsifront: don't request a slot on the ring until request is ready xen/x86: Increase xen_e820_map to E820_X_MAX possible entries x86: Make E820_X_MAX unconditionally larger than E820MAX xen/pci: Bubble up error and fix description. xen: xenbus: set error code on failure xen: set error code on failures arm/xen: Use alloc_percpu rather than __alloc_percpu arm/arm64: xen: Move shared architecture headers to include/xen/arm xen/events: use xen_vcpu_id mapping for EVTCHNOP_status xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing xen-scsifront: Add a missing call to kfree MAINTAINERS: update XEN HYPERVISOR INTERFACE xenfs: Use proc_create_mount_point() to create /proc/xen xen-platform: use builtin_pci_driver xen-netback: fix error handling output xen: make use of xenbus_read_unsigned() in xenbus xen: make use of xenbus_read_unsigned() in xen-pciback xen: make use of xenbus_read_unsigned() in xen-fbfront ...
2016-12-13Merge tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-coreLinus Torvalds1-2/+2
Pull driver core updates from Greg KH: "Here's the new driver core patches for 4.10-rc1. Big thing here is the nice addition of "functional dependencies" to the driver core. The idea has been talked about for a very long time, great job to Rafael for stepping up and implementing it. It's been tested for longer than the 4.9-rc1 date, we held off on merging it earlier in order to feel more comfortable about it. Other than that, it's just a handful of small other patches, some good cleanups to the mess that is the firmware class code, and we have a test driver for the deferred probe logic. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (30 commits) firmware: Correct handling of fw_state_wait() return value driver core: Silence device links sphinx warning firmware: remove warning at documentation generation time drivers: base: dma-mapping: Fix typo in dmam_alloc_non_coherent comments driver core: test_async: fix up typo found by 0-day firmware: move fw_state_is_done() into UHM section firmware: do not use fw_lock for fw_state protection firmware: drop bit ops in favor of simple state machine firmware: refactor loading status firmware: fix usermode helper fallback loading driver core: firmware_class: convert to use class_groups driver core: devcoredump: convert to use class_groups driver core: class: add class_groups support kernfs: Declare two local data structures static driver-core: fix platform_no_drv_owner.cocci warnings drivers/base/memory.c: Remove unused 'first_page' variable driver core: add CLASS_ATTR_WO() drivers: base: cacheinfo: support DT overrides for cache properties drivers: base: cacheinfo: add pr_fmt logging drivers: base: cacheinfo: fix boot error message when acpi is enabled ...
2016-12-13Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-blockLinus Torvalds55-170/+413
Pull block layer updates from Jens Axboe: "This is the main block pull request this series. Contrary to previous release, I've kept the core and driver changes in the same branch. We always ended up having dependencies between the two for obvious reasons, so makes more sense to keep them together. That said, I'll probably try and keep more topical branches going forward, especially for cycles that end up being as busy as this one. The major parts of this pull request is: - Improved support for O_DIRECT on block devices, with a small private implementation instead of using the pig that is fs/direct-io.c. From Christoph. - Request completion tracking in a scalable fashion. This is utilized by two components in this pull, the new hybrid polling and the writeback queue throttling code. - Improved support for polling with O_DIRECT, adding a hybrid mode that combines pure polling with an initial sleep. From me. - Support for automatic throttling of writeback queues on the block side. This uses feedback from the device completion latencies to scale the queue on the block side up or down. From me. - Support from SMR drives in the block layer and for SD. From Hannes and Shaun. - Multi-connection support for nbd. From Josef. - Cleanup of request and bio flags, so we have a clear split between which are bio (or rq) private, and which ones are shared. From Christoph. - A set of patches from Bart, that improve how we handle queue stopping and starting in blk-mq. - Support for WRITE_ZEROES from Chaitanya. - Lightnvm updates from Javier/Matias. - Supoort for FC for the nvme-over-fabrics code. From James Smart. - A bunch of fixes from a whole slew of people, too many to name here" * 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits) blk-stat: fix a few cases of missing batch flushing blk-flush: run the queue when inserting blk-mq flush elevator: make the rqhash helpers exported blk-mq: abstract out blk_mq_dispatch_rq_list() helper blk-mq: add blk_mq_start_stopped_hw_queue() block: improve handling of the magic discard payload blk-wbt: don't throttle discard or write zeroes nbd: use dev_err_ratelimited in io path nbd: reset the setup task for NBD_CLEAR_SOCK nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME nvme-fabrics: Add target support for FC transport nvme-fabrics: Add host support for FC transport nvme-fabrics: Add FC transport LLDD api definitions nvme-fabrics: Add FC transport FC-NVME definitions nvme-fabrics: Add FC transport error codes to nvme.h Add type 0x28 NVME type code to scsi fc headers nvme-fabrics: patch target code in prep for FC transport support nvme-fabrics: set sqe.command_id in core not transports parser: add u64 number parser nvme-rdma: align to generic ib_event logging helper ...
2016-12-13Merge tag 'pstore-v4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds6-126/+293
Pull pstore updates from Kees Cook: "Improvements and fixes to pstore subsystem: - add additional checks for bad platform data - remove bounce buffer in console writer - protect read/unlink race with a mutex - correctly give up during dump locking failures - increase ftrace bandwidth by splitting ftrace buffers per CPU" * tag 'pstore-v4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ramoops: add pdata NULL check to ramoops_probe pstore: Convert console write to use ->write_buf pstore: Protect unlink with read_mutex pstore: Use global ftrace filters for function trace filtering ftrace: Provide API to use global filtering for ftrace ops pstore: Clarify context field przs as dprzs pstore: improve error report for failed setup pstore: Merge per-CPU ftrace records into one pstore: Add ftrace timestamp counter ramoops: Split ftrace buffer space into per-CPU zones pstore: Make ramoops_init_przs generic for other prz arrays pstore: Allow prz to control need for locking pstore: Warn on PSTORE_TYPE_PMSG using deprecated function pstore: Make spinlock per zone instead of global pstore: Actually give up during locking failure
2016-12-12Merge tag 'docs-4.10' of git://git.lwn.net/linuxLinus Torvalds2-3/+3
Pull documentation update from Jonathan Corbet: "These are the documentation changes for 4.10. It's another busy cycle for the docs tree, as the sphinx conversion continues. Highlights include: - Further work on PDF output, which remains a bit of a pain but should be more solid now. - Five more DocBook template files converted to Sphinx. Only 27 to go... Lots of plain-text files have also been converted and integrated. - Images in binary formats have been replaced with more source-friendly versions. - Various bits of organizational work, including the renaming of various files discussed at the kernel summit. - New documentation for the device_link mechanism. ... and, of course, lots of typo fixes and small updates" * tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits) dma-buf: Extract dma-buf.rst Update Documentation/00-INDEX docs: 00-INDEX: document directories/files with no docs docs: 00-INDEX: remove non-existing entries docs: 00-INDEX: add missing entries for documentation files/dirs docs: 00-INDEX: consolidate process/ and admin-guide/ description scripts: add a script to check if Documentation/00-INDEX is sane Docs: change sh -> awk in REPORTING-BUGS Documentation/core-api/device_link: Add initial documentation core-api: remove an unexpected unident ppc/idle: Add documentation for powersave=off Doc: Correct typo, "Introdution" => "Introduction" Documentation/atomic_ops.txt: convert to ReST markup Documentation/local_ops.txt: convert to ReST markup Documentation/assoc_array.txt: convert to ReST markup docs-rst: parse-headers.pl: cleanup the documentation docs-rst: fix media cleandocs target docs-rst: media/Makefile: reorganize the rules docs-rst: media: build SVG from graphviz files docs-rst: replace bayer.png by a SVG image ...
2016-12-12Merge branch 'akpm' (patches from Andrew)Linus Torvalds22-82/+101
Merge updates from Andrew Morton: - various misc bits - most of MM (quite a lot of MM material is awaiting the merge of linux-next dependencies) - kasan - printk updates - procfs updates - MAINTAINERS - /lib updates - checkpatch updates * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (123 commits) init: reduce rootwait polling interval time to 5ms binfmt_elf: use vmalloc() for allocation of vma_filesz checkpatch: don't emit unified-diff error for rename-only patches checkpatch: don't check c99 types like uint8_t under tools checkpatch: avoid multiple line dereferences checkpatch: don't check .pl files, improve absolute path commit log test scripts/checkpatch.pl: fix spelling checkpatch: don't try to get maintained status when --no-tree is given lib/ida: document locking requirements a bit better lib/rbtree.c: fix typo in comment of ____rb_erase_color lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM MAINTAINERS: add drm and drm/i915 irc channels MAINTAINERS: add "C:" for URI for chat where developers hang out MAINTAINERS: add drm and drm/i915 bug filing info MAINTAINERS: add "B:" for URI where to file bugs get_maintainer: look for arbitrary letter prefixes in sections printk: add Kconfig option to set default console loglevel printk/sound: handle more message headers printk/btrfs: handle more message headers printk/kdb: handle more message headers ...
2016-12-12Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+2
Pull timer updates from Thomas Gleixner: "The time/timekeeping/timer folks deliver with this update: - Fix a reintroduced signed/unsigned issue and cleanup the whole signed/unsigned mess in the timekeeping core so this wont happen accidentaly again. - Add a new trace clock based on boot time - Prevent injection of random sleep times when PM tracing abuses the RTC for storage - Make posix timers configurable for real tiny systems - Add tracepoints for the alarm timer subsystem so timer based suspend wakeups can be instrumented - The usual pile of fixes and updates to core and drivers" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) timekeeping: Use mul_u64_u32_shr() instead of open coding it timekeeping: Get rid of pointless typecasts timekeeping: Make the conversion call chain consistently unsigned timekeeping_Force_unsigned_clocksource_to_nanoseconds_conversion alarmtimer: Add tracepoints for alarm timers trace: Update documentation for mono, mono_raw and boot clock trace: Add an option for boot clock as trace clock timekeeping: Add a fast and NMI safe boot clock timekeeping/clocksource_cyc2ns: Document intended range limitation timekeeping: Ignore the bogus sleep time if pm_trace is enabled selftests/timers: Fix spelling mistake "Asyncrhonous" -> "Asynchronous" clocksource/drivers/bcm2835_timer: Unmap region obtained by of_iomap clocksource/drivers/arm_arch_timer: Map frame with of_io_request_and_map() arm64: dts: rockchip: Arch counter doesn't tick in system suspend clocksource/drivers/arm_arch_timer: Don't assume clock runs in suspend posix-timers: Make them configurable posix_cpu_timers: Move the add_device_randomness() call to a proper place timer: Move sys_alarm from timer.c to itimer.c ptp_clock: Allow for it to be optional Kconfig: Regenerate *.c_shipped files after previous changes ...
2016-12-12Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-10/+6
Pull smp hotplug updates from Thomas Gleixner: "This is the final round of converting the notifier mess to the state machine. The removal of the notifiers and the related infrastructure will happen around rc1, as there are conversions outstanding in other trees. The whole exercise removed about 2000 lines of code in total and in course of the conversion several dozen bugs got fixed. The new mechanism allows to test almost every hotplug step standalone, so usage sites can exercise all transitions extensively. There is more room for improvement, like integrating all the pointlessly different architecture mechanisms of synchronizing, setting cpus online etc into the core code" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) tracing/rb: Init the CPU mask on allocation soc/fsl/qbman: Convert to hotplug state machine soc/fsl/qbman: Convert to hotplug state machine zram: Convert to hotplug state machine KVM/PPC/Book3S HV: Convert to hotplug state machine arm64/cpuinfo: Convert to hotplug state machine arm64/cpuinfo: Make hotplug notifier symmetric mm/compaction: Convert to hotplug state machine iommu/vt-d: Convert to hotplug state machine mm/zswap: Convert pool to hotplug state machine mm/zswap: Convert dst-mem to hotplug state machine mm/zsmalloc: Convert to hotplug state machine mm/vmstat: Convert to hotplug state machine mm/vmstat: Avoid on each online CPU loops mm/vmstat: Drop get_online_cpus() from init_cpu_node_state/vmstat_cpu_dead() tracing/rb: Convert to hotplug state machine oprofile/nmi timer: Convert to hotplug state machine net/iucv: Use explicit clean up labels in iucv_init() x86/pci/amd-bus: Convert to hotplug state machine x86/oprofile/nmi: Convert to hotplug state machine ...
2016-12-12binfmt_elf: use vmalloc() for allocation of vma_fileszJason Baron1-2/+4
We have observed page allocations failures of order 4 during core dump while trying to allocate vma_filesz. This results in a useless core file of size 0. To improve reliability use vmalloc(). Note that the vmalloc() allocation is bounded by sysctl_max_map_count, which is 65,530 by default. So with a 4k page size, and 8 bytes per seg, this is a max of 128 pages or an order 7 allocation. Other parts of the core dump path, such as fill_files_note() are already using vmalloc() for presumably similar reasons. Link: http://lkml.kernel.org/r/1479745791-17611-1-git-send-email-jbaron@akamai.com Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12printk/btrfs: handle more message headersPetr Mladek1-11/+15
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines") allows to define more message headers for a single message. The motivation is that continuous lines might get mixed. Therefore it make sense to define the right log level for every piece of a cont line. The current btrfs_printk() macros do not support continuous lines at the moment. But better be prepared for a custom messages and avoid potential "lvl" buffer overflow. This patch iterates over the entire message header. It is interested only into the message level like the original code. This patch also introduces PRINTK_MAX_SINGLE_HEADER_LEN. Three bytes are enough for the message level header at the moment. But it used to be three, see the commit 04d2c8c83d0e ("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern"). Also I fixed the default ratelimit level. It looked very strange when it was different from the default log level. [pmladek@suse.com: Fix a check of the valid message level] Link: http://lkml.kernel.org/r/20161111183236.GD2145@dhcp128.suse.cz Link: http://lkml.kernel.org/r/1478695291-12169-4-git-send-email-pmladek@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: David Sterba <dsterba@suse.com> Cc: Joe Perches <joe@perches.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12fs/proc: calculate /proc/* and /proc/*/task/* nlink at init timeAlexey Dobriyan3-6/+15
Runtime nlink calculation works but meh. I don't know how to do it at compile time, but I know how to do it at init time. Shift "2+" part into init time as a bonus. Link: http://lkml.kernel.org/r/20161122195549.GB29812@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12fs/proc/base.c: save decrement during lookup/readdir in /proc/$PIDAlexey Dobriyan1-4/+4
Comparison for "<" works equally well as comparison for "<=" but one SUB/LEA is saved (no, it is not optimised away, at least here). Link: http://lkml.kernel.org/r/20161122195143.GA29812@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12fs/proc/array.c: slightly improve render_sigset_tRasmus Villemoes1-1/+1
format_decode and vsnprintf occasionally show up in perf top, so I went looking for places that might not need the full printf power. With the help of kprobes, I gathered some statistics on which format strings we mostly pass to vsnprintf. On a trivial desktop workload, I hit "%x" 25% of the time, so something apparently reads /proc/pid/status (which does 5*16 printf("%x") calls) a lot. With this patch, reading /proc/pid/status is 30% faster according to this microbenchmark: char buf[4096]; int i, fd; for (i = 0; i < 10000; ++i) { fd = open("/proc/self/status", O_RDONLY); read(fd, buf, sizeof(buf)); close(fd); } Link: http://lkml.kernel.org/r/1474410485-1305-1-git-send-email-linux@rasmusvillemoes.dk Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Andrei Vagin <avagin@openvz.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: tweak comments about 2 stage open and everythingAlexey Dobriyan1-8/+21
Some comments were obsoleted since commit 05c0ae21c034 ("try a saner locking for pde_opener..."). Some new comments added. Some confusing comments replaced with equally confusing ones. Link: http://lkml.kernel.org/r/20161029160231.GD1246@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: kmalloc struct pde_openerAlexey Dobriyan1-1/+3
kzalloc is too much, half of the fields will be reinitialized anyway. If proc file doesn't have ->release hook (some still do not), clearing is unnecessary because it will be freed immediately. Link: http://lkml.kernel.org/r/20161029155747.GC1246@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: fix type of struct pde_opener::closing fieldAlexey Dobriyan2-2/+2
struct pde_opener::closing is boolean. Link: http://lkml.kernel.org/r/20161029155439.GB1246@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: just list_del() struct pde_openerAlexey Dobriyan1-1/+1
list_del_init() is too much, structure will be freed in three lines anyway. Link: http://lkml.kernel.org/r/20161029155313.GA1246@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: make struct struct map_files_info::len unsigned intAlexey Dobriyan1-1/+1
Linux doesn't support 4GB+ filenames in /proc, so unsigned long is too much. MOV r64, r/m64 is larger than MOV r32, r/m32. Link: http://lkml.kernel.org/r/20161029161123.GG1246@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: make struct pid_entry::len unsignedAlexey Dobriyan1-1/+1
"unsigned int" is better on x86_64 because it most of the time it autoexpands to 64-bit value while "int" requires MOVSX instruction. Link: http://lkml.kernel.org/r/20161029160810.GF1246@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12proc: report no_new_privs stateKees Cook1-2/+3
Similar to being able to examine if a process has been correctly confined with seccomp, the state of no_new_privs is equally interesting, so this adds it to /proc/$pid/status. Link: http://lkml.kernel.org/r/20161103214041.GA58566@beast Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jann Horn <jann@thejh.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Rodrigo Freire <rfreire@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Robert Ho <robert.hu@intel.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Richard W.M. Jones" <rjones@redhat.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12mm: add cond_resched() in gather_pte_stats()Hugh Dickins1-0/+1
The other pagetable walks in task_mmu.c have a cond_resched() after walking their ptes: add a cond_resched() in gather_pte_stats() too, for reading /proc/<id>/numa_maps. Only pagemap_pmd_range() has a cond_resched() in its (unusually expensive) pmd_trans_huge case: more should probably be added, but leave them unchanged for now. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1612052157400.13021@eggly.anvils Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12lib: radix-tree: update callback for changing leaf nodesJohannes Weiner1-1/+2
Support handing __radix_tree_replace() a callback that gets invoked for all leaf nodes that change or get freed as a result of the slot replacement, to assist users tracking nodes with node->private_list. This prepares for putting page cache shadow entries into the radix tree root again and drastically simplifying the shadow tracking. Link: http://lkml.kernel.org/r/20161117193134.GD23430@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12lib: radix-tree: check accounting of existing slot replacement usersJohannes Weiner1-2/+2
The bug in khugepaged fixed earlier in this series shows that radix tree slot replacement is fragile; and it will become more so when not only NULL<->!NULL transitions need to be caught but transitions from and to exceptional entries as well. We need checks. Re-implement radix_tree_replace_slot() on top of the sanity-checked __radix_tree_replace(). This requires existing callers to also pass the radix tree root, but it'll warn us when somebody replaces slots with contents that need proper accounting (transitions between NULL entries, real entries, exceptional entries) and where a replacement through the slot pointer would corrupt the radix tree node counts. Link: http://lkml.kernel.org/r/20161117193021.GB23430@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12lib: radix-tree: native accounting of exceptional entriesJohannes Weiner1-2/+3
The way the page cache is sneaking shadow entries of evicted pages into the radix tree past the node entry accounting and tracking them manually in the upper bits of node->count is fraught with problems. These shadow entries are marked in the tree as exceptional entries, which are a native concept to the radix tree. Maintain an explicit counter of exceptional entries in the radix tree node. Subsequent patches will switch shadow entry tracking over to that counter. DAX and shmem are the other users of exceptional entries. Since slot replacements that change the entry type from regular to exceptional must now be accounted, introduce a __radix_tree_replace() function that does replacement and accounting, and switch DAX and shmem over. The increase in radix tree node size is temporary. A followup patch switches the shadow tracking to this new scheme and we'll no longer need the upper bits in node->count and shrink that back to one byte. Link: http://lkml.kernel.org/r/20161117192945.GA23430@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12fs/fs-writeback.c: remove redundant if checkTahsin Erdogan1-9/+7
b_more_io non-empty check is already preceded by an opposite check. Link: http://lkml.kernel.org/r/1478591249-30641-1-git-send-email-tahsin@google.com Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: replace CURRENT_TIME macroDeepa Dinamani3-4/+6
CURRENT_TIME is not y2038 safe. Use y2038 safe ktime_get_real_seconds() here for timestamps. struct heartbeat_block's hb_seq and deletetion time are already 64 bits wide and accommodate times beyond y2038. Also use y2038 safe ktime_get_real_ts64() for on disk inode timestamps. These are also wide enough to accommodate time64_t. Link: http://lkml.kernel.org/r/1475365298-29236-1-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: use time64_t to represent orphan scan timesDeepa Dinamani3-4/+4
struct timespec is not y2038 safe. Use time64_t which is y2038 safe to represent orphan scan times. time64_t is sufficient here as only the seconds delta times are relevant. Also use appropriate time functions that return time in time64_t format. Time functions now return monotonic time instead of real time as only delta scan times are relevant and these values are not persistent across reboots. The format string for the debug print is still using long as this is only the time elapsed since the last scan and long is sufficient to represent this value. Link: http://lkml.kernel.org/r/1475365138-20567-1-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: fix double put of recount tree in ocfs2_lock_refcount_tree()Ashish Samant1-1/+0
In ocfs2_lock_refcount_tree, if ocfs2_read_refcount_block() returns an error, we do ocfs2_refcount_tree_put twice (once in ocfs2_unlock_refcount_tree and once outside it), thereby reducing the refcount of the refcount tree twice, but we dont delete the tree in this case. This will make refcnt of the tree = 0 and the ocfs2_refcount_tree_put will eventually call ocfs2_mark_lockres_freeing, setting OCFS2_LOCK_FREEING for the refcount_tree->rf_lockres. The error returned by ocfs2_read_refcount_block is propagated all the way back and for next iteration of write, ocfs2_lock_refcount_tree gets the same tree back from ocfs2_get_refcount_tree because we havent deleted the tree. Now we have the same tree, but OCFS2_LOCK_FREEING is set for rf_lockres and eventually, when _ocfs2_lock_refcount_tree is called in this iteration, BUG_ON( __ocfs2_cluster_lock:1395 ERROR: Cluster lock called on freeing lockres T00000000000000000386019775b08d! flags 0x81) is triggerred. Call stack: (loop16,11155,0):ocfs2_lock_refcount_tree:482 ERROR: status = -5 (loop16,11155,0):ocfs2_refcount_cow_hunk:3497 ERROR: status = -5 (loop16,11155,0):ocfs2_refcount_cow:3560 ERROR: status = -5 (loop16,11155,0):ocfs2_prepare_inode_for_refcount:2111 ERROR: status = -5 (loop16,11155,0):ocfs2_prepare_inode_for_write:2190 ERROR: status = -5 (loop16,11155,0):ocfs2_file_write_iter:2331 ERROR: status = -5 (loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: bug expression: lockres->l_flags & OCFS2_LOCK_FREEING (loop16,11155,0):__ocfs2_cluster_lock:1395 ERROR: Cluster lock called on freeing lockres T00000000000000000386019775b08d! flags 0x81 kernel BUG at fs/ocfs2/dlmglue.c:1395! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: tun ocfs2 jbd2 xen_blkback xen_netback xen_gntdev .. sd_mod crc_t10dif ext3 jbd mbcache RIP: __ocfs2_cluster_lock+0x31c/0x740 [ocfs2] RSP: e02b:ffff88017c0138a0 EFLAGS: 00010086 Process loop16 (pid: 11155, threadinfo ffff88017c010000, task ffff8801b5374300) Call Trace: ocfs2_refcount_lock+0xae/0x130 [ocfs2] __ocfs2_lock_refcount_tree+0x29/0xe0 [ocfs2] ocfs2_lock_refcount_tree+0xdd/0x320 [ocfs2] ocfs2_refcount_cow_hunk+0x1cb/0x440 [ocfs2] ocfs2_refcount_cow+0xa9/0x1d0 [ocfs2] ocfs2_prepare_inode_for_refcount+0x115/0x200 [ocfs2] ocfs2_prepare_inode_for_write+0x33b/0x470 [ocfs2] ocfs2_file_write_iter+0x220/0x8c0 [ocfs2] aio_write_iter+0x2e/0x30 Fix this by avoiding the second call to ocfs2_refcount_tree_put() Link: http://lkml.kernel.org/r/1473984404-32011-1-git-send-email-ashish.samant@oracle.com Signed-off-by: Ashish Samant <ashish.samant@oracle.com> Reviewed-by: Eric Ren <zren@suse.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: clean up unused 'page' parameter in ocfs2_write_end_nolock()piaojun3-8/+5
'page' parameter in ocfs2_write_end_nolock() is never used. Link: http://lkml.kernel.org/r/582FD91A.5000902@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2/dlm: clean up deadcode in dlm_master_request_handler()piaojun1-6/+0
When 'dispatch_assert' is set, 'response' must be DLM_MASTER_RESP_YES, and 'res' won't be null, so execution can't reach these two branch. Link: http://lkml.kernel.org/r/58174C91.3040004@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2: delete redundant code and set the node bit into maybe_map directlyGuozhonghua1-4/+1
The variable `set_maybe' is redundant when the mle has been found in the map. So it is ok to set the node_idx into mle's maybe_map directly. Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA4A3D490DD@H3CMLB12-EX.srv.huawei-3com.com Signed-off-by: Guozhonghua <guozhonghua@h3c.com> Reviewed-by: Mark Fasheh <mfasheh@versity.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12ocfs2/dlm: clean up useless BUG_ON default case in dlm_finalize_reco_handler()piaojun1-2/+0
The value of 'stage' must be between 1 and 2, so the switch can't reach the default case. Link: http://lkml.kernel.org/r/57FB5EB2.7050002@huawei.com Signed-off-by: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-12Merge branch 'fscrypt' into devTheodore Ts'o12-128/+214
2016-12-12dax: Fix sleep in atomic contex in grab_mapping_entry()Jan Kara1-8/+7
Commit 642261ac995e: "dax: add struct iomap based DAX PMD support" has introduced unmapping of page tables if huge page needs to be split in grab_mapping_entry(). However the unmapping happens after radix_tree_preload() call which disables preemption and thus unmap_mapping_range() tries to acquire i_mmap_lock in atomic context which is a bug. Fix the problem by moving unmapping before radix_tree_preload() call. Fixes: 642261ac995e01d7837db1f4b90181496f7e6835 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-12f2fs: fix a missing size change in f2fs_setattrYunlei He1-2/+5
This patch fix a missing size change in f2fs_setattr Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-11fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FLDavid Gstir2-4/+4
... to better explain its purpose after introducing in-place encryption without bounce buffer. Signed-off-by: David Gstir <david@sigma-star.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>