aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2015-06-12ext4: use swap() in memswap()Fabian Frederick1-4/+1
Use kernel.h macro definition. Thanks to Julia Lawall for Coccinelle scripting support. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-12ext4: fix race between truncate and __ext4_journalled_writepage()Theodore Ts'o1-4/+19
The commit cf108bca465d: "ext4: Invert the locking order of page_lock and transaction start" caused __ext4_journalled_writepage() to drop the page lock before the page was written back, as part of changing the locking order to jbd2_journal_start -> page_lock. However, this introduced a potential race if there was a truncate racing with the data=journalled writeback mode. Fix this by grabbing the page lock after starting the journal handle, and then checking to see if page had gotten truncated out from under us. This fixes a number of different warnings or BUG_ON's when running xfstests generic/086 in data=journalled mode, including: jbd2_journal_dirty_metadata: vdc-8: bad jh for block 115643: transaction (ee3fe7 c0, 164), jh->b_transaction ( (null), 0), jh->b_next_transaction ( (null), 0), jlist 0 - and - kernel BUG at /usr/projects/linux/ext4/fs/jbd2/transaction.c:2200! ... Call Trace: [<c02b2ded>] ? __ext4_journalled_invalidatepage+0x117/0x117 [<c02b2de5>] __ext4_journalled_invalidatepage+0x10f/0x117 [<c02b2ded>] ? __ext4_journalled_invalidatepage+0x117/0x117 [<c027d883>] ? lock_buffer+0x36/0x36 [<c02b2dfa>] ext4_journalled_invalidatepage+0xd/0x22 [<c0229139>] do_invalidatepage+0x22/0x26 [<c0229198>] truncate_inode_page+0x5b/0x85 [<c022934b>] truncate_inode_pages_range+0x156/0x38c [<c0229592>] truncate_inode_pages+0x11/0x15 [<c022962d>] truncate_pagecache+0x55/0x71 [<c02b913b>] ext4_setattr+0x4a9/0x560 [<c01ca542>] ? current_kernel_time+0x10/0x44 [<c026c4d8>] notify_change+0x1c7/0x2be [<c0256a00>] do_truncate+0x65/0x85 [<c0226f31>] ? file_ra_state_init+0x12/0x29 - and - WARNING: CPU: 1 PID: 1331 at /usr/projects/linux/ext4/fs/jbd2/transaction.c:1396 irty_metadata+0x14a/0x1ae() ... Call Trace: [<c01b879f>] ? console_unlock+0x3a1/0x3ce [<c082cbb4>] dump_stack+0x48/0x60 [<c0178b65>] warn_slowpath_common+0x89/0xa0 [<c02ef2cf>] ? jbd2_journal_dirty_metadata+0x14a/0x1ae [<c0178bef>] warn_slowpath_null+0x14/0x18 [<c02ef2cf>] jbd2_journal_dirty_metadata+0x14a/0x1ae [<c02d8615>] __ext4_handle_dirty_metadata+0xd4/0x19d [<c02b2f44>] write_end_fn+0x40/0x53 [<c02b4a16>] ext4_walk_page_buffers+0x4e/0x6a [<c02b59e7>] ext4_writepage+0x354/0x3b8 [<c02b2f04>] ? mpage_release_unused_pages+0xd4/0xd4 [<c02b1b21>] ? wait_on_buffer+0x2c/0x2c [<c02b5a4b>] ? ext4_writepage+0x3b8/0x3b8 [<c02b5a5b>] __writepage+0x10/0x2e [<c0225956>] write_cache_pages+0x22d/0x32c [<c02b5a4b>] ? ext4_writepage+0x3b8/0x3b8 [<c02b6ee8>] ext4_writepages+0x102/0x607 [<c019adfe>] ? sched_clock_local+0x10/0x10e [<c01a8a7c>] ? __lock_is_held+0x2e/0x44 [<c01a8ad5>] ? lock_is_held+0x43/0x51 [<c0226dff>] do_writepages+0x1c/0x29 [<c0276bed>] __writeback_single_inode+0xc3/0x545 [<c0277c07>] writeback_sb_inodes+0x21f/0x36d ... Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
2015-06-12ext4 crypto: fail the mount if blocksize != pagesizeTheodore Ts'o1-1/+9
We currently don't correctly handle the case where blocksize != pagesize, so disallow the mount in those cases. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-09ext4: Add support FALLOC_FL_INSERT_RANGE for fallocateNamjae Jeon3-52/+292
This patch implements fallocate's FALLOC_FL_INSERT_RANGE for Ext4. 1) Make sure that both offset and len are block size aligned. 2) Update the i_size of inode by len bytes. 3) Compute the file's logical block number against offset. If the computed block number is not the starting block of the extent, split the extent such that the block number is the starting block of the extent. 4) Shift all the extents which are lying between [offset, last allocated extent] towards right by len bytes. This step will make a hole of len bytes at offset. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
2015-06-08jbd2: speedup jbd2_journal_get_[write|undo]_access()Jan Kara2-5/+73
jbd2_journal_get_write_access() and jbd2_journal_get_create_access() are frequently called for buffers that are already part of the running transaction - most frequently it is the case for bitmaps, inode table blocks, and superblock. Since in such cases we have nothing to do, it is unfortunate we still grab reference to journal head, lock the bh, lock bh_state only to find out there's nothing to do. Improving this is a bit subtle though since until we find out journal head is attached to the running transaction, it can disappear from under us because checkpointing / commit decided it's no longer needed. We deal with this by protecting journal_head slab with RCU. We still have to be careful about journal head being freed & reallocated within slab and about exposing journal head in consistent state (in particular b_modified and b_frozen_data must be in correct state before we allow user to touch the buffer). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08jbd2: more simplifications in do_get_write_access()Jan Kara1-71/+59
Check for the simple case of unjournaled buffer first, handle it and bail out. This allows us to remove one if and unindent the difficult case by one tab. The result is easier to read. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08jbd2: simplify error path on allocation failure in do_get_write_access()Jan Kara1-2/+1
We were acquiring bh_state_lock when allocation of buffer failed in do_get_write_access() only to be able to jump to a label that releases the lock and does all other checks that don't make sense for this error path. Just jump into the right label instead. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08jbd2: simplify code flow in do_get_write_access()Jan Kara1-24/+25
needs_copy is set only in one place in do_get_write_access(), just move the frozen buffer copying into that place and factor it out to a separate function to make do_get_write_access() slightly more readable. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08ext4 crypto: fix sparse warnings in fs/ext4/ioctl.cFabian Frederick1-3/+3
[ Added another sparse fix for EXT4_IOC_GET_ENCRYPTION_POLICY while we're at it. --tytso ] Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08ext4: BUG_ON assertion repeated for inode1, not done for inode2David Moore1-1/+1
During a source code review of fs/ext4/extents.c I noted identical consecutive lines. An assertion is repeated for inode1 and never done for inode2. This is not in keeping with the rest of the code in the ext4_swap_extents function and appears to be a bug. Assert that the inode2 mutex is not locked. Signed-off-by: David Moore <dmoorefo@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2015-06-08ext4 crypto: fix ext4_get_crypto_ctx()'s calling convention in ext4_decrypt_oneTheodore Ts'o1-2/+2
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08ext4: return error code from ext4_mb_good_group()Lukas Czerner1-5/+20
Currently ext4_mb_good_group() only returns 0 or 1 depending on whether the allocation group is suitable for use or not. However we might get various errors and fail while initializing new group including -EIO which would never get propagated up the call chain. This might lead to an endless loop at writeback when we're trying to find a good group to allocate from and we fail to initialize new group (read error for example). Fix this by returning proper error code from ext4_mb_good_group() and using it in ext4_mb_regular_allocator(). In ext4_mb_regular_allocator() we will always return only the first occurred error from ext4_mb_good_group() and we only propagate it back to the caller if we do not get any other errors and we fail to allocate any blocks. Note that with other modes than errors=continue, we will fail immediately in ext4_mb_good_group() in case of error, however with errors=continue we should try to continue using the file system, that's why we're not going to fail immediately when we see an error from ext4_mb_good_group(), but rather when we fail to find a suitable block group to allocate from due to an problem in group initialization. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
2015-06-08ext4: try to initialize all groups we can in case of failure on ppc64Lukas Czerner1-3/+6
Currently on the machines with page size > block size when initializing block group buddy cache we initialize it for all the block group bitmaps in the page. However in the case of read error, checksum error, or if a single bitmap is in any way corrupted we would fail to initialize all of the bitmaps. This is problematic because we will not have access to the other allocation groups even though those might be perfectly fine and usable. Fix this by reading all the bitmaps instead of error out on the first problem and simply skip the bitmaps which were either not read properly, or are not valid. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08ext4: verify block bitmap even after fresh initializationLukas Czerner1-2/+2
If we want to rely on the buffer_verified() flag of the block bitmap buffer, we have to set it consistently. However currently if we're initializing uninitialized block bitmap in ext4_read_block_bitmap_nowait() we're not going to set buffer verified at all. We can do this by simply setting the flag on the buffer, but I think it's actually better to run ext4_validate_block_bitmap() to make sure that what we did in the ext4_init_block_bitmap() is right. So run ext4_validate_block_bitmap() even after the block bitmap initialization. Also bail out early from ext4_validate_block_bitmap() if we see corrupt bitmap, since we already know it's corrupt and we do not need to verify that. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-06-08jbd2: revert must-not-fail allocation loops back to GFP_NOFAILMichal Hocko2-23/+8
This basically reverts 47def82672b3 (jbd2: Remove __GFP_NOFAIL from jbd2 layer). The deprecation of __GFP_NOFAIL was a bad choice because it led to open coding the endless loop around the allocator rather than removing the dependency on the non failing allocation. So the deprecation was a clear failure and the reality tells us that __GFP_NOFAIL is not even close to go away. It is still true that __GFP_NOFAIL allocations are generally discouraged and new uses should be evaluated and an alternative (pre-allocations or reservations) should be considered but it doesn't make any sense to lie the allocator about the requirements. Allocator can take steps to help making a progress if it knows the requirements. Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: David Rientjes <rientjes@google.com>
2015-06-03ext4 crypto: allocate bounce pages using GFP_NOWAITTheodore Ts'o2-23/+7
Previously we allocated bounce pages using a combination of alloc_page() and mempool_alloc() with the __GFP_WAIT bit set. Instead, use mempool_alloc() with GFP_NOWAIT. The mempool_alloc() function will try using alloc_pages() initially, and then only use the mempool reserve of pages if alloc_pages() is unable to fulfill the request. This minimizes the the impact on the mm layer when we need to do a large amount of writeback of encrypted files, as Jaeguk Kim had reported that under a heavy fio workload on a system with restricted amounts memory (which unfortunately, includes many mobile handsets), he had observed the the OOM killer getting triggered several times. Using GFP_NOWAIT If the mempool_alloc() function fails, we will retry the page writeback at a later time; the function of the mempool is to ensure that we can writeback at least 32 pages at a time, so we can more efficiently dispatch I/O under high memory pressure situations. In the future we should make this be a tunable so we can determine the best tradeoff between permanently sequestering memory and the ability to quickly launder pages so we can free up memory quickly when necessary. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: release crypto resource on module exitChao Yu1-0/+1
Crypto resource should be released when ext4 module exits, otherwise it will cause memory leak. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: handle unexpected lack of encryption keysTheodore Ts'o3-9/+14
Fix up attempts by users to try to write to a file when they don't have access to the encryption key. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: allocate the right amount of memory for the on-disk symlinkTheodore Ts'o3-21/+37
Previously we were taking the required padding when allocating space for the on-disk symlink. This caused a buffer overrun which could trigger a krenel crash when running fsstress. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: clean up error handling in ext4_fname_setup_filenameTheodore Ts'o1-19/+16
Fix a potential memory leak where fname->crypto_buf.name wouldn't get freed in some error paths, and also make the error handling easier to understand/audit. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: policies may only be set on directoriesTheodore Ts'o1-0/+2
Thanks to Chao Yu <chao2.yu@samsung.com> for pointing out we were missing this check. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: enforce crypto policy restrictions on cross-renamesTheodore Ts'o1-0/+9
Thanks to Chao Yu <chao2.yu@samsung.com> for pointing out the need for this check. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: encrypt tmpfile located in encryption protected directoryTheodore Ts'o3-34/+30
Factor out calls to ext4_inherit_context() and move them to __ext4_new_inode(); this fixes a problem where ext4_tmpfile() wasn't calling calling ext4_inherit_context(), so the temporary file wasn't getting protected. Since the blocks for the tmpfile could end up on disk, they really should be protected if the tmpfile is created within the context of an encrypted directory. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: make sure the encryption info is initialized on opendir(2)Theodore Ts'o1-0/+8
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: set up encryption info for new inodes in ext4_inherit_context()Theodore Ts'o1-0/+1
Set up the encryption information for newly created inodes immediately after they inherit their encryption context from their parent directories. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: fix memory leaks in ext4_encrypted_zerooutTheodore Ts'o1-31/+31
ext4_encrypted_zeroout() could end up leaking a bio and bounce page. Fortunately it's not used much. While we're fixing things up, refactor out common code into the static function alloc_bounce_page() and fix up error handling if mempool_alloc() fails. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: use per-inode tfm structureTheodore Ts'o9-156/+96
As suggested by Herbert Xu, we shouldn't allocate a new tfm each time we read or write a page. Instead we can use a single tfm hanging off the inode's crypt_info structure for all of our encryption needs for that inode, since the tfm can be used by multiple crypto requests in parallel. Also use cmpxchg() to avoid races that could result in crypt_info structure getting doubly allocated or doubly freed. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: require CONFIG_CRYPTO_CTR if ext4 encryption is enabledTheodore Ts'o1-0/+1
On arm64 this is apparently needed for CTS mode to function correctly. Otherwise attempts to use CTS return ENOENT. Change-Id: I732ea9a5157acc76de5b89edec195d0365f4ca63 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-31ext4 crypto: shrink size of the ext4_crypto_ctx structureTheodore Ts'o4-34/+30
Some fields are only used when the crypto_ctx is being used on the read path, some are only used on the write path, and some are only used when the structure is on free list. Optimize memory use by using a union. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4 crypto: get rid of ci_mode from struct ext4_crypt_infoTheodore Ts'o4-15/+12
The ci_mode field was superfluous, and getting rid of it gets rid of an unused hole in the structure. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4 crypto: use slab cachesTheodore Ts'o3-34/+39
Use slab caches the ext4_crypto_ctx and ext4_crypt_info structures for slighly better memory efficiency and debuggability. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4: clean up superblock encryption mode fieldsTheodore Ts'o4-32/+7
The superblock fields s_file_encryption_mode and s_dir_encryption_mode are vestigal, so remove them as a cleanup. While we're at it, allow file systems with both encryption and inline_data enabled at the same time to work correctly. We can't have encrypted inodes with inline data, but there's no reason to prohibit unencrypted inodes from using the inline data feature. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4 crypto: reorganize how we store keys in the inodeTheodore Ts'o11-346/+246
This is a pretty massive patch which does a number of different things: 1) The per-inode encryption information is now stored in an allocated data structure, ext4_crypt_info, instead of directly in the node. This reduces the size usage of an in-memory inode when it is not using encryption. 2) We drop the ext4_fname_crypto_ctx entirely, and use the per-inode encryption structure instead. This remove an unnecessary memory allocation and free for the fname_crypto_ctx as well as allowing us to reuse the ctfm in a directory for multiple lookups and file creations. 3) We also cache the inode's policy information in the ext4_crypt_info structure so we don't have to continually read it out of the extended attributes. 4) We now keep the keyring key in the inode's encryption structure instead of releasing it after we are done using it to derive the per-inode key. This allows us to test to see if the key has been revoked; if it has, we prevent the use of the derived key and free it. 5) When an inode is released (or when the derived key is freed), we will use memset_explicit() to zero out the derived key, so it's not left hanging around in memory. This implies that when a user logs out, it is important to first revoke the key, and then unlink it, and then finally, to use "echo 3 > /proc/sys/vm/drop_caches" to release any decrypted pages and dcache entries from the system caches. 6) All this, and we also shrink the number of lines of code by around 100. :-) Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4 crypto: separate kernel and userspace structure for the keyTheodore Ts'o6-48/+43
Use struct ext4_encryption_key only for the master key passed via the kernel keyring. For internal kernel space users, we now use struct ext4_crypt_info. This will allow us to put information from the policy structure so we can cache it and avoid needing to constantly looking up the extended attribute. We will do this in a spearate patch. This patch is mostly mechnical to make it easier for patch review. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4 crypto: don't allocate a page when encrypting/decrypting file namesTheodore Ts'o5-54/+28
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18ext4 crypto: optimize filename encryptionTheodore Ts'o4-313/+230
Encrypt the filename as soon it is passed in by the user. This avoids our needing to encrypt the filename 2 or 3 times while in the process of creating a filename. Similarly, when looking up a directory entry, encrypt the filename early, or if the encryption key is not available, base-64 decode the file syystem so that the hash value and the last 16 bytes of the encrypted filename is available in the new struct ext4_filename data structure. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-18Linux 4.1-rc4Linus Torvalds1-1/+1
2015-05-18watchdog: Fix merge 'conflict'Peter Zijlstra1-5/+15
Two watchdog changes that came through different trees had a non conflicting conflict, that is, one changed the semantics of a variable but no actual code conflict happened. So the merge appeared fine, but the resulting code did not behave as expected. Commit 195daf665a62 ("watchdog: enable the new user interface of the watchdog mechanism") changes the semantics of watchdog_user_enabled, which thereafter is only used by the functions introduced by b3738d293233 ("watchdog: Add watchdog enable/disable all functions"). There further appears to be a distinct lack of serialization between setting and using watchdog_enabled, so perhaps we should wrap the {en,dis}able_all() things in watchdog_proc_mutex. This patch fixes a s2r failure reported by Michal; which I cannot readily explain. But this does make the code internally consistent again. Reported-and-tested-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-15Documentation: dt: mtd: replace "nor-jedec" binding with "jedec, spi-nor"Brian Norris2-6/+6
In commit 8ff16cf77ce3 ("Documentation: devicetree: m25p80: add "nor-jedec" binding"), we added a generic "nor-jedec" binding to catch all mostly-compatible SPI NOR flash which can be detected via the READ ID opcode (0x9F). This was discussed and reviewed at the time, however objections have come up since then as part of this discussion: http://lkml.kernel.org/g/20150511224646.GJ32500@ld-irv-0074 It seems the parties involved agree that "jedec,spi-nor" does a better job of capturing the fact that this is SPI-specific, not just any NOR flash. This binding was only merged for v4.1-rc1, so it's still OK to change the naming. At the same time, let's move the documentation to a better name. Next up: stop referring to code (drivers/mtd/devices/m25p80.c) from the documentation. Signed-off-by: Brian Norris <computersforpeace@gmail.com> Cc: Marek Vasut <marex@denx.de> Cc: Rafał Miłecki <zajec5@gmail.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: devicetree@vger.kernel.org Acked-by: Stephen Warren <swarren@nvidia.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Mark Rutland <mark.rutland@arm.com>
2015-05-15MIPS: tlb-r4k: Fix PG_ELPA commentJames Hogan1-1/+1
The ELPA bit in PageGrain is all about large *physical* addresses, so correct the reference to "large virtual address" in the comment above where it is set for MIPS64. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: David Daney <ddaney@caviumnetworks.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10038/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-05-15MIPS: Fix up obsolete cpu_set usageEzequiel Garcia1-1/+1
cpu_set was removed (along with a bunch of cpumask helpers) by commit 2f0f267ea072 ("cpumask: remove deprecated functions."). Fix this by replacing cpu_set with cpumask_set_cpu. Without this fix the following error is triggered when CONFIG_MIPS_MT_FPAFF=y. arch/mips/kernel/smp-cps.c: In function 'cps_smp_setup': arch/mips/kernel/smp-cps.c:95:3: error: implicit declaration of function 'cpu_set' [-Werror=implicit-function-declaration] Fixes: 90db024f140d ("MIPS: smp-cps: cpu_set FPU mask if FPU present") Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com> Acked-by: Niklas Cassel <niklass@axis.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9912/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-05-15MAINTAINERS: Add dts entries for some of the Marvell SoCsGregory CLEMENT1-1/+9
Since many releases, the modifications of the mvebu and berlin device tree files are merged through the mvebu subsystem. This patch makes it official in order to help the contributors using the get_maintainer.pl to find the accurate peoples. In the same time, updated the mvebu description which now includes the kirkwood SoCs and new Armada SoCs. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Andrew Lunn <andrew@lunn.ch>
2015-05-15ext4: fix an ext3 collapse range regression in xfstestsTheodore Ts'o1-0/+8
The xfstests test suite assumes that an attempt to collapse range on the range (0, 1) will return EOPNOTSUPP if the file system does not support collapse range. Commit 280227a75b56: "ext4: move check under lock scope to close a race" broke this, and this caused xfstests to fail when run when testing file systems that did not have the extents feature enabled. Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2015-05-14mm, numa: really disable NUMA balancing by default on single node machinesMel Gorman1-1/+1
NUMA balancing is meant to be disabled by default on UMA machines but the check is using nr_node_ids (highest node) instead of num_online_nodes (online nodes). The consequences are that a UMA machine with a node ID of 1 or higher will enable NUMA balancing. This will incur useless overhead due to minor faults with the impact depending on the workload. These are the impact on the stats when running a kernel build on a single node machine whose node ID happened to be 1: vanilla patched NUMA base PTE updates 5113158 0 NUMA huge PMD updates 643 0 NUMA page range updates 5442374 0 NUMA hint faults 2109622 0 NUMA hint local faults 2109622 0 NUMA hint local percent 100 100 NUMA pages migrated 0 0 Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> [3.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14MAINTAINERS: update Jingoo Han's email addressJingoo Han1-5/+5
Change my private email address. Signed-off-by: Jingoo Han <jingoohan1@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14CMA: page_isolation: check buddy before accessing itHui Zhu1-1/+2
I had an issue: Unable to handle kernel NULL pointer dereference at virtual address 0000082a pgd = cc970000 [0000082a] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM PC is at get_pageblock_flags_group+0x5c/0xb0 LR is at unset_migratetype_isolate+0x148/0x1b0 pc : [<c00cc9a0>] lr : [<c0109874>] psr: 80000093 sp : c7029d00 ip : 00000105 fp : c7029d1c r10: 00000001 r9 : 0000000a r8 : 00000004 r7 : 60000013 r6 : 000000a4 r5 : c0a357e4 r4 : 00000000 r3 : 00000826 r2 : 00000002 r1 : 00000000 r0 : 0000003f Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 2cb7006a DAC: 00000015 Backtrace: get_pageblock_flags_group+0x0/0xb0 unset_migratetype_isolate+0x0/0x1b0 undo_isolate_page_range+0x0/0xdc __alloc_contig_range+0x0/0x34c alloc_contig_range+0x0/0x18 This issue is because when calling unset_migratetype_isolate() to unset a part of CMA memory, it try to access the buddy page to get its status: if (order >= pageblock_order) { page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1); buddy_idx = __find_buddy_index(page_idx, order); buddy = page + (buddy_idx - page_idx); if (!is_migrate_isolate_page(buddy)) { But the begin addr of this part of CMA memory is very close to a part of memory that is reserved at boot time (not in buddy system). So add a check before accessing it. [akpm@linux-foundation.org: use conventional code layout] Signed-off-by: Hui Zhu <zhuhui@xiaomi.com> Suggested-by: Laura Abbott <labbott@redhat.com> Suggested-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14uidgid: make uid_valid and gid_valid work with !CONFIG_MULTIUSERJosh Triplett1-2/+2
{u,g}id_valid call {u,g}id_eq, which calls __k{u,g}id_val on both arguments and compares. With !CONFIG_MULTIUSER, __k{u,g}id_val return a constant 0, which makes {u,g}id_valid always return false. Change {u,g}id_valid to compare their argument against -1 instead. That produces identical results in the normal CONFIG_MULTIUSER=y case, but with !CONFIG_MULTIUSER will make {u,g}id_valid constant-fold into "return true;" rather than "return false;". This fixes uses of devpts without CONFIG_MULTIUSER. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com>, Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14kernfs: do not account ino_ida allocations to memcgVladimir Davydov1-1/+8
root->ino_ida is used for kernfs inode number allocations. Since IDA has a layered structure, different IDs can reside on the same layer, which is currently accounted to some memory cgroup. The problem is that each kmem cache of a memory cgroup has its own directory on sysfs (under /sys/fs/kernel/<cache-name>/cgroup). If the inode number of such a directory or any file in it gets allocated from a layer accounted to the cgroup which the cache is created for, the cgroup will get pinned for good, because one has to free all kmem allocations accounted to a cgroup in order to release it and destroy all its kmem caches. That said we must not account layers of ino_ida to any memory cgroup. Since per net init operations may create new sysfs entries directly (e.g. lo device) or indirectly (nf_conntrack creates a new kmem cache per each namespace, which, in turn, creates new sysfs entries), an easy way to reproduce this issue is by creating network namespace(s) from inside a kmem-active memory cgroup. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Thelen <gthelen@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [4.0.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14gfp: add __GFP_NOACCOUNTVladimir Davydov3-1/+8
Not all kmem allocations should be accounted to memcg. The following patch gives an example when accounting of a certain type of allocations to memcg can effectively result in a memory leak. This patch adds the __GFP_NOACCOUNT flag which if passed to kmalloc and friends will force the allocation to go through the root cgroup. It will be used by the next patch. Note, since in case of kmemleak enabled each kmalloc implies yet another allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to gfp_kmemleak_mask. Alternatively, we could introduce a per kmem cache flag disabling accounting for all allocations of a particular kind, but (a) we would not be able to bypass accounting for kmalloc then and (b) a kmem cache with this flag set could not be merged with a kmem cache without this flag, which would increase the number of global caches and therefore fragmentation even if the memory cgroup controller is not used. Despite its generic name, currently __GFP_NOACCOUNT disables accounting only for kmem allocations while user page allocations are always charged. To catch abusing of this flag, a warning is issued on an attempt of passing it to mem_cgroup_try_charge. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Greg Thelen <gthelen@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [4.0.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-05-14tools/vm: fix page-flags buildAndi Kleen1-1/+1
libabikfs.a doesn't exist anymore, so we now need to link with libapi.a. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>