linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-08-19	xfs: fix btree cursor error cleanups	Brian Foster	2	-2/+3
	The btree cursor cleanup function takes an error parameter that affects how buffers are released from the cursor. All buffers are released in the event of error. Several callers do not specify the XFS_BTREE_ERROR flag in the event of error, however. This can cause buffers to hang around locked or with an elevated hold count and thus lead to umount hangs in the event of errors. Fix up the xfs_btree_del_cursor() callers to pass XFS_BTREE_ERROR if the cursor is being torn down due to error. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: clean up root inode properly on mount failure	Brian Foster	1	-0/+2
	The root inode is read as part of the xfs_mountfs() sequence and the reference is dropped in the event of failure after we grab the inode. The reference drop doesn't necessarily free the inode, however. It marks it for reclaim and potentially kicks off the reclaim workqueue. The workqueue is destroyed further up the error path, which means we are subject to crash if the workqueue job runs after this point or a memory leak which is identified if the xfs_inode_zone is destroyed (e.g., on module removal). Both of these outcomes are reproducible via manual instrumentation of a mount error after the root inode xfs_iget() call in xfs_mountfs(). Update the xfs_mountfs() error path to cancel any potential reclaim work items and to run a synchronous inode reclaim if the root inode is marked for reclaim. This ensures that no jobs remain on the queue before it is destroyed and that the root inode is freed before the reclaim mechanism is torn down. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: checksum log record ext headers based on record size	Brian Foster	1	-1/+6
	The first 4 bytes of every basic block in the physical log is stamped with the current lsn. To support this mechanism, the log record header (first block of each new log record) contains space for the original first byte of each log record block before it is replaced with the lsn. The log record header has space for 32k worth of blocks. The version 2 log adds new extended record headers for each additional 32k worth of blocks beyond what is supported by the record header. The log record checksum incorporates the log record header, the extended headers and the record payload. xlog_cksum() checksums the extended headers based on log->l_iclog_heads, which specifies the number of extended headers in a log record based on the log buffer size mount option. The log buffer size is variable, however, and thus means the checksum can be calculated differently based on how a filesystem is mounted. This is problematic if a filesystem crashes and recovery occurs on a subsequent mount using a different log buffer size. For example, crash an active filesystem that is mounted with the default (32k) logbsize, attempt remount/recovery using '-o logbsize=64k' and the mount fails on or warns about log checksum failures. To avoid this problem, update xlog_cksum() to calculate the checksum based on the size of the log buffer according to the log record. The size is already included in the h_size field of the log record header and thus is available at log recovery time. Extended log record headers are also only written when the log record is large enough to require them. This makes checksum calculation of log records consistent with the extended record header mechanism as well as how on-disk records are checksummed with various log buffer size mount options. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: fix broken icreate log item cancellation	Brian Foster	1	-12/+37
	Inode cluster buffers are invalidated and cancelled when inode chunks are freed to notify log recovery that previous logged updates to the metadata buffer should be skipped. This ensures that log recovery does not overwrite buffers that might have already been reused. On v4 filesystems, inode chunk allocation and inode updates are logged via the cluster buffers and thus cancellation is easily detected via buffer cancellation items. v5 filesystems use the new icreate transaction, which uses logical logging and ordered buffers to log a full inode chunk allocation at once. The resulting icreate item often spans multiple inode cluster buffers. Log recovery checks for cancelled buffers when processing icreate log items, but it has a couple problems. First, it uses the full length of the inode chunk rather than the cluster size. Second, it uses the length in FSB units rather than BB units. Either of these problems prevent icreate recovery from identifying cancelled buffers and thus inode initialization proceeds unconditionally. Update xlog_recover_do_icreate_pass2() to iterate the icreate range in cluster sized increments and check each increment for cancellation. Since icreate is currently only used for the minimum atomic inode chunk allocation, we expect that either all or none of the buffers will be cancelled. Cancel the icreate if at least one buffer is cancelled to avoid making a bad situation worse by initializing a partial inode chunk, but detect such anomalies and warn the user. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: icreate log item recovery and cancellation tracepoints	Brian Foster	2	-1/+38
	Various log items have recovery tracepoints to identify whether a particular log item is recovered or cancelled. Add the equivalent tracepoints for the icreate transaction. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: don't leave EFIs on AIL on mount failure	Brian Foster	5	-22/+100
	Log recovery occurs in two phases at mount time. In the first phase, EFIs and EFDs are processed and potentially cancelled out. EFIs without EFD objects are inserted into the AIL for processing and recovery in the second phase. xfs_mountfs() runs various other operations between the phases and is thus subject to failure. If failure occurs after the first phase but before the second, pending EFIs sit on the AIL, pin it and cause the mount to hang. Update the mount sequence to ensure that pending EFIs are cancelled in the event of failure. Add a recovery cancellation mechanism to iterate the AIL and cancel all EFI items when requested. Plumb cancellation support through the log mount finish helper and update xfs_mountfs() to invoke cancellation in the event of failure after recovery has started. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: use EFI refcount consistently in log recovery	Brian Foster	2	-57/+43
	The EFI is initialized with a reference count of 2. One for the EFI to ensure the item makes it to the AIL and one for the subsequently created EFD to release the EFI once the EFD is committed. Log recovery uses the EFI in a similar manner, but implements a hack to remove both references in one call once the EFD is handled. Update log recovery to use EFI reference counting in a manner consistent with the log. When an EFI is encountered during recovery, an EFI item is allocated and inserted to the AIL directly. Since the EFI reference is typically dropped when the EFI is unpinned and this is analogous with AIL insertion, drop the EFI reference at this point. When a corresponding EFD is encountered in the log, this indicates that the extents were freed, no processing is required and the EFI can be dropped. Update xlog_recover_efd_pass2() to simply drop the EFD reference at this point rather than open code the AIL removal and EFI free. Remaining EFIs (i.e., with no corresponding EFD) are processed in xlog_recover_finish(). An EFD transaction is allocated and the extents are freed, which transfers ownership of the EFI reference to the EFD item in the log. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: ensure EFD trans aborts on log recovery extent free failure	Brian Foster	4	-30/+36
	Log recovery attempts to free extents with leftover EFIs in the AIL after initial processing. If the extent free fails (e.g., due to unrelated fs corruption), the transaction is cancelled, though it might not be dirtied at the time. If this is the case, the EFD does not abort and thus does not release the EFI. This can lead to hangs as the EFI pins the AIL. Update xlog_recover_process_efi() to log the EFD in the transaction before xfs_free_extent() errors are handled to ensure the transaction is dirty, aborts the EFD and releases the EFI on error. Since this is a requirement for EFD processing (and consistent with xfs_bmap_finish()), update the EFD logging helper to do the extent free and unconditionally log the EFD. This encodes the required EFD logging behavior into the helper and reduces the likelihood of errors down the road. [dchinner: re-add xfs_alloc.h to xfs_log_recover.c to fix build failure.] Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: fix efi/efd error handling to avoid fs shutdown hangs	Brian Foster	3	-67/+111
	Freeing an extent in XFS involves logging an EFI (extent free intention), freeing the actual extent, and logging an EFD (extent free done). The EFI object is created with a reference count of 2: one for the current transaction and one for the subsequently created EFD. Under normal circumstances, the first reference is dropped when the EFI is unpinned and the second reference is dropped when the EFD is committed to the on-disk log. In event of errors or filesystem shutdown, there are various potential cleanup scenarios depending on the state of the EFI/EFD. The cleanup scenarios are confusing and racy, as demonstrated by the following test sequence: # mount $dev $mnt # fsstress -d $mnt -n 99999 -p 16 -z -f fallocate=1 \ -f punch=1 -f creat=1 -f unlink=1 & # sleep 5 # killall -9 fsstress; wait # godown -f $mnt # umount ... in which the final umount can hang due to the AIL being pinned indefinitely by one or more EFI items. This can occur due to several conditions. For example, if the shutdown occurs after the EFI is committed to the on-disk log and the EFD committed to the CIL, but before the EFD committed to the log, the EFD iop_committed() abort handler does not drop its reference to the EFI. Alternatively, manual error injection in the xfs_bmap_finish() codepath shows that if an error occurs after the EFI transaction is committed but before the EFD is constructed and logged, the EFI is never released from the AIL. Update the EFI/EFD item handling code to use a more straightforward and reliable approach to error handling. If an error occurs after the EFI transaction is committed and before the EFD is constructed, release the EFI explicitly from xfs_bmap_finish(). If the EFI transaction is cancelled, release the EFI in the unlock handler. Once the EFD is constructed, it is responsible for releasing the EFI under any circumstances (including whether the EFI item aborts due to log I/O error). Update the EFD item handlers to release the EFI if the transaction is cancelled or aborts due to log I/O error. Finally, update xfs_bmap_finish() to log at least one EFD extent to the transaction before xfs_free_extent() errors are handled to ensure the transaction is dirty and EFD item error handling is triggered. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: return committed status from xfs_trans_roll()	Brian Foster	2	-2/+14
	Some callers need to make error handling decisions based on whether the current transaction successfully committed or not. Rename xfs_trans_roll(), add a new parameter and provide a wrapper to preserve existing callers. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19	xfs: disentagle EFI release from the extent count	Brian Foster	4	-14/+11
	Release of the EFI either occurs based on the reference count or the extent count. The extent count used is either the count tracked in the EFI or EFD, depending on the particular situation. In either case, the count is initialized to the final value and thus always matches the current efi_next_extent value once the EFI is completely constructed. For example, the EFI extent count is increased as the extents are logged in xfs_bmap_finish() and the full free list is always completely processed. Therefore, the count is guaranteed to be complete once the EFI transaction is committed. The EFD uses the efd_nextents counter to release the EFI. This counter is initialized to the count of the EFI when the EFD is created. Thus the EFD, as currently used, has no concept of partial EFI release based on extent count. Given that the EFI extent count is always released in whole, use of the extent count for reference counting is unnecessary. Remove this level of the API and release the EFI based on the core reference count. The efi_next_extent counter remains because it is still used to track the slot to log the next extent to free. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-07-12	Linux 4.2-rc2	Linus Torvalds	1	-1/+1

2015-07-12	Revert "drm/i915: Use crtc_state->active in primary check_plane func"	Linus Torvalds	1	-1/+1
	This reverts commit dec4f799d0a4c9edae20512fa60b0a36f3299ca2. Jörg Otte reports a NULL pointder dereference due to this commit, as 'crtc_state' very much can be NULL: crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state, intel_crtc) : NULL; So the change to test 'crtc_state->base.active' cannot possibly be correct as-is. There may be some other minimal fix (like just checking crtc_state for NULL), but I'm just reverting it now for the rc2 release, and people like Daniel Vetter who actually know this code will figure out what the right solution is in the longer term. Reported-and-bisected-by: Jörg Otte <jrg.otte@gmail.com> Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-12	freeing unlinked file indefinitely delayed	Al Viro	1	-2/+5
	Normally opening a file, unlinking it and then closing will have the inode freed upon close() (provided that it's not otherwise busy and has no remaining links, of course). However, there's one case where that does not happen. Namely, if you open it by fhandle with cold dcache, then unlink() and close(). In normal case you get d_delete() in unlink(2) notice that dentry is busy and unhash it; on the final dput() it will be forcibly evicted from dcache, triggering iput() and inode removal. In this case, though, we end up with two dentries - disconnected (created by open-by-fhandle) and regular one (used by unlink()). The latter will have its reference to inode dropped just fine, but the former will not - it's considered hashed (it is on the ->s_anon list), so it will stay around until the memory pressure will finally do it in. As the result, we have the final iput() delayed indefinitely. It's trivial to reproduce - void flush_dcache(void) { system("mount -o remount,rw /"); } static char buf[20 * 1024 * 1024]; main() { int fd; union { struct file_handle f; char buf[MAX_HANDLE_SZ]; } x; int m; x.f.handle_bytes = sizeof(x); chdir("/root"); mkdir("foo", 0700); fd = open("foo/bar", O_CREAT \| O_RDWR, 0600); close(fd); name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0); flush_dcache(); fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR); unlink("foo/bar"); write(fd, buf, sizeof(buf)); system("df ."); /* 20Mb eaten / close(fd); system("df ."); / should've freed those 20Mb / flush_dcache(); system("df ."); / should be the same as #2 */ } will spit out something like Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 322023 303843 1131 100% / Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 322023 303843 1131 100% / Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 322023 283282 21692 93% / - inode gets freed only when dentry is finally evicted (here we trigger than by remount; normally it would've happened in response to memory pressure hell knows when). Cc: stable@vger.kernel.org # v2.6.38+; earlier ones need s/kill_it/unhash_it/ Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-12	fix a braino in ovl_d_select_inode()	Al Viro	1	-0/+3
	when opening a directory we want the overlayfs inode, not one from the topmost layer. Reported-By: Andrey Jr. Melnikov <temnota.am@gmail.com> Tested-By: Andrey Jr. Melnikov <temnota.am@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-12	9p: don't leave a half-initialized inode sitting around	Al Viro	2	-4/+2
	Cc: stable@vger.kernel.org # all branches Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-07-11	tick/broadcast: Prevent NULL pointer dereference	Thomas Gleixner	1	-8/+10
	Dan reported that the recent changes to the broadcast code introduced a potential NULL dereference. Add the proper check. Fixes: e0454311903d "tick/broadcast: Sanity check the shutdown of the local clock_event" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-07-10	selinux: fix mprotect PROT_EXEC regression caused by mm change	Stephen Smalley	1	-1/+2
	commit 66fc13039422ba7df2d01a8ee0873e4ef965b50b ("mm: shmem_zero_setup skip security check and lockdep conflict with XFS") caused a regression for SELinux by disabling any SELinux checking of mprotect PROT_EXEC on shared anonymous mappings. However, even before that regression, the checking on such mprotect PROT_EXEC calls was inconsistent with the checking on a mmap PROT_EXEC call for a shared anonymous mapping. On a mmap, the security hook is passed a NULL file and knows it is dealing with an anonymous mapping and therefore applies an execmem check and no file checks. On a mprotect, the security hook is passed a vma with a non-NULL vm_file (as this was set from the internally-created shmem file during mmap) and therefore applies the file-based execute check and no execmem check. Since the aforementioned commit now marks the shmem zero inode with the S_PRIVATE flag, the file checks are disabled and we have no checking at all on mprotect PROT_EXEC. Add a test to the mprotect hook logic for such private inodes, and apply an execmem check in that case. This makes the mmap and mprotect checking consistent for shared anonymous mappings, as well as for /dev/zero and ashmem. Cc: <stable@vger.kernel.org> # 4.1.x Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-10	parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results	John David Anglin	5	-168/+212
	The increased use of pdtlb/pitlb instructions seemed to increase the frequency of random segmentation faults building packages. Further, we had a number of cases where TLB inserts would repeatedly fail and all forward progress would stop. The Haskell ghc package caused a lot of trouble in this area. The final indication of a race in pte handling was this syslog entry on sibaris (C8000): swap_free: Unused swap offset entry 00000004 BUG: Bad page map in process mysqld pte:00000100 pmd:019bbec5 addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464 CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1 Backtrace: [<0000000040173eb0>] show_stack+0x20/0x38 [<0000000040444424>] dump_stack+0x9c/0x110 [<00000000402a0d38>] print_bad_pte+0x1a8/0x278 [<00000000402a28b8>] unmap_single_vma+0x3d8/0x770 [<00000000402a4090>] zap_page_range+0xf0/0x198 [<00000000402ba2a4>] SyS_madvise+0x404/0x8c0 Note that the pte value is 0 except for the accessed bit 0x100. This bit shouldn't be set without the present bit. It should be noted that the madvise system call is probably a trigger for many of the random segmentation faults. In looking at the kernel code, I found the following problems: 1) The pte_clear define didn't take TLB lock when clearing a pte. 2) We didn't test pte present bit inside lock in exception support. 3) The pte and tlb locks needed to merged in order to ensure consistency between page table and TLB. This also has the effect of serializing TLB broadcasts on SMP systems. The attached change implements the above and a few other tweaks to try to improve performance. Based on the timing code, TLB purges are very slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it beneficial to test the split_tlb variable to avoid duplicate purges. Probably, all PA 2.0 machines have combined TLBs. I dropped using __flush_tlb_range in flush_tlb_mm as I realized all applications and most threads have a stack size that is too large to make this useful. I added some comments to this effect. Since implementing 1 through 3, I haven't had any random segmentation faults on mx3210 (rp3440) in about one week of building code and running as a Debian buildd. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Helge Deller <deller@gmx.de>
2015-07-10	stifb: Implement hardware accelerated copyarea	Alex Ivanov	1	-2/+38
	This patch adds hardware assisted scrolling. The code is based upon the following investigation: https://parisc.wiki.kernel.org/index.php/NGLE#Blitter A simple 'time ls -la /usr/bin' test shows 1.6x speed increase over soft copy and 2.3x increase over FBINFO_READS_FAST (prefer soft copy over screen redraw) on Artist framebuffer. Signed-off-by: Alex Ivanov <lausgans@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
2015-07-10	nfit: add support for NVDIMM "latch" flag	Ross Zwisler	2	-1/+37
	Add support in the NFIT BLK I/O path for the "latch" flag defined in the "Get Block NVDIMM Flags" _DSM function: http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf This flag requires the driver to read back the command register after it is written in the block I/O path. This ensures that the hardware has fully processed the new command and moved the aperture appropriately. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-10	nfit: update block I/O path to use PMEM API	Ross Zwisler	2	-12/+100
	Update the nfit block I/O path to use the new PMEM API and to adhere to the read/write flows outlined in the "NVDIMM Block Window Driver Writer's Guide": http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf This includes adding support for targeted NVDIMM flushes called "flush hints" in the ACPI 6.0 specification: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf For performance and media durability the mapping for a BLK aperture is moved to a write-combining mapping which is consistent with memcpy_to_pmem() and wmb_blk(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-10	tools/testing/nvdimm: add mock acpi_nfit_flush_address entries to nfit_test	Dan Williams	3	-2/+71
	In preparation for fixing the BLK path to properly use "directed pcommit" enable the unit test infrastructure to emit mock "flush" tables. Writes to these flush addresses trigger a memory controller to flush its internal buffers to persistent media, similar to the x86 "pcommit" instruction. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-10	tools/testing/nvdimm: fix return code for unimplemented commands	Dan Williams	1	-1/+1
	The implementation for the new "DIMM Flags" DSM relies on the -ENOTTY return code to indicate that the flags are unimplimented and to fall back to a safe default. As is the -ENXIO error code erroneoously indicates to fail enabling a BLK region. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-10	tools/testing/nvdimm: mock ioremap_wt	Dan Williams	2	-0/+7
	In the 4.2-rc1 merge the default_memremap_pmem() implementation switched from ioremap_nocache() to ioremap_wt(). Add it to the list of mocked routines to restore the ability to run the unit tests. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-10	pmem: add maintainer for include/linux/pmem.h	Ross Zwisler	1	-0/+1
	The file include/linux/pmem.h was recently created to hold the PMEM API, and is logically part of the PMEM driver. Add an entry for this file to MAINTAINERS. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-07-10	Revert "Input: synaptics - allocate 3 slots to keep stability in image sensors"	Dmitry Torokhov	1	-1/+1
	This reverts commit 63c4fda3c0bb841b1aad1298fc7fe94058fc79f8 as it causes issues with detecting 3-finger taps. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=100481 Cc: stable@vger.kernel.org Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2015-07-10	arm64: entry32: remove pointless register assignment	Mark Rutland	1	-2/+0
	We currently set x27 in compat_sys_sigreturn_wrapper and compat_sys_rt_sigreturn_wrapper, similarly to what we do with r8/why on 32-bit ARM, in an attempt to prevent sigreturns from being restarted. However, on arm64 we have always used pt_regs::syscallno for syscall restarting (for both native and compat tasks), and x27 is never inspected again before being overwritten in kernel_exit. This patch removes the pointless register assignments. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-07-10	MIPS: O32: Use compat_sys_getsockopt.	Ralf Baechle	1	-1/+1
	We were using the native syscall and that results in subtle breakage. This is the same issue as fixed in 077d0e65618f27b2199d622e12ada6d8f3dbd862 (MIPS: N32: Use compat getsockopt syscall) but that commit did fix it only for N32. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=100291
2015-07-10	MIPS: c-r4k: Extend way_string array	Paul Burton	1	-1/+3
	The L2 cache in the I6400 core has 16 ways, so extend the way_string array to take such caches into account. [ralf@linux-mips.org: Other already supported CPUs are free to support more than 8 ways of cache as well.] Signed-off-by: Paul Burton <paul.burton@imgtec.com> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10640/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-10	MIPS: Pistachio: Support CDMM & Fast Debug Channel	James Hogan	2	-1/+12
	Implement the mips_cdmm_phys_base() platform callback to provide a default Common Device Memory Map (CDMM) physical base address for the Pistachio SoC. This allows the CDMM in each VPE to be configured and probed for devices, such as the Fast Debug Channel (FDC). The physical address chosen is just below the default CPC address, which appears to also be unallocated. The FDC IRQ is also usable on Pistachio, and is routed through the GIC, so implement the get_c0_fdc_int() platform callback using gic_get_c0_fdc_int(), so the FDC driver doesn't have to fall back to polling. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: James Hartley <james.hartley@imgtec.com> Cc: linux-mips@linux-mips.org Reviewed-by: Andrew Bresticker <abrestic@chromium.org> Patchwork: http://patchwork.linux-mips.org/patch/9749/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-10	MIPS: Malta: Make GIC FDC IRQ workaround Malta specific	James Hogan	2	-17/+13
	Wider testing reveals that the Fast Debug Channel (FDC) interrupt is routed through the GIC just fine on Pistachio SoC, even though it contains interAptiv cores. Clearly the FDC interrupt routing problems previously observed on interAptiv and proAptiv cores are specific to the Malta FPGA bitstreams. Move the workaround for interAptiv and proAptiv out of gic_get_c0_fdc_int() in the GIC irqchip driver into Malta's get_c0_fdc_int() platform callback, to allow the Pistachio SoC to use the FDC interrupt. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: linux-mips@linux-mips.org Reviewed-by: Andrew Bresticker <abrestic@chromium.org> Cc: James Hartley <james.hartley@imgtec.com> Patchwork: http://patchwork.linux-mips.org/patch/9748/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-10	MIPS: c-r4k: Fix cache flushing for MT cores	Markos Chandras	3	-4/+55
	MT_SMP is not the only SMP option for MT cores. The MT_SMP option allows more than one VPE per core to appear as a secondary CPU in the system. Because of how CM works, it propagates the address-based cache ops to the secondary cores but not the index-based ones. Because of that, the code does not use IPIs to flush the L1 caches on secondary cores because the CM would have done that already. However, the CM functionality is independent of the type of SMP kernel so even in non-MT kernels, IPIs are not necessary. As a result of which, we change the conditional to depend on the CM presence. Moreover, since VPEs on the same core share the same L1 caches, there is no need to send an IPI on all of them so we calculate a suitable cpumask with only one VPE per core. Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Cc: <stable@vger.kernel.org> # 3.15+ Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10654/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2015-07-10	cxl: Check if afu is not null in cxl_slbia	Daniel Axtens	1	-1/+1
	The pointer to an AFU in the adapter's list of AFUs can be null if we're in the process of removing AFUs. The afu_list_lock doesn't guard against this. Say we have 2 slices, and we're in the process of removing cxl. - We remove the AFUs in order (see cxl_remove). In cxl_remove_afu for AFU 0, we take the lock, set adapter->afu[0] = NULL, and release the lock. - Then we get an slbia. In cxl_slbia we take the lock, and set afu = adapter->afu[0], which is NULL. - Therefore our attempt to check afu->enabled will blow up. Therefore, check if afu is a null pointer before dereferencing it. Cc: stable@vger.kernel.org Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-09	hpfs: hpfs_error: Remove static buffer, use vsprintf extension %pV instead	Joe Perches	1	-4/+7
	Removing unnecessary static buffers is good. Use the vsprintf %pV extension instead. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Cc: stable@vger.kernel.org # v2.6.36+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	hpfs: kstrdup() out of memory handling	Sanidhya Kashyap	1	-2/+5
	There is a possibility of nothing being allocated to the new_opts in case of memory pressure, therefore return ENOMEM for such case. Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	hpfs: Remove unessary cast	Firo Yang	1	-1/+1
	Avoid a pointless kmem_cache_alloc() return value cast in fs/hpfs/super.c::hpfs_alloc_inode() Signed-off-by: Firo Yang <firogm@gmail.com> Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	hpfs: add fstrim support	Mikulas Patocka	5	-0/+128
	This patch adds support for fstrim to the HPFS filesystem. Signed-off-by: Mikulas Patocka <mikulas@twibright.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	i2c: Mark instantiated device nodes with OF_POPULATE	Pantelis Antoniou	1	-1/+15
	Mark (and unmark) device nodes with the POPULATE flag as appropriate. This is required to avoid multi probing when using I2C and device overlays containing a mux. This patch is also more careful with the release of the adapter device which caused a deadlock with muxes, and does not break the build on !OF since the node flag accessors are not defined then. Signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-07-09	i2c: jz4780: Fix return value if probe fails	Axel Lin	1	-7/+8
	Current code returns 0 if fails to read clock-frequency DT property, fix it. Also add checking return value of clk_prepare_enable and propagate return value of devm_request_irq. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-07-09	i2c: xgene-slimpro: Fix missing mbox_free_channel call in probe error path	Axel Lin	1	-0/+1
	Free requested mailbox channel before return error. Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-07-09	i2c: I2C_MT65XX should depend on HAS_DMA	Geert Uytterhoeven	1	-0/+1
	If NO_DMA=y: ERROR: "dma_unmap_single" [drivers/i2c/busses/i2c-mt65xx.ko] undefined! ERROR: "dma_mapping_error" [drivers/i2c/busses/i2c-mt65xx.ko] undefined! ERROR: "dma_map_single" [drivers/i2c/busses/i2c-mt65xx.ko] undefined! Add a dependency on HAS_DMA to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-07-09	ioctl_compat: handle FITRIM	Mikulas Patocka	6	-7/+1
	The FITRIM ioctl has the same arguments on 32-bit and 64-bit architectures, so we can add it to the list of compatible ioctls and drop it from compat_ioctl method of various filesystems. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ted Ts'o <tytso@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	selinux: don't waste ebitmap space when importing NetLabel categories	Paul Moore	1	-0/+6
	At present we don't create efficient ebitmaps when importing NetLabel category bitmaps. This can present a problem when comparing ebitmaps since ebitmap_cmp() is very strict about these things and considers these wasteful ebitmaps not equal when compared to their more efficient counterparts, even if their values are the same. This isn't likely to cause problems on 64-bit systems due to a bit of luck on how NetLabel/CIPSO works and the default ebitmap size, but it can be a problem on 32-bit systems. This patch fixes this problem by being a bit more intelligent when importing NetLabel category bitmaps by skipping over empty sections which should result in a nice, efficient ebitmap. Cc: stable@vger.kernel.org # 3.17 Signed-off-by: Paul Moore <pmoore@redhat.com>
2015-07-09	Fix firmware loader uevent buffer NULL pointer dereference	Linus Torvalds	1	-3/+13
	The firmware class uevent function accessed the "fw_priv->buf" buffer without the proper locking and testing for NULL. This is an old bug (looks like it goes back to 2012 and commit 1244691c73b2: "firmware loader: introduce firmware_buf"), but for some reason it's triggering only now in 4.2-rc1. Shuah Khan is trying to bisect what it is that causes this to trigger more easily, but in the meantime let's just fix the bug since others are hitting it too (at least Ingo reports having seen it as well). Reported-and-tested-by: Shuah Khan <shuahkh@osg.samsung.com> Acked-by: Ming Lei <ming.lei@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	mm: avoid setting up anonymous pages into file mapping	Kirill A. Shutemov	1	-7/+13
	Reading page fault handler code I've noticed that under right circumstances kernel would map anonymous pages into file mappings: if the VMA doesn't have vm_ops->fault() and the VMA wasn't fully populated on ->mmap(), kernel would handle page fault to not populated pte with do_anonymous_page(). Let's change page fault handler to use do_anonymous_page() only on anonymous VMA (->vm_ops == NULL) and make sure that the VMA is not shared. For file mappings without vm_ops->fault() or shred VMA without vm_ops, page fault on pte_none() entry would lead to SIGBUS. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Willy Tarreau <w@1wt.eu> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-07-09	MAINTAINERS: add secondary tree for ceph modules	Sage Weil	1	-0/+3
	The Ceph kernel code is primarily developed in the github tree, and only pushed to the korg tree before going to Linus. If Sage is unavailable and another maintainer needs to push something upstream, pull requests may originate from the github tree instead of Sage's korg tree. Signed-off-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-07-09	MAINTAINERS: update ceph entries	Sage Weil	1	-4/+15
	- The Ceph common code is used by both fs/ceph and drivers/block/rbd. Add a separate maintainers entry. - Add Ilya as libceph maintainer and cephfs submaintainer. - Attribute Documentation/ABI/testing/sysfs-bus-rbd to rbd. - ceph-devel@vger.kernel.org should be L, not M in rbd entry. Signed-off-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-07-09	libceph: treat sockaddr_storage with uninitialized family as blank	Ilya Dryomov	1	-7/+7
	addr_is_blank() should return true if family is neither AF_INET nor AF_INET6. This is what its counterpart entity_addr_t::is_blank_ip() is doing and it is the right thing to do: in process_banner() we check if our address is blank and if it is "learn" it from our peer. As it is, we never learn our address and always send out a blank one. This goes way back to ceph.git commit dd732cbfc1c9 ("use sockaddr_storage; and some ipv6 support groundwork") from 2009. While at at, do not open-code ipv6_addr_any() and use INADDR_ANY constant instead of 0. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
2015-07-09	libceph: enable ceph in a non-default network namespace	Ilya Dryomov	3	-7/+22
	Grab a reference on a network namespace of the 'rbd map' (in case of rbd) or 'mount' (in case of ceph) process and use that to open sockets instead of always using init_net and bailing if network namespace is anything but init_net. Be careful to not share struct ceph_client instances between different namespaces and don't add any code in the !CONFIG_NET_NS case. This is based on a patch from Hong Zhiguo <zhiguohong@tencent.com>. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>