linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2014-09-26	tty: serial: 8250_core: remove UART_IER_RDI in serial8250_stop_rx()	Sebastian Andrzej Siewior	1	-1/+1
	serial8250_do_startup() adds UART_IER_RDI and UART_IER_RLSI to ier. serial8250_stop_rx() should remove both. This is what the serial-omap driver has been doing and is now moved to the 8250-core since it does no look to be that omap specific. Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-26	tty: serial: 8250_core: use the ->line argument as a hint in serial8250_find_match_or_unused()	Sebastian Andrzej Siewior	1	-0/+5
	Tony noticed that the old omap-serial driver picked the uart "number" based on the hint given from device tree or platform device's id. The 8250 based omap driver doesn't do this because the core code does not honour the ->line argument which is passed by the driver. This patch aims to keep the same behaviour as with omap-serial. The function will first try to use the line suggested ->line argument and then fallback to the old strategy in case the port is taken. That means the the third uart will always be ttyS2 even if the previous two have not been enabled in DT. Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-26	tty: serial: 8250_core: read only RX if there is something in the FIFO	Sebastian Andrzej Siewior	1	-5/+6
	The serial8250_do_startup() function unconditionally clears the interrupts and for that it reads from the RX-FIFO without checking if there is a byte in the FIFO or not. This works fine on OMAP4+ HW like AM335x or DRA7. OMAP3630 ES1.1 (which means probably all OMAP3 and earlier) does not like this: \|Unhandled fault: external abort on non-linefetch (0x1028) at 0xfb020000 \|Internal error: : 1028 [#1] ARM \|Modules linked in: \|CPU: 0 PID: 1 Comm: swapper Not tainted 3.16.0-00022-g7edcb57-dirty #1213 \|task: de0572c0 ti: de058000 task.ti: de058000 \|PC is at mem32_serial_in+0xc/0x1c \|LR is at serial8250_do_startup+0x220/0x85c \|Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel \|Control: 10c5387d Table: 80004019 DAC: 00000015 \|[<c03051d4>] (mem32_serial_in) from [<c0307fe8>] (serial8250_do_startup+0x220/0x85c) \|[<c0307fe8>] (serial8250_do_startup) from [<c0309e00>] (omap_8250_startup+0x5c/0xe0) \|[<c0309e00>] (omap_8250_startup) from [<c030863c>] (serial8250_startup+0x18/0x2c) \|[<c030863c>] (serial8250_startup) from [<c030394c>] (uart_startup+0x78/0x1d8) \|[<c030394c>] (uart_startup) from [<c0304678>] (uart_open+0xe8/0x114) \|[<c0304678>] (uart_open) from [<c02e9e10>] (tty_open+0x1a8/0x5a4) Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-26	tty: serial: 8250_core: add run time pm	Sebastian Andrzej Siewior	3	-13/+122
	While comparing the OMAP-serial and the 8250 part of this I noticed that the latter does not use run time-pm. Here are the pieces. It is basically a get before first register access and a last_busy + put after last access. This has to be enabled from userland _and_ UART_CAP_RPM is required for this. The runtime PM can usually work transparently in the background however there is one exception to this: After serial8250_tx_chars() completes there still may be unsent bytes in the FIFO (depending on CPU speed vs baud rate + flow control). Even if the TTY-buffer is empty we do not want RPM to disable the device because it won't send the remaining bytes. Instead we leave serial8250_tx_chars() with RPM enabled and wait for the FIFO empty interrupt. Once we enter serial8250_tx_chars() with an empty buffer we know that the FIFO is empty and since we are not going to send anything, we can disable the device. That xchg() is to ensure that serial8250_tx_chars() can be called multiple times and only the first invocation will actually invoke the runtime PM function. So that the last invocation of __stop_tx() will disable runtime pm. NOTE: do not enable RPM on the device unless you know what you do! If the device goes idle, it won't be woken up by incomming RX data _unless_ there is a wakeup irq configured which is usually the RX pin configure for wakeup via the reset module. The RX activity will then wake up the device from idle. However the first character is garbage and lost. The following bytes will be received once the device is up in time. On the beagle board xm (omap3) it takes approx 13ms from the first wakeup byte until the first byte that is received properly if the device was in core-off. v5…v8: - drop RPM from serial8250_set_mctrl() it will be used in restore path which already has RPM active and holds dev->power.lock v4…v5: - add a wrapper around rpm function and introduce UART_CAP_RPM to ensure RPM put is invoked after the TX FIFO is empty. v3…v4: - added runtime to the console code - removed device_may_wakeup() from serial8250_set_sleep() Cc: mika.westerberg@linux.intel.com Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-26	tty: serial: 8250_core: allow to set ->throttle / ->unthrottle callbacks	Sebastian Andrzej Siewior	2	-0/+16
	The OMAP UART provides support for HW assisted flow control. What is missing is the support to throttle / unthrottle callbacks which are used by the omap-serial driver at the moment. This patch adds the callbacks. It should be safe to add them since they are only invoked from the serial_core (uart_throttle()) if the feature flags are set. Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-25	tty: Fix width of unsigned long bitfield padding	Peter Hurley	1	-2/+2
	Commit c545b66c6922b002b5fe224a6eaec58c913650b5, 'tty: Serialize tcflow() with other tty flow control changes' and commit 99416322dd16b810ba74098cc50ef2a844091d35, 'tty: Workaround Alpha non-atomic byte storage in tty_struct' work around compiler bugs and non-atomic storage on multiple arches by padding bitfields out to the declared type which is unsigned long. However, the width varies by arch. Pad bitfields to actual width of unsigned long (which is BITS_PER_LONG). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Workaround Alpha non-atomic byte storage in tty_struct	Peter Hurley	1	-2/+3
	The Alpha EV4/EV5 cpus can corrupt adjacent byte and short data because those cpus use RMW to store byte and short data. Thus, concurrent adjacent byte stores could become corrupted, if serialized by a different lock. tty_struct uses different locks to protect certain fields within the structure, and thus is vulnerable to byte stores which are not atomic. Merge the ->ctrl_status byte and packet mode bit, both protected by the ->ctrl_lock, into an unsigned long. The padding bits are necessary to force the compiler to allocate the type specified; otherwise, gcc will ignore the type specifier and allocate the minimum number of bytes required to store the bitfield. In turn, this would allow Alpha EV4/EV5 cpus to corrupt adjacent byte or short storage (because those cpus use RMW to store byte and short data). gcc versions < 4.7.2 will also corrupt storage adjacent to bitfields smaller than unsigned long on ia64, ppc64, hppa64, and sparc64, thus requiring more than unsigned int storage (which would otherwise be sufficient to fix the Alpha non-atomic storage problem). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Hold termios_rwsem for tcflow(TCIxxx)	Peter Hurley	1	-3/+7
	While transmitting a START/STOP char for tcflow(TCION/TCIOFF), prevent a termios change. Otherwise, a garbage in-band flow control char may be sent, if the termios change overlaps the transmission setup. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Move and rename send_prio_char() as tty_send_xchar()	Peter Hurley	3	-35/+34
	Relocate the file-scope function, send_prio_char(), as a global helper tty_send_xchar(). Remove the global declarations for tty_write_lock()/tty_write_unlock(), as these are file-scope only now. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Serialize tcflow() with other tty flow control changes	Peter Hurley	2	-4/+9
	Use newly-introduced tty->flow_lock to serialize updates to tty->flow_stopped (via tcflow()) and with concurrent tty flow control changes from other sources. Merge the storage for ->stopped and ->flow_stopped, now that both flags are serialized by ->flow_lock. The padding bits are necessary to force the compiler to allocate the type specified; otherwise, gcc will ignore the type specifier and allocate the minimum number of bytes necessary to store the bitfield. In turn, this would allow Alpha EV4 and EV5 cpus to corrupt adjacent byte storage because those cpus use RMW to store byte and short data. gcc versions < 4.7.2 will also corrupt storage adjacent to bitfields smaller than unsigned long on ia64, ppc64, hppa64 and sparc64, thus requiring more than unsigned int storage (which would otherwise be sufficient to workaround the Alpha non-atomic byte/short storage problem). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Move packet mode flow control notifications to pty driver	Peter Hurley	2	-27/+45
	When a master pty is set to packet mode, flow control changes to the slave pty cause notifications to the master pty via reads and polls. However, these tests are occurring for all ttys, not just ptys. Implement flow control packet mode notifications in the pty driver. Only the slave side implements the flow control handlers since packet mode is asymmetric; the master pty receives notifications for slave-side changes, but not vice versa. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Serialize tty flow control changes with flow_lock	Peter Hurley	3	-12/+36
	Without serialization, the flow control state can become inverted wrt. the actual hardware state. For example, CPU 0 \| CPU 1 stop_tty() \| lock ctrl_lock \| tty->stopped = 1 \| unlock ctrl_lock \| \| start_tty() \| lock ctrl_lock \| tty->stopped = 0 \| unlock ctrl_lock \| driver->start() driver->stop() \| In this case, the flow control state now indicates the tty has been started, but the actual hardware state has actually been stopped. Introduce tty->flow_lock spinlock to serialize tty flow control changes. Split out unlocked __start_tty()/__stop_tty() flavors for use by ioctl(TCXONC) in follow-on patch. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	tty: Convert tty_struct bitfield to ints	Peter Hurley	1	-1/+4
	The stopped, hw_stopped, flow_stopped and packet bits are smp-unsafe and interrupt-unsafe. For example, CPU 0 \| CPU 1 \| tty->flow_stopped = 1 \| tty->hw_stopped = 0 One of these updates will be corrupted, as the bitwise operation on the bitfield is non-atomic. Ensure each flag has a separate memory location, so concurrent updates do not corrupt orthogonal states. Because DEC Alpha EV4 and EV5 cpus (from 1995) perform RMW on smaller-than-machine-word storage, "separate memory location" must be int instead of byte. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	serial: core: Use spin_lock_irq() in uart_set_termios()	Peter Hurley	1	-5/+4
	uart_set_termios() is called with interrupts enabled; no need to save and restore the interrupt state when taking the uart port lock. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	serial: bfin-uart: Fix auto CTS	Peter Hurley	1	-1/+2
	Commit 64851636d568ae9f167cd5d1dcdbfe17e6eef73c, serial: bfin-uart: Remove ASYNC_CTS_FLOW flag for hardware automatic CTS, open-codes uart_handle_cts_change() when CONFIG_SERIAL_BFIN_HARD_CTSRTS to skip start and stop tx. But the CTS interrupt handler _still_ calls uart_handle_cts_change(); only call uart_handle_cts_change() if !CONFIG_SERIAL_BFIN_HARD_CTSRTS. cc: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	usb: serial: Remove unused tty->hw_stopped	Peter Hurley	3	-18/+3
	The tty core does not test tty->hw_stopped; remove from drivers which don't test it themselves. Acked-by: Johan Hovold <johan@kernel.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	serial: core: Privatize tty->hw_stopped	Peter Hurley	3	-23/+22
	tty->hw_stopped is not used by the tty core and is thread-unsafe; hw_stopped is a member of a bitfield whose fields are updated non-atomically and no lock is suitable for serializing updates. Replace serial core usage of tty->hw_stopped with uport->hw_stopped. Use int storage which works around Alpha EV4/5 non-atomic byte storage, since uart_port uses different locks to protect certain fields within the structure. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	isdn: i4l: Remove ASYNC_CTS_FLOW	Peter Hurley	1	-5/+0
	ISDN4Linux always enables CTS flow control and does not use the tty_port_cts_enabled() helper function; remove ASYNC_CTS_FLOW state enable/disable. cc: Karsten Keil <isdn@linux-pingi.de> cc: <netdev@vger.kernel.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	serial: core: Privatize modem status enable flags	Peter Hurley	3	-13/+30
	The serial core uses the tty port flags, ASYNC_CTS_FLOW and ASYNC_CD_CHECK, to track whether CTS and DCD changes should be ignored or handled. However, the tty port flags are not safe for atomic bit operations and no lock provides serialized updates. Introduce the struct uart_port status field to track CTS and DCD enable states, and serialize access with uart port lock. Substitute uart_cts_enabled() helper for tty_port_cts_enabled(). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	serial: core: Document and assert lock requirements for irq helpers	Peter Hurley	1	-0/+8
	The serial core provides two helper functions, uart_handle_dcd_change() and uart_handle_cts_change(), for UART drivers to use at interrupt time. The serial core expects the UART driver to hold the uart port lock when calling these helpers to prevent state corruption. If lockdep enabled, trigger a warning if the uart port lock is not held when calling these helper functions. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-23	locking: Add WARN_ON_ONCE lock assertion	Peter Hurley	1	-0/+5
	An interface may need to assert a lock invariant and not flood the system logs; add a lockdep helper macro equivalent to lockdep_assert_held() which only WARNs once. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-20	tty: serial_mctrl_gpio: Fix COMPILE_TEST build for architectures with custom termios.h	Alexander Shiyan	1	-1/+1
	This patch fixes COMPILE_TEST build of serial_mctrl_gpio module for architectures with custom termios.h header. sparc64:allmodconfig: In file included from drivers/tty/serial/serial_mctrl_gpio.c:21:0: include/uapi/asm-generic/termios.h:22:8: error: redefinition of 'struct termio' ./arch/sparc/include/uapi/asm/termbits.h:16:8: note: originally defined here make[3]: *** [drivers/tty/serial/serial_mctrl_gpio.o] Error 1 Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-14	Linux 3.17-rc5	Linus Torvalds	1	-1/+1

2014-09-14	vfs: avoid non-forwarding large load after small store in path lookup	Linus Torvalds	2	-9/+11
	The performance regression that Josef Bacik reported in the pathname lookup (see commit 99d263d4c5b2 "vfs: fix bad hashing of dentries") made me look at performance stability of the dcache code, just to verify that the problem was actually fixed. That turned up a few other problems in this area. There are a few cases where we exit RCU lookup mode and go to the slow serializing case when we shouldn't, Al has fixed those and they'll come in with the next VFS pull. But my performance verification also shows that link_path_walk() turns out to have a very unfortunate 32-bit store of the length and hash of the name we look up, followed by a 64-bit read of the combined hash_len field. That screws up the processor store to load forwarding, causing an unnecessary hickup in this critical routine. It's caused by the ugly calling convention for the "hash_name()" function, and easily fixed by just making hash_name() fill in the whole 'struct qstr' rather than passing it a pointer to just the hash value. With that, the profile for this function looks much smoother. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-14	be careful with nd->inode in path_init() and follow_dotdot_rcu()	Al Viro	1	-2/+13
	in the former we simply check if dentry is still valid after picking its ->d_inode; in the latter we fetch ->d_inode in the same places where we fetch dentry and its ->d_seq, under the same checks. Cc: stable@vger.kernel.org # 2.6.38+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-14	don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()	Al Viro	1	-16/+17
	return the value instead, and have path_init() do the assignment. Broken by "vfs: Fix absolute RCU path walk failures due to uninitialized seq number", which was Cc-stable with 2.6.38+ as destination. This one should go where it went. To avoid dummy value returned in case when root is already set (it would do no harm, actually, since the only caller that doesn't ignore the return value is guaranteed to have nd->root not set, but it's more obvious that way), lift the check into callers. And do the same to set_root(), to keep them in sync. Cc: stable@vger.kernel.org # 2.6.38+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-14	ntb: Add alignment check to meet hardware requirement	Dave Jiang	1	-0/+13
	The NTB translate register must have the value to be BAR size aligned. This alignment check make sure that the DMA memory allocated has the proper alignment. Another requirement for NTB to function properly with memory window BAR size greater or equal to 4M is to use the CMA feature in 3.16 kernel with the appropriate CONFIG_CMA_ALIGNMENT and CONFIG_CMA_SIZE_MBYTES set. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
2014-09-14	MAINTAINERS: update NTB info	Jon Mason	1	-1/+2
	Update my contact info to my personal email address and add Dave Jiang. Signed-off-by: Jon Mason <jon.mason@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2014-09-14	NTB: correct the spread of queues over mw's	Jon Mason	1	-2/+2
	The detection of an uneven number of queues on the given memory windows was not correct. The mw_num is zero based and the mod should be division to spread them evenly over the mw's. Signed-off-by: Jon Mason <jon.mason@intel.com>
2014-09-13	fix bogus read_seqretry() checks introduced in b37199e	Al Viro	1	-2/+2
	read_seqretry() returns true on mismatch, not on match... Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-13	move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon)	Al Viro	1	-2/+6
	and lock the right list there Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-13	[fix] lustre: d_make_root() does iput() on dentry allocation failure	Al Viro	1	-1/+1
	double-free is a bad thing Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-09-13	parisc: Implement new LWS CAS supporting 64 bit operations.	Guy Martin	1	-4/+229
	The current LWS cas only works correctly for 32bit. The new LWS allows for CAS operations of variable size. Signed-off-by: Guy Martin <gmsoft@tuxicoman.be> Cc: <stable@vger.kernel.org> # 3.13+ Signed-off-by: Helge Deller <deller@gmx.de>
2014-09-13	vfs: fix bad hashing of dentries	Linus Torvalds	2	-4/+3
	Josef Bacik found a performance regression between 3.2 and 3.10 and narrowed it down to commit bfcfaa77bdf0 ("vfs: use 'unsigned long' accesses for dcache name comparison and hashing"). He reports: "The test case is essentially for (i = 0; i < 1000000; i++) mkdir("a$i"); On xfs on a fio card this goes at about 20k dir/sec with 3.2, and 12k dir/sec with 3.10. This is because we spend waaaaay more time in __d_lookup on 3.10 than in 3.2. The new hashing function for strings is suboptimal for < sizeof(unsigned long) string names (and hell even > sizeof(unsigned long) string names that I've tested). I broke out the old hashing function and the new one into a userspace helper to get real numbers and this is what I'm getting: Old hash table had 1000000 entries, 0 dupes, 0 max dupes New hash table had 12628 entries, 987372 dupes, 900 max dupes We had 11400 buckets with a p50 of 30 dupes, p90 of 240 dupes, p99 of 567 dupes for the new hash My test does the hash, and then does the d_hash into a integer pointer array the same size as the dentry hash table on my system, and then just increments the value at the address we got to see how many entries we overlap with. As you can see the old hash function ended up with all 1 million entries in their own bucket, whereas the new one they are only distributed among ~12.5k buckets, which is why we're using so much more CPU in __d_lookup". The reason for this hash regression is two-fold: - On 64-bit architectures the down-mixing of the original 64-bit word-at-a-time hash into the final 32-bit hash value is very simplistic and suboptimal, and just adds the two 32-bit parts together. In particular, because there is no bit shuffling and the mixing boundary is also a byte boundary, similar character patterns in the low and high word easily end up just canceling each other out. - the old byte-at-a-time hash mixed each byte into the final hash as it hashed the path component name, resulting in the low bits of the hash generally being a good source of hash data. That is not true for the word-at-a-time case, and the hash data is distributed among all the bits. The fix is the same in both cases: do a better job of mixing the bits up and using as much of the hash data as possible. We already have the "hash_32\|64()" functions to do that. Reported-by: Josef Bacik <jbacik@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-13	Make hash_64() use a 64-bit multiply when appropriate	Linus Torvalds	1	-0/+4
	The hash_64() function historically does the multiply by the GOLDEN_RATIO_PRIME_64 number with explicit shifts and adds, because unlike the 32-bit case, gcc seems unable to turn the constant multiply into the more appropriate shift and adds when required. However, that means that we generate those shifts and adds even when the architecture has a fast multiplier, and could just do it better in hardware. Use the now-cleaned-up CONFIG_ARCH_HAS_FAST_MULTIPLIER (together with "is it a 64-bit architecture") to decide whether to use an integer multiply or the explicit sequence of shift/add instructions. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-13	Make ARCH_HAS_FAST_MULTIPLIER a real config variable	Linus Torvalds	5	-6/+8
	It used to be an ad-hoc hack defined by the x86 version of <asm/bitops.h> that enabled a couple of library routines to know whether an integer multiply is faster than repeated shifts and additions. This just makes it use the real Kconfig system instead, and makes x86 (which was the only architecture that did this) select the option. NOTE! Even for x86, this really is kind of wrong. If we cared, we would probably not enable this for builds optimized for netburst (P4), where shifts-and-adds are generally faster than multiplies. This patch does not change that kind of logic, though, it is purely a syntactic change with no code changes. This was triggered by the fact that we have other places that really want to know "do I want to expand multiples by constants by hand or not", particularly the hash generation code. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-09-12	alarmtimer: Lock k_itimer during timer callback	Richard Larocque	1	-2/+8
	Locks the k_itimer's it_lock member when handling the alarm timer's expiry callback. The regular posix timers defined in posix-timers.c have this lock held during timout processing because their callbacks are routed through posix_timer_fn(). The alarm timers follow a different path, so they ought to grab the lock somewhere else. Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by: Richard Larocque <rlarocque@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2014-09-12	alarmtimer: Do not signal SIGEV_NONE timers	Richard Larocque	1	-2/+4
	Avoids sending a signal to alarm timers created with sigev_notify set to SIGEV_NONE by checking for that special case in the timeout callback. The regular posix timers avoid sending signals to SIGEV_NONE timers by not scheduling any callbacks for them in the first place. Although it would be possible to do something similar for alarm timers, it's simpler to handle this as a special case in the timeout. Prior to this patch, the alarm timer would ignore the sigev_notify value and try to deliver signals to the process anyway. Even worse, the sanity check for the value of sigev_signo is skipped when SIGEV_NONE was specified, so the signal number could be bogus. If sigev_signo was an unitialized value (as it often would be if SIGEV_NONE is used), then it's hard to predict which signal will be sent. Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by: Richard Larocque <rlarocque@google.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2014-09-12	alarmtimer: Return relative times in timer_gettime	Richard Larocque	1	-7/+11
	Returns the time remaining for an alarm timer, rather than the time at which it is scheduled to expire. If the timer has already expired or it is not currently scheduled, the it_value's members are set to zero. This new behavior matches that of the other posix-timers and the POSIX specifications. This is a change in user-visible behavior, and may break existing applications. Hopefully, few users rely on the old incorrect behavior. Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by: Richard Larocque <rlarocque@google.com> [jstultz: minor style tweak] Signed-off-by: John Stultz <john.stultz@linaro.org>
2014-09-12	jiffies: Fix timeval conversion to jiffies	Andrew Hunter	2	-37/+31
	timeval_to_jiffies tried to round a timeval up to an integral number of jiffies, but the logic for doing so was incorrect: intervals corresponding to exactly N jiffies would become N+1. This manifested itself particularly repeatedly stopping/starting an itimer: setitimer(ITIMER_PROF, &val, NULL); setitimer(ITIMER_PROF, NULL, &val); would add a full tick to val, _even if it was exactly representable in terms of jiffies_ (say, the result of a previous rounding.) Doing this repeatedly would cause unbounded growth in val. So fix the math. Here's what was wrong with the conversion: we essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) >> USEC_JIFFIE_SC and rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) overapproximation of the true value; that is, instead of dividing by (1 usec/ 1 jiffie), we effectively divided by (1 usec/1 jiffie)-epsilon (rounding down). This would normally be fine, but we want to round timeouts up, and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this would be fine if our division was exact, but dividing this by the slightly smaller factor was equivalent to adding just _over_ 1 to the final result (instead of just _under_ 1, as desired.) In particular, with HZ=1000, we consistently computed that 10000 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec->nsec, round in nanoseconds, and then convert using timespec_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. / struct itimerval initial = zero; initial.it_interval.tv_usec = 10000; setitimer(ITIMER_PROF, &initial, NULL); / Save and restore several times. / for (size_t i = 0; i < 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, &zero, &prev); / on old kernels, this goes up by TICK_USEC every iteration */ printf("previous value: %ld %ld %ld %ld\n", prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, &prev, NULL); } return 0; } Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Turner <pjt@google.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Paul Turner <pjt@google.com> Reported-by: Aaron Jacobs <jacobsa@google.com> Signed-off-by: Andrew Hunter <ahh@google.com> [jstultz: Tweaked to apply to 3.17-rc] Signed-off-by: John Stultz <john.stultz@linaro.org>
2014-09-12	futex: Unlock hb->lock in futex_wait_requeue_pi() error path	Thomas Gleixner	1	-0/+1
	futex_wait_requeue_pi() calls futex_wait_setup(). If futex_wait_setup() succeeds it returns with hb->lock held and preemption disabled. Now the sanity check after this does: if (match_futex(&q.key, &key2)) { ret = -EINVAL; goto out_put_keys; } which releases the keys but does not release hb->lock. So we happily return to user space with hb->lock held and therefor preemption disabled. Unlock hb->lock before taking the exit route. Reported-by: Dave "Trinity" Jones <davej@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Hart <dvhart@linux.intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409112318500.4178@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-09-12	KEYS: Fix termination condition in assoc array garbage collection	David Howells	1	-1/+3
	This fixes CVE-2014-3631. It is possible for an associative array to end up with a shortcut node at the root of the tree if there are more than fan-out leaves in the tree, but they all crowd into the same slot in the lowest level (ie. they all have the same first nibble of their index keys). When assoc_array_gc() returns back up the tree after scanning some leaves, it can fall off of the root and crash because it assumes that the back pointer from a shortcut (after label ascend_old_tree) must point to a normal node - which isn't true of a shortcut node at the root. Should we find we're ascending rootwards over a shortcut, we should check to see if the backpointer is zero - and if it is, we have completed the scan. This particular bug cannot occur if the root node is not a shortcut - ie. if you have fewer than 17 keys in a keyring or if you have at least two keys that sit into separate slots (eg. a keyring and a non keyring). This can be reproduced by: ring=`keyctl newring bar @s` for ((i=1; i<=18; i++)); do last_key=`keyctl newring foo$i $ring`; done keyctl timeout $last_key 2 Doing this: echo 3 >/proc/sys/kernel/keys/gc_delay first will speed things up. If we do fall off of the top of the tree, we get the following oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540 PGD dae15067 PUD cfc24067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: xt_nat xt_mark nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_ni CPU: 0 PID: 26011 Comm: kworker/0:1 Not tainted 3.14.9-200.fc20.x86_64 #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: events key_garbage_collector task: ffff8800918bd580 ti: ffff8800aac14000 task.ti: ffff8800aac14000 RIP: 0010:[<ffffffff8136cea7>] [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540 RSP: 0018:ffff8800aac15d40 EFLAGS: 00010206 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800aaecacc0 RDX: ffff8800daecf440 RSI: 0000000000000001 RDI: ffff8800aadc2bc0 RBP: ffff8800aac15da8 R08: 0000000000000001 R09: 0000000000000003 R10: ffffffff8136ccc7 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000070 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000018 CR3: 00000000db10d000 CR4: 00000000000006f0 Stack: ffff8800aac15d50 0000000000000011 ffff8800aac15db8 ffffffff812e2a70 ffff880091a00600 0000000000000000 ffff8800aadc2bc3 00000000cd42c987 ffff88003702df20 ffff88003702dfa0 0000000053b65c09 ffff8800aac15fd8 Call Trace: [<ffffffff812e2a70>] ? keyring_detect_cycle_iterator+0x30/0x30 [<ffffffff812e3e75>] keyring_gc+0x75/0x80 [<ffffffff812e1424>] key_garbage_collector+0x154/0x3c0 [<ffffffff810a67b6>] process_one_work+0x176/0x430 [<ffffffff810a744b>] worker_thread+0x11b/0x3a0 [<ffffffff810a7330>] ? rescuer_thread+0x3b0/0x3b0 [<ffffffff810ae1a8>] kthread+0xd8/0xf0 [<ffffffff810ae0d0>] ? insert_kthread_work+0x40/0x40 [<ffffffff816ffb7c>] ret_from_fork+0x7c/0xb0 [<ffffffff810ae0d0>] ? insert_kthread_work+0x40/0x40 Code: 08 4c 8b 22 0f 84 bf 00 00 00 41 83 c7 01 49 83 e4 fc 41 83 ff 0f 4c 89 65 c0 0f 8f 5a fe ff ff 48 8b 45 c0 4d 63 cf 49 83 c1 02 <4e> 8b 34 c8 4d 85 f6 0f 84 be 00 00 00 41 f6 c6 01 0f 84 92 RIP [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540 RSP <ffff8800aac15d40> CR2: 0000000000000018 ---[ end trace 1129028a088c0cbd ]--- Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-12	video: ARM CLCD: Fix color model capabilities for DT platforms	Pawel Moll	1	-3/+1
	The DT-based panel capabilities selection was picking up a subset of available modes based on hardware configuration. This was wrong, as the capabilities describe available memory models and adapt the display controller to them that the RGB output is wired up correctly (as in: R and B components are not swapped). This patch fixes it by removing the unnecessary limitation. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-09-12	drm/ast: AST2000 cannot be detected correctly	Y.C. Chen	1	-1/+1
	Type error and cause AST2000 cannot be detected correctly Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com> Reviewed-by: Egbert Eich <eich@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-12	drm/ast: open key before detect chips	Y.C. Chen	1	-0/+1
	Some config settings like 3rd TX chips will not get correctly if the extended reg is protected Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com> Reviewed-by: Egbert Eich <eich@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-09-11	xhci: fix oops when xhci resumes from hibernate with hw lpm capable devices	Mathias Nyman	1	-2/+10
	Resuming from hibernate (S4) will restart and re-initialize xHC. The device contexts are freed and will be re-allocated later during device reset. Usb core will disable link pm in device resume before device reset, which will try to change the max exit latency, accessing the device contexts before they are re-allocated. There is no need to zero (disable) the max exit latency when disabling hw lpm for a freshly re-initialized xHC. So check that device context exists before doing anything. The max exit latency will be set again after device reset when usb core enables the link pm. Reported-by: Imre Deak <imre.deak@intel.com> Tested-by: Imre Deak <imre.deak@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-11	usb: xhci: Fix OOPS in xhci error handling code	Al Cooper	1	-0/+1
	The xhci driver will OOPS on resume from S2/S3 if dma_alloc_coherent() is out of memory. This is a result of two things: 1. xhci_mem_cleanup() in xhci-mem.c free's xhci->lpm_command if it's not NULL, but doesn't set it to NULL after the free. 2. xhci_mem_cleanup() is called twice on resume, once for normal restart and once from xhci_mem_init() if dma_alloc_coherent() fails, resulting in a free of xhci->lpm_command that has already been freed. The fix is to set xhci->lpm_command to NULL after freeing it. Signed-off-by: Al Cooper <alcooperx@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-11	xhci: Fix null pointer dereference if xhci initialization fails	Mathias Nyman	1	-1/+1
	If xhci initialization fails before the roothub bandwidth domains (xhci->rh_bw[i]) are allocated it will oops when trying to access rh_bw members in xhci_mem_cleanup(). Reported-by: Manuel Reimer <manuel.reimer@gmx.de> Cc: stable <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-11	storage: Add single-LUN quirk for Jaz USB Adapter	Mark	1	-0/+6
	The Iomega Jaz USB Adapter is a SCSI-USB converter cable. The hardware seems to be identical to e.g. the Microtech XpressSCSI, using a Shuttle/ SCM chip set. However its firmware restricts it to only work with Jaz drives. On connecting the cable a message like this appears four times in the log: reset full speed USB device number 4 using uhci_hcd That's non-fatal but the US_FL_SINGLE_LUN quirk fixes it. Signed-off-by: Mark Knibbs <markk@clara.co.uk> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-09-11	uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check	Hans de Goede	1	-2/+2
	Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: stable@vger.kernel.org # 3.16 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>