linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2014-12-18	netlink: Don't reorder loads/stores before marking mmap netlink frame as available	Thomas Graf	1	-1/+1
	Each mmap Netlink frame contains a status field which indicates whether the frame is unused, reserved, contains data or needs to be skipped. Both loads and stores may not be reordeded and must complete before the status field is changed and another CPU might pick up the frame for use. Use an smp_mb() to cover needs of both types of callers to netlink_set_status(), callers which have been reading data frame from the frame, and callers which have been filling or releasing and thus writing to the frame. - Example code path requiring a smp_rmb(): memcpy(skb->data, (void *)hdr + NL_MMAP_HDRLEN, hdr->nm_len); netlink_set_status(hdr, NL_MMAP_STATUS_UNUSED); - Example code path requiring a smp_wmb(): hdr->nm_uid = from_kuid(sk_user_ns(sk), NETLINK_CB(skb).creds.uid); hdr->nm_gid = from_kgid(sk_user_ns(sk), NETLINK_CB(skb).creds.gid); netlink_frame_flush_dcache(hdr); netlink_set_status(hdr, NL_MMAP_STATUS_VALID); Fixes: f9c228 ("netlink: implement memory mapped recvmsg()") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-18	netlink: Always copy on mmap TX.	David Miller	1	-36/+16
	Checking the file f_count and the nlk->mapped count is not completely sufficient to prevent the mmap'd area contents from changing from under us during netlink mmap sendmsg() operations. Be careful to sample the header's length field only once, because this could change from under us as well. Fixes: 5fd96123ee19 ("netlink: implement memory mapped sendmsg()") Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch>
2014-12-18	ALSA: usb-audio: extend KEF X300A FU 10 tweak to Arcam rPAC	Jiri Jaburek	1	-3/+12
	The Arcam rPAC seems to have the same problem - whenever anything (alsamixer, udevd, 3.9+ kernel from 60af3d037eb8c, ..) attempts to access mixer / control interface of the card, the firmware "locks up" the entire device, resulting in SNDRV_PCM_IOCTL_HW_PARAMS failed (-5): Input/output error from alsa-lib. Other operating systems can somehow read the mixer (there seems to be playback volume/mute), but any manipulation is ignored by the device (which has hardware volume controls). Cc: <stable@vger.kernel.org> Signed-off-by: Jiri Jaburek <jjaburek@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-18	ALSA: hda/realtek - New codec support for ALC298	Kailang Yang	1	-0/+7
	Add new support for ALC298 codec. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-18	watchdog: imx2_wdt: Fix the argument of watchdog_active()	Fabio Estevam	1	-1/+1
	Fix the following build warning by passing the expected argument type to watchdog_active(): drivers/watchdog/imx2_wdt.c: In function 'imx2_wdt_suspend': drivers/watchdog/imx2_wdt.c:340:2: warning: passing argument 1 of 'watchdog_active' from incompatible pointer type [enabled by default] In file included from drivers/watchdog/imx2_wdt.c:38:0: include/linux/watchdog.h:104:20: note: expected 'struct watchdog_device ' but argument is of type 'struct watchdog_device *' Reported-by: Olof's autobuilder <build@lixom.net> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2014-12-18	watchdog: imx2_wdt: Add power management support.	Xiubo Li	1	-0/+47
	Add power management operations(suspend and resume) as part of dev_pm_ops for IMX2 watchdog driver. Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2014-12-18	x86/tls: Don't validate lm in set_thread_area() after all	Andy Lutomirski	2	-6/+7
	It turns out that there's a lurking ABI issue. GCC, when compiling this in a 32-bit program: struct user_desc desc = { .entry_number = idx, .base_addr = base, .limit = 0xfffff, .seg_32bit = 1, .contents = 0, /* Data, grow-up */ .read_exec_only = 0, .limit_in_pages = 1, .seg_not_present = 0, .useable = 0, }; will leave .lm uninitialized. This means that anything in the kernel that reads user_desc.lm for 32-bit tasks is unreliable. Revert the .lm check in set_thread_area(). The value never did anything in the first place. Fixes: 0e58af4e1d21 ("x86/tls: Disallow unusual TLS segments") Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # Only if 0e58af4e1d21 is backported Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-12-18	powerpc/powernv: Ignore smt-enabled on Power8 and later	Greg Kurz	1	-1/+15
	Starting with POWER8, the subcore logic relies on all threads of a core being booted so that they can participate in split mode switches. So on those machines we ignore the smt_enabled_at_boot setting (smt-enabled on the kernel command line). Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> [mpe: Update comment and change log to be more precise] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-18	s390/kvm: REPLACE barrier fixup with READ_ONCE	Christian Borntraeger	1	-12/+6
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Commit 1365039d0cb3 ("KVM: s390: Fix ipte locking") replace ACCESS_ONCE with barriers. Lets use READ_ONCE instead. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	arm/spinlock: Replace ACCESS_ONCE with READ_ONCE	Christian Borntraeger	1	-2/+2
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the spinlock code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	arm64/spinlock: Replace ACCESS_ONCE READ_ONCE	Christian Borntraeger	1	-2/+2
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the spinlock code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	mips/gup: Replace ACCESS_ONCE with READ_ONCE	Christian Borntraeger	1	-1/+1
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the gup code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	x86/gup: Replace ACCESS_ONCE with READ_ONCE	Christian Borntraeger	1	-1/+1
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the gup code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	x86/spinlock: Replace ACCESS_ONCE with READ_ONCE	Christian Borntraeger	1	-4/+4
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the spinlock code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	mm: replace ACCESS_ONCE with READ_ONCE or barriers	Christian Borntraeger	3	-3/+13
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Let's change the code to access the page table elements with READ_ONCE that does implicit scalar accesses for the gup code. mm_find_pmd is tricky, because m68k and sparc(32bit) define pmd_t as array of longs. This code requires just that the pmd_present and pmd_trans_huge check are done on the same value, so a barrier is sufficent. A similar case is in handle_pte_fault. On ppc44x the word size is 32 bit, but a pte is 64 bit. A barrier is ok as well. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-mm@kvack.org Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-12-18	kernel: Provide READ_ONCE and ASSIGN_ONCE	Christian Borntraeger	1	-0/+74
	ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via scalar types as suggested by Linus Torvalds. Accesses larger than the machines word size cannot be guaranteed to be atomic. These macros will use memcpy and emit a build warning. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-18	KVM: move APIC types to arch/x86/	Paolo Bonzini	3	-27/+27
	They are not used anymore by IA64, move them away. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-18	powerpc/uaccess: Allow get_user() with bitwise types	Michael S. Tsirkin	1	-3/+3
	At the moment, if p and x are both of the same bitwise type (eg. __le32), get_user(x, p) produces a sparse warning. This is because p is loaded into a long then cast back to typeof(p). When typeof(p) is a bitwise type (which is uncommon), such a cast needs __force, otherwise sparse produces a warning. For non-bitwise types __force should have no effect, and should not hide any legitimate errors. Note that we are casting to typeof(p) not typeof(x). Even with the cast, if x and *p are of different types we should get the warning, so I think we are not loosing the ability to detect any actual errors. virtio would like to use bitwise types with get_user() so fix these spurious warnings by adding __force. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [mpe: Fill in changelog with more details] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-18	ALSA: asihpi: update to HPI version 4.14	Eliot Blennerhassett	1	-3/+3
	This corresponds with updated asihpi firmware in alsa-firmware repo Signed-off-by: Eliot Blennerhassett <eliot@blennerhassett.gen.nz> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-18	ALSA: asihpi: increase tuner pad cache size	Eliot Blennerhassett	1	-3/+3
	Increase size allocated for PAD (programme associated data) control. This is used by newer tuner products. Signed-off-by: Eliot Blennerhassett <eliot@blennerhassett.gen.nz> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-18	ALSA: asihpi: relax firmware version check	Eliot Blennerhassett	1	-14/+12
	Some products firmware is no longer being updated e.g. dsp5000, dsp8700 but it should continue to work with updated HPI versions. Avoid regression by allowing this firmware to be loaded as long as major version is the same. Warn about mismatching versions, as matching versions are preferred. Signed-off-by: Eliot Blennerhassett <eliot@blennerhassett.gen.nz> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-18	ALSA: usb-audio: Fix Scarlett 6i6 initialization typo	Chris J Arges	1	-1/+1
	The num_controls field was incorrectly set to 0 causing 6i6 to not be initialized. Set this to 9. Reported-and-tested-by: Mark Roberts <sunifiram@gmail.com> Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-12-17	Ceph: remove left-over reject file	Linus Torvalds	1	-10/+0
	Neither Sage nor I noticed that Zheng Yan had mistakenly committed fs/ceph/super.h.rej as part of commit 31c542a199d7 ("ceph: add inline data to pagecache"). Remove it. Requested-by: Yan, Zheng <ukernel@gmail.com> Cc: Sage Weil <sweil@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-18	param: do not set store func without write perm	Kees Cook	1	-1/+3
	When a module_param is defined without DAC write permissions, it can still be changed at runtime and updated. Drivers using a 0444 permission may be surprised that these values can still be changed. For drivers that want to allow updates, any S_IW* flag will set the "store" function as before. Drivers without S_IW* flags will have the "store" function unset, unforcing a read-only value. Drivers that wish neither "store" nor "get" can continue to use "0" for perms to stay out of sysfs entirely. Old behavior: # cd /sys/module/snd/parameters # ls -l total 0 -r--r--r-- 1 root root 4096 Dec 11 13:55 cards_limit -r--r--r-- 1 root root 4096 Dec 11 13:55 major -r--r--r-- 1 root root 4096 Dec 11 13:55 slots # cat major 116 # echo -1 > major -bash: major: Permission denied # chmod u+w major # echo -1 > major # cat major -1 New behavior: ... # chmod u+w major # echo -1 > major -bash: echo: write error: Input/output error Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-12-17	KVM: PPC: Book3S: Enable in-kernel XICS emulation by default	Anton Blanchard	1	-0/+1
	The in-kernel XICS emulation is faster than doing it all in QEMU and it has got a lot of testing, so enable it by default. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2014-12-17	Bluetooth: Fix bug with filter in service discovery optimization	Marcel Holtmann	1	-5/+9
	The optimization for filtering out extended inquiry results, advertising reports or scan response data based on provided UUID list has a logic bug. In case no match is found in the advertising data, the scan response is ignored and not checked against the filter. This will lead to events being filtered wrongly. Change the code to actually only drop the events when the scan response data is not present. If it is present, it needs to be checked against the provided filter. The patch is a bit more complex than it needs to be. That is because it also fixes this compiler warning that some gcc versions produce. CC net/bluetooth/mgmt.o net/bluetooth/mgmt.c: In function ‘mgmt_device_found’: net/bluetooth/mgmt.c:7028:7: warning: ‘match’ may be used uninitialized in this function [-Wmaybe-uninitialized] bool match; ^ It seems that gcc can not clearly figure out the context of the match variable. So just change the branches for the extended inquiry response and advertising data around so that it is clear. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-12-17	mmu_gather: fix over-eager tlb_flush_mmu_free() calling	Linus Torvalds	1	-3/+3
	Dave Hansen reports that commit fb7332a9fedf ("mmu_gather: move minimal range calculations into generic code") caused a performance problem: "tlb_finish_mmu() goes up about 9x in the profiles (~0.4%->3.6%) and tlb_flush_mmu_free() takes about 3.1% of CPU time with the patch applied, but does not show up at all on the commit before" and the reason is that Will moved the test for whether we need to flush from tlb_flush_mmu() into tlb_flush_mmu_tlbonly(). But that meant that tlb_flush_mmu_free() basically lost that check. Move it back into tlb_flush_mmu() where it belongs, so that it covers both tlb_flush_mmu_tlbonly() _and_ tlb_flush_mmu_free(). Reported-and-tested-by: Dave Hansen <dave@sr71.net> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-17	x86: mm: fix VM_FAULT_RETRY handling	Linus Torvalds	1	-1/+1
	My commit 26178ec11ef3 ("x86: mm: consolidate VM_FAULT_RETRY handling") had a really stupid typo: the FAULT_FLAG_USER bit is in the 'flags' variable, not the 'fault' variable. Duh, The one silver lining in this is that Dave finding this at least confirms that trinity actually triggers this special path easily, in a way normal use does not. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-17	i2c: sh_mobile: I2C_SH_MOBILE should depend on HAS_DMA	Geert Uytterhoeven	1	-0/+1
	If NO_DMA=y: drivers/built-in.o: In function `sh_mobile_i2c_dma_unmap': i2c-sh_mobile.c:(.text+0x60de42): undefined reference to `dma_unmap_single' drivers/built-in.o: In function `sh_mobile_i2c_xfer_dma': i2c-sh_mobile.c:(.text+0x60df22): undefined reference to `dma_map_single' i2c-sh_mobile.c:(.text+0x60df2e): undefined reference to `dma_mapping_error' Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-12-17	i2c: sh_mobile: rework deferred probing	Wolfram Sang	1	-50/+48
	DMA is opt-in for this driver. So, we can't use deferred probing for requesting DMA channels in probe, because our driver would get endlessly deferred if DMA support is compiled in AND the DMA driver is missing. Because we can't know when the DMA driver might show up, we always try again when a DMA transfer would be possible. The downside is that there is more overhead for setting up PIO transfers under the above scenario. But well, having DMA enabled and the proper DMA driver missing looks like a broken or test config anyhow. Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-12-17	i2c: sh_mobile: refactor DMA setup	Wolfram Sang	1	-19/+16
	Refactor DMA setup to keep the errno so we can implement better deferred probe support in the next step. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-12-17	i2c: mv64xxx: rework offload support to fix several problems	Thomas Petazzoni	1	-120/+186
	Originally, the I2C controller supported by the i2c-mv64xxx driver requires a lot of software support: an interrupt is generated at each step of an I2C transaction (after the start bit, after sending the address, etc.) and the driver is in charge of re-programming the I2C controller to do the next step of the I2C transaction. This explains the fairly complex state machine that the driver has. On Marvell Armada XP and later processors (Armada 375, 38x, etc.), the I2C controller was extended with a part called the "I2C Bridge", which allows to offload the I2C transaction completely to the hardware. Initial support for this mechanism was added in commit 930ab3d403a ("i2c: mv64xxx: Add I2C Transaction Generator support"). However, the implementation done in this commit has two related issues, which this commit fixes by completely changing how the offload implementation is done: * SMBus read transfers, where there is one write to select the register immediately followed in the same transaction by one read, were making the processor hang. This was easier visible on the Marvell Armada XP WRT1900AC platform using a driver for an I2C LED controller, or on other Armada XP platforms by using a simple 'i2cget' command to read an I2C EEPROM. * The implementation was based on the fact that the offload engine was re-programmed to transfer each message of an I2C xfer: this meant that each message sent with the offload engine was starting with a normal I2C start sequence. However, the I2C subsystem assumes that all messages belonging to the same xfer will use the so-called "repeated start" so that the entire I2C xfer is seen as one transfer by the I2C devices and cannot be interrupt by other I2C masters on the same bus. In fact, the "I2C Bridge" allows to offload three types of xfer: - xfer of one write message - xfer of one read message - xfer of one write message followed by one read message For all other situations, we have to fallback to not using the "I2C Bridge" in order to get proper I2C semantics. Therefore, this commit reworks the offload implementation to put it not at the message level, but at the xfer level: in the mv64xxx_i2c_xfer() function, we decide if the transaction can be offloaded (in which case it is handled by the mv64xxx_i2c_offload_xfer() function), or otherwise it is handled by the slow path (implemented in the existing mv64xxx_i2c_execute_msg()). This allows to simplify the state machine, which no longer needs to have any state related to the offload implementation: the offload implementation is now completely separated from the slow path (with the exception of the interrupt handler, of course). In summary: - mv64xxx_i2c_can_offload() will analyze an I2C xfer and decided of the "I2C Bridge" can be used to offload it or not. - mv64xxx_i2c_offload_xfer() will actually program the "I2C Bridge" to offload one xfer (of either one or two messages), and block using mv64xxx_i2c_wait_for_completion() until the xfer completes. - The interrupt handler mv64xxx_i2c_intr() is modified to push the offload related code to a separate function, mv64xxx_i2c_intr_offload(). It will take care of reading the received data if needed. This commit was tested on: - Armada XP OpenBlocks AX3-4 (EEPROM on I2C and RTC on I2C) - Armada XP WRT1900AC (LED controller on I2C) - Armada XP GP (EEPROM on I2C) Fixes: 930ab3d403ae ("i2c: mv64xxx: Add I2C Transaction Generator support") Cc: <stable@vger.kernel.org> # v3.12+ Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> [wsa: fixed checkpatch warnings] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-12-17	i2c: mv64xxx: use BIT() macro for register value definitions	Thomas Petazzoni	1	-11/+11
	Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-12-17	ceph: fix setting empty extended attribute	Yan, Zheng	1	-2/+5
	make sure 'value' is not null. otherwise __ceph_setxattr will remove the extended attribute. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-17	ceph: fix mksnap crash	Yan, Zheng	1	-1/+3
	mksnap reply only contain 'target', does not contain 'dentry'. So it's wrong to use req->r_reply_info.head->is_dentry to detect traceless reply. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Sage Weil <sage@redhat.com>
2014-12-17	ceph: do_sync is never initialized	Dan Carpenter	1	-1/+1
	Probably this code was syncing a lot more often then intended because the do_sync variable wasn't set to zero. Cc: stable@vger.kernel.org # v3.11+ Fixes: c62988ec0910 ('ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: fixup includes in pagelist.h	Ilya Dryomov	1	-1/+3
	pagelist.h needs to include linux/types.h and asm/byteorder.h and not rely on other headers pulling yet another set of headers. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	ceph: support inline data feature	Yan, Zheng	1	-1/+2
	Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: flush inline version	Yan, Zheng	3	-4/+23
	After converting inline data to normal data, client need to flush the new i_inline_version (CEPH_INLINE_NONE) to MDS. This commit makes cap messages (sent to MDS) contain inline_version and inline_data. Client always converts inline data to normal data before data write, so the inline data length part is always zero. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: convert inline data to normal data before data write	Yan, Zheng	3	-3/+161
	Before any data write, convert inline data to normal data and set i_inline_version to CEPH_INLINE_NONE. The OSD request that saves inline data to object contains 3 operations (CMPXATTR, WRITE and SETXATTR). It compares a xattr named 'inline_version' to prevent old data overwrites newer data. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: sync read inline data	Yan, Zheng	2	-13/+116
	we can't use getattr to fetch inline data while holding Fr cap, because it can cause deadlock. If we need to sync read inline data, drop cap refs first, then use getattr to fetch inline data. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: fetch inline data when getting Fcr cap refs	Yan, Zheng	3	-18/+63
	we can't use getattr to fetch inline data after getting Fcr caps, because it can cause deadlock. The solution is try bringing inline data to page cache when not holding any cap, and hope the inline data page is still there after getting the Fcr caps. If the page is still there, pin it in page cache for later IO. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: use getattr request to fetch inline data	Yan, Zheng	4	-10/+34
	Add a new parameter 'locked_page' to ceph_do_getattr(). If inline data in getattr reply will be copied to the page. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: add inline data to pagecache	Yan, Zheng	6	-1/+85
	Request reply and cap message can contain inline data. add inline data to the page cache if there is Fc cap. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	ceph: parse inline data in MClientReply and MClientCaps	Yan, Zheng	3	-11/+36
	Signed-off-by: Yan, Zheng <zyan@redhat.com>
2014-12-17	libceph: specify position of extent operation	Yan, Zheng	4	-20/+19
	allow specifying position of extent operation in multi-operations osd request. This is required for cephfs to convert inline data to normal data (compare xattr, then write object). Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: add CREATE osd operation support	Yan, Zheng	1	-20/+22
	Add CEPH_OSD_OP_CREATE support. Also change libceph to not treat CEPH_OSD_OP_DELETE as an extent op and add an assert to that end. Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	libceph: add SETXATTR/CMPXATTR osd operations support	Yan, Zheng	2	-0/+57
	Signed-off-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@redhat.com>
2014-12-17	rbd: don't treat CEPH_OSD_OP_DELETE as extent op	Ilya Dryomov	1	-2/+6
	CEPH_OSD_OP_DELETE is not an extent op, stop treating it as such. This sneaked in with discard patches - it's one of the three osd ops (the other two are CEPH_OSD_OP_TRUNCATE and CEPH_OSD_OP_ZERO) that discard is implemented with. Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2014-12-17	ceph: remove unused stringification macros	Ilya Dryomov	1	-3/+0
	These were used to report git versions a long time ago. Signed-off-by: Ilya Dryomov <idryomov@redhat.com>