linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2015-02-20	IB/qib: Add blank line after declaration	Mike Marciniszyn	17	-6/+67
	Upstream checkpatch now requires this. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-20	IB/qib: Fix checkpatch warnings	Mike Marciniszyn	15	-44/+42
	Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2015-02-20	i2c: ocores: rework clk code to handle NULL cookie	Wolfram Sang	1	-14/+21
	For, !HAVE_CLK the clk API returns a NULL cookie. Rework the initialization code to handle that. If clk_get_rate() delivers 0, we use the fallback mechanisms. The patch is pretty easy when ignoring white space issues (git diff -b). Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Tested-by: Max Filippov <jcmvbkbc@gmail.com>
2015-02-20	thermal: exynos: fix: Check if data->tmu_read callback is present before read	Lukasz Majewski	1	-1/+1
	The exynos_tmu_data() function should on entrance test not only for valid data pointer, but also for data->tmu_read one. It is important, since afterwards it is dereferenced to get temperature code. Signed-off-by: Lukasz Majewski <l.majewski@samsung.com> Tested-by: Abhilash Kesavan <a.kesavan@samsung.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2015-02-20	x86/mm/ASLR: Avoid PAGE_SIZE redefinition for UML subarch	Jiri Kosina	2	-2/+0
	Commit f47233c2d34 ("x86/mm/ASLR: Propagate base load address calculation") causes PAGE_SIZE redefinition warnings for UML subarch builds. This is caused by added includes that were leftovers from previous patch versions are are not actually needed (especially page_types.h inlcude in module.c). Drop those stray includes. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Borislav Petkov <bp@suse.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1502201017240.28769@pobox.suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-19	clk: Only recalculate the rate if needed	Tomeu Vizoso	1	-1/+4
	We don't really need to recalculate the effective rate of a clock when a per-user clock is removed, if the constraints of the later aren't limiting the requested rate. This was causing problems with clocks that never had a rate set before, as rate_req would be zero. Though this could be considered a bug in the implementation of those clocks, this should be checked somewhere else. Fixes: 1c8e600440c7 ("clk: Add rate constraints to clocks") Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Michael Turquette <mturquette@linaro.org>
2015-02-19	ipmi: Fix a memory ordering issue	Corey Minyard	1	-4/+10
	From a locking point of view it is safe to check waiting_msg without a lock, but there is a memory ordering issue that causes it to possibly not be set right when viewed from another processor. We are already claiming a lock right after that, move the check to inside the lock to enforce the memory ordering. Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	ipmi: Remove uses of return value of seq_printf	Joe Perches	3	-16/+26
	The seq_printf like functions will soon be changed to return void. Convert these uses to check seq_has_overflowed instead. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	ipmi: Use is_visible callback for conditional sysfs entries	Takashi Iwai	1	-43/+17
	Instead of manual calls of device_create_file() and device_remove_file(), implement the condition in is_visible callback for the attribute group and put these entries to the group, too. This simplifies the code and avoids the possible races. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	ipmi: Free ipmi_recv_msg messages from the linked list on close	Nicholas Krause	1	-1/+5
	This adds a loop through the elements in the linked list, recv_msgs using list_for_entry_safe in order to free messages in this list. In addition we are using the safe version of this marco in order to prevent use after bugs related to deleting the element we are on currently by holding a pointer to the next element after the current one we are on and freeing with the function, ipmi_free_recv_msg internally in this loop. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	ipmi: avoid gcc warning	Arnd Bergmann	1	-8/+21
	A new harmless warning has come up on ARM builds with gcc-4.9: drivers/char/ipmi/ipmi_msghandler.c: In function 'smi_send.isra.11': include/linux/spinlock.h:372:95: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized] raw_spin_unlock_irqrestore(&lock->rlock, flags); ^ drivers/char/ipmi/ipmi_msghandler.c:1490:16: note: 'flags' was declared here unsigned long flags; ^ This could be worked around by initializing the 'flags' variable, but it seems better to rework the code to avoid this. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 7ea0ed2b5be81 ("ipmi: Make the message handler easier to use for SMI interfaces") Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	ipmi: Update timespec usage to timespec64	John Stultz	1	-12/+13
	As part of the internal y2038 cleanup, this patch removes timespec usage in the ipmi driver, replacing it timespec64 Cc: openipmi-developer@lists.sourceforge.net Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Corey Minyard <minyard@mvista.com>
2015-02-19	ipmi: Cleanup DEBUG_TIMING ifdef usage	John Stultz	1	-40/+21
	The driver uses #ifdef DEBUG_TIMING in order to conditionally print out timestamped debug messages. Unfortunately it adds the ifdefs all over the usage sites. This patch cleans it up by adding a debug_timestamp() function which is compiled out if DEBUG_TIMING isn't present. This cleans up all the ugly ifdefs in the function logic. Cc: openipmi-developer@lists.sourceforge.net Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Corey Minyard <minyard@mvista.com>
2015-02-19	drivers:char:ipmi: Remove unneeded FIXME comment in the file,ipmi_si_intf.c	Nicholas Krause	1	-1/+0
	Removes a no longer needed FIXME comment in the function,acpi_gpe_irq_setup for the file,ipmi_si_intf.c. This comment is no longer needed as clearly we are passing the correct level of ACPI_GPE_LEVEL_TRIGGERED to the installer function,acpi_install_gpe_handler due to no breakage after years of using this ACPI level in the function,acpi_install_gpe_handler. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	char: ipmi: Remove obsolete cleanup for clientdata	Wolfram Sang	1	-2/+0
	A few new i2c-drivers came into the kernel which clear the clientdata-pointer on exit or error. This is obsolete meanwhile, the core will do it. Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	ipmi: Remove a FIXME for slab conversion	Corey Minyard	1	-1/+0
	There can't be more than a few IPMI messages allocated at any one time, so converting the messages to slabs would be a waste. So just remove the FIXME. Suggested-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-02-19	x86: pte_protnone() and pmd_protnone() must check entry is not present	David Vrabel	1	-2/+4
	Since _PAGE_PROTNONE aliases _PAGE_GLOBAL it is only valid if _PAGE_PRESENT is clear. Make pte_protnone() and pmd_protnone() check for this. This fixes a 64-bit Xen PV guest regression introduced by 8a0516ed8b90 ("mm: convert p[te\|md]_numa users to p[te\|md]_protnone_numa"). Any userspace process would endlessly fault. In a 64-bit PV guest, userspace page table entries have _PAGE_GLOBAL set by the hypervisor. This meant that any fault on a present userspace entry (e.g., a write to a read-only mapping) would be misinterpreted as a NUMA hinting fault and the fault would not be correctly handled, resulting in the access endlessly faulting. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-19	kgdb, docs: Fix <para> pdfdocs build errors	Rajaneesh Acharya	1	-3/+3
	kgdb.pdf failed to build from 'make pdfdocs' giving errors such as: jade:... Documentation/DocBook/kgdb.xml:200:8:E: document type does not allow element "para" here; missing one of "footnote", "caution", "important", "note", "tip", "warning", "blockquote", "informalexample" start-tag Fixing minor <para> and <sect> issues allows kgdb.pdf to be generated under Fedora20. Originally submitted by rajaneesh.acharya@yahoo.com in 2011, discussed here: http://permalink.gmane.org/gmane.linux.documentation/3954 as patch: The following are the enhancements that removed the errors while issuing "make pdfdocs" [graham.whaley@intel.com: Improved commit message and ported to 3.18.1] Signed-off-by: Graham Whaley <graham.whaley@intel.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	debug: prevent entering debug mode on panic/exception.	Colin Cross	1	-0/+17
	On non-developer devices, kgdb prevents the device from rebooting after a panic. Incase of panics and exceptions, to allow the device to reboot, prevent entering debug mode to avoid getting stuck waiting for the user to interact with debugger. To avoid entering the debugger on panic/exception without any extra configuration, panic_timeout is being used which can be set via /proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the default value. Setting panic_timeout indicates that the user requested machine to perform unattended reboot after panic. We dont want to get stuck waiting for the user input incase of panic. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: kgdb-bugreport@lists.sourceforge.net Cc: linux-kernel@vger.kernel.org Cc: Android Kernel Team <kernel-team@android.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Colin Cross <ccross@android.com> [Kiran: Added context to commit message. panic_timeout is used instead of break_on_panic and break_on_exception to honor CONFIG_PANIC_TIMEOUT Modified the commit as per community feedback] Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: Const qualifier for kdb_getstr's prompt argument	Daniel Thompson	2	-2/+2
	All current callers of kdb_getstr() can pass constant pointers via the prompt argument. This patch adds a const qualification to make explicit the fact that this is safe. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: Provide forward search at more prompt	Daniel Thompson	3	-5/+26
	Currently kdb allows the output of comamnds to be filtered using the \| grep feature. This is useful but does not permit the output emitted shortly after a string match to be examined without wading through the entire unfiltered output of the command. Such a feature is particularly useful to navigate function traces because these traces often have a useful trigger string before the point of interest. This patch reuses the existing filtering logic to introduce a simple forward search to kdb that can be triggered from the more prompt. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: Fix a prompt management bug when using \| grep	Daniel Thompson	1	-2/+2
	Currently when the "\| grep" feature is used to filter the output of a command then the prompt is not displayed for the subsequent command. Likewise any characters typed by the user are also not echoed to the display. This rather disconcerting problem eventually corrects itself when the user presses Enter and the kdb_grepping_flag is cleared as kdb_parse() tries to make sense of whatever they typed. This patch resolves the problem by moving the clearing of this flag from the middle of command processing to the beginning. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: Remove stack dump when entering kgdb due to NMI	Daniel Thompson	1	-1/+0
	Issuing a stack dump feels ergonomically wrong when entering due to NMI. Entering due to NMI is normally a reaction to a user request, either the NMI button on a server or a "magic knock" on a UART. Therefore the backtrace behaviour on entry due to NMI should be like SysRq-g (no stack dump) rather than like oops. Note also that the stack dump does not offer any information that cannot be trivial retrieved using the 'bt' command. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: Avoid printing KERN_ levels to consoles	Daniel Thompson	3	-11/+21
	Currently when kdb traps printk messages then the raw log level prefix (consisting of '\001' followed by a numeral) does not get stripped off before the message is issued to the various I/O handlers supported by kdb. This causes annoying visual noise as well as causing problems grepping for ^. It is also a change of behaviour compared to normal usage of printk() usage. For example <SysRq>-h ends up with different output to that of kdb's "sr h". This patch addresses the problem by stripping log levels from messages before they are issued to the I/O handlers. printk() which can also act as an i/o handler in some cases is special cased; if the caller provided a log level then the prefix will be preserved when sent to printk(). The addition of non-printable characters to the output of kdb commands is a regression, albeit and extremely elderly one, introduced by commit 04d2c8c83d0e ("printk: convert the format for KERN_<LEVEL> to a 2 byte pattern"). Note also that this patch does not restore the original behaviour from v3.5. Instead it makes printk() from within a kdb command display the message without any prefix (i.e. like printk() normally does). Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Joe Perches <joe@perches.com> Cc: stable@vger.kernel.org Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: Fix off by one error in kdb_cpu()	Jason Wessel	2	-2/+2
	There was a follow on replacement patch against the prior "kgdb: Timeout if secondary CPUs ignore the roundup". See: https://lkml.org/lkml/2015/1/7/442 This patch is the delta vs the patch that was committed upstream: * Fix an off-by-one error in kdb_cpu(). * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we really want a static limit. * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c (kgdb-next contains a patch which introduced pr_fmt() to this file to the tag will now be applied automatically). Cc: Daniel Thompson <daniel.thompson@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	kdb: fix incorrect counts in KDB summary command output	Jay Lan	1	-1/+1
	The output of KDB 'summary' command should report MemTotal, MemFree and Buffers output in kB. Current codes report in unit of pages. A define of K(x) as is defined in the code, but not used. This patch would apply the define to convert the values to kB. Please include me on Cc on replies. I do not subscribe to linux-kernel. Signed-off-by: Jay Lan <jlan@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19	MAINTAINERS: update Ceph and RBD maintainers	Sage Weil	1	-3/+4
	- add Ilya, drop Yehuda as an RBD maintainer - add Zheng as a Ceph maintainer - update Yehuda and Sage's emails Signed-off-by: Sage Weil <sage@redhat.com>
2015-02-19	s390/spinlock: disabled compare-and-delay by default	Martin Schwidefsky	1	-5/+7
	Until we have hard performance data about the effects of CAD in the spinlock loop disable the instruction by default. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-02-19	i2c: designware-baytrail: another fixup for proper Kconfig dependencies	Wolfram Sang	1	-1/+1
	IOSF_MBI is tristate. Baytrail driver isn't. Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-02-19	i2c: fix reference to functionality constants definition	Baruch Siach	1	-1/+1
	Since commit 607ca46e97 ('UAPI: (Scripted) Disintegrate include/linux') the list of functionality constants moved to include/uapi/linux/i2c.h. Update the reference accordingly. Fixes: 607ca46e97 ('UAPI: (Scripted) Disintegrate include/linux') Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2015-02-19	x86/microcode/intel: Handle truncated microcode images more robustly	Quentin Casasnovas	2	-0/+9
	We do not check the input data bounds containing the microcode before copying a struct microcode_intel_header from it. A specially crafted microcode could cause the kernel to read invalid memory and lead to a denial-of-service. Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1422964824-22056-3-git-send-email-quentin.casasnovas@oracle.com [ Made error message differ from the next one and flipped comparison. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-19	x86/microcode/intel: Guard against stack overflow in the loader	Quentin Casasnovas	1	-1/+1
	mc_saved_tmp is a static array allocated on the stack, we need to make sure mc_saved_count stays within its bounds, otherwise we're overflowing the stack in _save_mc(). A specially crafted microcode header could lead to a kernel crash or potentially kernel execution. Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1422964824-22056-1-git-send-email-quentin.casasnovas@oracle.com Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-19	libceph: kfree() in put_osd() shouldn't depend on authorizer	Ilya Dryomov	1	-2/+3
	a255651d4cad ("ceph: ensure auth ops are defined before use") made kfree() in put_osd() conditional on the authorizer. A mechanical mistake most likely - fix it. Cc: Alex Elder <elder@linaro.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2015-02-19	libceph: fix double __remove_osd() problem	Ilya Dryomov	1	-8/+18
	It turns out it's possible to get __remove_osd() called twice on the same OSD. That doesn't sit well with rb_erase() - depending on the shape of the tree we can get a NULL dereference, a soft lockup or a random crash at some point in the future as we end up touching freed memory. One scenario that I was able to reproduce is as follows: <osd3 is idle, on the osd lru list> <con reset - osd3> con_fault_finish() osd_reset() <osdmap - osd3 down> ceph_osdc_handle_map() <takes map_sem> kick_requests() <takes request_mutex> reset_changed_osds() __reset_osd() __remove_osd() <releases request_mutex> <releases map_sem> <takes map_sem> <takes request_mutex> __kick_osd_requests() __reset_osd() __remove_osd() <-- !!! A case can be made that osd refcounting is imperfect and reworking it would be a proper resolution, but for now Sage and I decided to fix this by adding a safe guard around __remove_osd(). Fixes: http://tracker.ceph.com/issues/8087 Cc: Sage Weil <sage@redhat.com> Cc: stable@vger.kernel.org # 3.9+: 7c6e6fc53e73: libceph: assert both regular and lingering lists in __remove_osd() Cc: stable@vger.kernel.org # 3.9+: cc9f1f518cec: libceph: change from BUG to WARN for __remove_osd() asserts Cc: stable@vger.kernel.org # 3.9+ Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2015-02-19	rbd: convert to blk-mq	Christoph Hellwig	1	-54/+68
	This converts the rbd driver to use the blk-mq infrastructure. Except for switching to a per-request work item this is almost mechanical. This was tested by Alexandre DERUMIER in November, and found to give him 120000 iops, although the only comparism available was an old 3.10 kernel which gave 80000iops. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <elder@linaro.org> [idryomov@gmail.com: context, blk_mq_init_queue() EH] Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-02-19	x86, mm/ASLR: Fix stack randomization on 64-bit systems	Hector Marco-Gisbert	2	-5/+6
	The issue is that the stack for processes is not properly randomized on 64 bit architectures due to an integer overflow. The affected function is randomize_stack_top() in file "fs/binfmt_elf.c": static unsigned long randomize_stack_top(unsigned long stack_top) { unsigned int random_variable = 0; if ((current->flags & PF_RANDOMIZE) && !(current->personality & ADDR_NO_RANDOMIZE)) { random_variable = get_random_int() & STACK_RND_MASK; random_variable <<= PAGE_SHIFT; } return PAGE_ALIGN(stack_top) + random_variable; return PAGE_ALIGN(stack_top) - random_variable; } Note that, it declares the "random_variable" variable as "unsigned int". Since the result of the shifting operation between STACK_RND_MASK (which is 0x3fffff on x86_64, 22 bits) and PAGE_SHIFT (which is 12 on x86_64): random_variable <<= PAGE_SHIFT; then the two leftmost bits are dropped when storing the result in the "random_variable". This variable shall be at least 34 bits long to hold the (22+12) result. These two dropped bits have an impact on the entropy of process stack. Concretely, the total stack entropy is reduced by four: from 2^28 to 2^30 (One fourth of expected entropy). This patch restores back the entropy by correcting the types involved in the operations in the functions randomize_stack_top() and stack_maxrandom_size(). The successful fix can be tested with: $ for i in `seq 1 10`; do cat /proc/self/maps \| grep stack; done 7ffeda566000-7ffeda587000 rw-p 00000000 00:00 0 [stack] 7fff5a332000-7fff5a353000 rw-p 00000000 00:00 0 [stack] 7ffcdb7a1000-7ffcdb7c2000 rw-p 00000000 00:00 0 [stack] 7ffd5e2c4000-7ffd5e2e5000 rw-p 00000000 00:00 0 [stack] ... Once corrected, the leading bytes should be between 7ffc and 7fff, rather than always being 7fff. Signed-off-by: Hector Marco-Gisbert <hecmargi@upv.es> Signed-off-by: Ismael Ripoll <iripoll@upv.es> [ Rebased, fixed 80 char bugs, cleaned up commit message, added test example and CVE ] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: CVE-2015-1593 Link: http://lkml.kernel.org/r/20150214173350.GA18393@www.outflux.net Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-19	x86/mm/init: Fix incorrect page size in init_memory_mapping() printks	Dave Hansen	1	-2/+26
	With 32-bit non-PAE kernels, we have 2 page sizes available (at most): 4k and 4M. Enabling PAE replaces that 4M size with a 2M one (which 64-bit systems use too). But, when booting a 32-bit non-PAE kernel, in one of our early-boot printouts, we say: init_memory_mapping: [mem 0x00000000-0x000fffff] [mem 0x00000000-0x000fffff] page 4k init_memory_mapping: [mem 0x37000000-0x373fffff] [mem 0x37000000-0x373fffff] page 2M init_memory_mapping: [mem 0x00100000-0x36ffffff] [mem 0x00100000-0x003fffff] page 4k [mem 0x00400000-0x36ffffff] page 2M init_memory_mapping: [mem 0x37400000-0x377fdfff] [mem 0x37400000-0x377fdfff] page 4k Which is obviously wrong. There is no 2M page available. This is probably because of a badly-named variable: in the map_range code: PG_LEVEL_2M. Instead of renaming all the PG_LEVEL_2M's. This patch just fixes the printout: init_memory_mapping: [mem 0x00000000-0x000fffff] [mem 0x00000000-0x000fffff] page 4k init_memory_mapping: [mem 0x37000000-0x373fffff] [mem 0x37000000-0x373fffff] page 4M init_memory_mapping: [mem 0x00100000-0x36ffffff] [mem 0x00100000-0x003fffff] page 4k [mem 0x00400000-0x36ffffff] page 4M init_memory_mapping: [mem 0x37400000-0x377fdfff] [mem 0x37400000-0x377fdfff] page 4k BRK [0x03206000, 0x03206fff] PGTABLE Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/20150210212030.665EC267@viggo.jf.intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-19	x86/mm/ASLR: Propagate base load address calculation	Jiri Kosina	7	-17/+63
	Commit: e2b32e678513 ("x86, kaslr: randomize module base load address") makes the base address for module to be unconditionally randomized in case when CONFIG_RANDOMIZE_BASE is defined and "nokaslr" option isn't present on the commandline. This is not consistent with how choose_kernel_location() decides whether it will randomize kernel load base. Namely, CONFIG_HIBERNATION disables kASLR (unless "kaslr" option is explicitly specified on kernel commandline), which makes the state space larger than what module loader is looking at. IOW CONFIG_HIBERNATION && CONFIG_RANDOMIZE_BASE is a valid config option, kASLR wouldn't be applied by default in that case, but module loader is not aware of that. Instead of fixing the logic in module.c, this patch takes more generic aproach. It introduces a new bootparam setup data_type SETUP_KASLR and uses that to pass the information whether kaslr has been applied during kernel decompression, and sets a global 'kaslr_enabled' variable accordingly, so that any kernel code (module loading, livepatching, ...) can make decisions based on its value. x86 module loader is converted to make use of this flag. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Kees Cook <keescook@chromium.org> Cc: "H. Peter Anvin" <hpa@linux.intel.com> Link: https://lkml.kernel.org/r/alpine.LNX.2.00.1502101411280.10719@pobox.suse.cz [ Always dump correct kaslr status when panicking ] Signed-off-by: Borislav Petkov <bp@suse.de>
2015-02-19	ceph: return error for traceless reply race	Yan, Zheng	1	-6/+9
	When we receives traceless reply for request that created new inode, we re-send a lookup request to MDS get information of the newly created inode. (VFS expects FS' callback return an inode in create case) This breaks one request into two requests. Other client may modify or move to the new inode in the middle. When the race happens, ceph_handle_notrace_create() unconditionally links the dentry for 'create' operation to the inode returned by lookup. This may confuse VFS when the inode is a directory (VFS does not allow multiple linkages for directory inode). This patch makes ceph_handle_notrace_create() when it detect a race. This event should be rare and it happens only when we talk to old MDS. Recent MDS does not send traceless reply for request that creates new inode. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: fix dentry leaks	Yan, Zheng	2	-3/+6
	Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: re-send requests when MDS enters reconnecting stage	Yan, Zheng	1	-3/+26
	So that MDS can check if any request is already completed and process completed requests in clientreplay stage. When completed requests are processed in clientreplay stage, MDS can avoid sending traceless replies. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: show nocephx_require_signatures and notcp_nodelay options	Ilya Dryomov	1	-0/+4
	Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2015-02-19	libceph: tcp_nodelay support	Chaitanya Huilgol	4	-4/+33
	TCP_NODELAY socket option set on connection sockets, disables Nagle’s algorithm and improves latency characteristics. tcp_nodelay(default)/notcp_nodelay option flags provided to enable/disable setting the socket option. Signed-off-by: Chaitanya Huilgol <chaitanya.huilgol@sandisk.com> [idryomov@redhat.com: NO_TCP_NODELAY -> TCP_NODELAY, minor adjustments] Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
2015-02-19	rbd: do not treat standalone as flatten	Ilya Dryomov	1	-20/+10
	If the clone is resized down to 0, it becomes standalone. If such resize is carried over while an image is mapped we would detect this and call rbd_dev_parent_put() which means "let go of all parent state, including the spec(s) of parent images(s)". This leads to a mismatch between "rbd info" and sysfs parent fields, so a fix is in order. # rbd create --image-format 2 --size 1 foo # rbd snap create foo@snap # rbd snap protect foo@snap # rbd clone foo@snap bar # DEV=$(rbd map bar) # rbd resize --allow-shrink --size 0 bar # rbd resize --size 1 bar # rbd info bar \| grep parent parent: rbd/foo@snap Before: # cat /sys/bus/rbd/devices/0/parent (no parent image) After: # cat /sys/bus/rbd/devices/0/parent pool_id 0 pool_name rbd image_id 10056b8b4567 image_name foo snap_id 2 snap_name snap overlap 0 Signed-off-by: Ilya Dryomov <idryomov@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Alex Elder <elder@linaro.org>
2015-02-19	ceph: fix atomic_open snapdir	Yan, Zheng	1	-1/+1
	ceph_handle_snapdir() checks ceph_mdsc_do_request()'s return value and creates snapdir inode if it's -ENOENT Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: properly mark empty directory as complete	Yan, Zheng	1	-14/+15
	ceph_add_cap() calls __check_cap_issue(), which clears directory inode' complete flag. so we should set the complete flag for empty directory should be set after calling ceph_add_cap(). Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	client: include kernel version in client metadata	Yan, Zheng	1	-1/+2
	Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: provide seperate {inode,file}_operations for snapdir	Yan, Zheng	3	-4/+19
	remove all unsupported operations from {inode,file}_operations. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: fix request time stamp encoding	Yan, Zheng	1	-2/+10
	struct timespec uses 'long' to present second and nanosecond. 'long' is 64 bits on 64bits machine. ceph MDS expects time stamp to be encoded as struct ceph_timespec, which uses 'u32' to present second and nanosecond. Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-02-19	ceph: fix reading inline data when i_size > PAGE_SIZE	Yan, Zheng	2	-15/+26
	when inode has inline data but its size > PAGE_SIZE (it was truncated to larger size), previous direct read code return -EIO. This patch adds code to return zeros for data whose offset > PAGE_SIZE. Signed-off-by: Yan, Zheng <zyan@redhat.com>