aboutsummaryrefslogtreecommitdiffstats
path: root/kernel (follow)
AgeCommit message (Collapse)AuthorFilesLines
2014-06-05sched/fair: Fix tg_set_cfs_bandwidth() deadlock on rq->lockRoman Gushchin3-7/+6
tg_set_cfs_bandwidth() sets cfs_b->timer_active to 0 to force the period timer restart. It's not safe, because can lead to deadlock, described in commit 927b54fccbf0: "__start_cfs_bandwidth calls hrtimer_cancel while holding rq->lock, waiting for the hrtimer to finish. However, if sched_cfs_period_timer runs for another loop iteration, the hrtimer can attempt to take rq->lock, resulting in deadlock." Three CPUs must be involved: CPU0 CPU1 CPU2 take rq->lock period timer fired ... take cfs_b lock ... ... tg_set_cfs_bandwidth() throttle_cfs_rq() release cfs_b lock take cfs_b lock ... distribute_cfs_runtime() timer_active = 0 take cfs_b->lock wait for rq->lock ... __start_cfs_bandwidth() {wait for timer callback break if timer_active == 1} So, CPU0 and CPU1 are deadlocked. Instead of resetting cfs_b->timer_active, tg_set_cfs_bandwidth can wait for period timer callbacks (ignoring cfs_b->timer_active) and restart the timer explicitly. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Reviewed-by: Ben Segall <bsegall@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/87wqdi9g8e.wl\%klamm@yandex-team.ru Cc: pjt@google.com Cc: chris.j.arges@canonical.com Cc: gregkh@linuxfoundation.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05sched/dl: Fix race in dl_task_timer()Kirill Tkhai1-1/+9
Throttled task is still on rq, and it may be moved to other cpu if user is playing with sched_setaffinity(). Therefore, unlocked task_rq() access makes the race. Juri Lelli reports he got this race when dl_bandwidth_enabled() was not set. Other thing, pointed by Peter Zijlstra: "Now I suppose the problem can still actually happen when you change the root domain and trigger a effective affinity change that way". To fix that we do the same as made in __task_rq_lock(). We do not use __task_rq_lock() itself, because it has a useful lockdep check, which is not correct in case of dl_task_timer(). We do not need pi_lock locked here. This case is an exception (PeterZ): "The only reason we don't strictly need ->pi_lock now is because we're guaranteed to have p->state == TASK_RUNNING here and are thus free of ttwu races". Signed-off-by: Kirill Tkhai <tkhai@yandex.ru> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v3.14+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/3056991400578422@web14g.yandex.ru Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05sched: Fix sched_policy < 0 comparisonRichard Weinberger1-1/+1
attr.sched_policy is u32, therefore a comparison against < 0 is never true. Fix this by casting sched_policy to int. This issue was reported by coverity CID 1219934. Fixes: dbdb22754fde ("sched: Disallow sched_attr::sched_policy < 0") Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1401741514-7045-1-git-send-email-richard@nod.at Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05sched/numa: Fix use of spin_{un}lock_irq() when interrupts are disabledSteven Rostedt1-3/+4
As Peter Zijlstra told me, we have the following path: do_exit() exit_itimers() itimer_delete() spin_lock_irqsave(&timer->it_lock, &flags); timer_delete_hook(timer); kc->timer_del(timer) := posix_cpu_timer_del() put_task_struct() __put_task_struct() task_numa_free() spin_lock(&grp->lock); Which means that task_numa_free() can be called with interrupts disabled, which means that we should not be using spin_lock_irq() but spin_lock_irqsave() instead. Otherwise we are enabling interrupts while holding an interrupt unsafe lock! Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner<tglx@linutronix.de> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140527182541.GH11096@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-04cgroup: don't destroy the default rootLi Zefan1-1/+4
The default root is allocated and initialized at boot phase, so we shouldn't destroy the default root when it's umounted, otherwise it will lead to disaster. Just try mount and then umount the default root, and the kernel will crash immediately. v2: - No need to check for CSS_NO_REF in cgroup_get/put(). (Tejun) - Better call cgroup_put() for the default root in kill_sb(). (Tejun) - Add a comment. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-06-04Merge branch 'akpm' (patchbomb from Andrew) into nextLinus Torvalds28-229/+337
Merge misc updates from Andrew Morton: - a few fixes for 3.16. Cc'ed to stable so they'll get there somehow. - various misc fixes and cleanups - most of the ocfs2 queue. Review is slow... - most of MM. The MM queue is pretty huge this time, but not much in the way of feature work. - some tweaks under kernel/ - printk maintenance work - updates to lib/ - checkpatch updates - tweaks to init/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (276 commits) fs/autofs4/dev-ioctl.c: add __init to autofs_dev_ioctl_init fs/ncpfs/getopt.c: replace simple_strtoul by kstrtoul init/main.c: remove an ifdef kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND init/main.c: add initcall_blacklist kernel parameter init/main.c: don't use pr_debug() fs/binfmt_flat.c: make old_reloc() static fs/binfmt_elf.c: fix bool assignements fs/efs: convert printk(KERN_DEBUG to pr_debug fs/efs: add pr_fmt / use __func__ fs/efs: convert printk to pr_foo() scripts/checkpatch.pl: device_initcall is not the only __initcall substitute checkpatch: check stable email address checkpatch: warn on unnecessary void function return statements checkpatch: prefer kstrto<foo> to sscanf(buf, "%<lhuidx>", &bar); checkpatch: add warning for kmalloc/kzalloc with multiply checkpatch: warn on #defines ending in semicolon checkpatch: make --strict a default for files in drivers/net and net/ checkpatch: always warn on missing blank line after variable declaration block checkpatch: fix wildcard DT compatible string checking ...
2014-06-04kernel/compat.c: use sizeof() instead of sizeofFabian Frederick1-4/+4
Fix 4 checkpatch warnings WARNING: sizeof *tv should be sizeof(*tv) Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/printk: use symbolic defines for console loglevelsBorislav Petkov4-13/+6
... instead of naked numbers. Stuff in sysrq.c used to set it to 8 which is supposed to mean above default level so set it to DEBUG instead as we're terminating/killing all tasks and we want to be verbose there. Also, correct the check in x86_64_start_kernel which should be >= as we're clearly issuing the string there for all debug levels, not only the magical 10. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: Joe Perches <joe@perches.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: report dropping of messages from logbufWill Deacon1-2/+7
If the log ring buffer becomes full, we silently overwrite old messages with new data. console_unlock will detect this case and fast-forward the console_* pointers to skip over the corrupted data, but nothing will be reported to the user. This patch hijacks the first valid log message after detecting that we dropped messages and prefixes it with a note detailing how many messages were dropped. For long (~1000 char) messages, this will result in some truncation of the real message, but given that we're dropping things anyway, that doesn't seem to be the end of the world. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Kay Sievers <kay@vrfy.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04timekeeping: use printk_deferred when holding timekeeping seqlockJohn Stultz2-9/+13
Jiri Bohac pointed out that there are rare but potential deadlock possibilities when calling printk while holding the timekeeping seqlock. This is due to printk() triggering console sem wakeup, which can cause scheduling code to trigger hrtimers which may try to read the time. Specifically, as Jiri pointed out, that path is: printk vprintk_emit console_unlock up(&console_sem) __up wake_up_process try_to_wake_up ttwu_do_activate ttwu_activate activate_task enqueue_task enqueue_task_fair hrtick_update hrtick_start_fair hrtick_start_fair get_time ktime_get --> endless loop on read_seqcount_retry(&timekeeper_seq, ...) This patch tries to avoid this issue by using printk_deferred (previously named printk_sched) which should defer printing via a irq_work_queue. Signed-off-by: John Stultz <john.stultz@linaro.org> Reported-by: Jiri Bohac <jbohac@suse.cz> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: Add printk_deferred_onceJohn Stultz2-13/+2
Two of the three prink_deferred uses are really printk_once style uses, so add a printk_deferred_once macro to simplify those call sites. Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: rename printk_sched to printk_deferredJohn Stultz4-4/+4
After learning we'll need some sort of deferred printk functionality in the timekeeping core, Peter suggested we rename the printk_sched function so it can be reused by needed subsystems. This only changes the function name. No logic changes. Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Jan Kara <jack@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: disable preemption for printk_schedJohn Stultz1-0/+2
An earlier change in -mm (printk: remove separate printk_sched buffers...), removed the printk_sched irqsave/restore lines since it was safe for current users. Since we may be expanding usage of printk_sched(), disable preepmtion for this function to make it more generally safe to call. Signed-off-by: John Stultz <john.stultz@linaro.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Bohac <jbohac@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: remove separate printk_sched buffers and use printk buf insteadSteven Rostedt1-18/+29
To prevent deadlocks with doing a printk inside the scheduler, printk_sched() was created. The issue is that printk has a console_sem that it can grab and release. The release does a wake up if there's a task pending on the sem, and this wake up grabs the rq locks that is held in the scheduler. This leads to a possible deadlock if the wake up uses the same rq as the one with the rq lock held already. What printk_sched() does is to save the printk write in a per cpu buffer and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is set, the printk() is done against the buffer. There's a couple of issues with this approach. 1) If two printk_sched()s are called before the tick, the second one will overwrite the first one. 2) The temporary buffer is 512 bytes and is per cpu. This is a quite a bit of space wasted for something that is seldom used. In order to remove this, the printk_sched() can use the printk buffer instead, and delay the console_trylock()/console_unlock() to the queued work. Because printk_sched() would then be taking the logbuf_lock, the logbuf_lock must not be held while doing anything that may call into the scheduler functions, which includes wake ups. Unfortunately, printk() also has a console_sem that it uses, and on release, the up(&console_sem) may do a wake up of any pending waiters. This must be avoided while holding the logbuf_lock. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: enable interrupts before calling console_trylock_for_printk()Jan Kara1-11/+18
We need interrupts disabled when calling console_trylock_for_printk() only so that cpu id we pass to can_use_console() remains valid (for other things console_sem provides all the exclusion we need and deadlocks on console_sem due to interrupts are impossible because we use down_trylock()). However if we are rescheduled, we are guaranteed to run on an online cpu so we can easily just get the cpu id in can_use_console(). We can lose a bit of performance when we enable interrupts in vprintk_emit() and then disable them again in console_unlock() but OTOH it can somewhat reduce interrupt latency caused by console_unlock() especially since later in the patch series we will want to spin on console_sem in console_trylock_for_printk(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: fix lockdep instrumentation of console_semJan Kara1-14/+32
Printk calls mutex_acquire() / mutex_release() by hand to instrument lockdep about console_sem. However in some corner cases the instrumentation is missing. Fix the problem by creating helper functions for locking / unlocking console_sem which take care of lockdep instrumentation as well. Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Fabio Estevam <festevam@gmail.com> Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Fabio Estevam <fabio.estevam@freescale.com> Tested-By: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: release lockbuf_lock before calling console_trylock_for_printk()Jan Kara1-33/+21
There's no reason to hold lockbuf_lock when entering console_trylock_for_printk(). The first thing this function does is to call down_trylock(console_sem) and if that fails it immediately unlocks lockbuf_lock. So lockbuf_lock isn't needed for that branch. When down_trylock() succeeds, the rest of console_trylock() is OK without lockbuf_lock (it is called without it from other places), and the only remaining thing in console_trylock_for_printk() is can_use_console() call. For that call console_sem is enough (it iterates all consoles and checks CON_ANYTIME flag). So we drop logbuf_lock before entering console_trylock_for_printk() which simplifies the code. [akpm@linux-foundation.org: fix have_callable_console() comment] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: remove outdated commentJan Kara1-2/+1
Comment about interesting interlocking between lockbuf_lock and console_sem is outdated. It was added in 2002 by commit a880f45a48be during conversion of console_lock to console_sem + lockbuf_lock. At that time release_console_sem() (today's equivalent is console_unlock()) was indeed using lockbuf_lock to avoid races between trylock on console_sem in printk() and unlock of console_sem. However these days the interlocking is gone and the races are avoided by rechecking logbuf state after releasing console_sem. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: return really stored message lengthPetr Mladek1-15/+21
I wonder if anyone uses printk return value but it is there and should be counted correctly. This patch modifies log_store() to return the number of really stored bytes from the 'text' part. Also it handles the return value in vprintk_emit(). Note that log_store() is used also in cont_flush() but we could ignore the return value there. The function works with characters that were already counted earlier. In addition, the store could newer fail here because the length of the printed text is limited by the "cont" buffer and "dict" is NULL. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: shrink too long messagesPetr Mladek1-3/+39
We might want to print at least part of too long messages and add some warning for debugging purpose. The question is how long the shrunken message should be. If we use the whole buffer, it might get rotated too soon. Let's try to use only 1/4 of the buffer for now. Also shrink the whole dictionary. We do not want to parse it or break it in the middle of some pair of values. It would not cause any real harm but still. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: split message size computationPetr Mladek1-3/+13
We will want to recompute the message size when shrinking too long messages. Let's put the code into separate function. The side effect of setting "pad_len" is not nice but it is worth removing the code duplication. Note that I will probably have one more usage for this function when handling messages safe way in NMI context. This patch does not change the existing behavior. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: ignore too long messagesPetr Mladek1-7/+23
There was no check for too long messages. The check for free space always passed when first_seq and next_seq were equal. Enough free space was not guaranteed, though. log_store() might be called to store messages up to 64kB + 64kB + 16B. This is sum of maximal text_len, dict_len values, and the size of the structure printk_log. On the other hand, the minimal size for the main log buffer currently is 4kB and it is enforced only by Kconfig. The good news is that the usage looks safe right now. log_store() is called only from vprintk_emit() and cont_flush(). Here the "text" part is always passed via a static buffer and the length is limited to LOG_LINE_MAX which is 1024. The "dict" part is NULL in most cases. The only exceptions is when vprintk_emit() is called from printk_emit() and dev_vprintk_emit(). But printk_emit() is currently used only in devkmsg_writev() and here "dict" is NULL as well. In dev_vprintk_emit(), "dict" is limited by the static buffer "hdr" of the size 128 bytes. It meas that the current maximal printed text is 1024B + 128B + 16B and it always fit the log buffer. But it is only matter of time when someone calls printk_emit() with unsafe parameters, especially the "dict" one. This patch adds a check for the free space when the buffer is empty. It reuses the already existing log_has_space() function but it has to add an extra parameter. It defines whether the buffer is empty. Note that the same values of "first_idx" and "next_idx" might also mean that the buffer is full. If the buffer is empty, we must respect the current position of the indexes. We cannot reset them to the beginning of the buffer. Otherwise, the functions reading the buffer would get crazy. The question is what to do when the message is too long. This patch uses the easiest solution and just ignores the problematic message. Let's do something better in a followup patch. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04printk: split code for making free space in the log bufferPetr Mladek1-15/+29
The check for free space in the log buffer always passes when "first_seq" and "next_seq" are equal. In theory, it might cause writing outside of the log buffer. Fortunately, the current usage looks safe because the used "text" and "dict" buffers are quite limited. See the second patch for more details. Anyway, it is better to be on the safe side and add a check. An easy solution is done in the 2nd patch and it is improved in the 4th patch. 5th patch fixes the computation of the printed message length. 1st and 3rd patches just do some code refactoring to make the other patches easier. This patch (of 5): There will be needed some fixes in the check for free space. They will be easier if the code is moved outside of the quite long log_store() function. This patch does not change the existing behavior. Signed-off-by: Petr Mladek <pmladek@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/user.c: drop unused field 'files' from user_structKirill A. Shutemov1-1/+0
Nobody seems uses it for a long time. Let's drop it. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/hung_task.c: convert simple_strtoul to kstrtouintFabian Frederick1-1/+3
sysctl_hung_task_panic has been changed to unsigned int. use kstrtouint instead of obsolete simple_strtoul Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/utsname_sysctl.c: replace obsolete __initcall by device_initcallFabian Frederick1-2/+2
Also fixes checkpatch warnings on proc_dostring function parameters Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/reboot.c: convert simple_strtoul to kstrtointFabian Frederick1-7/+14
Replace obsolete function. kstrtoint is used as reboot_cpu is an integer. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/res_counter.c: replace simple_strtoull by kstrtoullFabian Frederick1-2/+5
[akpm@linux-foundation.org: don't overwrite kstrtoull()'s errno] Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/tracepoint.c: kernel-doc fixesFabian Frederick1-0/+2
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/stop_machine.c: kernel-doc warning fixFabian Frederick1-0/+1
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/latencytop.c: convert seq_printf to seq_putsFabian Frederick1-2/+3
This patch also fixes one function declaration over 80 characters. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/exec_domain.c: code clean-upFabian Frederick1-8/+6
Fix checkpatch warnings about EXPORT_SYMBOL and return() Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/capability.c: code clean-upFabian Frederick1-3/+3
- EXPORT_SYMBOL - typo: unexpectidly->unexpectedly - function prototype over 80 characters Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/backtracetest.c: replace no level printk by pr_info()Fabian Frederick1-9/+9
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kernel/cpu.c: convert printk to pr_foo()Fabian Frederick1-17/+14
no level printk converted to pr_warn (if err) no level printk converted to pr_info (disabling non-boot cpus) Other printk converted to respective level. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALLFabian Frederick2-2/+4
sys_sgetmask and sys_ssetmask are obsolete system calls no longer supported in libc. This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert mode configuration.That option is enabled by default for those architectures. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Steven Miao <realmz6@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04mm: page_alloc: use jump labels to avoid checking number_of_cpusetsMel Gorman1-10/+4
If cpusets are not in use then we still check a global variable on every page allocation. Use jump labels to avoid the overhead. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04memcg: optimize the "Search everything else" loop in mm_update_next_owner()Oleg Nesterov1-3/+9
for_each_process_thread() is sub-optimal. All threads share the same ->mm, we can swicth to the next process once we found a thread with ->mm != NULL and ->mm != mm. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Peter Chiang <pchiang@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04memcg: mm_update_next_owner() should skip kthreadsOleg Nesterov1-6/+4
"Search through everything else" in mm_update_next_owner() can hit a kthread which adopted this "mm" via use_mm(), it should not be used as mm->owner. Add the PF_KTHREAD check. While at it, change this code to use for_each_process_thread() instead of deprecated do_each_thread/while_each_thread. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Peter Chiang <pchiang@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04memcg: kill CONFIG_MM_OWNEROleg Nesterov2-4/+4
CONFIG_MM_OWNER makes no sense. It is not user-selectable, it is only selected by CONFIG_MEMCG automatically. So we can kill this option in init/Kconfig and do s/CONFIG_MM_OWNER/CONFIG_MEMCG/ globally. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04mm: get rid of __GFP_KMEMCGVladimir Davydov1-3/+3
Currently to allocate a page that should be charged to kmemcg (e.g. threadinfo), we pass __GFP_KMEMCG flag to the page allocator. The page allocated is then to be freed by free_memcg_kmem_pages. Apart from looking asymmetrical, this also requires intrusion to the general allocation path. So let's introduce separate functions that will alloc/free pages charged to kmemcg. The new functions are called alloc_kmem_pages and free_kmem_pages. They should be used when the caller actually would like to use kmalloc, but has to fall back to the page allocator for the allocation is large. They only differ from alloc_pages and free_pages in that besides allocating or freeing pages they also charge them to the kmem resource counter of the current memory cgroup. [sfr@canb.auug.org.au: export kmalloc_order() to modules] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04kthread: fix return value of kthread_create() upon SIGKILL.Tetsuo Handa1-2/+2
Commit 786235eeba0e ("kthread: make kthread_create() killable") meant for allowing kthread_create() to abort as soon as killed by the OOM-killer. But returning -ENOMEM is wrong if killed by SIGKILL from userspace. Change kthread_create() to return -EINTR upon SIGKILL. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> [3.13+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into nextLinus Torvalds7-45/+188
Pull core irq updates from Thomas Gleixner: "The irq department delivers: - Another tree wide update to get rid of the horrible create_irq interface along with its even more horrible variants. That also gets rid of the last leftovers of the initial sparse irq hackery. arch/driver specific changes have been either acked or ignored. - A fix for the spurious interrupt detection logic with threaded interrupts. - A new ARM SoC interrupt controller - The usual pile of fixes and improvements all over the place" * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) Documentation: brcmstb-l2: Add Broadcom STB Level-2 interrupt controller binding irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller genirq: Improve documentation to match current implementation ARM: iop13xx: fix msi support with sparse IRQ genirq: Provide !SMP stub for irq_set_affinity_notifier() irqchip: armada-370-xp: Move the devicetree binding documentation irqchip: gic: Use mask field in GICC_IAR genirq: Remove dynamic_irq mess ia64: Use irq_init_desc genirq: Replace dynamic_irq_init/cleanup genirq: Remove irq_reserve_irq[s] genirq: Replace reserve_irqs in core code s390: Avoid call to irq_reserve_irqs() s390: Remove pointless arch_show_interrupts() s390: pci: Check return value of alloc_irq_desc() proper sh: intc: Remove pointless irq_reserve_irqs() invocation x86, irq: Remove pointless irq_reserve_irqs() call genirq: Make create/destroy_irq() ia64 private tile: Use SPARSE_IRQ tile: pci: Use irq_alloc/free_hwirq() ...
2014-06-04Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into nextLinus Torvalds2-20/+10
Pull timer core updates from Thomas Gleixner: "This time you get nothing really exciting: - A huge update to the sh* clocksource drivers - Support for two more ARM SoCs - Removal of the deprecated setup_sched_clock() API - The usual pile of fixlets all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) clocksource: Add Freescale FlexTimer Module (FTM) timer support ARM: dts: vf610: Add Freescale FlexTimer Module timer node. clocksource: ftm: Add FlexTimer Module (FTM) Timer devicetree Documentation clocksource: sh_tmu: Remove unnecessary OOM messages clocksource: sh_mtu2: Remove unnecessary OOM messages clocksource: sh_cmt: Remove unnecessary OOM messages clocksource: em_sti: Remove unnecessary OOM messages clocksource: dw_apb_timer_of: Do not trace read_sched_clock clocksource: Fix clocksource_mmio_readX_down clocksource: Fix type confusion for clocksource_mmio_readX_Y clocksource: sh_tmu: Fix channel IRQ retrieval in legacy case clocksource: qcom: Implement read_current_timer for udelay ntp: Make is_error_status() use its argument ntp: Convert simple_strtol to kstrtol timer_stats/doc: Fix /proc/timer_stats documentation sched_clock: Remove deprecated setup_sched_clock() API ARM: sun6i: a31: Add support for the High Speed Timers clocksource: sun5i: Add support for reset controller clocksource: efm32: use $vendor,$device scheme for compatible string KConfig: Vexpress: build the ARM_GLOBAL_TIMER with vexpress platform ...
2014-06-04tracing: Remove unused variable in trace_benchmarkSteven Rostedt (Red Hat)1-1/+0
Somehow this unused variable warning sneaked past my warnings check (probably due to it depending on a new config). kernel/trace/trace_benchmark.c: In function 'trace_do_benchmark': kernel/trace/trace_benchmark.c:38:6: warning: unused variable 'seedsq' [-Wunused-variable] u64 seedsq; ^ Link: http://lkml.kernel.org/r/20140604160921.4f4e69c4@canb.auug.org.au Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-04Merge tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm into nextLinus Torvalds7-77/+130
Pull ACPI and power management updates from Rafael Wysocki: "ACPICA is the leader this time (63 commits), followed by cpufreq (28 commits), devfreq (15 commits), system suspend/hibernation (12 commits), ACPI video and ACPI device enumeration (10 commits each). We have no major new features this time, but there are a few significant changes of how things work. The most visible one will probably be that we are now going to create platform devices rather than PNP devices by default for ACPI device objects with _HID. That was long overdue and will be really necessary to be able to use the same drivers for the same hardware blocks on ACPI and DT-based systems going forward. We're not expecting fallout from this one (as usual), but it's something to watch nevertheless. The second change having a chance to be visible is that ACPI video will now default to using native backlight rather than the ACPI backlight interface which should generally help systems with broken Win8 BIOSes. We're hoping that all problems with the native backlight handling that we had previously have been addressed and we are in a good enough shape to flip the default, but this change should be easy enough to revert if need be. In addition to that, the system suspend core has a new mechanism to allow runtime-suspended devices to stay suspended throughout system suspend/resume transitions if some extra conditions are met (generally, they are related to coordination within device hierarchy). However, enabling this feature requires cooperation from the bus type layer and for now it has only been implemented for the ACPI PM domain (used by ACPI-enumerated platform devices mostly today). Also, the acpidump utility that was previously shipped as a separate tool will now be provided by the upstream ACPICA along with the rest of ACPICA code, which will allow it to be more up to date and better supported, and we have one new cpuidle driver (ARM clps711x). The rest is improvements related to certain specific use cases, cleanups and fixes all over the place. Specifics: - ACPICA update to upstream version 20140424. That includes a number of fixes and improvements related to things like GPE handling, table loading, headers, memory mapping and unmapping, DSDT/SSDT overriding, and the Unload() operator. The acpidump utility from upstream ACPICA is included too. From Bob Moore, Lv Zheng, David Box, David Binderman, and Colin Ian King. - Fixes and cleanups related to ACPI video and backlight interfaces from Hans de Goede. That includes blacklist entries for some new machines and using native backlight by default. - ACPI device enumeration changes to create platform devices rather than PNP devices for ACPI device objects with _HID by default. PNP devices will still be created for the ACPI device object with device IDs corresponding to real PNP devices, so that change should not break things left and right, and we're expecting to see more and more ACPI-enumerated platform devices in the future. From Zhang Rui and Rafael J Wysocki. - Updates for the ACPI LPSS (Low-Power Subsystem) driver allowing it to handle system suspend/resume on Asus T100 correctly. From Heikki Krogerus and Rafael J Wysocki. - PM core update introducing a mechanism to allow runtime-suspended devices to stay suspended over system suspend/resume transitions if certain additional conditions related to coordination within device hierarchy are met. Related PM documentation update and ACPI PM domain support for the new feature. From Rafael J Wysocki. - Fixes and improvements related to the "freeze" sleep state. They affect several places including cpuidle, PM core, ACPI core, and the ACPI battery driver. From Rafael J Wysocki and Zhang Rui. - Miscellaneous fixes and updates of the ACPI core from Aaron Lu, Bjørn Mork, Hanjun Guo, Lan Tianyu, and Rafael J Wysocki. - Fixes and cleanups for the ACPI processor and ACPI PAD (Processor Aggregator Device) drivers from Baoquan He, Manuel Schölling, Tony Camuso, and Toshi Kani. - System suspend/resume optimization in the ACPI battery driver from Lan Tianyu. - OPP (Operating Performance Points) subsystem updates from Chander Kashyap, Mark Brown, and Nishanth Menon. - cpufreq core fixes, updates and cleanups from Srivatsa S Bhat, Stratos Karafotis, and Viresh Kumar. - Updates, fixes and cleanups for the Tegra, powernow-k8, imx6q, s5pv210, nforce2, and powernv cpufreq drivers from Brian Norris, Jingoo Han, Paul Bolle, Philipp Zabel, Stratos Karafotis, and Viresh Kumar. - intel_pstate driver fixes and cleanups from Dirk Brandewie, Doug Smythies, and Stratos Karafotis. - Enabling the big.LITTLE cpufreq driver on arm64 from Mark Brown. - Fix for the cpuidle menu governor from Chander Kashyap. - New ARM clps711x cpuidle driver from Alexander Shiyan. - Hibernate core fixes and cleanups from Chen Gang, Dan Carpenter, Fabian Frederick, Pali Rohár, and Sebastian Capella. - Intel RAPL (Running Average Power Limit) driver updates from Jacob Pan. - PNP subsystem updates from Bjorn Helgaas and Fabian Frederick. - devfreq core updates from Chanwoo Choi and Paul Bolle. - devfreq updates for exynos4 and exynos5 from Chanwoo Choi and Bartlomiej Zolnierkiewicz. - turbostat tool fix from Jean Delvare. - cpupower tool updates from Prarit Bhargava, Ramkumar Ramachandra and Thomas Renninger. - New ACPI ec_access.c tool for poking at the EC in a safe way from Thomas Renninger" * tag 'pm+acpi-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (187 commits) ACPICA: Namespace: Remove _PRP method support. intel_pstate: Improve initial busy calculation intel_pstate: add sample time scaling intel_pstate: Correct rounding in busy calculation intel_pstate: Remove C0 tracking PM / hibernate: fixed typo in comment ACPI: Fix x86 regression related to early mapping size limitation ACPICA: Tables: Add mechanism to control early table checksum verification. ACPI / scan: use platform bus type by default for _HID enumeration ACPI / scan: always register ACPI LPSS scan handler ACPI / scan: always register memory hotplug scan handler ACPI / scan: always register container scan handler ACPI / scan: Change the meaning of missing .attach() in scan handlers ACPI / scan: introduce platform_id device PNP type flag ACPI / scan: drop unsupported serial IDs from PNP ACPI scan handler ID list ACPI / scan: drop IDs that do not comply with the ACPI PNP ID rule ACPI / PNP: use device ID list for PNPACPI device enumeration ACPI / scan: .match() callback for ACPI scan handlers ACPI / battery: wakeup the system only when necessary power_supply: allow power supply devices registered w/o wakeup source ...
2014-06-03tracing: Eliminate double free on failure of allocation on boot upYoshihiro YUNOMAE1-4/+0
If allocation of the max_buffer fails on boot up, the error path will free both per_cpu data structures from the buffers. With the new redesign of the code, those structures are freed if allocations failed. That is, the helper function that allocates the buffers will free the per cpu data on failure. No need to do it again. In fact, the second free will cause a bug as the code can not handle a double free. Link: http://lkml.kernel.org/p/20140603042803.27308.30956.stgit@yunodevel Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-03Merge branches 'pnp', 'powercap', 'pm-runtime' and 'pm-opp'Rafael J. Wysocki1-2/+1
* pnp: MAINTAINERS: Remove Bjorn Helgaas as PNP maintainer PNP / resources: remove positive test on unsigned values * powercap: powercap / RAPL: add new CPU IDs powercap / RAPL: further relax energy counter checks * pm-runtime: PM / runtime: Update documentation to reflect the current code flow * pm-opp: PM / OPP: discard duplicate OPPs PM / OPP: Make OPP invisible to users in Kconfig PM / OPP: fix incorrect OPP count handling in of_init_opp_table
2014-06-03Merge branch 'acpi-pm'Rafael J. Wysocki1-0/+15
* acpi-pm: ACPI / PM: Export rest of the subsys PM callbacks ACPI / PM: Avoid resuming devices in ACPI PM domain during system suspend ACPI / PM: Hold ACPI scan lock over the "freeze" sleep state ACPI / PM: Export acpi_target_system_state() to modules
2014-06-03Merge branch 'pm-sleep'Rafael J. Wysocki6-75/+114
* pm-sleep: PM / hibernate: fixed typo in comment PM / sleep: unregister wakeup source when disabling device wakeup PM / sleep: Introduce command line argument for sleep state enumeration PM / sleep: Use valid_state() for platform-dependent sleep states only PM / sleep: Add state field to pm_states[] entries PM / sleep: Update device PM documentation to cover direct_complete PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily PM / hibernate: Fix memory corruption in resumedelay_setup() PM / hibernate: convert simple_strtoul to kstrtoul PM / hibernate: Documentation: Fix script for unswapping PM / hibernate: no kernel_power_off when pm_power_off NULL PM / hibernate: use unsigned local variables in swsusp_show_speed()