aboutsummaryrefslogtreecommitdiffstats
path: root/kernel (follow)
AgeCommit message (Collapse)AuthorFilesLines
2011-12-06Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-5/+11
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ftrace: Fix hash record accounting bug perf: Fix parsing of __print_flags() in TP_printk() jump_label: jump_label_inc may return before the code is patched ftrace: Remove force undef config value left for testing tracing: Restore system filter behavior tracing: fix event_subsystem ref counting
2011-12-05Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-6/+95
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix loss of notification with multi-event perf, x86: Force IBS LVT offset assignment for family 10h perf, x86: Disable PEBS on SandyBridge chips trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter perf session: Fix crash with invalid CPU list perf python: Fix undefined symbol problem perf/x86: Enable raw event access to Intel offcore events perf: Don't use -ENOSPC for out of PMU resources perf: Do not set task_ctx pointer in cpuctx if there are no events in the context perf/x86: Fix PEBS instruction unwind oprofile, x86: Fix crash when unloading module (nmi timer mode) oprofile: Fix crash when unloading module (hr timer mode)
2011-12-05Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds3-3/+4
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clockevents: Set noop handler in clockevents_exchange_device() tick-broadcast: Stop active broadcast device when replacing it clocksource: Fix bug with max_deferment margin calculation rtc: Fix some bugs that allowed accumulating time drift in suspend/resume rtc: Disable the alarm in the hardware
2011-12-05Merge branches 'core-urgent-for-linus' and 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+4
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: slab, lockdep: Fix silly bug * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Fix race condition when stopping the irq thread
2011-12-05Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-34/+146
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched, x86: Avoid unnecessary overflow in sched_clock sched: Fix buglet in return_cfs_rq_runtime() sched: Avoid SMT siblings in select_idle_sibling() if possible sched: Set the command name of the idle tasks in SMP kernels sched, rt: Provide means of disabling cross-cpu bandwidth sharing sched: Document wait_for_completion_*() return values sched_fair: Fix a typo in the comment describing update_sd_lb_stats sched: Add a comment to effective_load() since it's a pain
2011-12-05ftrace: Fix hash record accounting bugSteven Rostedt1-1/+3
If the set_ftrace_filter is cleared by writing just whitespace to it, then the filter hash refcounts will be decremented but not updated. This causes two bugs: 1) No functions will be enabled for tracing when they all should be 2) If the users clears the set_ftrace_filter twice, it will crash ftrace: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7() Modules linked in: Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32 Call Trace: [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7 [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f [<ffffffff8111bdfe>] ? kfree+0xe5/0x115 [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151 [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f [<ffffffff8112e49a>] fput+0xfd/0x1c2 [<ffffffff8112b54c>] filp_close+0x6d/0x78 [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1 [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54 [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b ---[ end trace 77a3a7ee73794a02 ]--- Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian Reported-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05jump_label: jump_label_inc may return before the code is patchedGleb Natapov1-1/+2
If cpu A calls jump_label_inc() just after atomic_add_return() is called by cpu B, atomic_inc_not_zero() will return value greater then zero and jump_label_inc() will return to a caller before jump_label_update() finishes its job on cpu B. Link: http://lkml.kernel.org/r/20111018175551.GH17571@redhat.com Cc: stable@vger.kernel.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05ftrace: Remove force undef config value left for testingSteven Rostedt1-1/+0
A forced undef of a config value was used for testing and was accidently left in during the final commit. This causes x86 to run slower than needed while running function tracing as well as causes the function graph selftest to fail when DYNMAIC_FTRACE is not set. This is because the code in MCOUNT expects the ftrace code to be processed with the config value set that happened to be forced not set. The forced config option was left in by: commit 6331c28c962561aee59e5a493b7556a4bb585957 ftrace: Fix dynamic selftest failure on some archs Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian Cc: stable@vger.kernel.org Reported-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05tracing: Restore system filter behaviorLi Zefan1-1/+6
Though not all events have field 'prev_pid', it was allowed to do this: # echo 'prev_pid == 100' > events/sched/filter but commit 75b8e98263fdb0bfbdeba60d4db463259f1fe8a2 (tracing/filter: Swap entire filter of events) broke it without any reason. Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.com Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05tracing: fix event_subsystem ref countingIlya Dryomov1-1/+0
Fix a bug introduced by e9dbfae5, which prevents event_subsystem from ever being released. Ref_count was added to keep track of subsystem users, not for counting events. Subsystem is created with ref_count = 1, so there is no need to increment it for every event, we have nr_events for that. Fix this by touching ref_count only when we actually have a new user - subsystem_open(). Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Link: http://lkml.kernel.org/r/1320052062-7846-1-git-send-email-idryomov@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgentIngo Molnar1-3/+3
2011-12-05perf: Fix loss of notification with multi-eventPeter Zijlstra3-2/+90
When you do: $ perf record -e cycles,cycles,cycles noploop 10 You expect about 10,000 samples for each event, i.e., 10s at 1000samples/sec. However, this is not what's happening. You get much fewer samples, maybe 3700 samples/event: $ perf report -D | tail -15 Aggregated stats: TOTAL events: 10998 MMAP events: 66 COMM events: 2 SAMPLE events: 10930 cycles stats: TOTAL events: 3644 SAMPLE events: 3644 cycles stats: TOTAL events: 3642 SAMPLE events: 3642 cycles stats: TOTAL events: 3644 SAMPLE events: 3644 On a Intel Nehalem or even AMD64, there are 4 counters capable of measuring cycles, so there is plenty of space to measure those events without multiplexing (even with the NMI watchdog active). And even with multiplexing, we'd expect roughly the same number of samples per event. The root of the problem was that when the event that caused the buffer to become full was not the first event passed on the cmdline, the user notification would get lost. The notification was sent to the file descriptor of the overflowed event but the perf tool was not polling on it. The perf tool aggregates all samples into a single buffer, i.e., the buffer of the first event. Consequently, it assumes notifications for any event will come via that descriptor. The seemingly straight forward solution of moving the waitq into the ringbuffer object doesn't work because of life-time issues. One could perf_event_set_output() on a fd that you're also blocking on and cause the old rb object to be freed while its waitq would still be referenced by the blocked thread -> FAIL. Therefore link all events to the ringbuffer and broadcast the wakeup from the ringbuffer object to all possible events that could be waited upon. This is rather ugly, and we're open to better solutions but it works for now. Reported-by: Stephane Eranian <eranian@google.com> Finished-by: Stephane Eranian <eranian@google.com> Reviewed-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111126014731.GA7030@quad Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-02clockevents: Set noop handler in clockevents_exchange_device()Thomas Gleixner1-0/+1
If a device is shutdown, then there might be a pending interrupt, which will be processed after we reenable interrupts, which causes the original handler to be run. If the old handler is the (broadcast) periodic handler the shutdown state might hang the kernel completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2011-12-02tick-broadcast: Stop active broadcast device when replacing itThomas Gleixner1-1/+1
When a better rated broadcast device is installed, then the current active device is not disabled, which results in two running broadcast devices. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org
2011-12-02genirq: Fix race condition when stopping the irq threadIdo Yariv1-1/+4
In irq_wait_for_interrupt(), the should_stop member is verified before setting the task's state to TASK_INTERRUPTIBLE and calling schedule(). In case kthread_stop sets should_stop and wakes up the process after should_stop is checked by the irq thread but before the task's state is changed, the irq thread might never exit: kthread_stop irq_wait_for_interrupt ------------ ---------------------- ... ... while (!kthread_should_stop()) { kthread->should_stop = 1; wake_up_process(k); wait_for_completion(&kthread->exited); ... set_current_state(TASK_INTERRUPTIBLE); ... schedule(); } Fix this by checking if the thread should stop after modifying the task's state. [ tglx: Simplified it a bit ] Signed-off-by: Ido Yariv <ido@wizery.com> Link: http://lkml.kernel.org/r/1322740508-22640-1-git-send-email-ido@wizery.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
2011-12-01trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filterTejun Heo1-3/+3
ftrace_event_call->filter is sched RCU protected but didn't use rcu_assign_pointer(). Use it. TODO: Add proper __rcu annotation to call->filter and all its users. -v2: Use RCU_INIT_POINTER() for %NULL clearing as suggested by Eric. Link: http://lkml.kernel.org/r/20111123164949.GA29639@google.com Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@kernel.org # (2.6.39+) Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-01clocksource: Fix bug with max_deferment margin calculationYang Honggang (Joseph)1-2/+2
In order to leave a margin of 12.5% we should >> 3 not >> 5. CC: stable@kernel.org Signed-off-by: Yang Honggang (Joseph) <eagle.rtlinux@gmail.com> [jstultz: Modified commit subject] Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-29Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds1-6/+10
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: Update comments describing device power management callbacks PM / Sleep: Update documentation related to system wakeup PM / Runtime: Make documentation follow the new behavior of irq_safe PM / Sleep: Correct inaccurate information in devices.txt PM / Domains: Document how PM domains are used by the PM core PM / Hibernate: Do not leak memory in error/test code paths
2011-11-28Merge branch 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds1-2/+9
* 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup_freezer: fix freezing groups with stopped tasks
2011-11-28Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds3-13/+143
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Fix extra wakeups from __remove_hrtimer() timekeeping: add arch_offset hook to ktime_get functions clocksource: Avoid selecting mult values that might overflow when adjusted time: Improve documentation of timekeeeping_adjust()
2011-11-28Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Don't allow per cpu interrupts to be suspended
2011-11-28genirq: fix regression in irqfixup, irqpollEdward Donovan1-1/+3
Commit fa27271bc8d2("genirq: Fixup poll handling") introduced a regression that broke irqfixup/irqpoll for some hardware configurations. Amidst reorganizing 'try_one_irq', that patch removed a test that checked for 'action->handler' returning IRQ_HANDLED, before acting on the interrupt. Restoring this test back returns the functionality lost since 2.6.39. In the current set of tests, after 'action' is set, it must precede '!action->next' to take effect. With this and my previous patch to irq/spurious.c, c75d720fca8a, all IRQ regressions that I have encountered are fixed. Signed-off-by: Edward Donovan <edward.donovan@numble.net> Reported-and-tested-by: Rogério Brito <rbrito@ime.usp.br> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org (2.6.39+) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-24cgroup_freezer: fix freezing groups with stopped tasksMichal Hocko1-2/+9
2d3cbf8b (cgroup_freezer: update_freezer_state() does incorrect state transitions) removed is_task_frozen_enough and replaced it with a simple frozen call. This, however, breaks freezing for a group with stopped tasks because those cannot be frozen and so the group remains in CGROUP_FREEZING state (update_if_frozen doesn't count stopped tasks) and never reaches CGROUP_FROZEN. Let's add is_task_frozen_enough back and use it at the original locations (update_if_frozen and try_to_freeze_cgroup). Semantically we consider stopped tasks as frozen enough so we should consider both cases when testing frozen tasks. Testcase: mkdir /dev/freezer mount -t cgroup -o freezer none /dev/freezer mkdir /dev/freezer/foo sleep 1h & pid=$! kill -STOP $pid echo $pid > /dev/freezer/foo/tasks echo FROZEN > /dev/freezer/foo/freezer.state while true do cat /dev/freezer/foo/freezer.state [ "`cat /dev/freezer/foo/freezer.state`" = "FROZEN" ] && break sleep 1 done echo OK Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Tomasz Buchert <tomasz.buchert@inria.fr> Cc: Paul Menage <paul@paulmenage.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org Signed-off-by: Tejun Heo <htejun@gmail.com>
2011-11-23PM / Hibernate: Do not leak memory in error/test code pathsRafael J. Wysocki1-6/+10
The hibernation core code forgets to release memory preallocated for hibernation if there's an error in its early stages or if test modes causing hibernation_snapshot() to return early are used. This causes the system to be hardly usable, because the amount of preallocated memory is usually huge. Fix this problem. Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
2011-11-22Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds1-5/+0
* 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: remove vm_dirties and task->dirties writeback: hard throttle 1000+ dd on a slow USB stick mm: Make task in balance_dirty_pages() killable
2011-11-20Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds2-7/+19
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / Suspend: Fix bug in suspend statistics update PM / Hibernate: Fix the early termination of test modes PM / shmobile: Fix build of sh7372_pm_init() for CONFIG_PM unset PM Sleep: Do not extend wakeup paths to devices with ignore_children set PM / driver core: disable device's runtime PM during shutdown PM / devfreq: correct Kconfig dependency PM / devfreq: fix use after free in devfreq_remove_device PM / shmobile: Avoid restoring the INTCS state during initialization PM / devfreq: Remove compiler error after irq.h update PM / QoS: Properly use the WARN() macro in dev_pm_qos_add_request() PM / Clocks: Only disable enabled clocks in pm_clk_suspend() ARM: mach-shmobile: sh7372 A3SP no_suspend_console fix PM / shmobile: Don't skip debugging output in pd_power_up()
2011-11-19PM / Suspend: Fix bug in suspend statistics updateSrivatsa S. Bhat1-1/+2
After commit 2a77c46de1e3dace73745015635ebbc648eca69c (PM / Suspend: Add statistics debugfs file for suspend to RAM) a missing pair of braces inside the state_store() function causes even invalid arguments to suspend to be wrongly treated as failed suspend attempts. Fix this. [rjw: Put the hash/subject of the buggy commit into the changelog.] Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-11-19hrtimer: Fix extra wakeups from __remove_hrtimer()Jeff Ohlstein1-2/+4
__remove_hrtimer() attempts to reprogram the clockevent device when the timer being removed is the next to expire. However, __remove_hrtimer() reprograms the clockevent *before* removing the timer from the timerqueue and thus when hrtimer_force_reprogram() finds the next timer to expire it finds the timer we're trying to remove. This is especially noticeable when the system switches to NOHz mode and the system tick is removed. The timer tick is removed from the system but the clockevent is programmed to wakeup in another HZ anyway. Silence the extra wakeup by removing the timer from the timerqueue before calling hrtimer_force_reprogram() so that we actually program the clockevent for the next timer to expire. This was broken by 998adc3 "hrtimers: Convert hrtimers to use timerlist infrastructure". Signed-off-by: Jeff Ohlstein <johlstei@codeaurora.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1321660030-8520-1-git-send-email-johlstei@codeaurora.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-11-18PM / Hibernate: Fix the early termination of test modesSrivatsa S. Bhat1-6/+17
Commit 2aede851ddf08666f68ffc17be446420e9d2a056 (PM / Hibernate: Freeze kernel threads after preallocating memory) postponed the freezing of kernel threads to after preallocating memory for hibernation. But while doing that, the hibernation test TEST_FREEZER and the test mode HIBERNATION_TESTPROC were not moved accordingly. As a result, when using these test modes, it only goes upto the freezing of userspace and exits, when in fact it should go till the complete end of task freezing stage, namely the freezing of kernel threads as well. So, move these points of exit to appropriate places so that freezing of kernel threads is also tested while using these test harnesses. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-11-17timekeeping: add arch_offset hook to ktime_get functionsHector Palacios1-0/+4
ktime_get and ktime_get_ts were calling timekeeping_get_ns() but later they were not calling arch_gettimeoffset() so architectures using this mechanism returned 0 ns when calling these functions. This happened for example when running Busybox's ping which calls syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually calls ktime_get. As a result the returned ping travel time was zero. CC: stable@kernel.org Signed-off-by: Hector Palacios <hector.palacios@digi.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-17genirq: Don't allow per cpu interrupts to be suspendedMarc Zyngier1-1/+1
The power management functions related to interrupts do not know (yet) about per-cpu interrupts and end up calling the wrong low-level methods to enable/disable interrupts. This leads to all kind of interesting issues (action taken on one CPU only, updating a refcount which is not used otherwise...). The workaround for the time being is simply to flag these interrupts with IRQF_NO_SUSPEND. At least on ARM, these interrupts are actually dealt with at the architecture level. Reported-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1321446459-31409-1-git-send-email-marc.zyngier@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-11-17Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Fix irqfixup, irqpoll regression
2011-11-17writeback: remove vm_dirties and task->dirtiesWu Fengguang1-5/+0
They are not used any more. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-11-16sched: Fix buglet in return_cfs_rq_runtime()Paul Turner1-1/+1
In return_cfs_rq_runtime() we want to return bandwidth when there are no remaining tasks, not "return" when this is the case. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111108042736.623812423@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-16sched: Avoid SMT siblings in select_idle_sibling() if possiblePeter Zijlstra1-14/+28
Avoid select_idle_sibling() from picking a sibling thread if there's an idle core that shares cache. This fixes SMT balancing in the increasingly common case where there's a shared cache core available to balance to. Tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1321350377.1421.55.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14perf: Do not set task_ctx pointer in cpuctx if there are no events in the contextGleb Natapov1-1/+2
Do not set task_ctx pointer during sched_in if there are no events associated with the context. Otherwise if during task execution total number of events in the system will become zero perf_event_context_sched_out() will not be called and cpuctx->task_ctx will be left with a stale value. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111023171033.GI17571@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14sched: Set the command name of the idle tasks in SMP kernelsCarsten Emde1-0/+4
In UP systems, the idle task is initialized using the init_task structure from which the command name is taken (currently "swapper"). In SMP systems, one idle task per CPU is forked by the worker thread from which the task structure is copied. The command name is, therefore, "kworker/0:0" or "kworker/0:1", if not updated. Since such update was lacking, all idle tasks in SMP systems were incorrectly named. This longtime bug was not discovered immediately, because there is no /proc/0 entry - the bug only becomes apparent when tracing is enabled. This patch sets the command name of the idle tasks in SMP systems to the name that is used in the INIT_TASK structure suffixed by a slash and the number of the CPU. Signed-off-by: Carsten Emde <C.Emde@osadl.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111026211708.768925506@osadl.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14sched, rt: Provide means of disabling cross-cpu bandwidth sharingPeter Zijlstra2-0/+4
Normally the RT bandwidth scheme will share bandwidth across the entire root_domain. However sometimes its convenient to disable this sharing for debug purposes. Provide a simple feature switch to this end. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14sched: Document wait_for_completion_*() return valuesJ. Bruce Fields1-0/+13
The return-value convention for these functions varies depending on whether they're interruptible or can timeout. It can be a little confusing--document it. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111006192246.GB28026@fieldses.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14sched_fair: Fix a typo in the comment describing update_sd_lb_statsHui Kang1-1/+1
Signed-off-by: Hui Kang <hkang.sunysb@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1318388459-4427-1-git-send-email-hkang.sunysb@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-14sched: Add a comment to effective_load() since it's a painPeter Zijlstra1-18/+95
Every time I have to stare at this function I need to completely reverse engineer its workings, about time I write a comment explaining the thing. Collected bits and pieces from previous changelogs, mostly: 4be9daaa1b33701f011f4117f22dc1e45a3e6e34 83378269a5fad98f562ebc0f09c349575e6cbfe1 Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1318518057.27731.2.camel@twins Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-11Merge branch 'formingo/3.2/tip/timers/core' of git://git.linaro.org/people/jstultz/linux into timers/coreIngo Molnar2-10/+55
Conflicts: kernel/time/timekeeping.c
2011-11-10clocksource: Avoid selecting mult values that might overflow when adjustedJohn Stultz2-10/+55
For some frequencies, the clocks_calc_mult_shift() function will unfortunately select mult values very close to 0xffffffff. This has the potential to overflow when NTP adjusts the clock, adding to the mult value. This patch adds a clocksource.maxadj value, which provides an approximation of an 11% adjustment(NTP limits adjustments to 500ppm and the tick adjustment is limited to 10%), which could be made to the clocksource.mult value. This is then used to both check that the current mult value won't overflow/underflow, as well as warning us if the timekeeping_adjust() code pushes over that 11% boundary. v2: Fix max_adjustment calculation, and improve WARN_ONCE messages. v3: Don't warn before maxadj has actually been set CC: Yong Zhang <yong.zhang0@gmail.com> CC: David Daney <ddaney.cavm@gmail.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Chen Jie <chenj@lemote.com> CC: zhangfx <zhangfx@lemote.com> CC: stable@kernel.org Reported-by: Chen Jie <chenj@lemote.com> Reported-by: zhangfx <zhangfx@lemote.com> Tested-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-11-07PM / QoS: Set cpu_dma_pm_qos->nameDominik Brodowski1-0/+1
Since commit 4a31a334, the name of this misc device is not initialized, which leads to a funny device named /dev/(null) being created and /proc/misc containing an entry with just a number but no name. The latter leads to complaints by cryptsetup, which caused me to investigate this matter. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-11-06Merge branch 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xenLinus Torvalds1-16/+21
* 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: jump-label: initialize jump-label subsystem much earlier x86/jump_label: add arch_jump_label_transform_static() s390/jump-label: add arch_jump_label_transform_static() jump_label: add arch_jump_label_transform_static() to optimise non-live code updates sparc/jump_label: drop arch_jump_label_text_poke_early() x86/jump_label: drop arch_jump_label_text_poke_early() jump_label: if a key has already been initialized, don't nop it out stop_machine: make stop_machine safe and efficient to call early jump_label: use proper atomic_t initializer Conflicts: - arch/x86/kernel/jump_label.c Added __init_or_module to arch_jump_label_text_poke_early vs removal of that function entirely - kernel/stop_machine.c same patch ("stop_machine: make stop_machine safe and efficient to call early") merged twice, with whitespace fix in one version
2011-11-06Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linuxLinus Torvalds78-65/+83
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
2011-11-06Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linuxLinus Torvalds1-0/+3
* 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux: writeback: Add a 'reason' to wb_writeback_work writeback: send work item to queue_io, move_expired_inodes writeback: trace event balance_dirty_pages writeback: trace event bdi_dirty_ratelimit writeback: fix ppc compile warnings on do_div(long long, unsigned long) writeback: per-bdi background threshold writeback: dirty position control - bdi reserve area writeback: control dirty pause time writeback: limit max dirty pause time writeback: IO-less balance_dirty_pages() writeback: per task dirty rate limit writeback: stabilize bdi->dirty_ratelimit writeback: dirty rate control writeback: add bg_threshold parameter to __bdi_update_bandwidth() writeback: dirty position control writeback: account per-bdi accumulated dirtied pages
2011-11-06Merge git://github.com/rustyrussell/linuxLinus Torvalds2-4/+9
* git://github.com/rustyrussell/linux: module,bug: Add TAINT_OOT_MODULE flag for modules not built in-tree module: Enable dynamic debugging regardless of taint
2011-11-06Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpcLinus Torvalds1-2/+0
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (106 commits) powerpc/p3060qds: Add support for P3060QDS board powerpc/83xx: Add shutdown request support to MCU handling on MPC8349 MITX powerpc/85xx: Make kexec to interate over online cpus powerpc/fsl_booke: Fix comment in head_fsl_booke.S powerpc/85xx: issue 15 EOI after core reset for FSL CoreNet devices powerpc/8xxx: Fix interrupt handling in MPC8xxx GPIO driver powerpc/85xx: Add 'fsl,pq3-gpio' compatiable for GPIO driver powerpc/86xx: Correct Gianfar support for GE boards powerpc/cpm: Clear muram before it is in use. drivers/virt: add ioctl for 32-bit compat on 64-bit to fsl-hv-manager powerpc/fsl_msi: add support for "msi-address-64" property powerpc/85xx: Setup secondary cores PIR with hard SMP id powerpc/fsl-booke: Fix settlbcam for 64-bit powerpc/85xx: Adding DCSR node to dtsi device trees powerpc/85xx: clean up FPGA device tree nodes for Freecsale QorIQ boards powerpc/85xx: fix PHYS_64BIT selection for P1022DS powerpc/fsl-booke: Fix setup_initial_memory_limit to not blindly map powerpc: respect mem= setting for early memory limit setup powerpc: Update corenet64_smp_defconfig powerpc: Update mpc85xx/corenet 32-bit defconfigs ... Fix up trivial conflicts in: - arch/powerpc/configs/40x/hcu4_defconfig removed stale file, edited elsewhere - arch/powerpc/include/asm/udbg.h, arch/powerpc/kernel/udbg.c: added opal and gelic drivers vs added ePAPR driver - drivers/tty/serial/8250.c moved UPIO_TSI to powerpc vs removed UPIO_DWAPB support
2011-11-07module,bug: Add TAINT_OOT_MODULE flag for modules not built in-treeBen Hutchings2-0/+7
Use of the GPL or a compatible licence doesn't necessarily make the code any good. We already consider staging modules to be suspect, and this should also be true for out-of-tree modules which may receive very little review. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reviewed-by: Dave Jones <davej@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (patched oops-tracing.txt)