linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2010-09-09	perf, trace: Fix module leak	Li Zefan	1	-0/+3
	Commit 1c024eca (perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events) caused a module refcount leak. Reported-And-Tested-by: Avi Kivity <avi@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4C7E1F12.8030304@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-08	tracing/kprobe: Fix handling of C-unlike argument names	Masami Hiramatsu	1	-8/+12
	Check the argument name whether it is invalid (not C-like symbol name). This makes event format simple. Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100827113912.22882.62313.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-09-08	tracing/kprobes: Fix handling of argument names	Masami Hiramatsu	1	-7/+10
	Set "argN" name for each argument automatically if it has no specified name. Since dynamic trace event(kprobe_events) accepts special characters for its argument, its format can show those special characters (e.g. '$', '%', '+'). However, perf can't parse those format because of the character (especially '%') mess up the format. This sets "argX" name for those arguments if user omitted the argument names. E.g. # echo 'p do_fork %ax IP=%ip $stack' > tracing/kprobe_events # cat tracing/kprobe_events p:kprobes/p_do_fork_0 do_fork arg1=%ax IP=%ip arg3=$stack Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100827113906.22882.59312.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-09-08	perf probe: Fix handling of arguments names	Masami Hiramatsu	1	-13/+19
	Don't make argument names from raw parameters (means the parameters are written in kprobe-tracer syntax), because the argument syntax may include special characters. Just leave it, then kprobe-tracer gives a new name. Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100827113859.22882.75598.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-09-08	perf probe: Fix return probe support	Masami Hiramatsu	2	-0/+11
	Fix a bug to support %return probe syntax again. Previous commit 4235b04 has a bug which disables the %return syntax on perf probe. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100827113852.22882.87447.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-09-08	tracing/kprobe: Fix a memory leak in error case	Masami Hiramatsu	1	-3/+3
	Fix a memory leak which happens when a field name conflicts with others. In error case, free_trace_probe() will free all arguments until nr_args, so this increments nr_args the begining of the loop instead of the end. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> LKML-Reference: <20100827113846.22882.12670.stgit@ltc236.sdl.hitachi.co.jp> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-09-08	tracing: Do not allow llseek to set_ftrace_filter	Steven Rostedt	1	-1/+1
	Reading the file set_ftrace_filter does three things. 1) shows whether or not filters are set for the function tracer 2) shows what functions are set for the function tracer 3) shows what triggers are set on any functions 3 is independent from 1 and 2. The way this file currently works is that it is a state machine, and as you read it, it may change state. But this assumption breaks when you use lseek() on the file. The state machine gets out of sync and the t_show() may use the wrong pointer and cause a kernel oops. Luckily, this will only kill the app that does the lseek, but the app dies while holding a mutex. This prevents anyone else from using the set_ftrace_filter file (or any other function tracing file for that matter). A real fix for this is to rewrite the code, but that is too much for a -rc release or stable. This patch simply disables llseek on the set_ftrace_filter() file for now, and we can do the proper fix for the next major release. Reported-by: Robert Swiecki <swiecki@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Tavis Ormandy <taviso@google.com> Cc: Eugene Teo <eugene@redhat.com> Cc: vendor-sec@lst.de Cc: <stable@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-07	perf: Add a script to show packets processing	Koki Sanagi	3	-0/+477
	Add a perf script which shows packets processing and processed time. It helps us to investigate networking or network devices. If you want to use it, install perf and record perf.data like following. If you set script, perf gathers records until it ends. If not, you must Ctrl-C to stop recording. And if you want a report from record, If you use some options, you can limit the output. Option is below. tx: show only tx packets processing rx: show only rx packets processing dev=: show processing on this device debug: work with debug mode. It shows buffer status. For example, if you want to show received packets processing associated with eth4, 106133.171439sec cpu=0 irq_entry(+0.000msec irq=24:eth4) \| softirq_entry(+0.006msec) \| \|---netif_receive_skb(+0.010msec skb=f2d15900 len=100) \| \| \| skb_copy_datagram_iovec(+0.039msec 10291::10291) \| napi_poll_exit(+0.022msec eth4) This perf script helps us to analyze the processing time of a transmit/receive sequence. Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Kaneshige Kenji <kaneshige.kenji@jp.fujitsu.com> Cc: Izumo Taku <izumi.taku@jp.fujitsu.com> Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Scott Mcmillan <scott.a.mcmillan@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <4C72439D.3040001@jp.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-09-07	skb: Add tracepoints to freeing skb	Koki Sanagi	4	-0/+21
	This patch adds tracepoint to consume_skb and add trace_kfree_skb before __kfree_skb in skb_free_datagram_locked and net_tx_action. Combinating with tracepoint on dev_hard_start_xmit, we can check how long it takes to free transmitted packets. And using it, we can calculate how many packets driver had at that time. It is useful when a drop of transmitted packet is a problem. sshd-6828 [000] 112689.258154: consume_skb: skbaddr=f2d99bb8 Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Kaneshige Kenji <kaneshige.kenji@jp.fujitsu.com> Cc: Izumo Taku <izumi.taku@jp.fujitsu.com> Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Scott Mcmillan <scott.a.mcmillan@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <4C724364.50903@jp.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-09-07	netdev: Add tracepoints to netdev layer	Koki Sanagi	3	-0/+89
	This patch adds tracepoint to dev_queue_xmit, dev_hard_start_xmit, netif_rx and netif_receive_skb. These tracepoints help you to monitor network driver's input/output. <idle>-0 [001] 112447.902030: netif_rx: dev=eth1 skbaddr=f3ef0900 len=84 <idle>-0 [001] 112447.902039: netif_receive_skb: dev=eth1 skbaddr=f3ef0900 len=84 sshd-6828 [000] 112447.903257: net_dev_queue: dev=eth4 skbaddr=f3fca538 len=226 sshd-6828 [000] 112447.903260: net_dev_xmit: dev=eth4 skbaddr=f3fca538 len=226 rc=0 Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Kaneshige Kenji <kaneshige.kenji@jp.fujitsu.com> Cc: Izumo Taku <izumi.taku@jp.fujitsu.com> Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Scott Mcmillan <scott.a.mcmillan@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <4C72431E.3000901@jp.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-09-07	napi: Convert trace_napi_poll to TRACE_EVENT	Neil Horman	1	-2/+23
	This patch converts trace_napi_poll from DECLARE_EVENT to TRACE_EVENT to improve the usability of napi_poll tracepoint. <idle>-0 [001] 241302.750777: napi_poll: napi poll on napi struct f6acc480 for device eth3 <idle>-0 [000] 241302.852389: napi_poll: napi poll on napi struct f5d0d70c for device eth1 The original patch is below: http://marc.info/?l=linux-kernel&m=126021713809450&w=2 [ sanagi.koki@jp.fujitsu.com: And add a fix by Steven Rostedt: http://marc.info/?l=linux-kernel&m=126150506519173&w=2 ] Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Kaneshige Kenji <kaneshige.kenji@jp.fujitsu.com> Cc: Izumo Taku <izumi.taku@jp.fujitsu.com> Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Scott Mcmillan <scott.a.mcmillan@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <4C7242D7.4050009@jp.fujitsu.com> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-09-07	irq: Add tracepoint to softirq_raise	Lai Jiangshan	2	-3/+31
	Add a tracepoint for tracing when softirq action is raised. This and the existing tracepoints complete softirq's tracepoints: softirq_raise, softirq_entry and softirq_exit. And when this tracepoint is used in combination with the softirq_entry tracepoint we can determine the softirq raise latency. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Cc: David Miller <davem@davemloft.net> Cc: Kaneshige Kenji <kaneshige.kenji@jp.fujitsu.com> Cc: Izumo Taku <izumi.taku@jp.fujitsu.com> Cc: Kosaki Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Scott Mcmillan <scott.a.mcmillan@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <4C724298.4050509@jp.fujitsu.com> [ factorize softirq events with DECLARE_EVENT_CLASS ] Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-09-03	perf, x86: Try to handle unknown nmis with an enabled PMU	Robert Richter	1	-13/+46
	When the PMU is enabled it is valid to have unhandled nmis, two events could trigger 'simultaneously' raising two back-to-back NMIs. If the first NMI handles both, the latter will be empty and daze the CPU. The solution to avoid an 'unknown nmi' massage in this case was simply to stop the nmi handler chain when the PMU is enabled by stating the nmi was handled. This has the drawback that a) we can not detect unknown nmis anymore, and b) subsequent nmi handlers are not called. This patch addresses this. Now, we check this unknown NMI if it could be a PMU back-to-back NMI. Otherwise we pass it and let the kernel handle the unknown nmi. This is a debug log: cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430 cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616 cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320 cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139 cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100 cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607 cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416 cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032 cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830 cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743 cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552 cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224 cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677 cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772 cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818 cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591 Uhhuh. NMI received for unknown reason 00 on CPU 6. Do you have a strange power saving mode enabled? Dazed and confused, but trying to continue Deltas: nmi #32334 340186 nmi #32335 1327704 nmi #32336 1819 <<<< back-to-back nmi [1] nmi #32337 85961 nmi #32338 284507 nmi #32339 1578809 nmi #32340 217616 nmi #32341 1798 <<<< back-to-back nmi [2] nmi #32342 240913 nmi #32343 1512809 nmi #32344 116672 nmi #32345 412453 nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3] For back-to-back nmi detection there are the following rules: The PMU nmi handler was handling more than one counter and no counter was handled in the subsequent nmi (see [1] and [2] above). There is another case if there are two subsequent back-to-back nmis [3]. The 2nd is detected as back-to-back because the first handled more than one counter. If the second handles one counter and the 3rd handles nothing, we drop the 3rd nmi because it could be a back-to-back nmi. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> [ renamed nmi variable to pmu_nmi to avoid clash with .nmi in entry.S ] Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: peterz@infradead.org Cc: gorcunov@gmail.com Cc: fweisbec@gmail.com Cc: ying.huang@intel.com Cc: ming.m.lin@intel.com Cc: eranian@google.com LKML-Reference: <1283454469-1909-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-03	perf, x86: Fix handle_irq return values	Peter Zijlstra	2	-3/+8
	Now that we rely on the number of handled overflows, ensure all handle_irq implementations actually return the right number. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: peterz@infradead.org Cc: robert.richter@amd.com Cc: gorcunov@gmail.com Cc: fweisbec@gmail.com Cc: ying.huang@intel.com Cc: ming.m.lin@intel.com Cc: eranian@google.com LKML-Reference: <1283454469-1909-4-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-03	perf, x86: Fix accidentally ack'ing a second event on intel perf counter	Don Zickus	1	-4/+2
	During testing of a patch to stop having the perf subsytem swallow nmis, it was uncovered that Nehalem boxes were randomly getting unknown nmis when using the perf tool. Moving the ack'ing of the PMI closer to when we get the status allows the hardware to properly re-set the PMU bit signaling another PMI was triggered during the processing of the first PMI. This allows the new logic for dealing with the shortcomings of multiple PMIs to handle the extra NMI by 'eat'ing it later. Now one can wonder why are we getting a second PMI when we disable all the PMUs in the begining of the NMI handler to prevent such a case, for that I do not know. But I know the fix below helps deal with this quirk. Tested on multiple Nehalems where the problem was occuring. With the patch, the code now loops a second time to handle the second PMI (whereas before it was not). Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: peterz@infradead.org Cc: robert.richter@amd.com Cc: gorcunov@gmail.com Cc: fweisbec@gmail.com Cc: ying.huang@intel.com Cc: ming.m.lin@intel.com Cc: eranian@google.com LKML-Reference: <1283454469-1909-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-01	oprofile, x86: fix init_sysfs() function stub	Robert Richter	1	-2/+4
	The use of the return value of init_sysfs() with commit 10f0412 oprofile, x86: fix init_sysfs error handling discovered the following build error for !CONFIG_PM: .../linux/arch/x86/oprofile/nmi_int.c: In function ‘op_nmi_init’: .../linux/arch/x86/oprofile/nmi_int.c:784: error: expected expression before ‘do’ make[2]: * [arch/x86/oprofile/nmi_int.o] Error 1 make[1]: * [arch/x86/oprofile] Error 2 This patch fixes this. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-09-01	ring-buffer: Place duplicate expression into a single function	Steven Rostedt	1	-6/+15
	While discussing the strictness of the 80 character limit on the Kernel Summit Discussion mailing list, I showed examples that I broke that limit slightly with some algorithms. In discussing with John Linville, what looked better, I realized that two of the 80 char breaking culprits were an identical expression. As a clean up, this patch moves the identical expression into its own helper function and that is used instead. As a side effect, the offending code is now under the 80 character limit. :-) This clean up code also changes the expression from (A - B) - C to A - (B + C) This makes the code look a little nicer too. Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-01	lockup_detector: Sync touch_*_watchdog back to old semantics	Don Zickus	1	-5/+12
	During my rewrite, the semantics of touch_nmi_watchdog and touch_softlockup_watchdog changed enough to break some drivers (mostly over preemptable regions). These are cases where long delays on one CPU (due to print_delay for example) can cause long delays on other CPUs - so we must 'touch' the nmi_watchdog flag of those other CPUs as well. This change brings those touch_*_watchdog() functions back in line with to how they used to work. Signed-off-by: Don Zickus <dzickus@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: peterz@infradead.org Cc: fweisbec@gmail.com LKML-Reference: <1283310009-22168-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-01	perf, x86, Pentium4: Add RAW events verification	Cyrill Gorcunov	2	-54/+290
	Implements verification of - Bits of ESCR EventMask field (meaningful bits in field are hardware predefined and others bits should be set to zero) - INSTR_COMPLETED event (it is available on predefined cpu model only) - Thread shared events (they should be guarded by "perf_event_paranoid" sysctl due to security reason). The side effect of this action is that PERF_COUNT_HW_BUS_CYCLES become a "paranoid" general event. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Tested-by: Lin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100825182334.GB14874@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-01	lockup_detector: Remove unused panic_notifier	Akinobu Mita	1	-15/+0
	The panic notifer in lockup_detector just set did_panic to 1. But did_panic is not used anywhere so we can just remove it. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: peterz@infradead.org Cc: gorcunov@gmail.com Cc: fweisbec@gmail.com LKML-Reference: <1283310009-22168-4-git-send-email-dzickus@redhat.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-01	lockup_detector: Convert cpu notifier to return encapsulate errno value	Akinobu Mita	1	-10/+11
	By the commit e6bde73b07edeb703d4c89c1daabc09c303de11f ("cpu-hotplug: return better errno on cpu hotplug failure"), the cpu notifier can return encapsulate errno value, resulting in more meaningful error codes for CPU hotplug failures. This converts the cpu notifier to return encapsulate errno value for the lockup_detector as well. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: peterz@infradead.org Cc: gorcunov@gmail.com Cc: fweisbec@gmail.com LKML-Reference: <1283310009-22168-3-git-send-email-dzickus@redhat.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31	tracing: Fix a race in function profile	Li Zefan	1	-4/+11
	While we are reading trace_stat/functionX and someone just disabled function_profile at that time, we can trigger this: divide error: 0000 [#1] PREEMPT SMP ... EIP is at function_stat_show+0x90/0x230 ... This fix just takes the ftrace_profile_lock and checks if rec->counter is 0. If it's 0, we know the profile buffer has been reset. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: stable@kernel.org LKML-Reference: <4C723644.4040708@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-08-31	tracing/lockdep: Fix dependency of TRACE_IRQFLAGS	Steven Rostedt	1	-4/+4
	When CONFIG_IRQSOFF_TRACER is set and CONFIG_PROVE_LOCKING is not, we get the following error: $ make oldconfig scripts/kconfig/conf --oldconfig arch/x86/Kconfig warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING) warning: (IRQSOFF_TRACER && TRACING_SUPPORT && FTRACE && TRACE_IRQFLAGS_SUPPORT && !ARCH_USES_GETTIMEOFFSET) selects TRACE_IRQFLAGS which has unmet direct dependencies (DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && PROVE_LOCKING) This is because IRQSOFF_TRACER selects TRACE_IRQFLAGS but TRACE_IRQFLAGS has PROVE_LOCKING as a dependency. This code is incorrect, and this patch changes the TRACE_IRQFLAGS to be just a simple bool that does not depend or select anything. Instead both IRQSOFF_TRACER and PROVE_LOCKING select it. Reported-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-08-31	oprofile, x86: fix init_sysfs error handling	Robert Richter	1	-3/+13
	On failure init_sysfs() might not properly free resources. The error code of the function is not checked. And, when reinitializing the exit function might be called twice. This patch fixes all this. Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-08-30	perf_events: Fix time tracking for events with pid != -1 and cpu != -1	Stephane Eranian	1	-4/+22
	Per-thread events with a cpu filter, i.e., cpu != -1, were not reporting correct timings when the thread never ran on the monitored cpu. The time enabled was reported as a negative value. This patch fixes the problem by updating tstamp_stopped, tstamp_running in event_sched_out() for events with filters and which are marked as INACTIVE. The function group_sched_out() is modified to systematically call into event_sched_out() to avoid duplicating the timing adjustment code twice. With the patch, I now get: $ task_cpu -i -e unhalted_core_cycles,unhalted_core_cycles noploop 2 noploop for 2 seconds CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0) CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594) CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594) Signed-off-by: Stephane Eranian <eranian@gmail.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: perfmon2-devel@lists.sf.net Cc: eranian@google.com LKML-Reference: <4c76802d.aae9d80a.115d.70fe@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-27	perf: Initialize callchains roots's childen hits	Frederic Weisbecker	1	-0/+1
	Each histogram entry has a callchain root that stores the callchain samples. However we forgot to initialize the tracking of children hits of these roots, which then got random values on their creation. The root children hits is multiplied by the minimum percentage of hits provided by the user, and the result becomes the minimum hits expected from children branches. If the random value due to the uninitialization is big enough, then this minimum number of hits can be huge and eventually filter every children branches. The end result was invisible callchains. All we need to fix this is to initialize the children hits of the root. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: 2.6.32.x-2.6.35.y <stable@kernel.org>
2010-08-25	perf tools: Fix linking errors with --as-needed flag	Tom Zanussi	1	-6/+15
	External shared libraries should never be appended to the LDFLAGS as this messes the linking order. As EXTLIBS collects those libraries, it seems that perl and python libraries should also be appended to EXTLIBS. Also fix the broken linking order. This is a refresh of a patch by Ozan Çağlayan and improved by both Tom Zanussi and Kirill A. Shutemov. Cc: Ozan Çağlayan <ozan@pardus.org.tr> Tested-by: Kirill A. Shutemov <kirill@shutemov.name> Tested-by: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1282627430.28324.8.camel@tropicana> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-25	perf symbols: List symbols in a dso in ascending name order	Srikar Dronamraju	2	-0/+15
	Given a dso, list the symbols in ascending name order. Needed for listing available symbols from perf probe. Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "Frank Ch. Eigler" <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com> Cc: Mark Wielaard <mjw@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Naren A Devaiah <naren.devaiah@in.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <20100825134329.5447.92261.sendpatchset@localhost6.localdomain6> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-25	perf hists browser: Introduce "expand/collapse all callchains" action	Arnaldo Carvalho de Melo	1	-6/+106
	When looking at a callchains enabled perf data file one can find it tiresome to start with all callchains collapsed and then to have to go one by one expanding them. So associate 'E' with "Expand all callchains" and 'C' with "Collapse all callchains". This way now one can have the top level view and then switch to/from having all callchains expanded. More work is needed to allow expanding just from one branch down to its leaves. Reported-by: Christoph Hellwig <hch@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-25	perf hists browser: Init the has_children fields just once	Arnaldo Carvalho de Melo	1	-14/+4
	Not everytime we show the callchains, removing duplicated initialization of this field. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-25	perf hists browser: replace rb_first() != NULL by !RB_EMPTY_ROOT()	Arnaldo Carvalho de Melo	1	-7/+7
	Its way too stupid to use rb_first() for just caching if there are children, use the cheaper RB_EMPTY_ROOT() instead. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-25	perf hists: Fix hist_entry__init_have_children	Arnaldo Carvalho de Melo	1	-0/+1
	It wasn't setting the ms.has_children for the hist_entry itself, just for the callchain Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-08-25	perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag	Lin Ming	1	-0/+2
	If on Pentium4 CPUs the FORCE_OVF flag is set then an NMI happens on every event, which can generate a flood of NMIs. Clear it. Reported-by: Vince Weaver <vweaver1@eecs.utk.edu> Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-25	perf: Remove unused variable	Lin Ming	1	-3/+0
	This fixes the following build warning introduced by the callchain rework: arch/x86/kernel/cpu/perf_event.c:1574: warning: ‘perf_callchain_entry_nmi’ defined but not used Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1282718949.16443.75.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-25	tracing/trace_stack: Fix stack trace on ppc64	Anton Blanchard	1	-1/+1
	save_stack_trace() stores the instruction pointer, not the function descriptor. On ppc64 the trace stack code currently dereferences the instruction pointer and shows 8 bytes of instructions in our backtraces: # cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (26 entries) ----- ---- -------- 0) 5424 112 0x6000000048000004 1) 5312 160 0x60000000ebad01b0 2) 5152 160 0x2c23000041c20030 3) 4992 240 0x600000007c781b79 4) 4752 160 0xe84100284800000c 5) 4592 192 0x600000002fa30000 6) 4400 256 0x7f1800347b7407e0 7) 4144 208 0xe89f0108f87f0070 8) 3936 272 0xe84100282fa30000 Since we aren't dealing with function descriptors, use %pS instead of %pF to fix it: # cat /sys/kernel/debug/tracing/stack_trace Depth Size Location (26 entries) ----- ---- -------- 0) 5424 112 ftrace_call+0x4/0x8 1) 5312 160 .current_io_context+0x28/0x74 2) 5152 160 .get_io_context+0x48/0xa0 3) 4992 240 .cfq_set_request+0x94/0x4c4 4) 4752 160 .elv_set_request+0x60/0x84 5) 4592 192 .get_request+0x2d4/0x468 6) 4400 256 .get_request_wait+0x7c/0x258 7) 4144 208 .__make_request+0x49c/0x610 8) 3936 272 .generic_make_request+0x390/0x434 Signed-off-by: Anton Blanchard <anton@samba.org> Cc: rostedt@goodmis.org Cc: fweisbec@gmail.com LKML-Reference: <20100825013238.GE28360@kryten> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-25	oprofile: fix crash when accessing freed task structs	Robert Richter	2	-15/+14
	This patch fixes a crash during shutdown reported below. The crash is caused by accessing already freed task structs. The fix changes the order for registering and unregistering notifier callbacks. All notifiers must be initialized before buffers start working. To stop buffer synchronization we cancel all workqueues, unregister the notifier callback and then flush all buffers. After all of this we finally can free all tasks listed. This should avoid accessing freed tasks. On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote: > So the initial observation is a spinlock bad magic followed by a crash > in the spinlock debug code: > > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136 > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03 > > Backtrace looks like: > > spin_bug+0x74/0xd4 > ._raw_spin_lock+0x48/0x184 > ._spin_lock+0x10/0x24 > .get_task_mm+0x28/0x8c > .sync_buffer+0x1b4/0x598 > .wq_sync_buffer+0xa0/0xdc > .worker_thread+0x1d8/0x2a8 > .kthread+0xa8/0xb4 > .kernel_thread+0x54/0x70 > > So we are accessing a freed task struct in the work queue when > processing the samples. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: stable@kernel.org Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-08-24	guard page for stacks that grow upwards	Luck, Tony	3	-8/+18
	pa-risc and ia64 have stacks that grow upwards. Check that they do not run into other mappings. By making VM_GROWSUP 0x0 on architectures that do not ever use it, we can avoid some unpleasant #ifdefs in check_stack_guard_page(). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-24	drm/i915: fix vblank wait test condition	Jesse Barnes	1	-1/+1
	When converting this to the new wait_for macro I inverted the wait condition, which causes all sorts of problems. So correct it to fix several failures caused by the bad wait (flickering, bad output detection, tearing, etc.). Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-24	intel_scu_ipc: fix IPC i2c write bug	Jianwei Yang	1	-1/+1
	We should pass the data to the data register. Signed-off-by: Jianwei Yang <jianwei.yang@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-24	rar: Fix off by one error	Ossama Othman	1	-1/+1
	It looks like there is an off-by-one error in one of your changes to drivers/staging/rar_register/rar_register.c: Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-24	V4L/DVB: mantis: Fix IR_CORE dependency	Ingo Molnar	1	-1/+1
	This build bug triggers: drivers/built-in.o: In function `mantis_exit': (.text+0x377413): undefined reference to `ir_input_unregister' drivers/built-in.o: In function `mantis_input_init': (.text+0x3774ff): undefined reference to `__ir_input_register' If MANTIS_CORE is enabled but IR_CORE is not. Add the correct dependency. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-24	[S390] fix tlb flushing vs. concurrent /proc accesses	Martin Schwidefsky	8	-6/+28
	The tlb flushing code uses the mm_users field of the mm_struct to decide if each page table entry needs to be flushed individually with IPTE or if a global flush for the mm_struct is sufficient after all page table updates have been done. The comment for mm_users says "How many users with user space?" but the /proc code increases mm_users after it found the process structure by pid without creating a new user process. Which makes mm_users useless for the decision between the two tlb flusing methods. The current code can be confused to not flush tlb entries by a concurrent access to /proc files if e.g. a fork is in progres. The solution for this problem is to make the tlb flushing logic independent from the mm_users field. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-08-24	[S390] s390: fix build error (sys_execve)	Sebastian Ott	1	-2/+2
	fix this build error: arch/s390/kernel/process.c:272: error: conflicting types for 'sys_execve' arch/s390/kernel/entry.h:45: error: previous declaration of 'sys_execve' was here make[1]: * [arch/s390/kernel/process.o] Error 1 make: * [arch/s390/kernel] Error 2 introduced by d7627467b7a8dd6944885290a03a07ceb28c10eb Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-08-23	sparc64: Get rid of indirect p1275 PROM call buffer.	David S. Miller	7	-297/+456
	This is based upon a report by Meelis Roos showing that it's possible that we'll try to fetch a property that is 32K in size with some devices. With the current fixed 3K buffer we use for moving data in and out of the firmware during PROM calls, that simply won't work. In fact, it will scramble random kernel data during bootup. The reasoning behind the temporary buffer is entirely historical. It used to be the case that we had problems referencing dynamic kernel memory (including the stack) early in the boot process before we explicitly told the firwmare to switch us over to the kernel trap table. So what we did was always give the firmware buffers that were locked into the main kernel image. But we no longer have problems like that, so get rid of all of this indirect bounce buffering. Besides fixing Meelis's bug, this also makes the kernel data about 3K smaller. It was also discovered during these conversions that the implementation of prom_retain() was completely wrong, so that was fixed here as well. Currently that interface is not in use. Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-08-24	powerpc: Fix config dependency problem with MPIC_U3_HT_IRQS	Andreas Schwab	1	-2/+1
	MPIC_U3_HT_IRQS is selected both by PPC_PMAC64 and PPC_MAPLE, but depends on PPC_MAPLE, so a PPC_PMAC64-only config gets this warning: warning: (PPC_PMAC64 && PPC_PMAC && POWER4 \|\| PPC_MAPLE && PPC64 && PPC_BOOK3S) selects MPIC_U3_HT_IRQS which has unmet direct dependencies (PPC_MAPLE) Fix that by removing the dependency on PPC_MAPLE. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24	via-pmu: Add compat_pmu_ioctl	Andreas Schwab	1	-0/+42
	The ioctls are actually compatible, but due to historical mistake the numbers differ between 32bit and 64bit. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24	powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls	Andreas Schwab	3	-1/+15
	Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24	powerpc/pci: Fix checking for child bridges in PCI code.	Grant Likely	1	-1/+2
	pci_device_to_OF_node() can return null, and list_for_each_entry will never enter the loop when dev is NULL, so it looks like this test is a typo. Reported-by: Julia Lawall <julia@diku.dk> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24	powerpc: Fix typo in uImage target	Anatolij Gustschin	1	-1/+1
	Commit e32e78c5ee8aadef020fbaecbe6fb741ed9029fd (powerpc: fix build with make 3.82) introduced a typo in uImage target and broke building uImage: make: *** No rule to make target `uImage'. Stop. Signed-off-by: Anatolij Gustschin <agust@denx.de> Cc: stable <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24	powerpc: Initialise paca->kstack before early_setup_secondary	Matt Evans	1	-3/+3
	As early setup calls down to slb_initialize(), we must have kstack initialised before checking "should we add a bolted SLB entry for our kstack?" Failing to do so means stack access requires an SLB miss exception to refill an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text & static data). It's not always allowable to take such a miss, and intermittent crashes will result. Primary CPUs don't have this issue; an SLB entry is not bolted for their stack anyway (as that lives within SLB(0)). This patch therefore only affects the init of secondaries. Signed-off-by: Matt Evans <matt@ozlabs.org> Cc: stable <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>