linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2014-08-26	alpha: Replace __get_cpu_var	Christoph Lameter	1	-8/+8
	__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int x = &__get_cpu_var(y); Converts to int x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2013-11-16	alpha: perf: fix out-of-bounds array access triggered from raw event	Will Deacon	1	-2/+13
	Vince's perf fuzzer uncovered the following issue on Alpha: Unable to handle kernel paging request at virtual address fffffbfe4e46a0e8 CPU 0 perf_fuzzer(1278): Oops 0 pc = [<fffffc000031fbc0>] ra = [<fffffc000031ff54>] ps = 0007 Not tainted pc is at alpha_perf_event_set_period+0x60/0xf0 ra is at alpha_pmu_enable+0x1a4/0x1c0 v0 = 0000000000000000 t0 = 00000000000fffff t1 = fffffc007b3f5800 t2 = fffffbff275faa94 t3 = ffffffffc9b9bd89 t4 = fffffbfe4e46a098 t5 = 0000000000000020 t6 = fffffbfe4e46a0b8 t7 = fffffc007f4c8000 s0 = 0000000000000000 s1 = fffffc0001b0c018 s2 = fffffc0001b0c020 s3 = fffffc007b3f5800 s4 = 0000000000000001 s5 = ffffffffc9b9bd85 s6 = 0000000000000001 a0 = 0000000000000006 a1 = fffffc007b3f5908 a2 = fffffbfe4e46a098 a3 = 00000005000108c0 a4 = 0000000000000000 a5 = 0000000000000000 t8 = 0000000000000001 t9 = 0000000000000001 t10= 0000000027829f6f t11= 0000000000000020 pv = fffffc000031fb60 at = fffffc0000950900 gp = fffffc0000940900 sp = fffffc007f4cbca8 Disabling lock debugging due to kernel taint Trace: [<fffffc000031ff54>] alpha_pmu_enable+0x1a4/0x1c0 [<fffffc000039f4e8>] perf_pmu_enable+0x48/0x60 [<fffffc00003a0d6c>] __perf_install_in_context+0x15c/0x230 [<fffffc000039d1f0>] remote_function+0x80/0xa0 [<fffffc00003a0c10>] __perf_install_in_context+0x0/0x230 [<fffffc000037b7e4>] smp_call_function_single+0x1b4/0x1d0 [<fffffc000039bb70>] task_function_call+0x60/0x80 [<fffffc00003a0c10>] __perf_install_in_context+0x0/0x230 [<fffffc000039bb44>] task_function_call+0x34/0x80 [<fffffc000039d3fc>] perf_install_in_context+0x9c/0x150 [<fffffc00003a0c10>] __perf_install_in_context+0x0/0x230 [<fffffc00003a5100>] SYSC_perf_event_open+0x360/0xac0 [<fffffc00003110c4>] entSys+0xa4/0xc0 This is due to the raw event encoding being used as an index directly into the ev67_mapping array, rather than being validated against the ev67_pmc_event_type enumeration instead. Unlike other architectures, which allow raw events to propagate into the hardware counters with little interference, the limited number of events on Alpha and the strict event <-> counter relationships mean that raw events actually correspond to the Linux-specific Alpha events, rather than anything defined by the architecture. This patch adds a new callback to alpha_pmu_t for validating the raw event encoding with the Linux event types for the PMU, preventing the out-of-bounds array access. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2012-05-09	perf: Pass last sampling period to perf_sample_data_init()	Robert Richter	1	-2/+1
	We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-03-05	perf: Disable PERF_SAMPLE_BRANCH_* when not supported	Stephane Eranian	1	-0/+4
	PERF_SAMPLE_BRANCH_* is disabled for: - SW events (sw counters, tracepoints) - HW breakpoints - ALL but Intel x86 architecture - AMD64 processors Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1328826068-11713-10-git-send-email-eranian@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-26	atomic: use <linux/atomic.h>	Arun Sharma	1	-1/+1
	This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-01	perf: Remove the nmi parameter from the swevent and overflow interface	Peter Zijlstra	1	-1/+1
	The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16	perf: Dynamic pmu types	Peter Zijlstra	1	-1/+1
	Extend the perf_pmu_register() interface to allow for named and dynamic pmu types. Because we need to support the existing static types we cannot use dynamic types for everything, hence provide a type argument. If we want to enumerate the PMUs they need a name, provide one. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101117222056.259707703@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-26	perf, arch: Cleanup perf-pmu init vs lockup-detector	Peter Zijlstra	1	-3/+6
	The perf hardware pmu got initialized at various points in the boot, some before early_initcall() some after (notably arch_initcall). The problem is that the NMI lockup detector is ran from early_initcall() and expects the hardware pmu to be present. Sanitize this by moving all architecture hardware pmu implementations to initialize at early_initcall() and move the lockup detector to an explicit initcall right after that. Cc: paulus <paulus@samba.org> Cc: davem <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290707759.2145.119.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-15	alpha: Fix HW performance counters to be stopped properly	Michael Cree	1	-9/+10
	Also fix a few compile errors due to undefined and duplicated variables. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1284269844-23251-1-git-send-email-mcree@orcon.net.nz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-15	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core	Ingo Molnar	1	-9/+9

2010-09-09	perf: Remove the sysfs bits	Peter Zijlstra	1	-2/+1
	Neither the overcommit nor the reservation sysfs parameter were actually working, remove them as they'll only get in the way. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09	perf: Rework the PMU methods	Peter Zijlstra	1	-18/+53
	Replace pmu::{enable,disable,start,stop,unthrottle} with pmu::{add,del,start,stop}, all of which take a flags argument. The new interface extends the capability to stop a counter while keeping it scheduled on the PMU. We replace the throttled state with the generic stopped state. This also allows us to efficiently stop/start counters over certain code paths (like IRQ handlers). It also allows scheduling a counter without it starting, allowing for a generic frozen state (useful for rotating stopped counters). The stopped state is implemented in two different ways, depending on how the architecture implemented the throttled state: 1) We disable the counter: a) the pmu has per-counter enable bits, we flip that b) we program a NOP event, preserving the counter state 2) We store the counter state and ignore all read/overflow events Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09	perf: Per PMU disable	Peter Zijlstra	1	-14/+16
	Changes perf_disable() into perf_pmu_disable(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09	perf: Register PMU implementations	Peter Zijlstra	1	-15/+22
	Simple registration interface for struct pmu, this provides the infrastructure for removing all the weak functions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09	perf: Deconstify struct pmu	Peter Zijlstra	1	-2/+2
	sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"` Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31	alpha: convert perf_event to use local_t	Michael Cree	1	-9/+9
	Updates the Alpha perf_event code to match the changes recently made to the core perf_event code in commit e78505958cf123048fb48cb56b79cebb8edd15fb. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-08-09	alpha: implement HW performance events on the EV67 and later CPUs	Michael Cree	1	-0/+842
	This implements hardware performance events for the EV67 and later CPUs within the Linux performance events subsystem. Only using the performance monitoring unit in HP/Compaq's so called "Aggregrate mode" is supported. The code has been implemented in a manner that makes extension to other older Alpha CPUs relatively straightforward should some mug wish to indulge themselves. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jay Estabrook <jay.estabrook@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>