linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2009-05-07	tracing/filters: support for operator reserved characters in strings	Frederic Weisbecker	1	-0/+10
	When we set a filter for an event, such as: echo "name == my_lock_name" > \ /debug/tracing/events/lockdep/lock_acquired/filter then the following order of token type is parsed: - space - operator - parentheses - operand Because the operators and parentheses have a higher precedence than the operand characters, which is normal, then we can't use any string containing such special characters: ()=<>!&\| To get this support and also avoid ambiguous intepretation from the parser or the human, we can do it using double quotes so that we keep the usual languages habits. Then after this patch you can still declare string condition like before: echo name == myname But if you want to compare against a string containing an operator character, you can use double quotes: echo 'name == "&myname"' Don't forget to include the whole expression into single quotes or the double ones will be eaten by echo. [ Impact: support strings with special characters for tracing filters ] Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Zhaolei <zhaolei@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-05-07	tracing/filters: support for filters of dynamic sized arrays	Frederic Weisbecker	1	-3/+41
	Currently the filtering infrastructure supports well the numeric types and fixed sized array types. But the recently added __string() field uses a specific indirect offset mechanism which requires a specific predicate. Until now it wasn't supported. This patch adds this support and implies very few changes, only a new predicate is needed, the management of this specific field can be done through the usual string helpers in the filtering infrastructure. [ Impact: support all kinds of strings in the tracing filters ] Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Zhaolei <zhaolei@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-05-06	tracing: add hierarchical enabling of events	Steven Rostedt	1	-0/+140
	With the current event directory, you can only enable individual events. The file debugfs/tracing/set_event is used to be able to enable or disable several events at once. But that can still be awkward. This patch adds hierarchical enabling of events. That is, each directory in debugfs/tracing/events has an "enable" file. This file can enable or disable all events within the directory and below. # echo 1 > /debugfs/tracing/events/enable will enable all events. # echo 1 > /debugfs/tracing/events/sched/enable will enable all events in the sched subsystem. # echo 1 > /debugfs/tracing/events/enable # echo 0 > /debugfs/tracing/events/irq/enable will enable all events, but then disable just the irq subsystem events. When reading one of these enable files, there are four results: 0 - all events this file affects are disabled 1 - all events this file affects are enabled X - there is a mixture of events enabled and disabled ? - this file does not affect any event Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	tracing: reset ring buffer when removing modules with events	Steven Rostedt	3	-0/+21
	Li Zefan found that there's a race using the event ids of events and modules. When a module is loaded, an event id is incremented. We only have 16 bits for event ids (65536) and there is a possible (but highly unlikely) race that we could load and unload a module that registers events so many times that the event id counter overflows. When it overflows, it then restarts and goes looking for available ids. An id is available if it was added by a module and released. The race is if you have one module add an id, and then is removed. Another module loaded can use that same event id. But if the old module still had events in the ring buffer, the new module's call back would get bogus data. At best (and most likely) the output would just be garbage. But if the module for some reason used pointers (not recommended) then this could potentially crash. The safest thing to do is just reset the ring buffer if a module that registered events is removed. [ Impact: prevent unpredictable results of event id overflows ] Reported-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <49FEAFD0.30106@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	tracing: update sample with TRACE_INCLUDE_FILE	Steven Rostedt	1	-1/+6
	When creating trace events for ftrace, the header file with the TRACE_EVENT macros must also have a macro called TRACE_SYSTEM. This macro describes the name of the system the TRACE_EVENTS are defined for. It also doubles as a way for the define_trace.h file to include the file that included it. For example: in irq.h #define TRACE_SYSTEM irq [...] #include <trace/define_trace.h> The define_trace will use TRACE_SYSTEM to include irq.h. But if the name of the trace system does not match the name of the trace header file, one can override it with: Which will change define_trace.h to inclued foo_trace.h instead of foo.h The sample comments this, but people that use the sample code will more likely use the code and not read the comments. This patch changes the sample code to use the TRACE_INCLUDE_FILE to better show developers how to use it. [ Impact: make sample less confusing to developers ] Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	ring-buffer: change test to be more latency friendly	Steven Rostedt	1	-0/+31
	The ring buffer benchmark/test runs a producer for 10 seconds. This is done with preemption and interrupts enabled. But if the kernel is not compiled with CONFIG_PREEMPT, it basically stops everything but interrupts for 10 seconds. Although this is just a test and is not for production, this attribute can be quite annoying. It can also spawn badness elsewhere. This patch solves the issues by calling "cond_resched" when the system is not compiled with CONFIG_PREEMPT. It also keeps track of the time spent to call cond_resched such that it does not go against the time calculations. That is, if the task schedules away, the time scheduled out is removed from the test data. Note, this only works for non PREEMPT because we do not know when the task is scheduled out if we have PREEMPT enabled. [ Impact: prevent test from stopping the world for 10 seconds ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	ring-buffer: make moving the tail page a separate function	Steven Rostedt	1	-40/+49
	Ingo Molnar thought the code would be cleaner if we used a function call instead of a goto for moving the tail page. After implementing this, it seems that gcc still inlines the result and the output is pretty much the same. Since this is considered a cleaner approach, might as well implement it. [ Impact: code clean up ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	ring-buffer: check for failed allocation in ring buffer benchmark	Steven Rostedt	1	-0/+3
	The result of the allocation of the ring buffer read page in the ring buffer bench mark does not check the return to see if a page was actually allocated. This patch fixes that. [ Impact: avoid NULL dereference ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	ring-buffer: remove unneeded conditional in rb_reserve_next	Steven Rostedt	1	-5/+3
	The code in __rb_reserve_next checks on page overflow if it is the original commiter and then resets the page back to the original setting. Although this is fine, and the code is correct, it is a bit fragil. Some experimental work I did breaks it easily. The better and more robust solution is to have all commiters that overflow the page, simply subtract what they added. [ Impact: more robust ring buffer account management ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	tracing: small trave_events sample Makefile cleanup	Christoph Hellwig	1	-3/+1
	Use -I$(src) to add the current directory the include path. [ Impact: cleanup ] Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	tracing: trace_output.c, fix false positive compiler warning	Jaswinder Singh Rajput	1	-1/+1
	This compiler warning: CC kernel/trace/trace_output.o kernel/trace/trace_output.c: In function ‘register_ftrace_event’: kernel/trace/trace_output.c:544: warning: ‘list’ may be used uninitialized in this function Is wrong as 'list' is always initialized - but GCC (4.3.2) does not recognize this relationship properly. Work around the warning by initializing the variable to NULL. [ Impact: fix false positive compiler warning ] Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	blktrace: from-sector redundant in trace_block_remap	Alan D. Brunelle	4	-11/+9
	Remove redundant from-sector parameter: it's /always/ the bio's sector passed in. [ Impact: cleanup ] Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49FF517C.7000503@hp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	blktrace: correct remap names	Alan D. Brunelle	3	-16/+16
	This attempts to clarify names utilized during block I/O remap operations (partition, volume manager). It correctly matches up the /from/ information for both device & sector. This takes in the concept from Kosaki Motohiro and extends it to include better naming for the "device_from" field. [ Impact: cleanup ] Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49FF4FAE.3000301@hp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	tracepoint: trace_sched_migrate_task(): remove parameter	Mathieu Desnoyers	2	-4/+4
	The orig_cpu parameter in trace_sched_migrate_task() is not necessary, it can be got by using task_cpu(p) in the probe. [ Impact: micro-optimization ] Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> [ modified from Mathieu's patch. The original patch is at: http://marc.info/?l=linux-kernel&m=123791201716239&w=2 ] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: fweisbec@gmail.com Cc: rostedt@goodmis.org Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: zhaolei@cn.fujitsu.com Cc: laijs@cn.fujitsu.com LKML-Reference: <49FFFDB7.1050402@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	tracing/events: fix concurrent access to ftrace_events list	Li Zefan	4	-17/+33
	A module will add/remove its trace events when it gets loaded/unloaded, so the ftrace_events list is not "const", and concurrent access needs to be protected. This patch thus fixes races between loading/unloding modules and read 'available_events' or read/write 'set_event', etc. Below shows how to reproduce the race: # for ((; ;)) { cat /mnt/tracing/available_events; } > /dev/null & # for ((; ;)) { insmod trace-events-sample.ko; rmmod sample; } & After a while: BUG: unable to handle kernel paging request at 0010011c IP: [<c1080f27>] t_next+0x1b/0x2d ... Call Trace: [<c10c90e6>] ? seq_read+0x217/0x30d [<c10c8ecf>] ? seq_read+0x0/0x30d [<c10b4c19>] ? vfs_read+0x8f/0x136 [<c10b4fc3>] ? sys_read+0x40/0x65 [<c1002a68>] ? sysenter_do_call+0x12/0x36 [ Impact: fix races when concurrent accessing ftrace_events list ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4A00F709.3080800@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	tracing/events: fix memory leak when unloading module	Li Zefan	3	-7/+34
	When unloading a module, memory allocated by init_preds() and trace_define_field() is not freed. [ Impact: fix memory leak ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <4A00F6E0.3040503@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	tracing/events: make SAMPLE_TRACE_EVENTS default to n	Li Zefan	1	-3/+2
	Normally a config should be default to n. This patch also makes the sample module-only, like SAMPLE_MARKERS and SAMPLE_TRACEPOINTS. [ Impact: don't build trace event sample by default ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A00F6C0.8090803@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	tracing/events: don't say hi when loading the trace event sample	Li Zefan	1	-4/+0
	The sample is useful for testing, and I'm using it. But after loading the module, it keeps saying hi every 10 seconds, this may be disturbing. Also Steven said commenting out the "hi" helped in causing races. :) [ Impact: make testing a bit easier ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4A00F6AD.2070008@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06	ring-buffer: add benchmark and tester	Steven Rostedt	3	-0/+396
	This patch adds code that can benchmark the ring buffer as well as test it. This code can be compiled into the kernel (not recommended) or as a module. A separate ring buffer is used to not interfer with other users, like ftrace. It creates a producer and a consumer (option to disable creation of the consumer) and will run for 10 seconds, then sleep for 10 seconds and then repeat. While running, the producer will write 10 byte loads into the ring buffer with just putting in the current CPU number. The reader will continually try to read the buffer. The reader will alternate from reading the buffer via event by event, or by full pages. The output is a pr_info, thus it will fill up the syslogs. Starting ring buffer hammer End ring buffer hammer Time: 9000349 (usecs) Overruns: 12578640 Read: 5358440 (by events) Entries: 0 Total: 17937080 Missed: 0 Hit: 17937080 Entries per millisec: 1993 501 ns per entry Sleeping for 10 secs Starting ring buffer hammer End ring buffer hammer Time: 9936350 (usecs) Overruns: 0 Read: 28146644 (by pages) Entries: 74 Total: 28146718 Missed: 0 Hit: 28146718 Entries per millisec: 2832 353 ns per entry Sleeping for 10 secs Time: is the time the test ran Overruns: the number of events that were overwritten and not read Read: the number of events read (either by pages or events) Entries: the number of entries left in the buffer (the by pages will only read full pages) Total: Entries + Read + Overruns Missed: the number of entries that failed to write Hit: the number of entries that were written The above example shows that it takes ~353 nanosecs per entry when there is a reader, reading by pages (and no overruns) The event by event reader slowed the producer down to 501 nanosecs. [ Impact: see how changes to the ring buffer affect stability and performance ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	ring-buffer: move big if statement down	Steven Rostedt	1	-107/+111
	In the hot path of the ring buffer "__rb_reserve_next" there's a big if statement that does not even return back to the work flow. code; if (cross to next page) { [ lots of code ] return; } more code; The condition is even the unlikely path, although we do not denote it with an unlikely because gcc is fine with it. The condition is true when the write crosses a page boundary, and we need to start at a new page. Having this if statement makes it hard to read, but calling another function to do the work is also not appropriate, because we are using a lot of variables that were set before the if statement, and we do not want to send them as parameters. This patch changes it to a goto: code; if (cross to next page) goto next_page; more code; return; next_page: [ lots of code] This makes the code easier to understand, and a bit more obvious. The output from gcc is practically identical. For some reason, gcc decided to use different registers when I switched it to a goto. But other than that, the logic is the same. [ Impact: easier to read code ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	tracing: use proper export symbol for tracing api	Steven Rostedt	1	-3/+3
	When adding the EXPORT_SYMBOL to some of the tracing API, I accidently used EXPORT_SYMBOL instead of EXPORT_SYMBOL_GPL. This patch fixes that mistake. [ Impact: export the tracing code only for GPL modules ] Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	ftrace: use .sched.text, not .text.sched in recordmcount.pl	Tim Abbott	1	-3/+3
	The only references in the kernel to the .text.sched section are in recordmcount.pl. Since the code it has is intended to be example code it should refer to real kernel sections. So change it to .sched.text instead. [ Impact: consistency in comments ] Signed-off-by: Tim Abbott <tabbott@mit.edu> LKML-Reference: <1241136371-10768-1-git-send-email-tabbott@mit.edu> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-06	drm/r128: fix r128 ioremaps to use ioremap_wc.	Dave Airlie	1	-3/+3
	This should allow r128 to start working again since PAT changes. taken from F-11 kernel. Signed-off-by: Dave Airlie <airlied@redhat.com>
2009-05-05	Ignore madvise(MADV_WILLNEED) for hugetlbfs-backed regions	Mel Gorman	1	-0/+8
	madvise(MADV_WILLNEED) forces page cache readahead on a range of memory backed by a file. The assumption is made that the page required is order-0 and "normal" page cache. On hugetlbfs, this assumption is not true and order-0 pages are allocated and inserted into the hugetlbfs page cache. This leaks hugetlbfs page reservations and can cause BUGs to trigger related to corrupted page tables. This patch causes MADV_WILLNEED to be ignored for hugetlbfs-backed regions. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-05	ring-buffer: disable writers when resetting buffers	Steven Rostedt	1	-0/+4
	As a precaution, it is best to disable writing to the ring buffers when reseting them. [ Impact: prevent weird things if write happens during reset ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	ring-buffer: have read page swap increment counter with page entries	Steven Rostedt	1	-25/+3
	In the swap page ring buffer code that is used by the ftrace splice code, we scan the page to increment the counter of entries read. With the number of entries already in the page we simply need to add it. [ Impact: speed up reading page from ring buffer ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	[IA64] xen_domu_defconfig: fix build issues/warnings	Jan Beulich	4	-9/+10
	- drivers/xen/events.c did not compile - xen_setup_hook caused a modpost section warning - the use of u64 (instead of unsigned long long) together with a %llu in drivers/xen/balloon.c caused a compiler warning Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2009-05-05	ring-buffer: record page entries in buffer page descriptor	Steven Rostedt	1	-26/+13
	Currently, when the ring buffer writer overflows the buffer and must write over non consumed data, we increment the overrun counter by reading the entries on the page we are about to overwrite. This reads the entries one by one. This is not very effecient. This patch adds another entry counter into each buffer page descriptor that keeps track of the number of entries on the page. Now on overwrite, the overrun counter simply needs to add the number of entries that is on the page it is about to overwrite. [ Impact: speed up of ring buffer in overwrite mode ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	ring-buffer: convert cpu buffer entries to local_t	Steven Rostedt	1	-9/+12
	The entries counter in cpu buffer is not atomic. It can be updated by other interrupts or from another CPU (readers). But making entries into "atomic_t" causes an atomic operation that can hurt performance. Instead we convert it to a local_t that will increment a counter with a local CPU atomic operation (if the arch supports it). Instead of fighting with readers and overwrites that decrement the counter, I added a "read" counter. Every time a reader reads an entry it is incremented. We already have a overrun counter and with that, the entries counter and the read counter, we can calculate the total number of entries in the buffer with: (entries - overrun) - read As long as the total number of entries in the ring buffer is less than the word size, this will work. But since the entries counter was previously a long, this is no different than what we had before. Thanks to Andrew Morton for pointing out in the first version that atomic_t does not replace unsigned long. I switched to atomic_long_t even though it is signed. A negative count is most likely a bug. [ Impact: keep accurate count of cpu buffer entries ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	tracing: export stats of ring buffers to userspace	Steven Rostedt	1	-0/+42
	This patch adds stats to the ftrace ring buffers: # cat /debugfs/tracing/per_cpu/cpu0/stats entries: 42360 overrun: 30509326 commit overrun: 0 nmi dropped: 0 Where entries are the total number of data entries in the buffer. overrun is the number of entries not consumed and were overwritten by the writer. commit overrun is the number of entries dropped due to nested writers wrapping the buffer before the initial writer finished the commit. nmi dropped is the number of entries dropped due to the ring buffer lock being held when an nmi was going to write to the ring buffer. Note, this field will be meaningless and will go away when the ring buffer becomes lockless. [ Impact: let userspace know what is happening in the ring buffers ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	ring-buffer: add counters for commit overrun and nmi dropped entries	Steven Rostedt	2	-3/+51
	The WARN_ON in the ring buffer when a commit is preempted and the buffer is filled by preceding writes can happen in normal operations. The WARN_ON makes it look like a bug, not to mention, because it does not stop tracing and calls printk which can also recurse, this is prone to deadlock (the WARN_ON is not in a position to recurse). This patch removes the WARN_ON and replaces it with a counter that can be retrieved by a tracer. This counter is called commit_overrun. While at it, I added a nmi_dropped counter to count any time an NMI entry is dropped because the NMI could not take the spinlock. [ Impact: prevent deadlock by printing normal case warning ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	ring-buffer: export symbols	Steven Rostedt	1	-0/+3
	I'm adding a module to do a series of tests on the ring buffer as well as benchmarks. This module needs to have more of the ring buffer API exported. There's nothing wrong with reading the ring buffer from a module. [ Impact: allow modules to read pages from the ring buffer ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-05-05	i2c-algo-pca: Let PCA9564 recover from unacked data byte (state 0x30)	Enrik Berkhan	1	-0/+11
	Currently, the i2c-algo-pca driver does nothing if the chip enters state 0x30 (Data byte in I2CDAT has been transmitted; NOT ACK has been received). Thus, the i2c bus connected to the controller gets stuck afterwards. I have seen this kind of error on a custom board in certain load situations most probably caused by interference or noise. A possible reaction is to let the controller generate a STOP condition. This is documented in the PCA9564 data sheet (2006-09-01) and the same is done for other NACK states as well. Further, state 0x38 isn't handled completely, either. Try to do another START in this case like the data sheet says. As this couldn't be tested, I've added a comment to try to reset the chip if the START doesn't help as suggested by Wolfram Sang. Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com> Reviewed-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-05-05	i2c-algo-bit: Fix timeout test	Dave Airlie	1	-1/+1
	When fetching DDC using i2c algo bit, we were often seeing timeouts before getting valid EDID on a retry. The VESA spec states 2ms is the DDC timeout, so when this translates into 1 jiffie and we are close to the end of the time period, it could return with a timeout less than 2ms. Change this code to use time_after instead of time_after_eq. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-05-05	i2c: Timeouts off by 1	Roel Kluin	9	-13/+13
	with while (timeout++ < MAX_TIMEOUT); timeout reaches MAX_TIMEOUT + 1 after the loop, so the tests below are off by one. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-05-04	e1000: fix virtualization bug	Jesse Brandeburg	1	-1/+1
	a recent fix to e1000 (commit 15b2bee2) caused KVM/QEMU/VMware based virtualized e1000 interfaces to begin failing when resetting. This is because the driver in a virtual environment doesn't get to run instructions AT ALL when an interrupt is asserted. The interrupt code runs immediately and this recent bug fix allows an interrupt to be possible when the interrupt handler will reject it (due to the new code), when being called from any path in the driver that holds the E1000_RESETTING flag. the driver should use the __E1000_DOWN flag instead of the __E1000_RESETTING flag to prevent interrupt execution while reconfiguring the hardware. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-04	bonding: fix alb mode locking regression	Jay Vosburgh	1	-9/+3
	Fix locking issue in alb MAC address management; removed incorrect locking and replaced with correct locking. This bug was introduced in commit 059fe7a578fba5bbb0fdc0365bfcf6218fa25eb0 ("bonding: Convert locks to _bh, rework alb locking for new locking") Bug reported by Paul Smith <paul@mad-scientist.net>, who also tested the fix. Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-05	selinux: Fix send_sigiotask hook	Stephen Smalley	1	-1/+1
	The CRED patch incorrectly converted the SELinux send_sigiotask hook to use the current task SID rather than the target task SID in its permission check, yielding the wrong permission check. This fixes the hook function. Detected by the ltp selinux testsuite and confirmed to correct the test failure. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
2009-05-04	proc: avoid information leaks to non-privileged processes	Jake Edge	2	-5/+13
	By using the same test as is used for /proc/pid/maps and /proc/pid/smaps, only allow processes that can ptrace() a given process to see information that might be used to bypass address space layout randomization (ASLR). These include eip, esp, wchan, and start_stack in /proc/pid/stat as well as the non-symbolic output from /proc/pid/wchan. ASLR can be bypassed by sampling eip as shown by the proof-of-concept code at http://code.google.com/p/fuzzyaslr/ As part of a presentation (http://www.cr0.org/paper/to-jt-linux-alsr-leak.pdf) esp and wchan were also noted as possibly usable information leaks as well. The start_stack address also leaks potentially useful information. Cc: Stable Team <stable@kernel.org> Signed-off-by: Jake Edge <jake@lwn.net> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-04	Bluetooth: Fix issue with sysfs handling for connections	Marcel Holtmann	3	-34/+43
	Due to a semantic changes in flush_workqueue() the current approach of synchronizing the sysfs handling for connections doesn't work anymore. The whole approach is actually fully broken and based on assumptions that are no longer valid. With the introduction of Simple Pairing support, the creation of low-level ACL links got changed. This change invalidates the reason why in the past two independent work queues have been used for adding/removing sysfs devices. The adding of the actual sysfs device is now postponed until the host controller successfully assigns an unique handle to that link. So the real synchronization happens inside the controller and not the host. The only left-over problem is that some internals of the sysfs device handling are not initialized ahead of time. This leaves potential access to invalid data and can cause various NULL pointer dereferences. To fix this a new function makes sure that all sysfs details are initialized when an connection attempt is made. The actual sysfs device is only registered when the connection has been successfully established. To avoid a race condition with the registration, the check if a device is registered has been moved into the removal work. As an extra protection two flush_work() calls are left in place to make sure a previous add/del work has been completed first. Based on a report by Marc Pignat <marc.pignat@hevs.ch> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Tested-by: Justin P. Mattock <justinmattock@gmail.com> Tested-by: Roger Quadros <ext-roger.quadros@nokia.com> Tested-by: Marc Pignat <marc.pignat@hevs.ch>
2009-05-04	usbnet: CDC EEM support (v5)	Omar Laazimani	4	-0/+399
	This introduces a CDC Ethernet Emulation Model (EEM) host side driver to support USB EEM devices. EEM is different from the Ethernet Control Model (ECM) currently supported by the "CDC Ethernet" driver. One key difference is that it doesn't require of USB interface alternate settings to manage interface state; some maldesigned hardware can't handle that part of USB. It also avoids a separate USB interface for control and status updates. [ dbrownell@users.sourceforge.net: fix skb leaks, add rx packet checks, improve fault handling, EEM conformance updates, cleanup ] Signed-off-by: Omar Laazimani <omar.oberthur@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-04	x86: show number of core_siblings instead of thread_siblings in /proc/cpuinfo	Andreas Herrmann	1	-1/+1
	Commit 7ad728f98162cb1af06a85b2a5fc422dddd4fb78 (cpumask: x86: convert cpu_sibling_map/cpu_core_map to cpumask_var_t) changed the output of /proc/cpuinfo for siblings: Example on an AMD Phenom: physical id : 0 siblings : 1 core id : 3 cpu cores : 4 Before that commit it was: physical id : 0 siblings : 4 core id : 3 cpu cores : 4 Instead of cpu_core_mask it now uses cpu_sibling_mask to count siblings. This is due to the following hunk of above commit: \| --- a/arch/x86/kernel/cpu/proc.c \| +++ b/arch/x86/kernel/cpu/proc.c \| @@ -14,7 +14,7 @@ static void show_cpuinfo_core(struct seq_file m, struct cpuinf \| if (c->x86_max_cores smp_num_siblings > 1) { \| seq_printf(m, "physical id\t: %d\n", c->phys_proc_id); \| seq_printf(m, "siblings\t: %d\n", \| - cpus_weight(per_cpu(cpu_core_map, cpu))); \| + cpumask_weight(cpu_sibling_mask(cpu))); \| seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id); \| seq_printf(m, "cpu cores\t: %d\n", c->booted_cores); \| seq_printf(m, "apicid\t\t: %d\n", c->apicid); This was a mistake, because the impact line shows that this side-effect was not anticipated: Impact: reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y So revert the respective hunk to restore the old behavior. [ Impact: fix sibling-info regression in /proc/cpuinfo ] Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <20090504182859.GA29045@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-04	tcp: Fix tcp_prequeue() to get correct rto_min value	Satoru SATOH	2	-11/+13
	tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of the actual value might be tuned. The following patches fix this and make tcp_prequeue get the actual value returns from tcp_rto_min(). Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-04	ehea: fix invalid pointer access	Hannes Hering	2	-14/+19
	This patch fixes an invalid pointer access in case the receive queue holds no pointer to the next skb when the queue is empty. Signed-off-by: Hannes Hering <hering2@de.ibm.com> Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-04	ASoC: Remove BROKEN from mpc5200 kconfig	Takashi Iwai	1	-1/+1
	The regression was fixed by commit 3e5b50165fd0be080044586f43fcdd460ed27610, so no need to mark this driver as BROKEN. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-05-04	amd-iommu: fix iommu flag masks	Joerg Roedel	1	-8/+8
	The feature bits should be set via bitmasks, not via feature IDs. [ Impact: fix feature enabling in newer IOMMU versions ] Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> LKML-Reference: <20090504102028.GA30307@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-04	kbuild, modpost: fix unexpected non-allocatable warning with mips	Sam Ravnborg	1	-1/+11
	mips emit the following debug sections: .mdebug* and .pdr They were included in the check for non-allocatable section and caused modpost to warn. Manuel Lauss suggested to fix this by adding the relevant sections to the list of sections we do not check. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reported-by: Manuel Lauss <mano@roarinelk.homelinux.net>
2009-05-04	kbuild, modpost: fix "unexpected non-allocatable" warning with SUSE gcc	Sam Ravnborg	1	-1/+1
	Jean reported that he saw one warning for each module like the one below: WARNING: arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.o (.comment.SUSE.OPTs): unexpected non-allocatable section. The warning appeared with the improved version of the check of the flags in the sections. That check already ignored sections named ".comment" - but SUSE store additional info in the comment section and has named it in a SUSE specific way. Therefore modpost failed to ignore the section. The fix is to extend the pattern so we ignore all sections that start with the name ".comment.". Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reported-by: Jean Delvare <khali@linux-fr.org> Tested-by: Jean Delvare <khali@linux-fr.org>
2009-05-04	kbuild, modpost: fix unexpected non-allocatable section when cross compiling	Anders Kaseorg	1	-12/+23
	The missing TO_NATIVE(sechdrs[i].sh_flags) was causing many unexpected non-allocatable section warnings when cross-compiling for an architecture with a different endianness. Fix endianness of all the fields in the ELF header and section headers, not just some of them so we are not hit by this anohter time. Signed-off-by: Anders Kaseorg <andersk@mit.edu> Reported-by: Sean MacLennan <smaclennan@pikatech.com> Tested-by: Sean MacLennan <smaclennan@pikatech.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-05-03	mvsdio: fix CONFIG_PM=y build	Rabin Vincent	1	-6/+5
	Fix usage of obsolete parameters and functions in the driver's PM callbacks. Signed-off-by: Rabin Vincent <rabin@rab.in> Acked-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Pierre Ossman <pierre@ossman.eu>