Age | Commit message (Collapse) | Author | Files | Lines |
|
As rcu_dereference_raw() under RCU debug config options can add quite a
bit of checks, and that tracing uses rcu_dereference_raw(), these checks
happen with the function tracer. The function tracer also happens to trace
these debug checks too. This added overhead can livelock the system.
Have the function tracer use the new RCU _notrace equivalents that do
not do the debug checks for RCU.
Link: http://lkml.kernel.org/r/20130528184209.467603904@goodmis.org
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
As rcu_dereference_raw() under RCU debug config options can add quite a
bit of checks, and that tracing uses rcu_dereference_raw(), these checks
happen with the function tracer. The function tracer also happens to trace
these debug checks too. This added overhead can livelock the system.
Add a new interface to RCU for both rcu_dereference_raw_notrace() as well
as hlist_for_each_entry_rcu_notrace() as the hlist iterator uses the
rcu_dereference_raw() as well, and is used a bit with the function tracer.
Link: http://lkml.kernel.org/r/20130528184209.304356745@goodmis.org
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
The tracing infrastructure sets up for possible CPUs, but it uses
the ring buffer polling, it is possible to call the ring buffer
polling code with a CPU that hasn't been allocated. This will cause
a kernel oops when it access a ring buffer cpu buffer that is part
of the possible cpus but hasn't been allocated yet as the CPU has never
been online.
Reported-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
If ftrace=<tracer> is on the kernel command line, when that tracer is
registered, it will be initiated by tracing_set_tracer() to execute that
tracer.
The nop tracer is just a stub tracer that is used to have no tracer
enabled. It is assigned at early bootup as it is the default tracer.
But if ftrace=nop is on the kernel command line, the registering of the
nop tracer will call tracing_set_tracer() which will try to execute
the nop tracer. But it expects tr->current_trace to be assigned something
as it usually is assigned to the nop tracer. As it hasn't been assigned
to anything yet, it causes the system to crash.
The simple fix is to move the tr->current_trace = nop before registering
the nop tracer. The functionality is still the same as the nop tracer
doesn't do anything anyway.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Since try_module_get() returns false( = 0) when it fails to
pindown a module, event_enable_func() returns 0 which means
"succeed". This can cause a kernel panic when the entry
is removed, because the event is already released.
This fixes the bug by returning -EBUSY, because the reason
why it fails is that the module is being removed at that time.
Link: http://lkml.kernel.org/r/20130516114848.13508.97899.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
According to sparse warning, print_*probe_event static because
those functions are not directly called from outside.
Link: http://lkml.kernel.org/r/20130513115839.6545.83067.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Fix a sparse warning about the rcu operated pointer is
defined without __rcu address space.
Link: http://lkml.kernel.org/r/20130513115837.6545.23322.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Use rcu_dereference_raw() for accessing tp->files. Because the
write-side uses rcu_assign_pointer() for memory barrier,
the read-side also has to use rcu_dereference_raw() with
read memory barrier.
Link: http://lkml.kernel.org/r/20130513115834.6545.17022.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
Special preds are created when folding a series of preds that
can be done in serial. These are allocated in an ops field of
the pred structure. But they were never freed, causing memory
leaks.
This was discovered using the kmemleak checker:
unreferenced object 0xffff8800797fd5e0 (size 32):
comm "swapper/0", pid 1, jiffies 4294690605 (age 104.608s)
hex dump (first 32 bytes):
00 00 01 00 03 00 05 00 07 00 09 00 0b 00 0d 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff814b52af>] kmemleak_alloc+0x73/0x98
[<ffffffff8111ff84>] kmemleak_alloc_recursive.constprop.42+0x16/0x18
[<ffffffff81120e68>] __kmalloc+0xd7/0x125
[<ffffffff810d47eb>] kcalloc.constprop.24+0x2d/0x2f
[<ffffffff810d4896>] fold_pred_tree_cb+0xa9/0xf4
[<ffffffff810d3781>] walk_pred_tree+0x47/0xcc
[<ffffffff810d5030>] replace_preds.isra.20+0x6f8/0x72f
[<ffffffff810d50b5>] create_filter+0x4e/0x8b
[<ffffffff81b1c30d>] ftrace_test_event_filter+0x5a/0x155
[<ffffffff8100028d>] do_one_initcall+0xa0/0x137
[<ffffffff81afbedf>] kernel_init_freeable+0x14d/0x1dc
[<ffffffff814b24b7>] kernel_init+0xe/0xdb
[<ffffffff814d539c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
|
|
|
|
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
This fixes a bug where the iscsi class/driver did not do a put_device
when a sess/conn device was found. This also simplifies the interface
by not having to pass in some arguments that were duplicated and did
not need to be exported.
Reported-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
These enums have been separate since the dawn of SAS, mainly because the
latter is a procotol only enum and the former includes additional state
for libsas. The dichotomy causes endless confusion about which one you
should use where and leads to pointless warnings like this:
drivers/scsi/mvsas/mv_sas.c: In function 'mvs_update_phyinfo':
drivers/scsi/mvsas/mv_sas.c:1162:34: warning: comparison between 'enum sas_device_type' and 'enum sas_dev_type' [-Wenum-compare]
Fix by eliminating one of them. The one kept is effectively the sas.h
one, but call it sas_device_type and make sure the enums are all
properly namespaced with the SAS_ prefix.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Modified thermal configuration to happen after interrupt registration
Added SAS controller configuration during initialization
Added error handling logic to handle I_T_Nexus errors and variants
[jejb: fix up tabs and spaces issues]
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Handled NCQ errors in the low level driver as the FW
is not providing the faulty tag for NCQ errors for libsas
to recover.
[jejb: fix checkpatch issues]
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Individual WWN read operations based on controller.
PM8081 - Read WWN from Flash VPD.
PM8088/89 - Read WWN from EEPROM.
PM8001 - Read WWN from NVM.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Changed name in driver to pm80xx. Updated debug messages.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Performing pci_free_consistent in tasklet had result in a core dump. So
allocated a new memory region for it. Fix for passing proper address
and operation in firmware flash update.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Additional bar shift for new SPC firmware, applicable to device
id 0x8081 only.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Implementation of SPCv/ve specific hardware functionality and
macros. Changing common functionalities wrt SPCv/ve operations.
Conditional checks for SPC specific operations.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Implementation of interrupt handlers and tasklets to support
upto 64 interrupt for the device.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Update of function prototype for common function to SPC and SPCv/ve.
Multiple queues implementation for IO.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Memory allocation and configuration of multiple inbound and
outbound queues.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Updated pci id table with device, vendor, subdevice and subvendor ids
for 8081, 8088, 8089 SAS/SATA controllers. Added SPCv/ve related macros.
Updated macros, hba info structure and other structures for SPCv/ve.
Update of structure and variable names for SPC hardware functionalities.
Signed-off-by: Sakthivel K <Sakthivel.SaravananKamalRaju@pmcs.com>
Signed-off-by: Anand Kumar S <AnandKumar.Santhanam@pmcs.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
lpfc uses the generic checksum as well as the T10DIF one from the lib/
directory, so make sure they're selected.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
scsi_send_eh_cmnd() is calling queuecommand() directly, so
it needs to check the return value here.
The only valid return codes for queuecommand() are 'busy'
states, so we need to wait for a bit to allow the LLDD
to recover.
Based on an earlier patch from Wen Xiong.
[jejb: fix confusion between msec and jiffies values and other issues]
[bvanassche: correct stall_for interval]
Cc: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
|
|
Share configuration option processing code between the dm cache
ctr and message functions.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Move process_config_option() in dm-cache-target.c to make the
next patch more readable.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Generate a dm event when the amount of remaining thin pool metadata
space falls below a certain level.
The threshold is taken to be a quarter of the size of the metadata
device with a minimum threshold of 4MB.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Add a threshold callback to dm persistent data space maps.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Add a threshold callback function to the persistent data space map
interface for a subsequent patch to use.
dm-thin and dm-cache are interested in knowing when they're getting
low on metadata or data blocks. This patch introduces a new method
for registering a callback against a threshold.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Allow the dm thin pool metadata device to be extended.
Whenever a pool is resumed, detect whether the size of the metadata
device has increased, and if so, extend the metadata to use the new
space.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Support extending a dm persistent data metadata space map.
The extend itself is implemented by switching back to the boostrap
allocator and pointing to the new space. The extra bitmap indexes are
then allocated from the new space, and finally we switch back to the
proper space map ops and tweak the reference counts.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
If a thin pool is created in read-only-metadata mode then only open the
metadata device read-only.
Previously it was always opened with FMODE_READ | FMODE_WRITE.
(Note that dm_get_device() still allows read-only dm devices to be used
read-write at the moment: If I create a read-only linear device for the
metadata, via dmsetup load --readonly, then I can still create a rw pool
out of it.)
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Refactor device size functions in preparation for similar metadata
device resizing functions.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Use struct assignment rather than memcpy in dm cache.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Fix up some typos in dm-cache comments.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Correct the documented requirement on the return code from dm cache policy
lookup functions stated in the policy module header file.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Document iterate_devices in device-mapper.h.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Fix some typos in dm-space-map-metadata.c error messages.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Tune the dm cache migration throttling.
i) Issue a tick every second, just in case there's no i/o going through.
ii) Drop the migration threshold right down to something suitable for
background work.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Enable WRITE SAME support in dm multipath. As far as multipath is
concerned it is just another write request.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Bharata B Rao <bharata.rao@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
If device_not_write_same_capable() returns true then the iterate_devices
loop in dm_table_supports_write_same() should return false.
Reported-by: Bharata B Rao <bharata.rao@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # v3.8+
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
This patch uses memalloc_noio_save to avoid a possible deadlock in
dm-bufio. (it could happen only with large block size, at most
PAGE_SIZE << MAX_ORDER (typically 8MiB).
__vmalloc doesn't fully respect gfp flags. The specified gfp flags are
used for allocation of requested pages, structures vmap_area, vmap_block
and vm_struct and the radix tree nodes.
However, the kernel pagetables are allocated always with GFP_KERNEL.
Thus the allocation of pagetables can recurse back to the I/O layer and
cause a deadlock.
This patch uses the function memalloc_noio_save to set per-process
PF_MEMALLOC_NOIO flag and the function memalloc_noio_restore to restore
it. When this flag is set, all allocations in the process are done with
implied GFP_NOIO flag, thus the deadlock can't happen.
This should be backported to stable kernels, but they don't have the
PF_MEMALLOC_NOIO flag and memalloc_noio_save/memalloc_noio_restore
functions. So, PF_MEMALLOC should be set and restored instead.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Return -ENOMEM instead of success if unable to allocate pending
exception mempool in snapshot_ctr.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Return -ENOMEM if memory allocation fails in cache_create
instead of 0 (to avoid NULL pointer dereference).
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
Fix a regression in the calculation of the stripe_width in the
dm stripe target which led to incorrect processing of device limits.
The stripe_width is the stripe device length divided by the number of
stripes. The group of commits in the range f14fa69 ("dm stripe: fix
size test") to eb850de ("dm stripe: support for non power of 2
chunksize") interfered with each other (a merging error) and led to the
stripe_width being set incorrectly to the stripe device length divided by
chunk_size * stripe_count.
For example, a stripe device's table with: 0 33553920 striped 3 512 ...
should result in a stripe_width of 11184640 (33553920 / 3), but due to
the bug it was getting set to 21845 (33553920 / (512 * 3)).
The impact of this bug is that device topologies that previously worked
fine with the stripe target are no longer considered valid. In
particular, there is a higher risk of seeing this issue if one of the
stripe devices has a 4K logical block size. Resulting in an error
message like this:
"device-mapper: table: 253:4: len=21845 not aligned to h/w logical block size 4096 of dm-1"
The fix is to swap the order of the divisions and to use a temporary
variable for the second one, so that width retains the intended
value.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.6+
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
We now cache the MSI-X capability offset in the struct pci_dev, so no
need to find the capability again.
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the
Table Offset register, not the flags ("Message Control" per spec)
register.
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
This reverts commit affdb62b815b38261f09f9d4ec210a35c7ffb1f3.
The commit introduced a regression with AD codecs where the stream is
always clean up. Since the patch is just a minor optimization and
reverting the commit fixes the issue, let's just revert it.
Reported-and-tested-by: Michael Burian <michael.burian@sbg.at>
Cc: <stable@vger.kernel.org> [v3.9+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|