Age | Commit message (Collapse) | Author | Files | Lines |
|
Per the recommendation by Sagi on:
http://lists.infradead.org/pipermail/linux-nvme/2017-April/009261.html
Rather than waiting for reset work thread to stop queues and abort the ios,
immediately stop the queues on error detection. Reset thread will restop
the queues (as it's called on other paths), but it does not appear to have
a side effect.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
In order to create an association, the remoteport must be
serving either a target role or a discovery role.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
FC Port roles is a bit mask, not individual values.
Correct nvme definitions to unique bits.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
CMB doesn't get unmapped until removal while getting remapped on every
reset. Add the unmapping and sysfs file removal to the reset path in
nvme_pci_disable to match the mapping path in nvme_pci_enable.
Fixes: 202021c1a ("nvme : Add sysfs entry for NVMe CMBs when appropriate")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Reviewed-By: Stephen Bates <sbates@raithlin.com>
Cc: <stable@vger.kernel.org> # 4.9+
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
sscanf is a very poor way to parse integer. For example, I input
"discard" for act_mask, it gets 0xd and completely messes up. Using
correct API to do integer parse.
This patch also makes attributes accept any base of integer.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Commit 5f7f7543f52e "fuse: Convert to separately allocated bdi" didn't
properly handle fuseblk filesystem. When fuse_bdi_init() is called for
that filesystem type, sb->s_bdi is already initialized (by
set_bdev_super()) to point to block device's bdi and consequently
super_setup_bdi_name() complains about this fact when reseting bdi to
the private one.
Fix the problem by properly dropping bdi reference in fuse_bdi_init()
before creating a private bdi in super_setup_bdi_name().
Fixes: 5f7f7543f52e ("fuse: Convert to separately allocated bdi")
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Add null check before calling xen_blkif_put() to avoid potential
null pointer dereference.
Addresses-Coverity-ID: 1350942
Cc: Juergen Gross <jgross@suse.com>
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
|
|
When killing kref_sub(), the unconditional additional kref_get()
was not properly paired with the necessary kref_put(), causing
a leak of struct drbd_requests (~ 224 Bytes) per submitted bio,
and breaking DRBD in general, as the destructor of those "drbd_requests"
does more than just the mempoll_free().
Fixes: bdfafc4ffdd2 ("locking/atomic, kref: Kill kref_sub()")
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: stable@vger.kernel.org # v4.11
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
SCSI devices can return short writes on Write Same just like for normal
writes, so we need to handle this case for our special payload requests
as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
When formatting NVMe to 512B/4K + T10 DIf/DIX, dd with split op returns
"Input/output error". Looks block layer split the bio after calling
bio_integrity_prep(bio). This patch fixes the issue.
Below is how we debug this issue:
(1)format nvme to 4K block # size with type 2 DIF
(2)dd with block size bigger than 1024k.
oflag=direct
dd: error writing '/dev/nvme0n1': Input/output error
We added some debug code in nvme device driver. It showed us the first
op and the second op have the same bi and pi address. This is not
correct.
1st op: nvme0n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
dsmgmt=0x0, AT=0x0 & RT=0x505
Guard 0x00b1, AT 0x0000, RT physical 0x00000505 RT virtual 0x00002828
2nd op: nvme0n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
AT=0x0 & RT=0x605 ==> This op fails and subsequent 5 retires..
Guard 0x00b1, AT 0x0000, RT physical 0x00000605 RT virtual 0x00002828
With the fix, It showed us both of the first op and the second op have
correct bi and pi address.
1st op: nvme2n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
dsmgmt=0x0, AT=0x0 & RT=0x505
Guard 0x5ccb, AT 0x0000, RT physical 0x00000505 RT virtual
0x00002828
2nd op: nvme2n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
AT=0x0 & RT=0x605
Guard 0xab4c, AT 0x0000, RT physical 0x00000605 RT virtual
0x00003028
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
If PREEMPT_RCU is enabled, rcu_read_lock() isn't strong enough
for us to use this_cpu_ptr() in that section. Use the safer
get/put_cpu_ptr() variants instead.
Reported-by: Mike Galbraith <efault@gmx.de>
Fixes: 34dbad5d26e2 ("blk-stat: convert to callback-based statistics reporting")
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
We warn twice for switching to a scheduler, if that switch fails.
As we also report the failure in the return value to the
sysfs write, remove the dmesg induced failures.
Keep the failure print for warning to switch to the kconfig
selected IO scheduler, as we can't report errors for that in
any other way.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The introduction of the BFQ and Kyber I/O schedulers has triggered a
new wave of I/O benchmarks. Unfortunately, comments and discussions on
these benchmarks confirm that there is still little awareness that it
is very hard to achieve, at the same time, a low latency and a high
throughput. In particular, virtually all benchmarks measure
throughput, or throughput-related figures of merit, but, for BFQ, they
use the scheduler in its default configuration. This configuration is
geared, instead, toward a low latency. This is evidently a sign that
BFQ documentation is still too unclear on this important aspect. This
commit addresses this issue by stressing how BFQ configuration must be
(easily) changed if the only goal is maximum throughput.
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
In the function __bfq_deactivate_entity, the pointer
entity->sched_data could happen to be used before being properly
initialized. This led to a NULL pointer dereference. This commit fixes
this bug by just using this pointer only where it is safe to do so.
Reported-by: Tom Harrison <l12436.tw@gmail.com>
Tested-by: Tom Harrison <l12436.tw@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Free up kmalloc allocated memory if failure happens while handling L2P
table transfer in nvme_nvm_get_l2p_tbl.
Fixes: 8e79b5cb ("lightnvm: move block provisioning to targets")
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Making __blk_mq_stop_hw_queues static fixes sparse warning:
block/blk-mq.c:6: warning: symbol '__blk_mq_stop_hw_queues' was not
declared. Should it be static?
Fixes: 2719aa217e0d0 ("blk-mq: don't use sync workqueue flushing from drivers")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
With gcc 4.1.2:
drivers/nvme/host/lightnvm.c: In function ‘nvme_nvm_submit_io’:
drivers/nvme/host/lightnvm.c:498: warning: ‘rq’ is used uninitialized in this function
Indeed, since commit 2e13f33a2464fc3a ("lightnvm: create cmd before
allocating request"), the request is passed to nvme_nvm_rqtocmd() before
it is allocated.
Fortunately, as of commit 91276162de9476b8 ("lightnvm: refactor end_io
functions for sync"), nvme_nvm_rqtocmd () no longer uses the passed
request, so this warning is a false positive.
Drop the unused parameter to clean up the code and kill the warning.
Fixes: 2e13f33a2464fc3a ("lightnvm: create cmd before allocating request")
Fixes: 91276162de9476b8 ("lightnvm: refactor end_io functions for sync")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This can be triggered by hot-unplug one cpu.
======================================================
[ INFO: possible circular locking dependency detected ]
4.11.0+ #17 Not tainted
-------------------------------------------------------
step_after_susp/2640 is trying to acquire lock:
(all_q_mutex){+.+...}, at: [<ffffffffb33f95b8>] blk_mq_queue_reinit_work+0x18/0x110
but task is already holding lock:
(cpu_hotplug.lock){+.+.+.}, at: [<ffffffffb306d04f>] cpu_hotplug_begin+0x7f/0xe0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (cpu_hotplug.lock){+.+.+.}:
lock_acquire+0x11c/0x230
__mutex_lock+0x92/0x990
mutex_lock_nested+0x1b/0x20
get_online_cpus+0x64/0x80
blk_mq_init_allocated_queue+0x3a0/0x4e0
blk_mq_init_queue+0x3a/0x60
loop_add+0xe5/0x280
loop_init+0x124/0x177
do_one_initcall+0x53/0x1c0
kernel_init_freeable+0x1e3/0x27f
kernel_init+0xe/0x100
ret_from_fork+0x31/0x40
-> #0 (all_q_mutex){+.+...}:
__lock_acquire+0x189a/0x18a0
lock_acquire+0x11c/0x230
__mutex_lock+0x92/0x990
mutex_lock_nested+0x1b/0x20
blk_mq_queue_reinit_work+0x18/0x110
blk_mq_queue_reinit_dead+0x1c/0x20
cpuhp_invoke_callback+0x1f2/0x810
cpuhp_down_callbacks+0x42/0x80
_cpu_down+0xb2/0xe0
freeze_secondary_cpus+0xb6/0x390
suspend_devices_and_enter+0x3b3/0xa40
pm_suspend+0x129/0x490
state_store+0x82/0xf0
kobj_attr_store+0xf/0x20
sysfs_kf_write+0x45/0x60
kernfs_fop_write+0x135/0x1c0
__vfs_write+0x37/0x160
vfs_write+0xcd/0x1d0
SyS_write+0x58/0xc0
do_syscall_64+0x8f/0x710
return_from_SYSCALL_64+0x0/0x7a
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpu_hotplug.lock);
lock(all_q_mutex);
lock(cpu_hotplug.lock);
lock(all_q_mutex);
*** DEADLOCK ***
8 locks held by step_after_susp/2640:
#0: (sb_writers#6){.+.+.+}, at: [<ffffffffb3244aed>] vfs_write+0x1ad/0x1d0
#1: (&of->mutex){+.+.+.}, at: [<ffffffffb32d3a51>] kernfs_fop_write+0x101/0x1c0
#2: (s_active#166){.+.+.+}, at: [<ffffffffb32d3a59>] kernfs_fop_write+0x109/0x1c0
#3: (pm_mutex){+.+...}, at: [<ffffffffb30d2ecd>] pm_suspend+0x21d/0x490
#4: (acpi_scan_lock){+.+.+.}, at: [<ffffffffb34dc3d7>] acpi_scan_lock_acquire+0x17/0x20
#5: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffffb306d6d7>] freeze_secondary_cpus+0x27/0x390
#6: (cpu_hotplug.dep_map){++++++}, at: [<ffffffffb306cfd5>] cpu_hotplug_begin+0x5/0xe0
#7: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffffb306d04f>] cpu_hotplug_begin+0x7f/0xe0
stack backtrace:
CPU: 3 PID: 2640 Comm: step_after_susp Not tainted 4.11.0+ #17
Hardware name: Dell Inc. OptiPlex 7040/0JCTF8, BIOS 1.4.9 09/12/2016
Call Trace:
dump_stack+0x99/0xce
print_circular_bug+0x1fa/0x270
__lock_acquire+0x189a/0x18a0
lock_acquire+0x11c/0x230
? lock_acquire+0x11c/0x230
? blk_mq_queue_reinit_work+0x18/0x110
? blk_mq_queue_reinit_work+0x18/0x110
__mutex_lock+0x92/0x990
? blk_mq_queue_reinit_work+0x18/0x110
? kmem_cache_free+0x2cb/0x330
? anon_transport_class_unregister+0x20/0x20
? blk_mq_queue_reinit_work+0x110/0x110
mutex_lock_nested+0x1b/0x20
? mutex_lock_nested+0x1b/0x20
blk_mq_queue_reinit_work+0x18/0x110
blk_mq_queue_reinit_dead+0x1c/0x20
cpuhp_invoke_callback+0x1f2/0x810
? __flow_cache_shrink+0x160/0x160
cpuhp_down_callbacks+0x42/0x80
_cpu_down+0xb2/0xe0
freeze_secondary_cpus+0xb6/0x390
suspend_devices_and_enter+0x3b3/0xa40
? rcu_read_lock_sched_held+0x79/0x80
pm_suspend+0x129/0x490
state_store+0x82/0xf0
kobj_attr_store+0xf/0x20
sysfs_kf_write+0x45/0x60
kernfs_fop_write+0x135/0x1c0
__vfs_write+0x37/0x160
? rcu_read_lock_sched_held+0x79/0x80
? rcu_sync_lockdep_assert+0x2f/0x60
? __sb_start_write+0xd9/0x1c0
? vfs_write+0x1ad/0x1d0
vfs_write+0xcd/0x1d0
SyS_write+0x58/0xc0
? rcu_read_lock_sched_held+0x79/0x80
do_syscall_64+0x8f/0x710
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL64_slow_path+0x25/0x25
The cpu hotplug path will hold cpu_hotplug.lock and then reinit all exiting
queues for blk mq w/ all_q_mutex, however, blk_mq_init_allocated_queue() will
contend these two locks in the inversion order. This is due to commit eabe06595d62
(blk/mq: Cure cpu hotplug lock inversion), it fixes a cpu hotplug lock inversion
issue because of hotplug rework, however the hotplug rework is still work-in-progress
and lives in a -tip branch and mainline cannot yet trigger that splat. The commit
breaks the linus's tree in the merge window, so this patch reverts the lock order
and avoids to splat linus's tree.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Commit 37d69ee30808 ("docs: bump minimal GNU Make version to 3.81")
changes one entry of GNU make version in the changes.rst, there's still
one more entry saying that one need version 3.80. Fix that.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Now that kref is using the refcount apis, the _GPL markings are getting
exported to places that it previously wasn't. Now kref.h is GPLv2
licensed, so any non-GPL code using it better be talking to some
lawyers, but changing api markings isn't considered "nice", so let's fix
this up.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Since 2014, you can't successfully build kernels with GNU Make version
3.80. Example errors:
$ git describe
v4.11
$ make --version | head -1
GNU Make 3.80
$ make defconfig
HOSTCC scripts/basic/fixdep
scripts/Makefile.host:135: *** missing separator. Stop.
make: *** [defconfig] Error 2
$ make ARCH=arm64 help
arch/arm64/Makefile:43: *** unterminated call to function `warning': missing `)'. Stop.
$ make help >/dev/null
./Documentation/Makefile.sphinx:25: Extraneous text after `else' directive
./Documentation/Makefile.sphinx:31: *** only one `else' per conditional. Stop.
make: *** [help] Error 2
The first breakage was introduced by commit c8589d1e9e01 ("kbuild:
handle multi-objs dependency appropriately"). Since then (i.e. v3.18),
GNU Make 3.80 has not been able to compile the kernel, but nobody has
ever complained aboutt (or noticed) it.
Even GNU Make 3.81 is more than 10 years old. It would not hurt to
match the documentation with reality instead of fixing makefiles.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 17a9be317475 ("initramfs: Always do fput() and load modules after
rootfs populate") introduced an error for the
CONFIG_BLK_DEV_RAM=y
case, because even though the code looks fine, the compiler really wants
a statement after a label, or you'll get complaints:
init/initramfs.c: In function 'populate_rootfs':
init/initramfs.c:644:2: error: label at end of compound statement
That commit moved the subsequent statements to outside the compound
statement, leaving the label without any associated statements.
Reported-by: Jörg Otte <jrg.otte@gmail.com>
Fixes: 17a9be317475 ("initramfs: Always do fput() and load modules after rootfs populate")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: stable@vger.kernel.org # if 17a9be317475 gets backported
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This bug fixes a regression introduced by patch 0d1c7ae9d8.
The intent of the patch was to stop promoting glocks after a
file system is withdrawn due to a variety of errors, because doing
so results in a BUG(). (You should be able to unmount after a
withdraw rather than having the kernel panic.)
Unfortunately, it also stopped demotions, so glocks could not be
unlocked after withdraw, which means the unmount would hang.
This patch allows function do_xmote to demote locks to an
unlocked state after a withdraw, but not promote them.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Commit 28b783e47ad7 ("xfs: bufferhead chains are invalid after
end_page_writeback") fixed one use-after-free issue by
pre-calculating the loop conditionals before calling bh->b_end_io()
in the end_io processing loop, but it assigned 'next' pointer before
checking end offset boundary & breaking the loop, at which point the
bh might be freed already, and caused use-after-free.
This is caught by KASAN when running fstests generic/127 on sub-page
block size XFS.
[ 2517.244502] run fstests generic/127 at 2017-04-27 07:30:50
[ 2747.868840] ==================================================================
[ 2747.876949] BUG: KASAN: use-after-free in xfs_destroy_ioend+0x3d3/0x4e0 [xfs] at addr ffff8801395ae698
...
[ 2747.918245] Call Trace:
[ 2747.920975] dump_stack+0x63/0x84
[ 2747.924673] kasan_object_err+0x21/0x70
[ 2747.928950] kasan_report+0x271/0x530
[ 2747.933064] ? xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
[ 2747.938409] ? end_page_writeback+0xce/0x110
[ 2747.943171] __asan_report_load8_noabort+0x19/0x20
[ 2747.948545] xfs_destroy_ioend+0x3d3/0x4e0 [xfs]
[ 2747.953724] xfs_end_io+0x1af/0x2b0 [xfs]
[ 2747.958197] process_one_work+0x5ff/0x1000
[ 2747.962766] worker_thread+0xe4/0x10e0
[ 2747.966946] kthread+0x2d3/0x3d0
[ 2747.970546] ? process_one_work+0x1000/0x1000
[ 2747.975405] ? kthread_create_on_node+0xc0/0xc0
[ 2747.980457] ? syscall_return_slowpath+0xe6/0x140
[ 2747.985706] ? do_page_fault+0x30/0x80
[ 2747.989887] ret_from_fork+0x2c/0x40
[ 2747.993874] Object at ffff8801395ae690, in cache buffer_head size: 104
[ 2748.001155] Allocated:
[ 2748.003782] PID = 8327
[ 2748.006411] save_stack_trace+0x1b/0x20
[ 2748.010688] save_stack+0x46/0xd0
[ 2748.014383] kasan_kmalloc+0xad/0xe0
[ 2748.018370] kasan_slab_alloc+0x12/0x20
[ 2748.022648] kmem_cache_alloc+0xb8/0x1b0
[ 2748.027024] alloc_buffer_head+0x22/0xc0
[ 2748.031399] alloc_page_buffers+0xd1/0x250
[ 2748.035968] create_empty_buffers+0x30/0x410
[ 2748.040730] create_page_buffers+0x120/0x1b0
[ 2748.045493] __block_write_begin_int+0x17a/0x1800
[ 2748.050740] iomap_write_begin+0x100/0x2f0
[ 2748.055308] iomap_zero_range_actor+0x253/0x5c0
[ 2748.060362] iomap_apply+0x157/0x270
[ 2748.064347] iomap_zero_range+0x5a/0x80
[ 2748.068624] iomap_truncate_page+0x6b/0xa0
[ 2748.073227] xfs_setattr_size+0x1f7/0xa10 [xfs]
[ 2748.078312] xfs_vn_setattr_size+0x68/0x140 [xfs]
[ 2748.083589] xfs_file_fallocate+0x4ac/0x820 [xfs]
[ 2748.088838] vfs_fallocate+0x2cf/0x780
[ 2748.093021] SyS_fallocate+0x48/0x80
[ 2748.097006] do_syscall_64+0x18a/0x430
[ 2748.101186] return_from_SYSCALL_64+0x0/0x6a
[ 2748.105948] Freed:
[ 2748.108189] PID = 8327
[ 2748.110816] save_stack_trace+0x1b/0x20
[ 2748.115093] save_stack+0x46/0xd0
[ 2748.118788] kasan_slab_free+0x73/0xc0
[ 2748.122969] kmem_cache_free+0x7a/0x200
[ 2748.127247] free_buffer_head+0x41/0x80
[ 2748.131524] try_to_free_buffers+0x178/0x250
[ 2748.136316] xfs_vm_releasepage+0x2e9/0x3d0 [xfs]
[ 2748.141563] try_to_release_page+0x100/0x180
[ 2748.146325] invalidate_inode_pages2_range+0x7da/0xcf0
[ 2748.152087] xfs_shift_file_space+0x37d/0x6e0 [xfs]
[ 2748.157557] xfs_collapse_file_space+0x49/0x120 [xfs]
[ 2748.163223] xfs_file_fallocate+0x2a7/0x820 [xfs]
[ 2748.168462] vfs_fallocate+0x2cf/0x780
[ 2748.172642] SyS_fallocate+0x48/0x80
[ 2748.176629] do_syscall_64+0x18a/0x430
[ 2748.180810] return_from_SYSCALL_64+0x0/0x6a
Fixed it by checking on offset against end & breaking out first,
dereference bh only if there're still bufferheads to process.
Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
Otherwise it is possible to trigger crashes due to the metadata being
inaccessible yet these methods don't safely account for that possibility
without these checks.
Cc: stable@vger.kernel.org
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
|
SFM is mapping doublequote to 0xF020
Without this patch creating files with doublequote fails to Windows/Mac
Signed-off-by: Bjoern Jacke <bjacke@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: stable <stable@vger.kernel.org>
|
|
While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit
44176bb38fa4: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to
IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable()
have been broken by passing a physically contiguous vm_struct with an
invalid pages pointer through the common iommu API.
Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA,
this patch simply reuses the existing swiotlb logic for mmap and
get_sgtable.
Note that the current implementation of get_sgtable (both swiotlb and
iommu) is broken if dma_declare_coherent_memory() is used since such
memory does not have a corresponding struct page. To be addressed in a
subsequent patch.
Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Reported-by: Andrzej Hajda <a.hajda@samsung.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
based on commit b3b42c0deaa1
("fs/affs: make export work with cold dcache")
This adds get_parent function so that nfs client can still work after
cache drop (Tested on NFS v4 with echo 3 > /proc/sys/vm/drop_caches)
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
|
|
In OpenRISC we do not have a bootloader passed initrd, but the built in
initramfs does contain the /init and other binaries, including modules.
The previous commit 08865514805d2 ("initramfs: finish fput() before
accessing any binary from initramfs") made a change to only call fput()
if the bootloader initrd was available, this caused intermittent crashes
for OpenRISC.
This patch changes the fput() to happen unconditionally if any rootfs is
loaded. Also, I added some comments to make it a bit more clear why we
call unpack_to_rootfs() multiple times.
Fixes: 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs")
Cc: stable@vger.kernel.org
Cc: Lokesh Vutla <lokeshvutla@ti.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Stafford Horne <shorne@gmail.com>
|
|
Fix failures to create namespaces due to the vmem_altmap not advertising
enough free space to store the memmap.
WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
[..]
Call Trace:
dump_stack+0x63/0x83
__warn+0xcb/0xf0
warn_slowpath_null+0x1d/0x20
arch_add_memory+0xde/0xf0
devm_memremap_pages+0x244/0x440
pmem_attach_disk+0x37e/0x490 [nd_pmem]
nd_pmem_probe+0x7e/0xa0 [nd_pmem]
nvdimm_bus_probe+0x71/0x120 [libnvdimm]
driver_probe_device+0x2bb/0x460
bind_store+0x114/0x160
drv_attr_store+0x25/0x30
In commit 658922e57b84 "libnvdimm, pfn: fix memmap reservation sizing"
we arranged for the capacity to be allocated, but failed to also update
the 'npfns' parameter. This leads to cases where there is enough
capacity reserved to hold all the allocated sections, but
vmemmap_populate_hugepages() still encounters -ENOMEM from
altmap_alloc_block_buf().
This fix is a stop-gap until we can teach the core memory hotplug
implementation to permit sub-section hotplug.
Cc: <stable@vger.kernel.org>
Fixes: 658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing")
Reported-by: Anisha Allada <anisha.allada@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
freedesktop.org has adopted a formal&enforced code of conduct:
https://www.fooishbar.org/blog/fdo-contributor-covenant/
https://www.freedesktop.org/wiki/CodeOfConduct/
Besides formalizing things a bit more I don't think this changes
anything for us, we've already peer-enforced respectful and
constructive interactions since a long time. But it's good to document
things properly.
v2: Drop confusing note from commit message and clarify the grammer
(Chris, Alex and others).
Cc: Daniel Stone <daniels@collabora.com>
Cc: Keith Packard <keithp@keithp.com>
Cc: tfheen@err.no
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Archit Taneja <architt@codeaurora.org>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Vincent Abriou <vincent.abriou@st.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Brian Starkey <brian.starkey@arm.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Keith Packard <keithp@keithp.com>
Acked-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Acked-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Per the latest version of the "NVDIMM DSM Interface Example" [1], the
label data retrieval routine can report a "locked" status. In this case
all regions associated with that DIMM are disabled until the label area
is unlocked. Provide generic libnvdimm enabling for NVDIMMs with label
data area locking capabilities.
[1]: http://pmem.io/documents/
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
This is a preparation patch for handling locked nvdimm label regions, a
new concept as introduced by the latest DSM document on pmem.io [1]. A
future patch will leverage nvdimm_set_locked() at DIMM probe time to
flag regions that can not be enabled. There should be no functional
difference resulting from this change.
[1]: http://pmem.io/documents/NVDIMM_DSM_Interface_Example-V1.3.pdf
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
sparse generates the following warnings in drivers/of/:
../drivers/of/fdt.c:63:36: warning: cast to restricted __be32
../drivers/of/fdt.c:68:33: warning: cast to restricted __be32
../drivers/of/irq.c:105:88: warning: incorrect type in initializer (different base types)
../drivers/of/irq.c:105:88: expected restricted __be32
../drivers/of/irq.c:105:88: got int
../drivers/of/irq.c:526:35: warning: incorrect type in assignment (different modifiers)
../drivers/of/irq.c:526:35: expected int ( *const [usertype] irq_init_cb )( ... )
../drivers/of/irq.c:526:35: got void const *const data
../drivers/of/of_reserved_mem.c:200:50: warning: incorrect type in initializer (different modifiers)
../drivers/of/of_reserved_mem.c:200:50: expected int ( *[usertype] initfn )( ... )
../drivers/of/of_reserved_mem.c:200:50: got void const *const data
../drivers/of/resolver.c:95:42: warning: incorrect type in assignment (different base types)
../drivers/of/resolver.c:95:42: expected unsigned int [unsigned] [usertype] <noident>
../drivers/of/resolver.c:95:42: got restricted __be32 [usertype] <noident>
All these are harmless type mismatches fixed by adjusting the types.
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
A large directory full of differently sized file names triggered this.
Most directories, even very large directories with shorter names, would
be lucky enough to fit in one server response.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
|
|
If an application seeks to a position before the point which has been
read, it must want updates which have been made to the directory. So
delete the copy stored in the kernel so it will be fetched again.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
|
|
If userspace seeks to a position in the stream which is not correct, it
would have returned EIO because the data in the buffer at that offset
would be incorrect. This and the userspace daemon returning a corrupt
directory are indistinguishable.
Now if the data does not look right, skip forward to the next chunk and
try again. The motivation is that if the directory changes, an
application may seek to a position that was valid and no longer is valid.
It is not yet possible for a directory to change.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
|
|
sparse gives the following warning for 'pci_space':
../drivers/of/address.c:266:26: warning: incorrect type in assignment (different base types)
../drivers/of/address.c:266:26: expected unsigned int [unsigned] [usertype] pci_space
../drivers/of/address.c:266:26: got restricted __be32 const [usertype] <noident>
It appears that pci_space is only ever accessed on powerpc, so the endian
swap is often not needed.
Cc: stable@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
sparse gives a warning that 'handle' is not a __be32:
../drivers/of/base.c:2261:61: warning: incorrect type in argument 1 (different base types)
../drivers/of/base.c:2261:61: expected restricted __be32 const [usertype] *p
../drivers/of/base.c:2261:61: got unsigned int const [usertype] *[assigned] handle
We could just change the type, but the code can be improved by using
of_parse_phandle instead of open coding it with of_get_property and
of_find_node_by_phandle.
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Due to the way I did the RX bitrate conversions in mac80211 with
spatch, going setting flags to setting the value, many drivers now
don't set the bandwidth value for 20 MHz, since with the flags it
wasn't necessary to (there was no 20 MHz flag, only the others.)
Rather than go through and try to fix up all the drivers, instead
renumber the enum so that 20 MHz, which is the typical bandwidth,
actually has the value 0, making those drivers all work again.
If VHT was hit used with a driver not reporting it, e.g. iwlmvm,
this manifested in hitting the bandwidth warning in
cfg80211_calculate_bitrate_vht().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
since it is always NULL.
This is clearly wrong, we have code to initialize it to loopback_dev,
unfortunately the order is still not correct.
loopback_dev is registered very early during boot, we lose a chance
to re-initialize it in notifier. addrconf_init() is called after
ip6_route_init(), which means we have no chance to correct it.
Fix it by moving this initialization explicitly after
ipv6_add_dev(init_net.loopback_dev) in addrconf_init().
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fail the configuration of advertised speed-autoneg value if the config
update is not supported.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Driver currently uses advertised-autoneg value to populate the
supported-autoneg field. When advertised field is updated, user gets
the same value for supported field. Supported-autoneg value need to be
populated from the link capabilities value returned by the MFW.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Value for status block id could be more than 256 in 100G mode, need to
update its data type from u8 to u16.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Static checkers complain that we should unlock before returning. Which
is true.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0.
Otherwise libnl3 fails to validate netlink messages with this attribute.
"ip -detail a" assumes too that the attribute is NUL-terminated when
printing it. It often was, due to padding.
I noticed this as libvirtd failing to start on a system with sfc driver
after upgrading it to Linux 4.11, i.e. when sfc added support for
phys_port_name.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This fixes a race where vmbus callback for new packet arriving
could occur before NAPI is initialized.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
V2: using "aquantia" subsystem tag.
The command "ethtool -i ethX" should display driver name (driver: atlantic)
instead vendor name (driver: aquantia).
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied
from the userspace contains the IPv4/IPv6 header, so if too few bytes are
copied, parts of the header may remain uninitialized.
This bug has been detected with KMSAN.
For the record, the KMSAN report:
==================================================================
BUG: KMSAN: use of unitialized memory in nf_ct_frag6_gather+0xf5a/0x44a0
inter: 0
CPU: 0 PID: 1036 Comm: probe Not tainted 4.11.0-rc5+ #2455
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x143/0x1b0 lib/dump_stack.c:52
kmsan_report+0x16b/0x1e0 mm/kmsan/kmsan.c:1078
__kmsan_warning_32+0x5c/0xa0 mm/kmsan/kmsan_instr.c:510
nf_ct_frag6_gather+0xf5a/0x44a0 net/ipv6/netfilter/nf_conntrack_reasm.c:577
ipv6_defrag+0x1d9/0x280 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
nf_hook_entry_hookfn ./include/linux/netfilter.h:102
nf_hook_slow+0x13f/0x3c0 net/netfilter/core.c:310
nf_hook ./include/linux/netfilter.h:212
NF_HOOK ./include/linux/netfilter.h:255
rawv6_send_hdrinc net/ipv6/raw.c:673
rawv6_sendmsg+0x2fcb/0x41a0 net/ipv6/raw.c:919
inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762
sock_sendmsg_nosec net/socket.c:633
sock_sendmsg net/socket.c:643
SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696
SyS_sendto+0xbc/0xe0 net/socket.c:1664
do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285
entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
RIP: 0033:0x436e03
RSP: 002b:00007ffce48baf38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000436e03
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007ffce48baf90 R08: 00007ffce48baf50 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000401790 R14: 0000000000401820 R15: 0000000000000000
origin: 00000000d9400053
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:362
kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:257
kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:270
slab_alloc_node mm/slub.c:2735
__kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4341
__kmalloc_reserve net/core/skbuff.c:138
__alloc_skb+0x2cd/0x740 net/core/skbuff.c:231
alloc_skb ./include/linux/skbuff.h:933
alloc_skb_with_frags+0x209/0xbc0 net/core/skbuff.c:4678
sock_alloc_send_pskb+0x9ff/0xe00 net/core/sock.c:1903
sock_alloc_send_skb+0xe4/0x100 net/core/sock.c:1920
rawv6_send_hdrinc net/ipv6/raw.c:638
rawv6_sendmsg+0x2918/0x41a0 net/ipv6/raw.c:919
inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762
sock_sendmsg_nosec net/socket.c:633
sock_sendmsg net/socket.c:643
SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696
SyS_sendto+0xbc/0xe0 net/socket.c:1664
do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285
return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
==================================================================
, triggered by the following syscalls:
socket(PF_INET6, SOCK_RAW, IPPROTO_RAW) = 3
sendto(3, NULL, 0, 0, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "ff00::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EPERM
A similar report is triggered in net/ipv4/raw.c if we use a PF_INET socket
instead of a PF_INET6 one.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
head is previously null checked and so the 2nd null check on head
is redundant and therefore can be removed.
Detected by CoverityScan, CID#1399505 ("Logically dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|