Age | Commit message (Collapse) | Author | Files | Lines |
|
A test case was reported where two linked reads with registered buffers
failed the second link always. This is because we set the expected value
of a request in req->result, and if we don't get this result, then we
fail the dependent links. For some reason the registered buffer import
returned -ERROR/0, while the normal import returns -ERROR/length. This
broke linked commands with registered buffers.
Fix this by making io_import_fixed() correctly return the mapped length.
Cc: stable@vger.kernel.org # v5.3
Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For timeout requests io_uring tries to grab a file with specified fd,
which is usually stdin/fd=0.
Update io_op_needs_file()
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
__bio_try_merge_page() may merge a page to bio without bio_full() check
and cause bi_size overflow.
The overflow typically ends up with sd_init_command() warning on zero
segment request with call trace like this:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1986 at drivers/scsi/scsi_lib.c:1025 scsi_init_io+0x156/0x180
CPU: 2 PID: 1986 Comm: kworker/2:1H Kdump: loaded Not tainted 5.4.0-rc7 #1
Workqueue: kblockd blk_mq_run_work_fn
RIP: 0010:scsi_init_io+0x156/0x180
RSP: 0018:ffffa11487663bf0 EFLAGS: 00010246
RAX: 00000000002be0a0 RBX: ffff8e6e9ff30118 RCX: 0000000000000000
RDX: 00000000ffffffe1 RSI: 0000000000000000 RDI: ffff8e6e9ff30118
RBP: ffffa11487663c18 R08: ffffa11487663d28 R09: ffff8e6e9ff30150
R10: 0000000000000001 R11: 0000000000000000 R12: ffff8e6e9ff30000
R13: 0000000000000001 R14: ffff8e74a1cf1800 R15: ffff8e6e9ff30000
FS: 0000000000000000(0000) GS:ffff8e6ea7680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff18cf0fe8 CR3: 0000000659f0a001 CR4: 00000000001606e0
Call Trace:
sd_init_command+0x326/0xb40 [sd_mod]
scsi_queue_rq+0x502/0xaa0
? blk_mq_get_driver_tag+0xe7/0x120
blk_mq_dispatch_rq_list+0x256/0x5a0
? elv_rb_del+0x24/0x30
? deadline_remove_request+0x7b/0xc0
blk_mq_do_dispatch_sched+0xa3/0x140
blk_mq_sched_dispatch_requests+0xfb/0x170
__blk_mq_run_hw_queue+0x81/0x130
blk_mq_run_work_fn+0x1b/0x20
process_one_work+0x179/0x390
worker_thread+0x4f/0x3e0
kthread+0x105/0x140
? max_active_store+0x80/0x80
? kthread_bind+0x20/0x20
ret_from_fork+0x35/0x40
---[ end trace f9036abf5af4a4d3 ]---
blk_update_request: I/O error, dev sdd, sector 2875552 op 0x1:(WRITE) flags 0x0 phys_seg 0 prio class 0
XFS (sdd1): writeback error on sector 2875552
__bio_try_merge_page() should check the overflow before actually doing
merge.
Fixes: 07173c3ec276c ("block: enable multipage bvecs")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently we make sequence == 0 be the same as sequence == 1, but that's
not super useful if the intent is really to have a timeout that's just
a pure timeout.
If the user passes in sqe->off == 0, then don't apply any sequence logic
to the request, let it purely be driven by the timeout specified.
Reported-by: 李通洲 <carter.li@eoitek.com>
Reviewed-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
cgroup writeback tries to refresh the associated wb immediately if the
current wb is dead. This is to avoid keeping issuing IOs on the stale
wb after memcg - blkcg association has changed (ie. when blkcg got
disabled / enabled higher up in the hierarchy).
Unfortunately, the logic gets triggered spuriously on inodes which are
associated with dead cgroups. When the logic is triggered on dead
cgroups, the attempt fails only after doing quite a bit of work
allocating and initializing a new wb.
While c3aab9a0bd91 ("mm/filemap.c: don't initiate writeback if mapping
has no dirty pages") alleviated the issue significantly as it now only
triggers when the inode has dirty pages. However, the condition can
still be triggered before the inode is switched to a different cgroup
and the logic simply doesn't make sense.
Skip the immediate switching if the associated memcg is dying.
This is a simplified version of the following two patches:
* https://lore.kernel.org/linux-mm/20190513183053.GA73423@dennisz-mbp/
* http://lkml.kernel.org/r/156355839560.2063.5265687291430814589.stgit@buzz
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: e8a7abf5a5bd ("writeback: disassociate inodes from dying bdi_writebacks")
Acked-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are two callers of this function and they both unlock the mutex so
this ends up being a double unlock.
Fixes: 44ed167da748 ("drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
blkcg_print_stat() iterates blkgs under RCU and doesn't test whether
the blkg is online. This can call into pd_stat_fn() on a pd which is
still being initialized leading to an oops.
The heaviest operation - recursively summing up rwstat counters - is
already done while holding the queue_lock. Expand queue_lock to cover
the other operations and skip the blkg if it isn't online yet. The
online state is protected by both blkcg and queue locks, so this
guarantees that only online blkgs are processed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Roman Gushchin <guro@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Fixes: 903d23f0a354 ("blk-cgroup: allow controllers to output their own stats")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Changing nvme_passthru_cmd64 to add a field: rsvd2. This field is an explicit
marker for the padding space added on certain platforms as a result of the
enlargement of the result field from 32 bit to 64 bits in size, and
fixes differences in struct size when using compat ioctl for 32-bit
binaries on 64-bit architecture.
Fixes: 65e68edce0db ("nvme: allow 64-bit results in passthru commands")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Charles Machalow <csm10495@gmail.com>
[changelog]
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
nvme_mpath_clear_ctrl_paths() iterates through
the ctrl->namespaces list while holding ctrl->scan_lock.
This does not seem to be the correct way of protecting
from concurrent list modification.
Specifically, nvme_scan_work() sorts ctrl->namespaces
AFTER unlocking scan_lock.
This may result in the following (rare) crash in ctrl disconnect
during scan_work:
BUG: kernel NULL pointer dereference, address: 0000000000000050
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 3995 Comm: nvme 5.3.5-050305-generic
RIP: 0010:nvme_mpath_clear_current_path+0xe/0x90 [nvme_core]
...
Call Trace:
nvme_mpath_clear_ctrl_paths+0x3c/0x70 [nvme_core]
nvme_remove_namespaces+0x35/0xe0 [nvme_core]
nvme_do_delete_ctrl+0x47/0x90 [nvme_core]
nvme_sysfs_delete+0x49/0x60 [nvme_core]
dev_attr_store+0x17/0x30
sysfs_kf_write+0x3e/0x50
kernfs_fop_write+0x11e/0x1a0
__vfs_write+0x1b/0x40
vfs_write+0xb9/0x1a0
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x5a/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f8d02bfb154
Fix:
After taking scan_lock in nvme_mpath_clear_ctrl_paths()
down_read(&ctrl->namespaces_rwsem) as well to make list traversal safe.
This will not cause deadlocks because taking scan_lock never happens
while holding the namespaces_rwsem.
Moreover, scan work downs namespaces_rwsem in the same order.
Alternative: sort ctrl->namespaces in nvme_scan_work()
while still holding the scan_lock.
This would leave nvme_mpath_clear_ctrl_paths() without correct protection
against ctrl->namespaces modification by anyone other than scan_work.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
In case there are controllers that are not associated with any RDMA
device (e.g. during unsuccessful reconnection) and the user will unload
the module, these controllers will not be freed and will access already
freed memory. The same logic appears in other fabric drivers as well.
Fixes: 87fd125344d6 ("nvme-rdma: remove redundant reference between ib_device and tagset")
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
This code causes a static analysis warning:
block/blk-iocost.c:2113 ioc_weight_write() error: double lock 'irq'
We disable IRQs in blkg_conf_prep() and re-enable them in
blkg_conf_finish(). IRQ disable/enable should not be nested because
that means the IRQs will be enabled at the first unlock instead of the
second one.
Fixes: 7caa47151ab2 ("blkcg: implement blk-iocost")
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We use io_kiocb->result == -EAGAIN as a way to know if we need to
re-submit a polled request, as -EAGAIN reporting happens out-of-line
for IO submission failures. This field is cleared when we originally
allocate the request, but it isn't reset when we retry the submission
from async context. This can cause issues where we think something
needs a re-issue, but we're really just reading stale data.
Reset ->result whenever we re-prep a request for polled submission.
Cc: stable@vger.kernel.org
Fixes: 9e645e1105ca ("io_uring: add support for sqe links")
Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes crashes due to ubd requeue logic conflicting with the block-mq
logic. Crash is reproducible in 5.0 - 5.3.
Fixes: 53766defb8c8 ("um: Clean-up command processing in UML UBD driver")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
groups_only mode in nvme_read_ana_log() is no longer used: remove it.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The following scenario results in an IO hang:
1) ctrl completes a request with NVME_SC_ANA_TRANSITION.
NVME_NS_ANA_PENDING bit in ns->flags is set and ana_work is triggered.
2) ana_work: nvme_read_ana_log() tries to get the ANA log page from the ctrl.
This fails because ctrl disconnects.
Therefore nvme_update_ns_ana_state() is not called
and NVME_NS_ANA_PENDING bit in ns->flags is not cleared.
3) ctrl reconnects: nvme_mpath_init(ctrl,...) calls
nvme_read_ana_log(ctrl, groups_only=true).
However, nvme_update_ana_state() does not update namespaces
because nr_nsids = 0 (due to groups_only mode).
4) scan_work calls nvme_validate_ns() finds the ns and re-validates OK.
Result:
The ctrl is now live but NVME_NS_ANA_PENDING bit in ns->flags is still set.
Consequently ctrl will never be considered a viable path by __nvme_find_path().
IO will hang if ctrl is the only or the last path to the namespace.
More generally, while ctrl is reconnecting, its ANA state may change.
And because nvme_mpath_init() requests ANA log in groups_only mode,
these changes are not propagated to the existing ctrl namespaces.
This may result in a mal-function or an IO hang.
Solution:
nvme_mpath_init() will nvme_read_ana_log() with groups_only set to false.
This will not harm the new ctrl case (no namespaces present),
and will make sure the ANA state of namespaces gets updated after reconnect.
Note: Another option would be for nvme_mpath_init() to invoke
nvme_parse_ana_log(..., nvme_set_ns_ana_state) for each existing namespace.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
syzkaller reported an issue where it looks like a malicious app can
trigger a use-after-free of reading the ctx ->sq_array and ->rings
value right after having installed the ring fd in the process file
table.
Defer ring fd installation until after we're done reading those
values.
Fixes: 75b28affdd6a ("io_uring: allocate the two rings together")
Reported-by: syzbot+6f03d895a6cd0d06187f@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_queue_link_head() owns shadow_req after taking it as an argument.
By not freeing it in case of an error, it can leak the request along
with taken ctx->refs.
Reviewed-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
If we always compile the get_break_insn_length inline function we can
remove the ifdefs and let dead code elimination take care of the warn
branch that is now unreadable because the report_bug stub always
returns BUG_TRAP_TYPE_BUG.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
nbd requires socket families to support the shutdown method so the nbd
recv workqueue can be woken up from its sock_recvmsg call. If the socket
does not support the callout we will leave recv works running or get hangs
later when the device or module is removed.
This adds a check during socket connection/reconnection to make sure the
socket being passed in supports the needed callout.
Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This driver is using regulator_get_optional() to handle all the supplies
that it handles, and only ever enables and disables all supplies en masse
without ever doing any other configuration of the device to handle missing
power. These are clear signs that the API is being misused - it should only
be used for supplies that may be physically absent from the system and in
these cases the hardware usually needs different configuration if the
supply is missing. Instead use normal regualtor_get(), if the supply is
not described in DT then the framework will substitute a dummy regulator in
so no special handling is needed by the consumer driver.
In the case of the PHY regulator the handling in the driver is a hack to
deal with integrated PHYs; the supplies are only optional in the sense
that that there's some confusion in the code about where they're bound to.
From a code point of view they function exactly as normal supplies so can
be treated as such. It'd probably be better to model this by instantiating
a PHY object for integrated PHYs.
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We hit the following warning in production
print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
Workqueue: knbd-recv recv_work [nbd]
RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
Call Trace:
blk_mq_free_request+0xb7/0xf0
blk_mq_complete_request+0x62/0xf0
recv_work+0x29/0xa1 [nbd]
process_one_work+0x1f5/0x3f0
worker_thread+0x2d/0x3d0
? rescuer_thread+0x340/0x340
kthread+0x111/0x130
? kthread_create_on_node+0x60/0x60
ret_from_fork+0x1f/0x30
---[ end trace b079c3c67f98bb7c ]---
This was preceded by us timing out everything and shutting down the
sockets for the device. The problem is we had a request in the queue at
the same time, so we completed the request twice. This can actually
happen in a lot of cases, we fail to get a ref on our config, we only
have one connection and just error out the command, etc.
Fix this by checking cmd->status in nbd_read_stat. We only change this
under the cmd->lock, so we are safe to check this here and see if we've
already error'ed this command out, which would indicate that we've
completed it as well.
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We already do this for the most part, except in timeout and clear_req.
For the timeout case we take the lock after we grab a ref on the config,
but that isn't really necessary because we're safe to touch the cmd at
this point, so just move the order around.
For the clear_req cause this is initiated by the user, so again is safe.
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We currently assume that submissions from the sqthread are successful,
and if IO polling is enabled, we use that value for knowing how many
completions to look for. But if we overflowed the CQ ring or some
requests simply got errored and already completed, they won't be
available for polling.
For the case of IO polling and SQTHREAD usage, look at the pending
poll list. If it ever hits empty then we know that we don't have
anymore pollable requests inflight. For that case, simply reset
the inflight count to zero.
Reported-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We currently use the ring values directly, but that can lead to issues
if the application is malicious and changes these values on our behalf.
Created in-kernel cached versions of them, and just overwrite the user
side when we update them. This is similar to how we treat the sq/cq
ring tail/head updates.
Reported-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Relax qspi pins slew-rate to minimize peak currents.
Fixes: 844030057339 ("ARM: dts: stm32: add flash nor support on stm32mp157c eval board")
Link: https://lore.kernel.org/r/20191025130122.11407-1-alexandre.torgue@st.com
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
io_ring_submit() finalises with
1. io_commit_sqring(), which releases sqes to the userspace
2. Then calls to io_queue_link_head(), accessing released head's sqe
Reorder them.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
io_sq_thread() processes sqes by 8 without considering links. As a
result, links will be randomely subdivided.
The easiest way to fix it is to call io_get_sqring() inside
io_submit_sqes() as do io_ring_submit().
Downsides:
1. This removes optimisation of not grabbing mm_struct for fixed files
2. It submitting all sqes in one go, without finer-grained sheduling
with cq processing.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is a bug, where failed linked requests are returned not with
specified @user_data, but with garbage from a kernel stack.
The reason is that io_fail_links() uses req->user_data, which is
uninitialised when called from io_queue_sqe() on fail path.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Support for the kernel as Xen 32-bit PV guest will soon be removed.
Issue a warning when booted as such.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
|
|
Modify plic_init() to skip .dts interrupt contexts other
than supervisor external interrupt.
The .dts entry for plic may specify multiple interrupt contexts.
For example, it may assign two entries IRQ_M_EXT and IRQ_S_EXT,
in that order, to the same interrupt controller. This patch
modifies plic_init() to skip the IRQ_M_EXT context since
IRQ_S_EXT is currently the only supported context.
If IRQ_M_EXT is not skipped, plic_init() will report "handler
already present for context" when it comes across the IRQ_S_EXT
context in the next iteration of its loop.
Without this patch, .dts would have to be edited to replace the
value of IRQ_M_EXT with -1 for it to be skipped.
Signed-off-by: Alan Mikhak <alan.mikhak@sifive.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Walmsley <paul.walmsley@sifive.com> # arch/riscv
Link: https://lkml.kernel.org/r/1571933503-21504-1-git-send-email-alan.mikhak@sifive.com
|
|
There's a deadlock that is possible and can easily be seen with
a test where multiple readers open/read/close of the same file
and a disruption occurs causing reconnect. The deadlock is due
a reader thread inside cifs_strict_readv calling down_read and
obtaining lock_sem, and then after reconnect inside
cifs_reopen_file calling down_read a second time. If in
between the two down_read calls, a down_write comes from
another process, deadlock occurs.
CPU0 CPU1
---- ----
cifs_strict_readv()
down_read(&cifsi->lock_sem);
_cifsFileInfo_put
OR
cifs_new_fileinfo
down_write(&cifsi->lock_sem);
cifs_reopen_file()
down_read(&cifsi->lock_sem);
Fix the above by changing all down_write(lock_sem) calls to
down_write_trylock(lock_sem)/msleep() loop, which in turn
makes the second down_read call benign since it will never
block behind the writer while holding lock_sem.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Suggested-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed--by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
Currently the code assumes that if a file info entry belongs
to lists of open file handles of an inode and a tcon then
it has non-zero reference. The recent changes broke that
assumption when putting the last reference of the file info.
There may be a situation when a file is being deleted but
nothing prevents another thread to reference it again
and start using it. This happens because we do not hold
the inode list lock while checking the number of references
of the file info structure. Fix this by doing the proper
locking when doing the check.
Fixes: 487317c99477d ("cifs: add spinlock for the openFileList to cifsInodeInfo")
Fixes: cb248819d209d ("cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic")
Cc: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
When the client hits reconnect it iterates over the mid
pending queue marking entries for retry and moving them
to a temporary list to issue callbacks later without holding
GlobalMid_Lock. In the same time there is no guarantee that
mids can't be removed from the temporary list or even
freed completely by another thread. It may cause a temporary
list corruption:
[ 430.454897] list_del corruption. prev->next should be ffff98d3a8f316c0, but was 2e885cb266355469
[ 430.464668] ------------[ cut here ]------------
[ 430.466569] kernel BUG at lib/list_debug.c:51!
[ 430.468476] invalid opcode: 0000 [#1] SMP PTI
[ 430.470286] CPU: 0 PID: 13267 Comm: cifsd Kdump: loaded Not tainted 5.4.0-rc3+ #19
[ 430.473472] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 430.475872] RIP: 0010:__list_del_entry_valid.cold+0x31/0x55
...
[ 430.510426] Call Trace:
[ 430.511500] cifs_reconnect+0x25e/0x610 [cifs]
[ 430.513350] cifs_readv_from_socket+0x220/0x250 [cifs]
[ 430.515464] cifs_read_from_socket+0x4a/0x70 [cifs]
[ 430.517452] ? try_to_wake_up+0x212/0x650
[ 430.519122] ? cifs_small_buf_get+0x16/0x30 [cifs]
[ 430.521086] ? allocate_buffers+0x66/0x120 [cifs]
[ 430.523019] cifs_demultiplex_thread+0xdc/0xc30 [cifs]
[ 430.525116] kthread+0xfb/0x130
[ 430.526421] ? cifs_handle_standard+0x190/0x190 [cifs]
[ 430.528514] ? kthread_park+0x90/0x90
[ 430.530019] ret_from_fork+0x35/0x40
Fix this by obtaining extra references for mids being retried
and marking them as MID_DELETED which indicates that such a mid
has been dequeued from the pending list.
Also move mid cleanup logic from DeleteMidQEntry to
_cifs_mid_q_entry_release which is called when the last reference
to a particular mid is put. This allows to avoid any use-after-free
of response buffers.
The patch needs to be backported to stable kernels. A stable tag
is not mentioned below because the patch doesn't apply cleanly
to any actively maintained stable kernel.
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-and-tested-by: David Wysochanski <dwysocha@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Remove the following warning:
drivers/i2c/busses/i2c-stm32f7.c:315:
warning: cannot understand function prototype:
'struct stm32f7_i2c_spec i2c_specs[] =
Replace a comment starting with /** by simply /* to avoid having
it interpreted as a kernel-doc comment.
Fixes: aeb068c57214 ("i2c: i2c-stm32f7: add driver")
Signed-off-by: Alain Volmat <alain.volmat@st.com>
Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
When in slave mode, an arbitration loss (ARLO) may be detected before the
slave had a chance to detect the stop condition (STOPF in ISR).
This is seen when two master + slave adapters switch their roles. It
provokes the i2c bus to be stuck, busy as SCL line is stretched.
- the I2C_SLAVE_STOP event is never generated due to STOPF flag is set but
don't generate an irq (race with ARLO irq, STOPIE is masked). STOPF flag
remains set until next master xfer (e.g. when STOPIE irq get unmasked).
In this case, completion is generated too early: immediately upon new
transfer request (then it doesn't send all data).
- Some data get stuck in TXDR register. As a consequence, the controller
stretches the SCL line: the bus gets busy until a future master transfer
triggers the bus busy / recovery mechanism (this can take time... and
may never happen at all)
So choice is to let the STOPF being detected by the slave isr handler,
to properly handle this stop condition. E.g. don't mask IRQs in error
handler, when the slave is running.
Fixes: 60d609f30de2 ("i2c: i2c-stm32f7: Add slave support")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
The slave-interface documentation [1] states "the bus driver should
transmit the first byte" upon I2C_SLAVE_READ_REQUESTED slave event:
- 'val': backend returns first byte to be sent
The driver currently ignores the 1st byte to send on this event.
[1] https://www.kernel.org/doc/Documentation/i2c/slave-interface
Fixes: 60d609f30de2 ("i2c: i2c-stm32f7: Add slave support")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Reviewed-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Since commit abf4923e97c3 ("i2c: mediatek: disable zero-length transfers
for mt8183"), there is a NULL pointer dereference for all the SoCs
that don't have any quirk. mtk_i2c_functionality is not checking that
the quirks pointer is not NULL before starting to use it.
This commit add a call to i2c_check_quirks which will check whether
the quirks pointer is set, and if so will check if the IP has the
NO_ZERO_LEN quirk.
Fixes: abf4923e97c3 ("i2c: mediatek: disable zero-length transfers for mt8183")
Signed-off-by: Fabien Parent <fparent@baylibre.com>
Reviewed-by: Cengiz Can <cengiz@kernel.wtf>
Reviewed-by: Hsin-Yi Wang <hsinyi@chromium.org>
Tested-by: Ulrich Hecht <uli@fpond.eu>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0),
we will map vPEs only on ITSs that will actually control interrupts
for the given VM. And when moving a vPE, the VMOVP command will be
issued only for those ITSs.
But when issuing VMOVPs we seemed fail to present the exact ITSList
to ITSs who are actually included in the synchronization operation.
The its_list_map we're currently using includes all ITSs in the system,
even though some of them don't have the corresponding vPE mapping at all.
Introduce get_its_list() to get the per-VM its_list_map, to indicate
which ITSs have vPE mappings for the given VM, and use this map as
the expected ITSList when building VMOVP. This is hopefully a performance
gain not to do some synchronization with those unsuspecting ITSs.
And initialize the whole command descriptor to zero at beginning, since
the seq_num and its_list should be RES0 when GITS_TYPER.VMOVP == 1.
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1571802386-2680-1-git-send-email-yuzenghui@huawei.com
|
|
gfs2 and gfs2meta share an ->init_fs_context function which allocates an
args structure stored in fc->fs_private. gfs2 registers a ->free
function to free this memory when the fs_context is cleaned up, but
there was not one registered for gfs2meta, causing a leak.
Register a ->free function for gfs2meta. The existing gfs2_fc_free
function does what we need.
Reported-by: syzbot+c2fdfd2b783754878fb6@syzkaller.appspotmail.com
Fixes: 1f52aa08d12f ("gfs2: Convert gfs2 to fs_context")
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Part 3 from this series [1] was not merged due to wrong splitting
and breaks mt6323 pmic on bananapi-r2
dmesg prints this line and at least switch is not initialized on bananapi-r2
mt6397 1000d000.pwrap:mt6323: unsupported chip: 0x0
this patch contains only the probe-changes and chip_data structs
from original part 3 by Hsin-Hsiung Wang
[1] https://patchwork.kernel.org/project/linux-mediatek/list/?series=164155
Fixes: a4872e80ce7d ("mfd: mt6397: Extract IRQ related code from core driver")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
|
|
build_restore_pagemask() will restore the value of register $1/$at when
its restore_scratch argument is non-zero, and aims to do so by filling a
branch delay slot. Commit 0b24cae4d535 ("MIPS: Add missing EHB in mtc0
-> mfc0 sequence.") added an EHB instruction (Execution Hazard Barrier)
prior to restoring $1 from a KScratch register, in order to resolve a
hazard that can result in stale values of the KScratch register being
observed. In particular, P-class CPUs from MIPS with out of order
execution pipelines such as the P5600 & P6600 are affected.
Unfortunately this EHB instruction was inserted in the branch delay slot
causing the MFC0 instruction which performs the restoration to no longer
execute along with the branch. The result is that the $1 register isn't
actually restored, ie. the TLB refill exception handler clobbers it -
which is exactly the problem the EHB is meant to avoid for the P-class
CPUs.
Similarly build_get_pgd_vmalloc() will restore the value of $1/$at when
its mode argument equals refill_scratch, and suffers from the same
problem.
Fix this by in both cases moving the EHB earlier in the emitted code.
There's no reason it needs to immediately precede the MFC0 - it simply
needs to be between the MTC0 & MFC0.
This bug only affects Cavium Octeon systems which use
build_fast_tlb_refill_handler().
Signed-off-by: Paul Burton <paulburton@kernel.org>
Fixes: 0b24cae4d535 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.")
Cc: Dmitry Korotin <dkorotin@wavecomp.com>
Cc: stable@vger.kernel.org # v3.15+
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
|
|
The sequence number of the timeout req (req->sequence) indicate the
expected completion request. Because of each timeout req consume a
sequence number, so the sequence of each timeout req on the timeout
list shouldn't be the same. But now, we may get the same number (also
incorrect) if we insert a new entry before the last one, such as submit
such two timeout reqs on a new ring instance below.
req->sequence
req_1 (count = 2): 2
req_2 (count = 1): 2
Then, if we submit a nop req, req_2 will still timeout even the nop req
finished. This patch fix this problem by adjust the sequence number of
each reordered reqs when inserting a new entry.
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The sequence number of reqs on the timeout_list before the timeout req
should be adjusted in io_timeout_fn(), because the current timeout req
will consumes a slot in the cq_ring and cq_tail pointer will be
increased, otherwise other timeout reqs may return in advance without
waiting for enough wait_nr.
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There are cases where it isn't always safe to block for submission,
even if the caller asked to wait for events as well. Revert the
previous optimization of doing that.
This reverts two commits:
bf7ec93c644cb
c576666863b78
Fixes: c576666863b78 ("io_uring: optimize submit_and_wait API")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The vectors span more than one byte, so mark them as arrays.
Fixes the following build error when building when using GCC 8.3:
In file included from ./include/linux/string.h:19,
from ./include/linux/bitmap.h:9,
from ./include/linux/cpumask.h:12,
from ./arch/mips/include/asm/processor.h:15,
from ./arch/mips/include/asm/thread_info.h:16,
from ./include/linux/thread_info.h:38,
from ./include/asm-generic/preempt.h:5,
from ./arch/mips/include/generated/asm/preempt.h:1,
from ./include/linux/preempt.h:81,
from ./include/linux/spinlock.h:51,
from ./include/linux/mmzone.h:8,
from ./include/linux/bootmem.h:8,
from arch/mips/bcm63xx/prom.c:10:
arch/mips/bcm63xx/prom.c: In function 'prom_init':
./arch/mips/include/asm/string.h:162:11: error: '__builtin_memcpy' forming offset [2, 32] is out of the bounds [0, 1] of object 'bmips_smp_movevec' with type 'char' [-Werror=array-bounds]
__ret = __builtin_memcpy((dst), (src), __len); \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/mips/bcm63xx/prom.c:97:3: note: in expansion of macro 'memcpy'
memcpy((void *)0xa0000200, &bmips_smp_movevec, 0x20);
^~~~~~
In file included from arch/mips/bcm63xx/prom.c:14:
./arch/mips/include/asm/bmips.h:80:13: note: 'bmips_smp_movevec' declared here
extern char bmips_smp_movevec;
Fixes: 18a1eef92dcd ("MIPS: BMIPS: Introduce bmips.h")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paul Burton <paulburton@kernel.org>
Cc: linux-mips@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
|
|
Change my email address, and add more Spreadtrum SC27xx series PMIC
drivers to maintain.
Link: https://lore.kernel.org/r/a48483d13243450ecf3b777d49e741b6367f2c6b.1571881956.git.baolin.wang@linaro.org
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
|
|
Remove various not required ifdefs and externs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
Using CONFIG_SPARSEMEM_VMEMMAP instead of CONFIG_SPARSEMEM to fix
following build issue.
riscv64-linux-ld: arch/riscv/mm/init.o: in function 'vmemmap_populate':
init.c:(.meminit.text+0x8): undefined reference to 'vmemmap_populate_basepages'
Cc: Logan Gunthorpe <logang@deltatee.com>
Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|
|
With CONFIG_SPARSEMEM and !CONFIG_SPARSEMEM_VMEMMAP,
arch/riscv/include/asm/pgtable.h: In function ‘mk_pte’:
include/asm-generic/memory_model.h:64:14: error: implicit declaration of function ‘page_to_section’; did you mean ‘present_section’? [-Werror=implicit-function-declaration]
int __sec = page_to_section(__pg); \
^~~~~~~~~~~~~~~
Fixed by changing mk_pte() from inline function to macro.
Cc: Logan Gunthorpe <logang@deltatee.com>
Fixes: d95f1a542c3d ("RISC-V: Implement sparsemem")
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
[paul.walmsley@sifive.com: fixed checkpatch errors]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
|