Age | Commit message (Collapse) | Author | Files | Lines |
|
We recently changed this from a single list to an rbtree, but for some
real life workloads, the rbtree slows down the submission/insertion
case enough so that it's the top cycle consumer on the io_uring side.
In testing, using a hash table is a more well rounded compromise. It
is fast for insertion, and as long as it's sized appropriately, it
works well for the cancellation case as well. Running TAO with a lot
of network sockets, this removes io_poll_req_insert() from spending
2% of the CPU cycles.
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If someone removes a node from a list, and then later adds it back to
a list, we can have invalid data in ->next. This can cause all sorts
of issues. One such use case is the IORING_OP_POLL_ADD command, which
will do just that if we race and get woken twice without any pending
events. This is a pretty rare case, but can happen under extreme loads.
Dan reports that he saw the following crash:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD d283ce067 P4D d283ce067 PUD e5ca04067 PMD 0
Oops: 0002 [#1] SMP
CPU: 17 PID: 10726 Comm: tao:fast-fiber Kdump: loaded Not tainted 5.2.9-02851-gac7bc042d2d1 #116
Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A17 05/03/2019
RIP: 0010:io_wqe_enqueue+0x3e/0xd0
Code: 34 24 74 55 8b 47 58 48 8d 6f 50 85 c0 74 50 48 89 df e8 35 7c 75 00 48 83 7b 08 00 48 8b 14 24 0f 84 84 00 00 00 48 8b 4b 10 <48> 89 11 48 89 53 10 83 63 20 fe 48 89 c6 48 89 df e8 0c 7a 75 00
RSP: 0000:ffffc90006858a08 EFLAGS: 00010082
RAX: 0000000000000002 RBX: ffff889037492fc0 RCX: 0000000000000000
RDX: ffff888e40cc11a8 RSI: ffff888e40cc11a8 RDI: ffff889037492fc0
RBP: ffff889037493010 R08: 00000000000000c3 R09: ffffc90006858ab8
R10: 0000000000000000 R11: 0000000000000000 R12: ffff888e40cc11a8
R13: 0000000000000000 R14: 00000000000000c3 R15: ffff888e40cc1100
FS: 00007fcddc9db700(0000) GS:ffff88903fa40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000e479f5003 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
io_poll_wake+0x12f/0x2a0
__wake_up_common+0x86/0x120
__wake_up_common_lock+0x7a/0xc0
sock_def_readable+0x3c/0x70
tcp_rcv_established+0x557/0x630
tcp_v6_do_rcv+0x118/0x3c0
tcp_v6_rcv+0x97e/0x9d0
ip6_protocol_deliver_rcu+0xe3/0x440
ip6_input+0x3d/0xc0
? ip6_protocol_deliver_rcu+0x440/0x440
ipv6_rcv+0x56/0xd0
? ip6_rcv_finish_core.isra.18+0x80/0x80
__netif_receive_skb_one_core+0x50/0x70
netif_receive_skb_internal+0x2f/0xa0
napi_gro_receive+0x125/0x150
mlx5e_handle_rx_cqe+0x1d9/0x5a0
? mlx5e_poll_tx_cq+0x305/0x560
mlx5e_poll_rx_cq+0x49f/0x9c5
mlx5e_napi_poll+0xee/0x640
? smp_reschedule_interrupt+0x16/0xd0
? reschedule_interrupt+0xf/0x20
net_rx_action+0x286/0x3d0
__do_softirq+0xca/0x297
irq_exit+0x96/0xa0
do_IRQ+0x54/0xe0
common_interrupt+0xf/0xf
</IRQ>
RIP: 0033:0x7fdc627a2e3a
Code: 31 c0 85 d2 0f 88 f6 00 00 00 55 48 89 e5 41 57 41 56 4c 63 f2 41 55 41 54 53 48 83 ec 18 48 85 ff 0f 84 c7 00 00 00 48 8b 07 <41> 89 d4 49 89 f5 48 89 fb 48 85 c0 0f 84 64 01 00 00 48 83 78 10
when running a networked workload with about 5000 sockets being polled
for. Fix this by clearing node->next when the node is being removed from
the list.
Fixes: 6206f0e180d4 ("io-wq: shrink io_wq_work a bit")
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If we defer a timeout, we should ensure that we copy the timespec
when we have consumed the sqe. This is similar to commit f67676d160c6
for read/write requests. We already did this correctly for timeouts
deferred as links, but do it generally and use the infrastructure added
by commit 1a6b74fc8702 instead of having the timeout deferral use its
own.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There's really no reason why we forbid things like link/drain etc on
regular timeout commands. Enable the usual SQE flags on timeouts.
Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If BLK_DEV_ZONED isn't set, 'ret' isn't used. This makes gcc complain,
rightfully. Move ret where it is used.
Fixes: 979d54475e0b ("null_blk: cleanup null_gendisk_register")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Queue dma alignment limit requires users(fs, target, ...) of block layer
to pass aligned buffer.
So far brd doesn't support un-aligned buffer, even though it is easy
to support it.
However, given brd is often used for debug purpose, and there are other
drivers which can't support un-aligned buffer too.
So add warning so that brd users know what to fix.
Reported-by: Stephen Rust <srust@blockbridge.com>
Cc: Stephen Rust <srust@blockbridge.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now we depend on blk_queue_split() to respect most of queue limit
(the only one exception could be dma alignment), however
blk_queue_split() isn't used for brd, so this limit isn't respected
since v4.3.
Also max_hw_sectors limit doesn't play a big role for brd, which is
added since brd is added to tree for unknown reason.
So remove it.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For each I/O request, blkback first maps the foreign pages for the
request to its local pages. If an allocation of a local page for the
mapping fails, it should unmap every mapping already made for the
request.
However, blkback's handling mechanism for the allocation failure does
not mark the remaining foreign pages as unmapped. Therefore, the unmap
function merely tries to unmap every valid grant page for the request,
including the pages not mapped due to the allocation failure. On a
system that fails the allocation frequently, this problem leads to
following kernel crash.
[ 372.012538] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
[ 372.012546] IP: [<ffffffff814071ac>] gnttab_unmap_refs.part.7+0x1c/0x40
[ 372.012557] PGD 16f3e9067 PUD 16426e067 PMD 0
[ 372.012562] Oops: 0002 [#1] SMP
[ 372.012566] Modules linked in: act_police sch_ingress cls_u32
...
[ 372.012746] Call Trace:
[ 372.012752] [<ffffffff81407204>] gnttab_unmap_refs+0x34/0x40
[ 372.012759] [<ffffffffa0335ae3>] xen_blkbk_unmap+0x83/0x150 [xen_blkback]
...
[ 372.012802] [<ffffffffa0336c50>] dispatch_rw_block_io+0x970/0x980 [xen_blkback]
...
Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[ 0.000000] Initializing cgroup subsys cpuset
This commit fixes this problem by marking the grant pages of the given
request that didn't mapped due to the allocation failure as invalid.
Fixes: c6cc142dac52 ("xen-blkback: use balloon pages for all mappings")
Reviewed-by: David Woodhouse <dwmw@amazon.de>
Reviewed-by: Maximilian Heyne <mheyne@amazon.de>
Reviewed-by: Paul Durrant <pdurrant@amazon.co.uk>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Right now we return it to userspace, which means the application has
to poll for the socket to be writeable. Let's just treat it like
-EAGAIN and have io_uring handle it internally, this makes it much
easier to use.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The current zone revalidation code has a major problem in that it
doesn't update the zone size and q->nr_zones atomically, leading
to a short window where an out of bounds access to the zone arrays
is possible.
To fix this move the setting of the zone size into the crticial
sections blk_revalidate_disk_zones so that it gets updated together
with the zone bitmaps and q->nr_zones. This also slightly simplifies
the caller as it deducts the zone size from the report_zones.
This change also allows to check for a power of two zone size in generic
code.
Reported-by: Hans Holmberg <hans@owltronix.com>
Reviewed-by: Javier González <javier@javigon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
bio based drivers only need to update q->nr_zones. Do that manually
instead of overloading blk_revalidate_disk_zones to keep that function
simpler for the next round of changes that will rely even more on the
request based functionality.
Reviewed-by: Javier González <javier@javigon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Allocate the conventional zone bitmap and the sequential zone locking
bitmap only when we find a zone of the respective type. This avoids
wasting memory on the conventional zone bitmap for devices that only
have sequential zones, and will also prepare for other future changes.
Reviewed-by: Javier González <javier@javigon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Invert the meaning of seq_zones_bitmap by keeping a bitmap of
conventional zones. This allows not having a bitmap for devices
that do not have conventional zones.
Reviewed-by: Javier González <javier@javigon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Simplify the arguments to blkdev_nr_zones by passing a gendisk instead
of the block_device and capacity. This also removes the need for
__blkdev_nr_zones as all callers are outside the fast path and can
deal with the additional branch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use a saner size calculation, and do a trivial cleanup on the zone
revalidation to prepare to future changes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For zoned=1 mode, the zone size must be a power of 2. Check this not
only when the zone size is specified during modprobe, but also when
creating a zoned null_blk device using configfs.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
bvec_iter_advance() is quite popular, but compilers fail to do proper
alias analysis and optimise it good enough. The assembly is checked
for gcc 9.2, x86-64.
- remove @iter->bi_size from min(...), as it's always less than @bytes.
Modify at the beginning and forget about it.
- the compiler isn't able to collapse memory dependencies and remove
writes in the loop. Help it by explicitely using local vars.
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since commit b18fdf71e01f ("io_uring: simplify io_req_link_next()"),
the io_wq_current_is_worker function is no longer needed, clean it
up.
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Parameter ctx we have never used, clean it up.
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If this flag is set, applications can be certain that any data for
async offload has been consumed when the kernel has consumed the
SQE.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Just like commit f67676d160c6 for read/write requests, this one ensures
that the sockaddr data has been copied for IORING_OP_CONNECT if we need
to punt the request to async context.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Just like commit f67676d160c6 for read/write requests, this one ensures
that the msghdr data is fully copied if we need to punt a recvmsg or
sendmsg system call to async context.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently we don't copy the iovecs when we punt to async context. This
can be problematic for applications that store the iovec on the stack,
as they often assume that it's safe to let the iovec go out of scope
as soon as IO submission has been called. This isn't always safe, as we
will re-copy the iovec once we're in async context.
Make this 100% safe by copying the iovec just once. With this change,
applications may safely store the iovec on the stack for all cases.
Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Right now we just copy the sqe for async offload, but we want to store
more context across an async punt. In preparation for doing so, put the
sqe copy inside a structure that we can expand. With this pointer added,
we can get rid of REQ_F_FREE_SQE, as that is now indicated by whether
req->io is NULL or not.
No functional changes in this patch.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit 6917d0689993 ("block: merge invalidate_partitions into
rescan_partitions") caused a regression where systemd-udevd spins
forever using max CPU starting at boot time.
It's caused by a behavior change where a KOBJ_CHANGE uevent is now sent
in a case where previously it wasn't.
Restore the old behavior.
Fixes: 6917d0689993 ("block: merge invalidate_partitions into rescan_partitions")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We should never return -ERESTARTSYS to userspace, transform it into
-EINTR.
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
syzbot reports:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9217 Comm: io_uring-sq Not tainted 5.4.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:creds_are_invalid kernel/cred.c:792 [inline]
RIP: 0010:__validate_creds include/linux/cred.h:187 [inline]
RIP: 0010:override_creds+0x9f/0x170 kernel/cred.c:550
Code: ac 25 00 81 fb 64 65 73 43 0f 85 a3 37 00 00 e8 17 ab 25 00 49 8d 7c
24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84
c0 74 08 3c 03 0f 8e 96 00 00 00 41 8b 5c 24 10 bf
RSP: 0018:ffff88809c45fda0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000043736564 RCX: ffffffff814f3318
RDX: 0000000000000002 RSI: ffffffff814f3329 RDI: 0000000000000010
RBP: ffff88809c45fdb8 R08: ffff8880a3aac240 R09: ffffed1014755849
R10: ffffed1014755848 R11: ffff8880a3aac247 R12: 0000000000000000
R13: ffff888098ab1600 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd51c40664 CR3: 0000000092641000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
io_sq_thread+0x1c7/0xa20 fs/io_uring.c:3274
kthread+0x361/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Modules linked in:
---[ end trace f2e1a4307fbe2245 ]---
RIP: 0010:creds_are_invalid kernel/cred.c:792 [inline]
RIP: 0010:__validate_creds include/linux/cred.h:187 [inline]
RIP: 0010:override_creds+0x9f/0x170 kernel/cred.c:550
Code: ac 25 00 81 fb 64 65 73 43 0f 85 a3 37 00 00 e8 17 ab 25 00 49 8d 7c
24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84
c0 74 08 3c 03 0f 8e 96 00 00 00 41 8b 5c 24 10 bf
RSP: 0018:ffff88809c45fda0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000043736564 RCX: ffffffff814f3318
RDX: 0000000000000002 RSI: ffffffff814f3329 RDI: 0000000000000010
RBP: ffff88809c45fdb8 R08: ffff8880a3aac240 R09: ffffed1014755849
R10: ffffed1014755848 R11: ffff8880a3aac247 R12: 0000000000000000
R13: ffff888098ab1600 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd51c40664 CR3: 0000000092641000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
which is caused by slab fault injection triggering a failure in
prepare_creds(). We don't actually need to create a copy of the creds
as we're not modifying it, we just need a reference on the current task
creds. This avoids the failure case as well, and propagates the const
throughout the stack.
Fixes: 181e448d8709 ("io_uring: async workers should inherit the user creds")
Reported-by: syzbot+5320383e16029ba057ff@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There is a version 1.0 MU on i.MX7ULP platform.
One new version ID register is added, and it's offset is 0.
TRn registers are defined at the offset 0x20 ~ 0x2C.
RRn registers are defined at the offset 0x40 ~ 0x4C.
SR/CR registers are defined at 0x60/0x64.
Extend this driver to support it.
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
There is a version 1.0 MU on imx7ulp, use "fsl,imx7ulp-mu" compatible
to support it.
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Make sure to only clear enabled interrupts keeping count
of the connection type.
Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
Tx doorbell is handled by txdb_tasklet and doesn't
have an associated IRQ.
Anyhow, imx_mu_shutdown ignores this and tries to
free an IRQ that wasn't requested for Tx DB resulting
in the following warning:
[ 1.967644] Trying to free already-free IRQ 26
[ 1.972108] WARNING: CPU: 2 PID: 157 at kernel/irq/manage.c:1708 __free_irq+0xc0/0x358
[ 1.980024] Modules linked in:
[ 1.983088] CPU: 2 PID: 157 Comm: kworker/2:1 Tainted: G
[ 1.993524] Hardware name: Freescale i.MX8QXP MEK (DT)
[ 1.998668] Workqueue: events deferred_probe_work_func
[ 2.003812] pstate: 60000085 (nZCv daIf -PAN -UAO)
[ 2.008607] pc : __free_irq+0xc0/0x358
[ 2.012364] lr : __free_irq+0xc0/0x358
[ 2.016111] sp : ffff00001179b7e0
[ 2.019422] x29: ffff00001179b7e0 x28: 0000000000000018
[ 2.024736] x27: ffff000011233000 x26: 0000000000000004
[ 2.030053] x25: 000000000000001a x24: ffff80083bec74d4
[ 2.035369] x23: 0000000000000000 x22: ffff80083bec7588
[ 2.040686] x21: ffff80083b1fe8d8 x20: ffff80083bec7400
[ 2.046003] x19: 0000000000000000 x18: ffffffffffffffff
[ 2.051320] x17: 0000000000000000 x16: 0000000000000000
[ 2.056637] x15: ffff0000111296c8 x14: ffff00009179b517
[ 2.061953] x13: ffff00001179b525 x12: ffff000011142000
[ 2.067270] x11: ffff000011129f20 x10: ffff0000105da970
[ 2.072587] x9 : 00000000ffffffd0 x8 : 0000000000000194
[ 2.077903] x7 : 612065657266206f x6 : ffff0000111e7b09
[ 2.083220] x5 : 0000000000000003 x4 : 0000000000000000
[ 2.088537] x3 : 0000000000000000 x2 : 00000000ffffffff
[ 2.093854] x1 : 28b70f0a2b60a500 x0 : 0000000000000000
[ 2.099173] Call trace:
[ 2.101618] __free_irq+0xc0/0x358
[ 2.105021] free_irq+0x38/0x98
[ 2.108170] imx_mu_shutdown+0x90/0xb0
[ 2.111921] mbox_free_channel.part.2+0x24/0xb8
[ 2.116453] mbox_free_channel+0x18/0x28
This bug is present from the beginning of times.
Cc: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
The wakeup specific IRQ management is no more needed to wake up the
stm32 platform. A relationship has been established between the EXTI and
the RX IRQ, just need to declare the EXTI interrupt instead of the
IPCC RX IRQ.
Signed-off-by: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
|
|
The DeviceTree binding document was converted to YAML in a patch that
is being merged through the devicetree tree, as such this patch needs to
be rewritten and is currently cause for a merge conflict.
This reverts commit 14ea1d04ed0f7bae60951bdb8eeaa55cdbb26c73.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
|
|
Christophe reports that current master fails building on powerpc with
this error:
CC fs/io_uring.o
fs/io_uring.c: In function ‘loop_rw_iter’:
fs/io_uring.c:1628:21: error: implicit declaration of function ‘kmap’
[-Werror=implicit-function-declaration]
iovec.iov_base = kmap(iter->bvec->bv_page)
^
fs/io_uring.c:1628:19: warning: assignment makes pointer from integer
without a cast [-Wint-conversion]
iovec.iov_base = kmap(iter->bvec->bv_page)
^
fs/io_uring.c:1643:4: error: implicit declaration of function ‘kunmap’
[-Werror=implicit-function-declaration]
kunmap(iter->bvec->bv_page);
^
which is caused by a missing highmem.h include. Fix it by including
it.
Fixes: 311ae9e159d8 ("io_uring: fix dead-hung for non-iter fixed rw")
Reported-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit f2bb86937d86 ("powerpc/fixmap: don't clear fixmap area in
paging_init()") removed the clearing of fixmap area in order to
avoid clearing fixmapped areas set earlier.
However unlike all other users of fixmap which use __set_fixmap(),
HIGHMEM functions directly use __set_pte_at(). This means
the page table must pre-exist, otherwise the following crash
can be encoutered due to the lack of entry in the PGD.
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash PowerMac
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 5.4.0+ #2528
NIP: c0144ce8 LR: c0144ccc CTR: 00000080
REGS: ef0b5aa0 TRAP: 0300 Not tainted (5.4.0+)
MSR: 00009032 <EE,ME,IR,DR,RI> CR: 44282842 XER: 00000000
DAR: fffdf000 DSISR: 42000000
GPR00: c0144ccc ef0b5b58 ef0b0000 fffdf000 fffdf000 00000000 c0000f7c 00000000
GPR08: c0833000 fffdf000 00000000 ef1c53c9 24042842 00000000 00000000 00000000
GPR16: 00000000 00000000 ef7e7358 effe8160 00000000 c08a9660 c0851644 00000004
GPR24: c08c70a8 00002dc2 00000000 00000001 00000201 effe8160 effe8160 00000000
NIP [c0144ce8] prep_new_page+0x138/0x178
LR [c0144ccc] prep_new_page+0x11c/0x178
Call Trace:
[ef0b5b58] [c0144ccc] prep_new_page+0x11c/0x178 (unreliable)
[ef0b5b88] [c0147218] get_page_from_freelist+0x1fc/0xd88
[ef0b5c38] [c0148328] __alloc_pages_nodemask+0xd4/0xbb4
[ef0b5cf8] [c0142ba8] __vmalloc_node_range+0x1b4/0x2e0
[ef0b5d38] [c0142dd0] vzalloc+0x48/0x58
[ef0b5d58] [c0301c8c] check_partition+0x58/0x244
[ef0b5d78] [c02ffe80] blk_add_partitions+0x44/0x2cc
[ef0b5db8] [c01a32d8] bdev_disk_changed+0x68/0xfc
[ef0b5de8] [c01a4494] __blkdev_get+0x290/0x460
[ef0b5e28] [c02fdd40] __device_add_disk+0x480/0x4d8
[ef0b5e68] [c0810688] brd_init+0xc0/0x188
[ef0b5e88] [c0005194] do_one_initcall+0x40/0x19c
[ef0b5ee8] [c07dd4dc] kernel_init_freeable+0x164/0x230
[ef0b5f28] [c0005408] kernel_init+0x18/0x10c
[ef0b5f38] [c0014274] ret_from_kernel_thread+0x14/0x1c
Partially revert that commit to still clear the fixmap area dedicated
to HIGHMEM.
Fixes: f2bb86937d86 ("powerpc/fixmap: don't clear fixmap area in paging_init()")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d42fa9747df5afa41e67b08e374c98d3b40529c9.1574927918.git.christophe.leroy@c-s.fr
|
|
commit ad723674d675 ("x86/efi: move common keyring handler functions
to new file") leave this unused.
Fixes: ad723674d675 ("x86/efi: move common keyring handler functions to new file")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20191115130830.13320-1-yuehaibing@huawei.com
|
|
While writing to MSR IA32_THERM_STATUS/IA32_PKG_THERM_STATUS, avoid
writing 1 to read only and reserved fields because updating some fields
generates exception.
[ bp: Vertically align for better readability. ]
Fixes: f6656208f04e ("x86/mce/therm_throt: Optimize notifications of thermal throttle")
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20191128150824.22413-1-srinivas.pandruvada@linux.intel.com
|
|
Fixes coccicheck warning:
drivers/block/ataflop.c:860:53-54: Unneeded semicolon
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes coccicheck warning:
drivers/block/sunvdc.c:637:2-3: Unneeded semicolon
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fixes coccicheck warning:
drivers/block/drbd/drbd_req.c:887:2-3: Unneeded semicolon
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
That is a bit weird scenario but I find it interesting to run fio loads
using LKL linux, where MMU is disabled. Probably other real archs which
run uClinux can also benefit from this patch.
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Move from the deprecated i2c_new_probed_device() to the new
i2c_new_scanned_device(). Make use of the new ERRPTR if suitable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Move from the deprecated i2c_new_probed_device() to the new
i2c_new_scanned_device(). Make use of the new ERRPTR if suitable.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Max Staudt <max@enpas.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
In the general move to have i2c_new_*_device functions which return
ERR_PTR instead of NULL, this patch converts i2c_new_probed_device().
There are only few users, so this patch converts the I2C core and all
users in one go. The function gets renamed to i2c_new_scanned_device()
so out-of-tree users will get a build failure to understand they need to
adapt their error checking code.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Max Staudt <max@enpas.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
|
|
Fix below build warning
WARNING: modpost: missing MODULE_LICENSE() in
drivers/pinctrl/pinctrl-equilibrium.o
Introduced by commit
1948d5c51dba ("pinctrl: Add pinmux & GPIO controller driver for a new SoC")
by adding missing MODULE_LICENSE.
Signed-off-by: Rahul Tanwar <rahul.tanwar@linux.intel.com>
Link: https://lore.kernel.org/r/20191128080832.13529-2-rahul.tanwar@linux.intel.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
This patch corrects the SPDX License Identifier style in
header files related to drivers for USB Network devices.
This patch gives an explicit block comment to the
SPDX License Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch corrects the SPDX License Identifier style in
header files related to PHY Layer for Ethernet drivers.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used). This patch also gives an explicit
block comment to the SPDX License Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()") has applied batched GRO_NORMAL packets processing
to all napi_gro_receive() users, including mac80211-based drivers.
However, this change has led to a regression in iwlwifi driver [1][2] as
it is required for NAPI users to call napi_complete_done() or
napi_complete() and the end of every polling iteration, whilst iwlwifi
doesn't use NAPI scheduling at all and just calls napi_gro_flush().
In that particular case, packets which have not been already flushed
from napi->rx_list stall in it until at least next Rx cycle.
Fix this by adding a manual flushing of the list to iwlwifi driver right
before napi_gro_flush() call to mimic napi_complete() logics.
I prefer to open-code gro_normal_list() rather than exporting it for 2
reasons:
* to prevent from using it and napi_gro_flush() in any new drivers,
as it is the *really* bad way to use NAPI that should be avoided;
* to keep gro_normal_list() static and don't lose any CC optimizations.
I also don't add the "Fixes:" tag as the mentioned commit was only a
trigger that only exposed an improper usage of NAPI in this particular
driver.
[1] https://lore.kernel.org/netdev/PSXP216MB04388962C411CD0B17A86F47804A0@PSXP216MB0438.KORP216.PROD.OUTLOOK.COM
[2] https://bugzilla.kernel.org/show_bug.cgi?id=205647
Signed-off-by: Alexander Lobakin <alobakin@dlink.ru>
Acked-by: Luca Coelho <luciano.coelho@intel.com>
Reported-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Tested-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Reviewed-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert to use skb queue instead of the list of skbs.
The skb queue could provide protection with lock.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|