Age | Commit message (Collapse) | Author | Files | Lines |
|
unmap_grant_pages() currently waits for the pages to no longer be used.
In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
deadlock against i915: i915 was waiting for gntdev's MMU notifier to
finish, while gntdev was waiting for i915 to free its pages. I also
believe this is responsible for various deadlocks I have experienced in
the past.
Avoid these problems by making unmap_grant_pages async. This requires
making it return void, as any errors will not be available when the
function returns. Fortunately, the only use of the return value is a
WARN_ON(), which can be replaced by a WARN_ON when the error is
detected. Additionally, a failed call will not prevent further calls
from being made, but this is harmless.
Because unmap_grant_pages is now async, the grant handle will be sent to
INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
handle. Instead, a separate bool array is allocated for this purpose.
This wastes memory, but stuffing this information in padding bytes is
too fragile. Furthermore, it is necessary to grab a reference to the
map before making the asynchronous call, and release the reference when
the call returns.
It is also necessary to guard against reentrancy in gntdev_map_put(),
and to handle the case where userspace tries to map a mapping whose
contents have not all been freed yet.
Fixes: 745282256c75 ("xen/gntdev: safely unmap grants in case they are still in use")
Cc: stable@vger.kernel.org
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220622022726.2538-1-demi@invisiblethingslab.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
With Xen PV Display driver in use the "expected" VM_DONTEXPAND flag
is not set (neither explicitly nor implicitly), so the driver hits
the code path in drm_gem_mmap_obj() which triggers the WARNING.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Link: https://lore.kernel.org/r/1652104303-5098-1-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
1 << 31 is undefined. So switch to 1U << 31.
Fixes: 5ead97c84fa7 ("xen: Core Xen implementation")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220617103037.57828-1-julien@xen.org
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
When a VBD is not fully created and then closed, the kernel can have a
NULL pointer dereference:
The reproducer is trivial:
[user@dom0 ~]$ sudo xl block-attach work backend=sys-usb vdev=xvdi target=/dev/sdz
[user@dom0 ~]$ xl block-list work
Vdev BE handle state evt-ch ring-ref BE-path
51712 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51712
51728 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51728
51744 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51744
51760 0 241 4 -1 -1 /local/domain/0/backend/vbd/241/51760
51840 3 241 3 -1 -1 /local/domain/3/backend/vbd/241/51840
^ note state, the /dev/sdz doesn't exist in the backend
[user@dom0 ~]$ sudo xl block-detach work xvdi
[user@dom0 ~]$ xl block-list work
Vdev BE handle state evt-ch ring-ref BE-path
work is an invalid domain identifier
And its console has:
BUG: kernel NULL pointer dereference, address: 0000000000000050
PGD 80000000edebb067 P4D 80000000edebb067 PUD edec2067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 52 Comm: xenwatch Not tainted 5.16.18-2.43.fc32.qubes.x86_64 #1
RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
Call Trace:
<TASK>
blkback_changed+0x95/0x137 [xen_blkfront]
? read_reply+0x160/0x160
xenwatch_thread+0xc0/0x1a0
? do_wait_intr_irq+0xa0/0xa0
kthread+0x16b/0x190
? set_kthread_struct+0x40/0x40
ret_from_fork+0x22/0x30
</TASK>
Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore ipt_REJECT nf_reject_ipv4 xt_state xt_conntrack nft_counter nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel xen_netfront pcspkr xen_scsiback target_core_mod xen_netback xen_privcmd xen_gntdev xen_gntalloc xen_blkback xen_evtchn ipmi_devintf ipmi_msghandler fuse bpf_preload ip_tables overlay xen_blkfront
CR2: 0000000000000050
---[ end trace 7bc9597fd06ae89d ]---
RIP: 0010:blk_mq_stop_hw_queues+0x5/0x40
Code: 00 48 83 e0 fd 83 c3 01 48 89 85 a8 00 00 00 41 39 5c 24 50 77 c0 5b 5d 41 5c 41 5d c3 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <8b> 47 50 85 c0 74 32 41 54 49 89 fc 55 53 31 db 49 8b 44 24 48 48
RSP: 0018:ffffc90000bcfe98 EFLAGS: 00010293
RAX: ffffffffc0008370 RBX: 0000000000000005 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
RBP: ffff88800775f000 R08: 0000000000000001 R09: ffff888006e620b8
R10: ffff888006e620b0 R11: f000000000000000 R12: ffff8880bff39000
R13: ffff8880bff39000 R14: 0000000000000000 R15: ffff88800604be00
FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 00000000e932e002 CR4: 00000000003706e0
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
info->rq and info->gd are only set in blkfront_connect(), which is
called for state 4 (XenbusStateConnected). Guard against using NULL
variables in blkfront_closing() to avoid the issue.
The rest of blkfront_closing looks okay. If info->nr_rings is 0, then
for_each_rinfo won't do anything.
blkfront_remove also needs to check for non-NULL pointers before
cleaning up the gendisk and request queue.
Fixes: 05d69d950d9d "xen-blkfront: sanitize the removal state machine"
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220601195341.28581-1-jandryuk@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.
modpost used to detect it, but it has been broken for a decade.
Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.
There are two ways to fix it:
- Remove __init
- Remove EXPORT_SYMBOL
I chose the latter for this case because none of the in-tree call-sites
(arch/arm/xen/enlighten.c, arch/x86/xen/grant-table.c) is compiled as
modular.
Fixes: 243848fc018c ("xen/grant-table: Move xlated_setup_gnttab_pages to common place")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20220606045920.4161881-1-masahiroy@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
By assigning xen-grant DMA ops we will restrict memory access for
passed device using Xen grant mappings. This is needed for using any
virtualized device (e.g. virtio) in Xen guests in a safe manner.
Please note, for the virtio devices the XEN_VIRTIO config should
be enabled (it forces ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS).
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1654197833-25362-9-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
Use the presence of "iommus" property pointed to the IOMMU node with
recently introduced "xen,grant-dma" compatible as a clear indicator
of enabling Xen grant mappings scheme for that device and read the ID
of Xen domain where the corresponding backend is running. The domid
(domain ID) is used as an argument to the Xen grant mapping APIs.
To avoid the deferred probe timeout which takes place after reusing
generic IOMMU device tree bindings (because the IOMMU device never
becomes available) enable recently introduced stub IOMMU driver by
selecting XEN_GRANT_DMA_IOMMU.
Also introduce xen_is_grant_dma_device() to check whether xen-grant
DMA ops need to be set for a passed device.
Remove the hardcoded domid 0 in xen_grant_setup_dma_ops().
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1654197833-25362-8-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|
|
In order to reuse generic IOMMU device tree bindings by Xen grant
DMA-mapping layer we need to add this stub driver from a fw_devlink
perspective (grant-dma-ops cannot be converted into the proper
IOMMU driver).
Otherwise, just reusing IOMMU bindings (without having a corresponding
driver) leads to the deferred probe timeout afterwards, because
the IOMMU device never becomes available.
This stub driver does nothing except registering empty iommu_ops,
the upper layer "of_iommu" will treat this as NO_IOMMU condition
and won't return -EPROBE_DEFER.
As this driver is quite different from the most hardware IOMMU
implementations and only needed in Xen guests, place it in drivers/xen
directory. The subsequent commit will make use of it.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/1654197833-25362-7-git-send-email-olekstysh@gmail.com
Signed-off-by: Juergen Gross <jgross@suse.com>
|