Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds an option to instantiate guest virtio-mmio devices
basing on a kernel command line (or module) parameter, for example:
virtio_mmio.devices=0x100@0x100b0000:48
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Benchmark shows small performance improvement on fusion io device.
Before:
seq-read : io=1,024MB, bw=19,982KB/s, iops=39,964, runt= 52475msec
seq-write: io=1,024MB, bw=20,321KB/s, iops=40,641, runt= 51601msec
rnd-read : io=1,024MB, bw=15,404KB/s, iops=30,808, runt= 68070msec
rnd-write: io=1,024MB, bw=14,776KB/s, iops=29,552, runt= 70963msec
After:
seq-read : io=1,024MB, bw=20,343KB/s, iops=40,685, runt= 51546msec
seq-write: io=1,024MB, bw=20,803KB/s, iops=41,606, runt= 50404msec
rnd-read : io=1,024MB, bw=16,221KB/s, iops=32,442, runt= 64642msec
rnd-write: io=1,024MB, bw=15,199KB/s, iops=30,397, runt= 68991msec
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
If we reset the virtio-blk device before the requests already dispatched
to the virtio-blk driver from the block layer are finised, we will stuck
in blk_cleanup_queue() and the remove will fail.
blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
before DEAD marking. However it will never success if the device is
already stopped. We'll have q->in_flight[] > 0, so the drain will not
finish.
How to reproduce the race:
1. hot-plug a virtio-blk device
2. keep reading/writing the device in guest
3. hot-unplug while the device is busy serving I/O
Test:
~1000 rounds of hot-plug/hot-unplug test passed with this patch.
Changes in v3:
- Drop blk_abort_queue and blk_abort_request
- Use __blk_end_request_all to complete request dispatched to driver
Changes in v2:
- Drop req_in_flight
- Use virtqueue_detach_unused_buf to get request dispatched to driver
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Current index allocation in virtio is based on a monotonically
increasing variable "index". This means we'll run out of numbers
after a while. E.g. someone crazy doing this in host side.
while(1) {
hot-plug a virtio device
hot-unplug the virito devcie
}
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
The remove and freeze functions have a lot of shared code; put it into a
common function that gets called by both.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
restore_common() was used when there were different thaw and freeze PM
callbacks implemented. We removed thaw in commit
f38f8387cbdc4138a492ce9f2a5f04fd3cd3cf33.
restore_common() can be removed and virtballoon_restore() can itself do
the restore ops.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
When a virtio_9p pci device is being removed, we should close down any
active channels and free up resources, we're not supposed to BUG() if there's
still an open channel since it's a valid case when removing the PCI device.
Otherwise, removing the PCI device with an open channel would cause the
following BUG():
[ 1184.671416] ------------[ cut here ]------------
[ 1184.672057] kernel BUG at net/9p/trans_virtio.c:618!
[ 1184.672057] invalid opcode: 0000 [#1] PREEMPT SMP
[ 1184.672057] CPU 3
[ 1184.672057] Pid: 5, comm: kworker/u:0 Tainted: G W 3.4.0-rc2-next-20120413-sasha-dirty #76
[ 1184.672057] RIP: 0010:[<ffffffff825c9116>] [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90
[ 1184.672057] RSP: 0018:ffff88000d653ac0 EFLAGS: 00010202
[ 1184.672057] RAX: ffffffff836bfb40 RBX: ffff88000c9b2148 RCX: ffff88000d658978
[ 1184.672057] RDX: 0000000000000006 RSI: 0000000000000000 RDI: ffff880028868000
[ 1184.672057] RBP: ffff88000d653ad0 R08: 0000000000000000 R09: 0000000000000000
[ 1184.672057] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880028868000
[ 1184.672057] R13: ffffffff835aa7c0 R14: ffff880041630000 R15: ffff88000d653da0
[ 1184.672057] FS: 0000000000000000(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000
[ 1184.672057] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1184.672057] CR2: 0000000001181000 CR3: 000000000eba1000 CR4: 00000000000406e0
[ 1184.672057] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
x000000000117a190 *[ 1184.672057] DR3: 00000000000000**
00 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1184.672057] Process kworker/u:0 (pid: 5, threadinfo ffff88000d652000, task ffff88000d658000)
[ 1184.672057] Stack:
[ 1184.672057] ffff880028868000 ffffffff836bfb40 ffff88000d653af0 ffffffff8193661b
[ 1184.672057] ffff880028868008 ffffffff836bfb40 ffff88000d653b10 ffffffff81af1c81
[ 1184.672057] ffff880028868068 ffff880028868008 ffff88000d653b30 ffffffff81af257a
[ 1184.795301] Call Trace:
[ 1184.795301] [<ffffffff8193661b>] virtio_dev_remove+0x1b/0x60
[ 1184.795301] [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0
[ 1184.795301] [<ffffffff81af257a>] device_release_driver+0x2a/0x40
[ 1184.795301] [<ffffffff81af0d48>] bus_remove_device+0x138/0x150
[ 1184.795301] [<ffffffff81aef08d>] device_del+0x14d/0x1b0
[ 1184.795301] [<ffffffff81aef138>] device_unregister+0x48/0x60
[ 1184.795301] [<ffffffff8193694d>] unregister_virtio_device+0xd/0x10
[ 1184.795301] [<ffffffff8265fc74>] virtio_pci_remove+0x2a/0x6c
[ 1184.795301] [<ffffffff818a95ad>] pci_device_remove+0x4d/0x110
[ 1184.795301] [<ffffffff81af1c81>] __device_release_driver+0x81/0xd0
[ 1184.795301] [<ffffffff81af257a>] device_release_driver+0x2a/0x40
[ 1184.795301] [<ffffffff81af0d48>] bus_remove_device+0x138/0x150
[ 1184.795301] [<ffffffff81aef08d>] device_del+0x14d/0x1b0
[ 1184.795301] [<ffffffff81aef138>] device_unregister+0x48/0x60
[ 1184.795301] [<ffffffff818a36fa>] pci_stop_bus_device+0x6a/0x90
[ 1184.795301] [<ffffffff818a3791>] pci_stop_and_remove_bus_device+0x11/0x20
[ 1184.795301] [<ffffffff818c21d9>] remove_callback+0x9/0x10
[ 1184.795301] [<ffffffff81252d91>] sysfs_schedule_callback_work+0x21/0x60
[ 1184.795301] [<ffffffff810cb1a1>] process_one_work+0x281/0x430
[ 1184.795301] [<ffffffff810cb140>] ? process_one_work+0x220/0x430
[ 1184.795301] [<ffffffff81252d70>] ? sysfs_read_file+0x1c0/0x1c0
[ 1184.795301] [<ffffffff810cc613>] worker_thread+0x1f3/0x320
[ 1184.795301] [<ffffffff810cc420>] ? manage_workers.clone.13+0x130/0x130
[ 1184.795301] [<ffffffff810d30b2>] kthread+0xb2/0xc0
[ 1184.795301] [<ffffffff826783f4>] kernel_thread_helper+0x4/0x10
[ 1184.795301] [<ffffffff810deb18>] ? finish_task_switch+0x78/0xf0
[ 1184.795301] [<ffffffff82676574>] ? retint_restore_args+0x13/0x13
[ 1184.795301] [<ffffffff810d3000>] ? kthread_flush_work_fn+0x10/0x10
[ 1184.795301] [<ffffffff826783f0>] ? gs_change+0x13/0x13
[ 1184.795301] Code: c1 9e 0a 00 48 83 c4 08 5b c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49 89 fc 53 48 8b 9f a8 04 00 00 80 3b 00 74 0a <0f> 0b 0f 1f 84 00 00 00 00 00 48 8b 87 88 04 00 00 ff 50 30 31
[ 1184.795301] RIP [<ffffffff825c9116>] p9_virtio_remove+0x16/0x90
[ 1184.795301] RSP <ffff88000d653ac0>
[ 1184.952618] ---[ end trace a307b3ed40206b4c ]---
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
This reverts commit 8c01a529b861ba97c7d78368e6a5d4d42e946f75.
It turns out the d_unhashed() check isn't unnecessary after all: while
it's true that unhashing will increment the sequence numbers, that does
not necessarily invalidate the RCU lookup, because it might have seen
the dentry pointer (before it got unhashed), but by the time it loaded
the sequence number, it could have seen the *new* sequence number (after
it got unhashed).
End result: we might look up an unhashed dentry that is about to be
freed, with the sequence number never indicating anything bad about it.
So checking that the dentry is still hashed (*after* reading the sequence
number) is indeed the proper fix, and was never unnecessary.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Miklos Szeredi points out that we need to also worry about memory
odering when doing the dentry name comparison asynchronously with RCU.
In particular, doing a rename can do a memcpy() of one dentry name over
another, and we want to make sure that any unlocked reader will always
see the proper terminating NUL character, so that it won't ever run off
the allocation.
Rather than having to be extra careful with the name copy or at lookup
time for each character, this resolves the issue by making sure that all
names that are inlined in the dentry always have a NUL character at the
end of the name allocation. If we do that at dentry allocation time, we
know that no future name copy will ever change that final NUL to
anything else, so there are no memory ordering issues.
So even if a concurrent rename ends up overwriting the NUL character
that terminates the original name, we always know that there is one
final NUL at the end, and there is no worry about the lockless RCU
lookup traversing the name too far.
The out-of-line allocations are never copied over, so we can just make
sure that we write the name (with terminating NULL) and do a write
barrier before we expose the name to anything else by setting it in the
dentry.
Reported-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <npiggin@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We had for some reason overlooked the AIO interface, and it didn't use
the proper rw_verify_area() helper function that checks (for example)
mandatory locking on the file, and that the size of the access doesn't
cause us to overflow the provided offset limits etc.
Instead, AIO did just the security_file_permission() thing (that
rw_verify_area() also does) directly.
This fixes it to do all the proper helper functions, which not only
means that now mandatory file locking works with AIO too, we can
actually remove lines of code.
Reported-by: Manish Honap <manish_honap_vit@yahoo.co.in>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During early boot, when the scheduler hasn't really been fully set up,
we really can't do blocking allocations because with certain (dubious)
configurations the "might_resched()" calls can actually result in
scheduling events.
We could just make such users always use GFP_ATOMIC, but quite often the
code that does the allocation isn't really aware of the fact that the
scheduler isn't up yet, and forcing that kind of random knowledge on the
initialization code is just annoying and not good for anybody.
And we actually have a the 'gfp_allowed_mask' exactly for this reason:
it's just that the kernel init sequence happens to set it to allow
blocking allocations much too early.
So move the 'gfp_allowed_mask' initialization from 'start_kernel()'
(which is some of the earliest init code, and runs with preemption
disabled for good reasons) into 'kernel_init()'. kernel_init() is run
in the newly created thread that will become the 'init' process, as
opposed to the early startup code that runs within the context of what
will be the first idle thread.
So by the time we reach 'kernel_init()', we know that the scheduler must
be at least limping along, because we've already scheduled from the idle
thread into the init thread.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: David Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is no point having the NET dependency on the select target, as it
forces all users to depend on NET to tell they support BPF_JIT. Move
the config option to the bottom of the file - this could be a nice place
also for future "selectable" config symbols.
Fix up all users to drop the dependency on NET now that it is not
required to supress warnings for non-NET builds.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
|
|
Use single_release() instead of seq_release() to free memory allocated
by single_open().
Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Steven Miao <realmz6@gmail.com>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
|
|
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bob Liu <lliubbo@gmail.com>
|