summaryrefslogtreecommitdiffstats
path: root/usr.sbin/vmd (follow)
Commit message (Collapse)AuthorAgeFilesLines
* vmd(8): fix ns8250 lockup due to race conditionpd2020-06-211-16/+16
| | | | | | | | | | | | | | | | | Inject a pending interrupt even if the rcv_pending flag is set to avoid the endless EV_READ loop where a byte lingers read to be read but the vcpu never gets the interrupt to read it. (e.g. the result of spamming RETURN via the serial console) Also, protect com ratelimit handler with mutexes to avoid corruption of the device state. These changes help preventing linux vm crashes when the return key is held on boot. Discovered by and patch from Dave Voutila <dave@sisu.io> ok tb@
* vmd(8): backout previous commit to ns8250.c as it reintroduced the bug where thepd2020-06-161-17/+9
| | | | | | | vm would get stuck if disconnected from console and get unstuck once console is attached. Spotted by tb@
* vmd(8): fix ns8250 lockup due to race conditionpd2020-06-161-9/+17
| | | | | | | | | | | | Inject pending interrupt if com has receive pending. This was previously accidently checked in with an unrelated change by Mike Larkin and was backed out as it didn't fix the intended problem. Also, protect com ratelimit handler with mutexes to avoid corruption of the device state. These changes help preventing linux vm crashes when the return key is held on boot. Discovered by and patch from Dave Voutila <dave@sisu.io>
* vmd(8): correctly terminate vm processes after sending vmpd2020-04-302-6/+6
| | | | | | | | | | | | | | | | | | | Instead of a round about way of sending a message to vmm that 'send is successful' and terminating by vm_remove from vmm, we can send the imsg and exit in the vm process. The sigchld handler in vmm will vm_remove it from its structures. This is how a normal vm is terminated as well. Previously, vm_remove was called in vmm_dispatch_vm (ie. the event handler to receive messages from vm process) when hanlding the IMSG_VMDOP_SEND_VM_RESPONSE (ie. the vm process has written the vm state to the fd passed on by vmctl send). This is not how vm_remove was intented to be used as it does a free(vm). The vm struct holds the buffers for imsg and so after handling this IMSG_VMDOP_SEND_VM_RESPONSE message, vmm_dispatch_vm loops again to do imsg_get(ibuf, &imsg) to read the next message (and we had just freed this *ibuf when we freed the vm struct) causing it to segfault. reported by kn@ ok kn@
* vmd: improve concurrency control in pausepd2020-04-211-38/+25
| | | | | | | | Previous implementation hit a deadlock sometimes as the pthread_cond_broadcast for the pause mutex could happen before pthread_cond_wait. This implementation uses a barrier which is hit when all vpcus are paused. ok mpi@
* vmm(4): add IOCTL handler to sets the access protections of the eptpd2020-04-081-2/+40
| | | | | | | | | | | | This exposes VMM_IOC_MPROTECT_EPT which can be used by vmd to lock in physical pages. Currently, vmd just terminates the vm in case it gets a protection fault in the future. This feature is used by solo5 which uses vmm(4) as a backend hypervisor. ok mpi@ Patch from Adam Steen <adam@adamsteen.com.au>
* Backout "DHCP is configured on the first interface only"kn2020-02-161-3/+3
| | | | | | | | | | | | | | | | | I completely missed that part from vmctl.5's "LOCAL INTERFACES" section. Reading `-L's description itself and the fact that it functions as a boolean switch contrary to how `-i' expects a number, I made the wrong assumption that it can only work for the first interface. "vmctl -Li2" configures two interfaces, one witch DHCP and one without. "vmctl -L -L" however configures two interfaces with DHCP IPs each. My second mistake was to imply analogue behaviour for the configuration. Now that you stated the obvious about `local' being per `interface' line, it makes absoloutely no sense to above mentioned behaviour for static VM definitions. Pointed out by tb
* DHCP is configured on the first interface onlykn2020-02-151-3/+3
| | | | | | | A VM can have multiple interfaces, but only the first one gets DHCP if "-L" (vmctl) or "local" (vm.conf) is specified. Positive feedback Mike Larkin
* briefly mention /etc/examples/ in the FILES section of all theschwarze2020-02-101-2/+7
| | | | | manual pages that document the corresponding configuration files; OK jmc@, and general direction discussed with many
* Guest VMs require some resources that are managed outside of vmm(4), sophessler2020-01-151-2/+11
| | | | | | | | | | try to document and enumerate them. This is most helpful when you try to assign the 5th interface to a guest, and are confused why vmd(8) won't start the guest when only 4 tap devices exist. OK jmc@, kn@, pamela@
* "allow instance {...}" requires optionskn2019-12-171-3/+3
| | | | The parameter block must not be omitted, so remove the Op markup.
* kn pointed out that the changes i made to "socket owner" can bejmc2019-12-171-5/+11
| | | | applied to "owner" too;
* combine "socket owner user[:group]" and "socket owner :group"jmc2019-12-171-7/+13
| | | | | | into one logical item; ok pd
* tweak previous; ok pdjmc2019-12-131-12/+13
|
* Make owner value mandatorykn2019-12-121-6/+2
| | | | | | | | | | | Omitting the owner value is not documented and ought to be rather invalid syntax, but it parses as "[socket] owner root:wheel" which is the same as simply omitting the owner line entirely. Require a value, that is treat "socket owner" and "owner" as invalid syntax and fail. OK denis
* vmd: start vms defined in vm.conf in a staggered fashionpd2019-12-124-34/+87
| | | | | | | | | | | | This addresses 'thundering herd' problem when a lot of vms are configured in vm.conf. A lot of vms booting in parallel can overload the host and also mess up tsc calibration in openbsd guests as it uses PIT which doesn't fire reliably if the host is overloaded. We default to starting vms with parallelism of ncpuonline and a delay 30 seconds between batches. This is configurable in vm.conf. ok mlarkin@ (also addressed comments from cheloha@)
* vmd: proper concurrency control when pausing a vmpd2019-12-116-42/+169
| | | | | | | | | | | | | | | Removes an XXX which slept for 1s waiting for the vcpu thread to reach HLT and pause. We now define a paused and unpaused condition so that a call to pause_vm() / vmctl pause blocks till the vm really reaches a paused state. Also, detach events for devices from event loop when pausing and add them back when unpausing. This is because some callbacks call pthread_mutex_lock and if the vm is paused, it would block also causing the libevent thread to block. This would mean that we would not be able to process any IMSGs received from vmm (parent process) including a message to unpause. ok mlarkin@
* Fully reinstate revision 1.21. Apparently, revision 1.22 (part oftb2019-12-081-27/+15
| | | | | | | | | the "Fix at least one cause of VMs spinning at 100% host CPU" commit) accidentally included some pieces of a different WIP. These pieces remained in the tree after the revert and caused vmd to busy loop after attaching to and detaching from a VM's console. "please commit" mlarkin
* Revert previous - the stability was not as improved as we had thought andmlarkin2019-11-305-31/+7
| | | | | | we ended up accidentally breaking vmctl. This will need more thought. ok ori@
* Fix at least one cause of VMs spinning at 100% host CPUmlarkin2019-11-295-21/+56
| | | | | | | | | | | | | | | After debugging with ori@, it looks like an event ends up on the wrong libevent queue, and we end continually de-queueing and re-queueing the event continually. While it's unclear exactly why this happened, a clue on libevent's github issues page for the same problem pointed us to using a different event base for the device events. This seems to have unstuck ori@'s problematic VM, and I have also seen no more hangs after this. We have not completely separated the queues; ori@ will work on setting new libevent bases for those later. But those events are pretty frequency. with help from and ok ori@
* Consistently use _rcctl enable foo_ in examples, it's simpler and lesslandry2019-11-101-5/+10
| | | | | | | | | | | | | error prone than manually editing rc.conf.local, and also works to enable ipsec and accounting. tweak from schwarze@ to use the \(dq\(dq syntax for quotes in '.Dl foo_flags="" lines' instead of \&"\&". while at it, fix a reference to a bogus /dev/dhclient.conf file that recently snuck in. ok jmc@ deraadt@ schwarze@
* ifname in opentap() is not optionalkn2019-10-251-5/+4
| | | | | | | | | | The function argument is not checked at all and the only caller in config.c always passes a buffer valid buffer. Defer the error case's default value to the end to avoid rewriting in case a node is opened. Feedback and OK reyk
* vmd(8): provide some additional info in a debug msgmlarkin2019-10-161-3/+4
| | | | | | | Print the guest %rip when it tries to do I/O to a nonexistent port. Also convert the message to a DPRINTF so that it doesn't leak guest address information into any logging the host might be doing under normal non-debug conditions.
* use sizeof(struct) not sizeof(pointer) in calloc calljsg2019-10-111-2/+2
| | | | ok deraadt@
* vmd(8): fix memory leak in virtio network TX path.mlarkin2019-09-241-1/+4
| | | | ok reyk, mpi, benno, tb
* vmd(8): virtio.c whitespace removalmlarkin2019-09-241-2/+2
|
* Remove unused VMD_DISK_INVALID message type and mark it obsolete.tobhe2019-09-071-2/+2
| | | | ok mlarkin@
* vmd(8): memory leak in an error pathmlarkin2019-09-041-1/+2
| | | | Found by Hiltjo Posthuma, thanks!
* Improve the error message when supplying an invalid template to vmctlanton2019-08-142-3/+9
| | | | | | | start. Favoring 'invalid template' over 'permission denied' should give the user a better hint on what went wrong. ok kn@ mlarkin@
* vmm/vmd: Fix migration with pvclockpd2019-07-172-3/+45
| | | | | | | Implement VMM_IOC_READVMPARAMS and VMM_IOC_WRITEVMPARAMS ioctls to read and write pvclock state. reads ok mlarkin@
* When system calls indicate an error they return -1, not some arbitraryderaadt2019-06-284-24/+24
| | | | | | value < 0. errno is only updated in this case. Change all (most?) callers of syscalls to follow this better, and let's see if this strictness helps us in the future.
* Make vmd(8)'s ns8250 emulation more correctmlarkin2019-05-282-12/+31
| | | | | | | | | | Remove the scratch register (8250s don't have this), and reorganize some constants to be able to more easily support more than one serial port in the future. ok deraadt Diff from Katherine Rohl, thanks!
* vmd: unset CR0_CD and CR0_NW in default flat64 register valuespd2019-05-281-2/+2
| | | | | | | These never got unset on AMD/SVM guests when booted via vmctl start -b causing them to run very slow ok mlarkin@
* only reschedule the periodic interrupt after updating register Ajasper2019-05-271-2/+2
| | | | | | | | | | | | | | if something changed in register A. when updating register A we were checking in register B if the PIE bit was set in order to decide if rtc_reschedule_per needed to be called. if that bit was changed then the timer rate would already have been adjusted by rtc_update_regb so the call from rtc_update_rega is not needed. this now matches what qemu and other emulators are doing too. ok mlarkin@
* drop fatalx calls when claiming a new vm id; otherwise it's possiblejasper2019-05-201-15/+31
| | | | | | | | to crash vmd and take all other vms with it. this required a little shuffling to get the error value reported back to the caller to handle the error properly. ok mlarkin@
* Unbreak vmctl start foo -b /bsd -d disk.img -cLclaudio2019-05-162-5/+6
| | | | | | | Define a local definition of LOADADDR() instead of pulling in machine/loadfile_machdep.h. vmd -b requires the addresses to be masked and the new bootloader no longer does that. OK pd@ kettenis@
* Delete some .Sx macros that were used in a wrong way.schwarze2019-05-141-6/+2
| | | | Part of a patch from Stephen Gregoratto <dev at sgregoratto dot me>.
* Add support for `boot device' to vm.conf grammar which is the `-B device'anton2019-05-142-6/+48
| | | | | | counterpart from vmctl. ok mlarkin@
* vmm: add a x86 page table walkerpd2019-05-121-1/+137
| | | | | | | | | | Add a first cut of x86 page table walker to vmd(8) and vmm(4). This function is not used right now but is a building block for future features like HPET, OUTSB and INSB emulation, nested virtualisation support, etc. With help from Mike Larkin ok mlarkin@
* report vm state through 'vmctl status'; whereas previously this would display the state ofjasper2019-05-112-9/+8
| | | | | | | the vcpu (which is why it got removed), it now actually reports the correct state (running, stopped, disabled, paused, etc) ok ccardenas@ mlarkin@
* vm_dump_header allocated space for a signature but it was never set;jasper2019-05-112-2/+9
| | | | | | set it to VMM_HV_SIGNATURE and check for it upon restoring a vm image ok mlarkin@ pd@
* add missing comment about VM_STATE_SHUTDOWN; as discussed with ccardenas@jasper2019-05-111-1/+2
|
* track the state of the vm (running, paused, etc) using a single bitfield instead ofjasper2019-05-116-60/+62
| | | | | | | | | a handful of separate variables. this will makes it easier for vmd to report and check on the individual vm states no functional change intended ok ccardenas@ mlarkin@
* sync the vm state in vmd too when (un)pausing a vm, otherwise the vm processjasper2019-05-111-1/+3
| | | | | | knows the vm is paused, but vmd does not. ok mlarkin@ pd@
* remove receive_vm prototype for the function does not exist (anymore)jasper2019-05-101-2/+1
| | | | ok pd@
* Do not unconditionally wait for read events on the pty associated with aanton2019-03-112-4/+27
| | | | | | | | | | | | vm console. Instead, wait for the controlling end of the pty to become writeable, which implies that the slave end is connected. A recent change to the kqueue pty implementation caused vmd to hammer the log due to constantly hitting EOF while reading from the pty since the slave end was disconnected. Issue found the hard way by mlarkin@ and tb@ ok mlarkin@
* Clarify that VM names must start with a letterkn2019-03-071-5/+7
| | | | | | | | `start' requires an alphanumeric VM name, must not be a number and in fact must not start with a digit. Improve and simplify the current requirements as starting with a letter directly implies all of the above. OK mlarkin, feedback jmc
* vmd(8): remove some i386 remnants that missed the original cleanupmlarkin2019-03-012-47/+2
| | | | ok pd, kn, deraadt
* vmd(8): initialize guest %drX registers to power-on defaults on launchmlarkin2019-02-202-3/+15
| | | | | | | Initializes the %drX registers to power on defaults, and bump the VM send/recieve header to reflect same discussed with deraadt@
* (unsigned) means (unsigned int) which on ptrdiff_t or size_t or otherderaadt2019-02-131-3/+3
| | | | | | larger types really is a range reduction... Almost any cast to (unsigned) is a bug. ok millert tb benno