summaryrefslogtreecommitdiffstats
path: root/sys/kern (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* copyright++;jsg2019-01-011-2/+2
|
* nanosleep: loop tsleep(9) to ensure coverage of the full timeout range.cheloha2018-12-311-10/+13
| | | | | | | | | | | | tsleep(9)'s maximum timeout shrinks as HZ grows, so this ensures we do not return early from longer timeouts on alpha or on custom kernels. POSIX says you cannot return early unless a signal is delivered, so this makes us more compliant with the standard. While here, remove the 100 million second upper bound. It is an artifact from itimerfix() and it serves no discernible purpose. ok tedu@ visa@
* sys_nanosleep: switch to descriptive, idiomatic variable names; ok tedu@cheloha2018-12-291-20/+19
|
* Rectify some issues with the noperm mount flag; the root vnode was notnatano2018-12-232-8/+15
| | | | | | | | protected properly and files without any x bit set were accidentaly considered executable when checked with access(2). Issues found and reported by deraadt, halex, reyk, tb ok deraadt
* When using MSG_WAITALL, soreceive() can sleep while processing thebluhm2018-12-171-3/+11
| | | | | | | | receive buffer of a stream socket. Then a new pair of control and data mbuf can be appended to the mbuf queue. In this case, terminate the loop with a short read to prevent a panic. Userland should read the control message with the next system call. OK claudio@ deraadt@
* Remove unused function gsignal().visa2018-12-171-13/+1
| | | | OK deraadt@ anton@
* add task_pendingdlg2018-12-161-3/+1
| | | | | | | | | | jsg@ wants this for drm, and i've had a version of it in diffs sine 2016, but obviously havent needed to use it just yet. task_pending is modelled on timeout_pending, and tells you if the task is on a list waiting to execute. ok jsg@
* free(9) sizes for sysv shm.mpi2018-12-121-3/+6
| | | | ok bluhm@, visa@
* free(9) sizes for SVID semaphores.mpi2018-12-121-4/+5
| | | | ok bluhm@, visa@
* free(9) sizes for netcred.mpi2018-12-071-4/+6
| | | | ok visa@
* Core files with >65535 sections have to use PN_XNUM and a section headerguenther2018-12-061-8/+65
| | | | | | | | to pass the real count, with a minimal .shstrtab segment for consistency. Also, add support for PN_XNUM to readelf. problem reported and testing by claudio@ ok kettenis@
* free(9) sizes for softcs.mpi2018-12-051-6/+8
| | | | ok tedu@
* free(9) size for temporary buffer.mpi2018-12-051-7/+7
| | | | ok ratchov@
* Trivial MH_ALIGN/M_ALIGN to m_align conversions.claudio2018-11-302-7/+7
| | | | OK bluhm@
* EVFILT_TIMER: Remove extra tick from tvtohz(9) on timeout reload.cheloha2018-11-271-2/+6
| | | | | | | | | | | | | | | tvtohz(9) adds an extra tick to account for the present tick, but this tick needs to be removed when the timeout is reloaded thereafter. We already do this for periodic setitimer(2) timeouts. Prompted by Paul Herman's writeup on clock aliasing for DragonflyBSD: https://frenchfries.net/paul/dfly/nanosleep.html Also fixed in FreeBSD r238424. Style tweaks from visa. ok visa@, guenther@
* In unp_internalize() check the length more carefully preventing anclaudio2018-11-211-1/+3
| | | | | | | underflow in a later calcuation. Using the same CMSG_LEN(0) check that other cmsghdr handlers implemented. Probelm found by anton@ OK anton@, deraadt@, visa@
* When using MSG_PEEK to peak into packets skip control messages holdingclaudio2018-11-212-11/+20
| | | | | | SCM_RIGHTS from being sent to the userland since they hold kernel internal data and it does not make sense to externalize it. OK deraadt@, guenther@, visa@
* free(9) sizes for bread_cluser().mpi2018-11-211-5/+5
| | | | ok mikeb@, visa@
* delete the dns jackport experiment. it has no future.tedu2018-11-192-34/+2
|
* Utilize sigio with sockets.visa2018-11-193-16/+17
| | | | OK mpi@
* Add new KERN_CPUSTATS sysctl(2) so we can identify offline CPUs.cheloha2018-11-173-3/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because of hw.smt we need a way to determine whether a given CPU is "online" or "offline" from userspace. KERN_CPTIME2 is an array, and so cannot be cleanly extended for this purpose, so add a new sysctl(2) KERN_CPUSTATS with an extensible struct. At the moment it's just KERN_CPTIME2 with a flags member, but it can grow as needed. KERN_CPUSTATS appears to have been defined by BSDi long ago, but there are few (if any) packages in the wild still using the symbol so breakage in ports should be near zero. No other system inherited the symbol from BSDi, either. Then, use the new sysctl(2) in systat(1) and top(1): - systat(1) draws placeholder marks ('-') instead of percentages for offline CPUs in the cpu view. - systat(1) omits offline CPU ticks when drawing the "big bar" in the vmstat view. The upshot is that the bar isn't half idle when half your logical CPUs are disabled. - top(1) does not draw lines for offline CPUs; if CPUs toggle on or offline in interactive mode we redraw the display to expand/reduce space for the new/missing CPUs. This is consistent with what some top(1) implementations do on Linux. - top(1) omits offline CPUs from the totals when CPU totals are combined into a single line (the '-1' flag). Originally prompted by deraadt@. Discussed endlessly with deraadt@, ketennis@, and sthen@. Tested by jmc@ and jca@. Earlier versions also discussed with jca@. Earlier versions tested by jmc@, tb@, and many others. docs ok jmc@, kernel bits ok ketennis@, everything ok sthen@, "Is your stuff in yet?" deraadt@
* Avoid leaking kernel memory in struct kevent padding.millert2018-11-171-1/+2
| | | | From NetBSD (maxv). OK deraadt@ visa@
* Revert previous, it breaks regress.mpi2018-11-141-3/+3
|
* Userland malloc(3) & free(3) take only one argument.mpi2018-11-141-3/+3
|
* Fix fcntl(fd, F_GETOWN) with pipes. As a regressionvisa2018-11-131-2/+2
| | | | | | | of kern_descrip.c r1.177 and sys_pipe.c r1.82, the call always returned an error. OK jca@ anton@ mpi@
* Utilize sigio with pipes. This makes fcntl(fd, F_SETOWN, arg) correctlyvisa2018-11-121-8/+8
| | | | | | | | handle arg as a process ID if the value is positive and as a process group ID if the value is negative. In addition, now the signal sending checks privileges. OK mpi@
* Add a mechanism for managing asynchronous IO signal registrations.visa2018-11-124-4/+279
| | | | | | | | | It centralizes IO signal privilege checking and makes possible to revoke a registration when the target process or process group is deleted. Adapted from FreeBSD. OK kettenis@ mpi@ guenther@
* Introduce m_align() a function that works like M_ALIGN() but works withclaudio2018-11-121-15/+25
| | | | | | | | all types of mbufs. Also introduce some KASSERT in the m_*space() functions to ensure that no negative number is returned. This also introduces two internal macros M_SIZE() & M_DATABUF() which return the right size and start pointer of the mbuf data area. Use it in a few obvious places to simplify code. OK bluhm@
* use the LFPRINTF() debug macro consistently; ok mpi@anton2018-11-101-38/+11
|
* Conform to POSIX-2001 in which the behavior of passing a negative length usinganton2018-11-101-6/+11
| | | | | | | posix file locks is defined. Also, detect overflows when dealing with positive lengths. ok millert@ visa@
* M_LEADINGSPACE() and M_TRAILINGSPACE() are just wrappers forclaudio2018-11-094-21/+28
| | | | | | m_leadingspace() and m_trailingspace(). Convert all callers to call directly the functions and remove the defines. OK krw@, mpi@
* new sysctl for userland malloc flags, kernel part. ok millert@ deraadt@otto2018-11-061-1/+4
|
* trace struct flock; ok visa@anton2018-11-051-1/+9
|
* make debug flags continuousanton2018-11-021-2/+2
|
* If we execute a #!shell binary, the shell is an integral part of thederaadt2018-10-301-1/+3
| | | | | | | | binary so it should bypass unveil restrictions. This is similar (but different...) to how the ELF linker (ld.so) is loaded (after unveils get dropped). Discovered in doas, due to more accurate unveil semantics. ok guenther tedu beck
* irrelevant part snuck into previous commit; from semariederaadt2018-10-291-2/+1
|
* Now that most archs have better NMBCLUSTERS defaults it is possible to bringclaudio2018-10-291-3/+2
| | | | | | | | | | | | | back rev 1.90. ---- mbufs and mbuf clusters are now backed by large pools. Because of this we can relax the oversubscribe limit of socketbuffers a fair bit. Instead of maxing out as sb_max * 1.125 or 2 * sb_hiwat the maximum is increased to 8 * sb_hiwat -- which seems to be a good compromise between memory waste and better socket buffer usage. OK deraadt@ ---- ok benno@
* needs sys/lock.hderaadt2018-10-291-1/+2
|
* Correctly deal with upper level unveil's by keeping track of the coveringbeck2018-10-284-99/+252
| | | | | | | unveil for each unveil in the process at unveil() time, and refactoring the handling of current directory and ISDOTDOT to be much more sensible. Worked out at ns2k18 with guenther@. ok deraadt@
* Add assertions for lockf list manipulation, hidden behind LOCKF_DIAGNOSTIC.anton2018-10-271-15/+61
| | | | | | | While here, improve existing lockf debug routines and sprinkle some more logging related to list manipulation. ok deraadt@ visa@ (as part of a larger diff)
* Rework previous lockf fix; bluhm@ noticed a regress failure during consecutiveanton2018-10-271-43/+73
| | | | | | | | runs. This is a second attempt in which the lockf structure is turned into a doubly linked list which makes it easier to ensure correctness during list insertion and deletion. ok deraadt@ visa@
* Fix a resource leak in doaccept().visa2018-10-251-3/+3
| | | | | | | | | | | | | | | If a connection that is being accepted gets aborted early, or if the user-supplied buffer is invalid, doaccept() leaks a socket. This is a regression caused by r1.153 of uipc_syscalls.c. Correct the issue by associating the socket with the file early enough. In case soaccept() or copyaddrout() fails, the socket will be freed as a result of the file closing. This logic was used by the pre-r1.153 code. closef() may block, so it is hoisted outside the fdp lock. OK bluhm@ mpi@
* Only the scheduler time statistics should be affected by spinning.bluhm2018-10-171-9/+8
| | | | | | | Change the process time accounting back to the original code before spinning time was added. No change for scheduler time. Spinning interrupts are no longer accounted to process system time. input and OK visa@
* User land time accounting has changed when kernel spinning time wasbluhm2018-10-101-3/+5
| | | | | | introduced. Account spinning time to the process system time again. time(1) has no spinning, it only shows real, user, sys. OK visa@ mpi@ deraadt@
* Fix a "copy-and-paste" error that Coverity picked up in the augment codedlg2018-10-091-2/+2
| | | | | | | | This brings it back in line with the macros. via Paco A. and the FRRouting project. ok deraadt@ visa@ guenther@ tb@
* When freeing a lockf struct that already is part of a linked list, make sure toanton2018-10-061-2/+9
| | | | | | update the next pointer for the preceding lock. Prevents a double free panic. ok millert@
* Revert KERN_CPTIME2 ENODEV changes in kernel and userspace.cheloha2018-10-052-10/+2
| | | | ok kettenis deraadt
* Call unveil_destroy() from exit1() instead of from the reaper. Fixes akettenis2018-10-041-3/+3
| | | | | | | race between the reaper and unveil_removevnode() that would trigger a KASSERT. At least as far as I can tell. Pointed out by semarie@ ok beck@, deraadt@
* Revert the inpcb table mutex commit. It triggers a witness panicbluhm2018-10-041-26/+11
| | | | | | | in raw IP delivery and UDP broadcast loops. There inpcbtable_mtx is held and sorwakeup() is called within the loop. As sowakeup() grabs the kernel lock, we have a lock ordering problem. found by Hrvoje Popovski; OK deraadt@ mpi@
* Use atomic operations to update vfc_refcount. Change the field's typevisa2018-09-291-4/+5
| | | | | | to unsigned int. OK deraadt@