summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_time.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* inittodr(9): introduce dedicated flag to enable writes from resettodr(9)cheloha2020-06-221-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | We don't want resettodr(9) to write the RTC until inittodr(9) has actually run. Until inittodr(9) calls tc_setclock() the system UTC clock will contain a meaningless value and there's no sense in overwriting a good value with a value we know is nonsense. This is not an uncommon problem if you're debugging a problem in early boot, e.g. a panic that occurs prior to inittodr(9). Currently we use the following logic in resettodr(9) to inhibit writes: if (time_second == 1) return; ... this is too magical. A better way to accomplish the same thing is to introduce a dedicated flag set from inittodr(9). Hence, "inittodr_done". Suggested by visa@. ok kettenis@
* clock_gettime(2): use nanoruntime(9) to get value for CLOCK_UPTIMEcheloha2020-05-201-5/+2
|
* Add function for attaching RTC drivers, to reduce direct usevisa2020-05-171-1/+7
| | | | | | of todr_handle. OK kettenis@
* Make inittodr() and resettodr() MI.kettenis2020-05-161-1/+91
| | | | | ok deraadt@, mpi@, visa@ ok cheloha@ as well (would have preferred in new file for this code)
* nanosleep(2): tsleep(9) -> tsleep_nsec(9)cheloha2020-03-201-4/+5
| | | | | | | | | | While here, rename the wait channel so the tsleep_nsec(9) call will fit onto a single line. It isn't a global channel so the name is arbitrary anyway. With input from visa@. ok visa@
* adjfreq(2): fix atomic swapcheloha2019-11-071-4/+4
| | | | | | | | | I broke adjfreq(2)'s atomic swap in kern_time.c,v1.112. By using the "f" variable to store both the new and old frequency adjustments, the new adjustment gets clobbered by the old adjustment if the caller asked for a swap. ok visa@ mpi@
* clock_getres(2): actually return the resolution of the given clockcheloha2019-10-261-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently we return (1000000000 / hz) from clock_getres(2) as the resolution for every clock. This is often untrue. For CPUTIME clocks, if we have a separate statclock interrupt the resolution is (1000000000 / stathz). Otherwise it is as we currently claim: (1000000000 / hz). For the REALTIME/MONOTONIC/UPTIME/BOOTTIME clocks the resolution is that of the active timecounter. During tc_init() we can compute the precision of a timecounter by examining its tc_counter_mask and store it for lookup later in a new member, tc_precision. The resolution of a clock backed by a timecounter "tc" is then tc.tc_precision * (2^64 / tc.tc_frequency) fractional seconds. While here we can clean up sys_clock_getres() a bit. Standards input from guenther@. Lots of input, feedback from kettenis@. ok kettenis@
* gettimeofday, settimeofday(2): limit timezone supportcheloha2019-09-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | For gettimeofday(2), always copy out an empty timezone struct. For settimeofday(2), still copyin(9) the struct but ignore the contents. In gettimeofday(2)'s case we have not changed the original BSD semantics: the kernel only tracks UTC time without an offset for DST, so a zeroed timezone struct is the correct thing to return to the caller. Future work could move these out into libc as stubs for clock_gettime and clock_settime(2). But, definitely a "later" thing, given that we are in beta. Update the manpage to de-emphasize the timezone parameters for these syscalls. Discussed with tedu@, deraadt@, millert@, kettenis@, yasuoka@, jca@, and guenther@. Tested by job@. Ports input from jca@ and sthen@. Manpage input from jca@. ok jca@ deraadt@
* R.I.P. itimerround(); ok mpi@cheloha2019-08-031-12/+1
|
* per-process itimers: itimerval -> itimerspeccheloha2019-08-021-34/+35
| | | | | | | | | | | | | Loongson runs at 128hz. 128 doesn't divide evenly into a million, but it does divide evenly into a billion. So if we do the per-process itimer bookkeeping with itimerspec structs we can have error-free virtual itimers on loongson just as we do on most other platforms. This change doesn't fix the virtual itimer error alpha, as 1024 does not divide evenly into a billion. But this doesn't make the situation any worse, either. ok deraadt@
* itimerdecr(): simplify logic with timer*(9) macros; ok millert@cheloha2019-07-251-32/+19
|
* R.I.P. timespecfix(); ok visa@ mpi@cheloha2019-07-021-15/+1
|
* Switch from bintime_add() et al. to bintimeadd(9).cheloha2019-06-031-3/+3
| | | | | | | | | | | | | | | Basically just make all the bintime routines look and behave more like the timeradd(3) macros. Switch to three-argument forms for structure math, introduce and use bintimecmp(9), and rename the structure conversion routines to resemble e.g. TIMEVAL_TO_TIMESPEC(3). Document all of this in a new bintimeadd.9 page. Code input from mpi@, manpage input from schwarze@. code ok mpi@, docs ok schwarze@, docs probably still ok jmc@
* Revert to using the SCHED_LOCK() to protect time accounting.mpi2019-06-011-11/+3
| | | | | | | | | It currently creates a lock ordering problem because SCHED_LOCK() is taken by hardclock(). That means the "priorities" of a thread should be moved out of the SCHED_LOCK() first in order to make progress. Reported-by: syzbot+8e4863b3dde88eb706dc@syzkaller.appspotmail.com via anton@ as well as by kettenis@
* Use a per-process mutex to protect time accounting instead of SCHED_LOCK().mpi2019-05-311-3/+11
| | | | | | | Note that hardclock(9) still increments p_{u,s,i}ticks without holding a lock. ok visa@, cheloha@
* Fix uninitialized return code in adjfreq(2); CID 1480285stsp2019-05-211-2/+2
| | | | ok mlarkin, otto (who both had the same diff)
* Unlock adjfreq(2), adjtime(2), clock_settime(2), and settimeofday(2).cheloha2019-05-091-1/+3
| | | | | | | | | clock_settime(2)/settimeofday(2) still need KERNEL_LOCK for a moment when resetting the RTC, as that's done periodically from a task under KERNEL_LOCK. Not quite sure how to approach that one yet. ok visa@ mpi@, "good stuff" tedu@, "please wait until after [tree] unlock" deraadt@
* Tweak previous: include <sys/stdint.h> for INT64_MAX/INT64_MIN.cheloha2019-03-261-1/+2
|
* adjtime(2): set EINVAL if delta overflows 64 bits of microseconds.cheloha2019-03-261-3/+13
| | | | | | | | | | | | | | | | No other (known) BSD-derived adjtime(2) implementation checks for overflow when converting delta into its final denomination of fractional seconds. This is peculiar, as the call originates in 4.3BSD. However, glibc, uclibc, and (to an extent) musl /do/ check the input and set EINVAL if it exceeds a certain bound, so we'll just use the errno that they use to be consistent with extant practice. Prompted by the comment kettenis@ left when we switched to storing the adjustment in an int64_t like ~5 years ago (kern_time.c,v 1.87). Positive feedback from deraadt@, manpage bits ok jmc@, no code complaints from otto@ or tedu@.
* MP-safe timecounting: new rwlock: tc_lockcheloha2019-03-251-26/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | tc_lock allows adjfreq(2) and the kern.timecounter.hardware sysctl(2) to read/write the active timecounter pointer and the .tc_adj_freq member of the active timecounter safely. This eliminates any possibility of a torn read/write for the .tc_adj_freq member when we drop the KERNEL_LOCK from the timecounting layer. It also ensures the active timecounter does not change in the midst of an adjfreq(2) call. Because these are not high-traffic paths, we can get away with using tc_lock in write-mode to ensure combination read/write adjtime(2) calls are relatively atomic (a) to other writer adjtime(2) calls, and (b) to settimeofday(2)/clock_settime(2) calls, which cancel ongoing adjtime(2) adjustment. When the KERNEL_LOCK is dropped, an unprivileged user will be able to create some tc_lock contention via adjfreq(2); it is very unlikely to ever be a problem. If it ever is actually a problem a lockless read could be added to address it. While here, reorganize sys_adjfreq()/sys_adjtime() to minimize code under the lock. Also while here, make tc_adjfreq() void, as it cannot fail under any circumstance. Also also while here, annotate various globals/struct members with lock ordering details. With lots of input from mpi@ and visa@. ok visa@
* Move adjtimedelta from kern_time.c to kern_tc.c.cheloha2019-03-101-13/+8
| | | | | | | | | | | | | | | This will simplify upcoming MP-safety diffs for the timecounting layer. adjtimedelta is now accessed nowhere outside of kern_tc.c, so we can remove its extern declaration from kernel.h. Zeroing adjtimedelta within timecounter_mtx before we jump the real-time clock is also a bit safer than what we do now, as we are not racing a simultaneous tc_windup() call from hardclock(), which itself can modify adjtimedelta via ntp_update_second(). Discussed with visa@ and mpi@. ok visa@
* matthew noticed that some clocks use tfind() which is not mpsafe.tedu2019-01-311-10/+20
| | | | | add locking in clock_gettime where needed. ok cheloha matthew
* Sprinkle a pinch of timerisvalid/timespecisvalid over the rest of sys/kerncheloha2019-01-231-10/+8
|
* no need to KERNEL_LOCK before calling ktrstruct() anymore; ok mpi@ visa@cheloha2019-01-181-21/+6
|
* adjtime(2), settimeofday(2), clock_settime(2): validate inputcheloha2019-01-181-1/+8
| | | | | | | | | Add documentation for the new EINVAL cases for adjtime(2) and settimeofday(2). adjtime.2 docs ok schwarze@, settimeofday(2)/clock_settime(2) stuff ok tedu@, "stop waiting" deraadt@
* settime: Don't cancel ongoing adjtime(2) until after full permission checkscheloha2019-01-101-7/+6
| | | | ok jca@ visa@ guenther@ deraadt@
* nanosleep: loop tsleep(9) to ensure coverage of the full timeout range.cheloha2018-12-311-10/+13
| | | | | | | | | | | | tsleep(9)'s maximum timeout shrinks as HZ grows, so this ensures we do not return early from longer timeouts on alpha or on custom kernels. POSIX says you cannot return early unless a signal is delivered, so this makes us more compliant with the standard. While here, remove the 100 million second upper bound. It is an artifact from itimerfix() and it serves no discernible purpose. ok tedu@ visa@
* sys_nanosleep: switch to descriptive, idiomatic variable names; ok tedu@cheloha2018-12-291-20/+19
|
* Constipate a bunch of time functionsguenther2018-05-281-2/+2
| | | | ok tb@ kettenis@
* nanosleep: ensure tv_nsec input is on [0, 1000000000)cheloha2018-05-221-5/+3
| | | | | | | | | | | Instead of converting timespec -> timeval and truncating the input, check with timespecfix and use tstohz(9) for the tsleep. All other contemporary systems check this correctly. Also add a regression test for this case. ok tb@
* Remove almost unused `flags' argument of suser().mpi2018-02-191-5/+5
| | | | | | | The account flag `ASU' will no longer be set but that makes suser() mpsafe since it no longer mess with a per-process field. No objection from millert@, ok tedu@, bluhm@
* Add the CLOCK_BOOTTIME clockid for use with clock_gettime(2)cheloha2017-12-181-1/+3
| | | | | | | | | | | | | | | | | and put it to use in userspace in lieu of the kern.boottime sysctl. Its absolute value is the time that has elapsed since the system booted, i.e., the system uptime. Use in top(1), w(1), and snmpd(8) eliminates a race with settimeofday(2), adjtime(2), etc. inherent to deriving the system uptime via the kern.boottime sysctl. Product of a great deal of discussion/revision with jca@, tb@, and guenther@. ok tb@ jca@ guenther@ dlg@ mlarkin@ tom@
* Rename pfind(9) into tfind(9) to reflect that it deals with threads.mpi2017-01-241-3/+3
| | | | | | While here document prfind(9. with and ok guenther@
* Write the system time back to the RTC every 30 minutes.naddy2016-09-031-1/+37
| | | | | | | This fixes the problem that long-running machines which were not shut down properly would reboot with a badly offset system time. hints and ok kettenis@
* careful study of the holy scrolls reveals that for pselect (and ppoll)tedu2016-04-281-2/+4
| | | | | oversized timespecs should be clamped, not rejected. ok millert
* remove stale lint annotationstedu2015-12-051-10/+1
|
* refactor pledge_*_check and pledge_fail functionssemarie2015-11-011-3/+4
| | | | | | | | | | | | | | - rename _check function without suffix: a "pledge" function called from anywhere is a "check" function. - makes pledge_fail call the responsability to the _check function. remove it from caller. - make proper use of (potential) returned error of _check() functions. - adds pledge_kill() and pledge_protexec() with and OK deraadt@
* Rename tame() to pledge(). This fairly interface has evolved to be morederaadt2015-10-091-3/+3
| | | | | | strict than anticipated. It allows a programmer to pledge/promise/covenant that their program will operate within an easily defined subset of the Unix environment, or it pays the price.
* Only include <sys/tame.h> in the .c files that need itguenther2015-09-111-1/+2
| | | | ok deraadt@ miod@
* Move to tame(int flags, char *paths[]) API/ABI.deraadt2015-08-221-2/+1
| | | | | | | | | | | | The pathlist is a whitelist of dirs and files; anything else returns ENOENT. Recommendation is to use a narrowly defined list. Also add TAME_FATTR, which permits explicit change operations against "struct stat" fields. Some other TAME_ flags are refined slightly. Not cranking libc now, since nothing commited in base uses this and the timing is uncomfortable for others. Discussed with many; thanks for a few bug fixes from semarie, doug, guenther. ok guenther
* tame(2) is a subsystem which restricts programs into a "reduced featurederaadt2015-07-191-1/+5
| | | | | | operating model". This is the kernel component; various changes should proceed in-tree for a while before userland programs start using it. ok miod, discussions and help from many
* Protect the per-process itimerval structs with a mutex. We update thesekettenis2015-04-281-10/+13
| | | | | | | | | | | from hardclock() which runs without grabbing the kernel lock. This means that two threads could concurrently update the struct which could lead to corruption of the value which in turn could stop the timer. It could also result in getitimer(2) returning a non-normalized value. With help from guenther@. ok deraadt@, guenther@
* typo; fix from Kaspars Bankovskisderaadt2014-12-071-2/+2
|
* Prefer prsignal() to send process signalsguenther2014-05-151-2/+2
|
* Simplyfy adjtime(2) by keeping track of the adjustment as a number ofkettenis2014-01-301-22/+19
| | | | | | | | microsecond in a 64-bit integer. Fixes the issue where ntpd loses sync because the struct timeval currently used to hold the adjustment is not properly normalized after the changes guenther@ made. ok guenther@, millert@
* timeval, timespec, and itimerval have padding on many archs. If we'reguenther2014-01-221-12/+24
| | | | | | | | going to copyout one, memset the structure and then set it member by member. sys_adjtime() does that on copyin instead, as it already has to munge the members as it goes. ok deraadt@
* Move the declarations for dogetrusage(), itimerround(), and dowait4()guenther2013-10-251-3/+1
| | | | | | | | | | to sys/*.h headers so that the compat/linux code can use them. Change dowait4() to not copyout() the status value, but rather leave that for its caller, as compat/linux has to translate it, with the side benefit of simplifying the native code. Originally written months ago as part of the time_t work; long memory, prodding, and ok from pirofti@
* Fix delivery of SIGPROF and SIGVTALRM to threaded processes by havingguenther2013-10-081-5/+1
| | | | | | | | hardclock() set a flag on the running thread and force AST processing, and then have the thread signal itself from userret(). idea and flag names from FreeBSD ok jsing@
* Add CLOCK_UPTIME, a clock which measures time-running-not-suspended, soguenther2013-10-061-1/+8
| | | | | | | that mlarkin@ can fix programs that report rates-over-uptime. ok kettenis@ manpage corrections jmc@ (which I've probably broken again)
* Snapshots for all archs have been built, so remove the T32 codeguenther2013-09-141-334/+1
|