summaryrefslogtreecommitdiffstats
path: root/sys/kern/kern_tc.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Switch from bintime_add() et al. to bintimeadd(9).cheloha2019-06-031-27/+25
| | | | | | | | | | | | | | | Basically just make all the bintime routines look and behave more like the timeradd(3) macros. Switch to three-argument forms for structure math, introduce and use bintimecmp(9), and rename the structure conversion routines to resemble e.g. TIMEVAL_TO_TIMESPEC(3). Document all of this in a new bintimeadd.9 page. Code input from mpi@, manpage input from schwarze@. code ok mpi@, docs ok schwarze@, docs probably still ok jmc@
* SLIST-ify the timecounter list.cheloha2019-05-221-8/+9
| | | | | | | Call it "tc_list" instead of "timecounters", which is too similar to the variable "timecounter" for my taste. ok mpi@ visa@
* kern.timecounter.choices: Don't offer the dummy counter as an option.cheloha2019-05-201-2/+5
| | | | | | | | | | | | | | | The dummy counter is a stopgap during boot. It is not useful after a real timecounter is attached and started and there is no reason to return to using it. So don't even offer it to the admin. This is easy: never add it to the timecounter list. It will effectively cease to exist after the first real timecounter is actived in tc_init(). In principle this means that we can have an empty timecounter list so we need to check for that case in sysctl_tc_choice(). "I don't mind" mpi@, ok visa@
* Reduce number of timehands from to just two.cheloha2019-05-101-21/+10
| | | | | | | | | | | | Reduces the worst-case error for for time values retrieved via the microtime(9) functions from 10 ticks to 2 ticks. Being interrupted for over a tick is unlikely but possible. While here use C99 initializers. From FreeBSD r303383. ok mpi@
* tc_setclock: always call tc_windup() before leaving windup_mtx.cheloha2019-04-301-9/+12
| | | | | We ought to conform to the windup_mtx protocol and call tc_windup() even if we aren't changing the system uptime.
* MP-safe timecounting: new rwlock: tc_lockcheloha2019-03-251-18/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | tc_lock allows adjfreq(2) and the kern.timecounter.hardware sysctl(2) to read/write the active timecounter pointer and the .tc_adj_freq member of the active timecounter safely. This eliminates any possibility of a torn read/write for the .tc_adj_freq member when we drop the KERNEL_LOCK from the timecounting layer. It also ensures the active timecounter does not change in the midst of an adjfreq(2) call. Because these are not high-traffic paths, we can get away with using tc_lock in write-mode to ensure combination read/write adjtime(2) calls are relatively atomic (a) to other writer adjtime(2) calls, and (b) to settimeofday(2)/clock_settime(2) calls, which cancel ongoing adjtime(2) adjustment. When the KERNEL_LOCK is dropped, an unprivileged user will be able to create some tc_lock contention via adjfreq(2); it is very unlikely to ever be a problem. If it ever is actually a problem a lockless read could be added to address it. While here, reorganize sys_adjfreq()/sys_adjtime() to minimize code under the lock. Also while here, make tc_adjfreq() void, as it cannot fail under any circumstance. Also also while here, annotate various globals/struct members with lock ordering details. With lots of input from mpi@ and visa@. ok visa@
* Move adjtimedelta into the timehands.cheloha2019-03-221-39/+58
| | | | | | | | | | | | | | | | | | | adjtimedelta is 64-bit and thus can't be read/written atomically on all architectures. Because it can be modified from tc_windup() and ntp_update_second() we need a way to ensure safe reads/writes for adjtime(2) callers. One solution is to move it into the timehands and adopt the lockless read protocol we now use for the system boot time and uptime. So make new_adjtimedelta an argument to tc_windup() and add a lockless read loop to tc_adjtime(). With adjtimedelta stored in the timehands we can now simply pass a timehands pointer to ntp_update_second(). This makes ntp_update_second() safer as we're using the timehands' timecounter pointer instead of the mutable global timecounter pointer. Lots of input from mpi@ and visa@. ok visa@
* Rename "timecounter_mtx" to "windup_mtx".cheloha2019-03-221-11/+12
| | | | | | This will make upcoming MP-related diffs smaller and should make the code int kern_tc.c easier to read in general. "windup_mtx" is also a better mnemonic: always call tc_windup() before leaving windup_mtx.
* Change boot time/offset within tc_windup().cheloha2019-03-171-14/+25
| | | | | | | | | | | | We need to perform the actual modification of the boot offset and the time-of-boot within the "safe zone" in tc_windup() where the timehands' generation is zero to conform to the timehands lockless read protocol. Based on FreeBSD r303387. Discussed with mpi@ and visa@. ok visa@
* Move adjtimedelta from kern_time.c to kern_tc.c.cheloha2019-03-101-1/+18
| | | | | | | | | | | | | | | This will simplify upcoming MP-safety diffs for the timecounting layer. adjtimedelta is now accessed nowhere outside of kern_tc.c, so we can remove its extern declaration from kernel.h. Zeroing adjtimedelta within timecounter_mtx before we jump the real-time clock is also a bit safer than what we do now, as we are not racing a simultaneous tc_windup() call from hardclock(), which itself can modify adjtimedelta via ntp_update_second(). Discussed with visa@ and mpi@. ok visa@
* tc_windup: read active timecounter once at function start.cheloha2019-03-091-5/+8
| | | | | | | | | | | | | | tc_windup() is not necessarily called with KERNEL_LOCK, so it is possible for the timecounter pointer to change in the midst of the call via the kern.timecounter.hardware sysctl(2). Reading it once and using that local copy ensures we're referring to the same timecounter consistently. Apparently the compiler can optimize this out... somehow... so there may be room for improvement. Idea from visa@. With input from visa@, mpi@, cjeker@, and guenther@. ok visa@ mpi@
* tc_setclock: Don't rewind the system uptime during resume/unhibernate.cheloha2019-01-311-1/+16
| | | | | | | | | | | | | | | When we come back from suspend/hibernate the BIOS/firmware/whatever can hand us *any* TOD, so we need to check that the given TOD doesn't set our boot offset backwards, breaking the monotonicity of e.g. CLOCK_MONOTONIC. This is trivial to do from the BIOS on most PCs before unhibernating. There might be other ways it can happen, accidentally or otherwise. This is a bit messy but it can be made prettier later with a "bintimecmp" macro or something like that. Problem confirmed by jmatthew@. "you are very likely right" deraadt@
* Serialize tc_windup() calls and modification of some timehands members.cheloha2019-01-201-4/+20
| | | | | | | | | | | | | | | | | | | | If a user thread from e.g. clock_settime(2) is in the midst of changing the boottime or calling tc_windup() when it is interrupted by hardclock(9), the timehands could be left in a damaged state. So protect tc_windup() calls with a mutex, timecounter_mtx. hardclock(9) merely attempts to enter the mutex instead of spinning because it cannot afford to wait around. In practice hardclock(9) will skip tc_windup() very rarely, and when it does skip there aren't any negative effects because the skip indicates that a user thread is already calling, or about to call, tc_windup() anyway. Based on FreeBSD r303387 and NetBSD sys/kern/kern_tc.c,v1.30 Discussed with mpi@ and visa@. Tons of nice technical detail about lockless reads from visa@. OK visa@
* Move boottime into the timehands.cheloha2019-01-191-22/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To protect the timehands we first need to protect the basis for all UTC time in the kernel: the boottime. Because the boottime can be changed at any time it needs to be versioned along with the other members of the timehands to enable safe lockless reads when using it for anything. So the global boottime timespec goes away and the static boottimebin becomes a member of the timehands. Instead of reading the global boottime you use one of two interfaces: binboottime(9) or microboottime(9). nanoboottime(9) can trivially be added later, though there are no consumers for it at the moment. This introduces one small change in behavior. We used to advance the reported boottime just before launching kernel threads from main(). This makes it look to userland like we "booted" moments before those threads were launched. Because there is no longer a boottime global we can no longer trivially do this from main(), so the boottime we report to userspace via e.g. kern.boottime will now reflect whatever the time was when we bootstrapped the timehands via inittodr(9). This is usually no more than a minute before the kernel threads are launched from main(). The prior behavior can be restored by adding a new interface to the timecounter layer in a future commit. Based on FreeBSD r303387. Discussed with mpi@ and visa@. ok visa@
* Updating time counters without memory barriers is wrong. Putbluhm2018-09-181-1/+15
| | | | | | | | membar_producer() into tc_windup() and membar_consumer() into the uptime functions. They order the visibility of the time and generation number updates. This is a combination of what NetBSD and FreeBSD do. OK kettenis@
* Constipate a bunch of time functionsguenther2018-05-281-3/+3
| | | | ok tb@ kettenis@
* replace add_*_randomness with enqueue_randomness()jasper2018-04-281-4/+4
| | | | | | | | | this gets rid of the source annotation which doesn't really add anything other than adding complexitiy. randomess is generally good enough that the few extra bits that the source type would add are not worth it. ok mikeb@ deraadt@
* Drop unused variable from ntp_update_second().dhill2017-03-071-4/+4
| | | | ok jca@ deraadt@
* remove a dead variable; ok millert, guenthermikeb2017-02-091-4/+1
|
* fix several places where calculating ticks could overflow.tedu2016-07-061-2/+2
| | | | | | | | it's not enough to assign to an unsigned type because if the arithmetic overflows the compiler may decide to do anything. so change all the long long casts to uint64_t so that we start with the right type. reported by Tim Newsham of NCC. ok deraadt
* convert bcopy to memcpy. ok millerttedu2014-12-101-2/+2
|
* add a few sizes to freetedu2014-11-011-2/+2
|
* remove uneeded proc.h includesjsg2014-09-141-2/+2
| | | | ok mpi@ kspillner@
* add a size argument to free. will be used soon, but for now default to 0.tedu2014-07-121-2/+2
| | | | after discussions with beck deraadt kettenis.
* fix $OpenBSD$, noticed by philipbeck2014-04-031-0/+2
|
* I have discussed these licenses with Poul-Henning Kamp and he has agreed tobeck2014-04-031-9/+18
| | | | this license change. We will remember that we all still like beer.
* Simplyfy adjtime(2) by keeping track of the adjustment as a number ofkettenis2014-01-301-13/+8
| | | | | | | | microsecond in a 64-bit integer. Fixes the issue where ntpd loses sync because the struct timeval currently used to hold the adjustment is not properly normalized after the changes guenther@ made. ok guenther@, millert@
* Add CLOCK_UPTIME, a clock which measures time-running-not-suspended, soguenther2013-10-061-5/+6
| | | | | | | that mlarkin@ can fix programs that report rates-over-uptime. ok kettenis@ manpage corrections jmc@ (which I've probably broken again)
* Convert some internal APIs to use timespecs instead of timevalsguenther2013-06-031-2/+2
| | | | ok matthew@ deraadt@
* Use long long and %lld for printing tv_sec valuesguenther2013-06-021-4/+4
| | | | ok deraadt@
* unifdef -D __HAVE_TIMECOUNTERmiod2012-11-051-3/+1
|
* On resume, run forward the monotonic and realtimes clocks instead of jumpingguenther2012-05-241-5/+51
| | | | | | just the realtime clock, triggering and adjusting timeouts to reflect that. ok matthew@ deraadt@
* useless storederaadt2010-09-241-2/+1
|
* move DEBUG-only variable into #ifdefderaadt2010-09-241-2/+4
|
* remove proc.h include from uvm_map.h. This has far reaching effects, astedu2010-04-201-1/+2
| | | | | | sysctl.h was reliant on this particular include, and many drivers included sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed. ok deraadt
* fix typos in comments, no code changes;schwarze2010-01-141-2/+2
| | | | | from Brad Tilley <brad at 16systems dot com>; ok oga@
* queue tc randomness when we get it. the tc_init() ones are (might be)deraadt2008-11-241-2/+5
| | | | | | submitted before randomattach, and thus will perturb the first arc4random() call, which is very good ok art djm
* don't declare th0 extern before declaring it as static; makes gcc4 happyrobert2008-11-211-2/+2
| | | | ok deraadt@
* allow for max 5000 uses/sec offset adjust, this makes it possible forotto2007-12-271-3/+3
| | | | clocks with drifts larger than 500ppm to be corrected.
* unused apis, very dangerous: getbinuptime() getbintime(), ok artderaadt2007-05-091-28/+1
|
* Add missing bintime2timespec().kettenis2007-03-311-1/+2
| | | | ok art@
* typos; from bret lambertjmc2006-11-151-2/+2
|
* Timecounter based implementation of adjfreq(2). Largely from art@otto2006-10-301-1/+14
| | | | | Tested by various using not (yet) committed amd64 timecounter code. ok deraadt@
* clean up some small fallout from initial freebsd import.hshoexer2005-05-031-4/+3
| | | | ok grange@
* unused variable n; ok cloderderaadt2005-04-211-3/+3
|
* Some cleanup:grange2004-09-171-5/+7
| | | | | | | - don't mix unsigned and u_int across the code - un'static some funcs ok art@
* - Match time_second and time_uptime prototypes.art2004-08-041-7/+7
| | | | - Less chatty.
* This touches only MI code, and adds new time keeping code. Thetholo2004-07-281-0/+603
code is all conditionalized on __HAVE_TIMECOUNTER, and not enabled on any platforms. adjtime(2) support exists, courtesy of nordin@, sysctl(2) support and a concept of quality for each time source attached exists. High quality time sources exists for PIIX4 ACPI timer as well as some AMD power management chips. This will have to be redone once we actually add ACPI support (at that time we need to use the ACPI interfaces to get at these clocks). ok art@ ken@ miod@ jmc@ and many more