aboutsummaryrefslogtreecommitdiffstats
path: root/kernel/time/hrtimer.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-01-22hrtimer: Add missing sparse annotation for __run_timer()Jules Irenge1-1/+1
Sparse reports a warning at __run_hrtimer() |warning: context imbalance in __run_hrtimer() - unexpected unlock Add the missing must_hold() annotation. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200120224347.51843-1-jbi.octave@gmail.com
2020-01-14hrtimers: Prepare hrtimer_nanosleep() for time namespacesAndrei Vagin1-5/+7
clock_nanosleep() accepts absolute values of expiration time when TIMER_ABSTIME flag is set. This absolute value is inside the task's time namespace, and has to be converted to the host's time. There is timens_ktime_to_host() helper for converting time, but it accepts ktime argument. As a preparation, make hrtimer_nanosleep() accept a clock value in ktime instead of timespec64. Co-developed-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-17-dima@arista.com
2019-12-03Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-4/+7
Pull timer updates from Ingo Molnar: "The main changes in the timer code in this cycle were: - Clockevent updates: - timer-of framework cleanups. (Geert Uytterhoeven) - Use timer-of for the renesas-ostm and the device name to prevent name collision in case of multiple timers. (Geert Uytterhoeven) - Check if there is an error after calling of_clk_get in asm9260 (Chuhong Yuan) - ABI fix: Zero out high order bits of nanoseconds on compat syscalls. This got broken a year ago, with apparently no side effects so far. Since the kernel would use random data otherwise I don't think we'd have other options but to fix the bug, even if there was a side effect to applications (Dmitry Safonov) - Optimize ns_to_timespec64() on 32-bit systems: move away from div_s64_rem() which can be slow, to div_u64_rem() which is faster (Arnd Bergmann) - Annotate KCSAN-reported false positive data races in hrtimer_is_queued() users by moving timer->state handling over to the READ_ONCE()/WRITE_ONCE() APIs. This documents these accesses (Eric Dumazet) - Misc cleanups and small fixes" [ I undid the "ABI fix" and updated the comments instead. The reason there were apparently no side effects is that the fix was a no-op. The updated comment is to say _why_ it was a no-op. - Linus ] * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Zero the upper 32-bits in __kernel_timespec on 32-bit time: Rename tsk->real_start_time to ->start_boottime hrtimer: Remove the comment about not used HRTIMER_SOFTIRQ time: Fix spelling mistake in comment time: Optimize ns_to_timespec64() hrtimer: Annotate lockless access to timer->state clocksource/drivers/asm9260: Add a check for of_clk_get clocksource/drivers/renesas-ostm: Use unique device name instead of ostm clocksource/drivers/renesas-ostm: Convert to timer_of clocksource/drivers/timer-of: Use unique device name instead of timer clocksource/drivers/timer-of: Convert last full_name to %pOF
2019-11-15y2038: remove CONFIG_64BIT_TIMEArnd Bergmann1-1/+1
The CONFIG_64BIT_TIME option is defined on all architectures, and can be removed for simplicity now. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-11-06hrtimer: Annotate lockless access to timer->stateEric Dumazet1-4/+7
syzbot reported various data-race caused by hrtimer_is_queued() reading timer->state. A READ_ONCE() is required there to silence the warning. Also add the corresponding WRITE_ONCE() when timer->state is set. In remove_hrtimer() the hrtimer_is_queued() helper is open coded to avoid loading timer->state twice. KCSAN reported these cases: BUG: KCSAN: data-race in __remove_hrtimer / tcp_pacing_check write to 0xffff8880b2a7d388 of 1 bytes by interrupt on cpu 0: __remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991 __run_hrtimer kernel/time/hrtimer.c:1496 [inline] __hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576 hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593 __do_softirq+0x115/0x33f kernel/softirq.c:292 run_ksoftirqd+0x46/0x60 kernel/softirq.c:603 smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 read to 0xffff8880b2a7d388 of 1 bytes by task 24652 on cpu 1: tcp_pacing_check net/ipv4/tcp_output.c:2235 [inline] tcp_pacing_check+0xba/0x130 net/ipv4/tcp_output.c:2225 tcp_xmit_retransmit_queue+0x32c/0x5a0 net/ipv4/tcp_output.c:3044 tcp_xmit_recovery+0x7c/0x120 net/ipv4/tcp_input.c:3558 tcp_ack+0x17b6/0x3170 net/ipv4/tcp_input.c:3717 tcp_rcv_established+0x37e/0xf50 net/ipv4/tcp_input.c:5696 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561 sk_backlog_rcv include/net/sock.h:945 [inline] __release_sock+0x135/0x1e0 net/core/sock.c:2435 release_sock+0x61/0x160 net/core/sock.c:2951 sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145 tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393 tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434 inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0x9f/0xc0 net/socket.c:657 BUG: KCSAN: data-race in __remove_hrtimer / __tcp_ack_snd_check write to 0xffff8880a3a65588 of 1 bytes by interrupt on cpu 0: __remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991 __run_hrtimer kernel/time/hrtimer.c:1496 [inline] __hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576 hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593 __do_softirq+0x115/0x33f kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0xbb/0xe0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 read to 0xffff8880a3a65588 of 1 bytes by task 22891 on cpu 1: __tcp_ack_snd_check+0x415/0x4f0 net/ipv4/tcp_input.c:5265 tcp_ack_snd_check net/ipv4/tcp_input.c:5287 [inline] tcp_rcv_established+0x750/0xf50 net/ipv4/tcp_input.c:5708 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561 sk_backlog_rcv include/net/sock.h:945 [inline] __release_sock+0x135/0x1e0 net/core/sock.c:2435 release_sock+0x61/0x160 net/core/sock.c:2951 sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145 tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393 tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434 inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0x9f/0xc0 net/socket.c:657 __sys_sendto+0x21f/0x320 net/socket.c:1952 __do_sys_sendto net/socket.c:1964 [inline] __se_sys_sendto net/socket.c:1960 [inline] __x64_sys_sendto+0x89/0xb0 net/socket.c:1960 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 24652 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ tglx: Added comments ] Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191106174804.74723-1-edumazet@google.com
2019-10-14hrtimer: Annotate lockless access to timer->baseEric Dumazet1-4/+4
Followup to commit dd2261ed45aa ("hrtimer: Protect lockless access to timer->base") lock_hrtimer_base() fetches timer->base without lock exclusion. Compiler is allowed to read timer->base twice (even if considered dumb) which could end up trying to lock migration_base and return &migration_base. base = timer->base; if (likely(base != &migration_base)) { /* compiler reads timer->base again, and now (base == &migration_base) raw_spin_lock_irqsave(&base->cpu_base->lock, *flags); if (likely(base == timer->base)) return base; /* == &migration_base ! */ Similarly the write sides must use WRITE_ONCE() to avoid store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191008173204.180879-1-edumazet@google.com
2019-09-05hrtimer: Add a missing bracket and hide `migration_base' on !SMPSebastian Andrzej Siewior1-1/+11
The recent change to avoid taking the expiry lock when a timer is currently migrated missed to add a bracket at the end of the if statement leading to compile errors. Since that commit the variable `migration_base' is always used but it is only available on SMP configuration thus leading to another compile error. The changelog says "The timer base and base->cpu_base cannot be NULL in the code path", so it is safe to limit this check to SMP configurations only. Add the missing bracket to the if statement and hide `migration_base' behind CONFIG_SMP bars. [ tglx: Mark the functions inline ... ] Fixes: 68b2c8c1e4210 ("hrtimer: Don't take expiry_lock when timer is currently migrated") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190904145527.eah7z56ntwobqm6j@linutronix.de
2019-08-21hrtimer: Don't take expiry_lock when timer is currently migratedJulien Grall1-1/+5
migration_base is used as a placeholder when an hrtimer is migrated to a different CPU. In the case that hrtimer_cancel_wait_running() hits a timer which is currently migrated it would pointlessly acquire the expiry lock of the migration base, which is even not initialized. Surely it could be initialized, but there is absolutely no point in acquiring this lock because the timer is guaranteed not to run it's callback for which the caller waits to finish on that base. So it would just do the inc/lock/dec/unlock dance for nothing. As the base switch is short and non-preemptible, there is no issue when the wait function returns immediately. The timer base and base->cpu_base cannot be NULL in the code path which is invoking that, so just replace those checks with a check whether base is migration base. [ tglx: Updated from RT patch. Massaged changelog. Added comment. ] Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821092409.13225-4-julien.grall@arm.com
2019-08-21hrtimer: Protect lockless access to timer->baseJulien Grall1-1/+2
The update to timer->base is protected by the base->cpu_base->lock(). However, hrtimer_cancel_wait_running() does access it lockless. So the compiler is allowed to refetch timer->base which can cause havoc when the timer base is changed concurrently. Use READ_ONCE() to prevent this. [ tglx: Adapted from a RT patch ] Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190821092409.13225-2-julien.grall@arm.com
2019-08-20hrtimer: Improve comments on handling priority inversion against softirq kthreadFrederic Weisbecker1-4/+10
The handling of a priority inversion between timer cancelling and a a not well defined possible preemption of softirq kthread is not very clear. Especially in the posix timers side it's unclear why there is a specific RT wait callback. All the nice explanations can be found in the initial changelog of f61eff83cec9 (hrtimer: Prepare support for PREEMPT_RT"). Extract the detailed informations from there and put it into comments. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190820132656.GC2093@lenoir
2019-08-01hrtimer: Prepare support for PREEMPT_RTAnna-Maria Gleixner1-6/+89
When PREEMPT_RT is enabled, the soft interrupt thread can be preempted. If the soft interrupt thread is preempted in the middle of a timer callback, then calling hrtimer_cancel() can lead to two issues: - If the caller is on a remote CPU then it has to spin wait for the timer handler to complete. This can result in unbound priority inversion. - If the caller originates from the task which preempted the timer handler on the same CPU, then spin waiting for the timer handler to complete is never going to end. To avoid these issues, add a new lock to the timer base which is held around the execution of the timer callbacks. If hrtimer_cancel() detects that the timer callback is currently running, it blocks on the expiry lock. When the callback is finished, the expiry lock is dropped by the softirq thread which wakes up the waiter and the system makes progress. This addresses both the priority inversion and the life lock issues. The same issue can happen in virtual machines when the vCPU which runs a timer callback is scheduled out. If a second vCPU of the same guest calls hrtimer_cancel() it will spin wait for the other vCPU to be scheduled back in. The expiry lock mechanism would avoid that. It'd be trivial to enable this when paravirt spinlocks are enabled in a guest, but it's not clear whether this is an actual problem in the wild, so for now it's an RT only mechanism. [ tglx: Refactored it for mainline ] Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185753.737767218@linutronix.de
2019-08-01hrtimer: Determine hard/soft expiry mode for hrtimer sleepers on RTSebastian Andrzej Siewior1-0/+34
On PREEMPT_RT enabled kernels hrtimers which are not explicitely marked for hard interrupt expiry mode are moved into soft interrupt context either for latency reasons or because the hrtimer callback takes regular spinlocks or invokes other functions which are not suitable for hard interrupt context on PREEMPT_RT. The hrtimer_sleeper callback is RT compatible in hard interrupt context, but there is a latency concern: Untrusted userspace can spawn many threads which arm timers for the same expiry time on the same CPU. On expiry that causes a latency spike due to the wakeup of a gazillion threads. OTOH, priviledged real-time user space applications rely on the low latency of hard interrupt wakeups. These syscall related wakeups are all based on hrtimer sleepers. If the current task is in a real-time scheduling class, mark the mode for hard interrupt expiry. [ tglx: Split out of a larger combo patch. Added changelog ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185753.645792403@linutronix.de
2019-08-01hrtimer: Move unmarked hrtimers to soft interrupt expiry on RTSebastian Andrzej Siewior1-1/+11
On PREEMPT_RT not all hrtimers can be expired in hard interrupt context even if that is perfectly fine on a PREEMPT_RT=n kernel, e.g. because they take regular spinlocks. Also for latency reasons PREEMPT_RT tries to defer most hrtimers' expiry into softirq context. hrtimers marked with HRTIMER_MODE_HARD must be kept in hard interrupt context expiry mode. Add the required logic. No functional change for PREEMPT_RT=n kernels. [ tglx: Split out of a larger combo patch. Added changelog ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185753.551967692@linutronix.de
2019-08-01hrtimer: Make enqueue mode check work on RTThomas Gleixner1-2/+7
hrtimer_start_range_ns() has a WARN_ONCE() which verifies that a timer which is marker for softirq expiry is not queued in the hard interrupt base and vice versa. When PREEMPT_RT is enabled, timers which are not explicitely marked to expire in hard interrupt context are deferrred to the soft interrupt. So the regular check would trigger. Change the check, so when PREEMPT_RT is enabled, it is verified that the timers marked for hard interrupt expiry are not tried to be queued for soft interrupt expiry or any of the unmarked and softirq marked is tried to be expired in hard interrupt context. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-08-01hrtimer: Provide hrtimer_sleeper_start_expires()Thomas Gleixner1-2/+17
hrtimer_sleepers will gain a scheduling class dependent treatment on PREEMPT_RT. Create a wrapper around hrtimer_start_expires() to make that possible. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-08-01hrtimer: Consolidate hrtimer_init() + hrtimer_init_sleeper() callsSebastian Andrzej Siewior1-11/+32
hrtimer_init_sleeper() calls require prior initialisation of the hrtimer object which is embedded into the hrtimer_sleeper. Combine the initialization and spare a function call. Fixup all call sites. This is also a preparatory change for PREEMPT_RT to do hrtimer sleeper specific initializations of the embedded hrtimer without modifying any of the call sites. No functional change. [ anna-maria: Minor cleanups ] [ tglx: Adopted to the removal of the task argument of hrtimer_init_sleeper() and trivial polishing. Folded a fix from Stephen Rothwell for the vsoc code ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185752.887468908@linutronix.de
2019-07-30hrtimer: Remove task argument from hrtimer_init_sleeper()Thomas Gleixner1-4/+4
All callers hand in 'current' and that's the only task pointer which actually makes sense. Remove the task argument and set current in the function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20190726185752.791885290@linutronix.de
2019-06-27hrtimer: Use a bullet for the returns bullet listMauro Carvalho Chehab1-3/+4
That gets rid of this warning: ./kernel/time/hrtimer.c:1119: WARNING: Block quote ends without a blank line; unexpected unindent. and displays nicely both at the source code and at the produced documentation. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linux Doc Mailing List <linux-doc@vger.kernel.org> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Link: https://lkml.kernel.org/r/74ddad7dac331b4e5ce4a90e15c8a49e3a16d2ac.1561372382.git.mchehab+samsung@kernel.org
2019-06-12hrtimer: Remove unused header includeYangtao Li1-1/+0
seq_file.h does not need to be included, so remove it. Signed-off-by: Yangtao Li <tiny.windzz@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190607174253.27403-1-tiny.windzz@gmail.com
2019-03-05Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-1/+1
Pull year 2038 updates from Thomas Gleixner: "Another round of changes to make the kernel ready for 2038. After lots of preparatory work this is the first set of syscalls which are 2038 safe: 403 clock_gettime64 404 clock_settime64 405 clock_adjtime64 406 clock_getres_time64 407 clock_nanosleep_time64 408 timer_gettime64 409 timer_settime64 410 timerfd_gettime64 411 timerfd_settime64 412 utimensat_time64 413 pselect6_time64 414 ppoll_time64 416 io_pgetevents_time64 417 recvmmsg_time64 418 mq_timedsend_time64 419 mq_timedreceiv_time64 420 semtimedop_time64 421 rt_sigtimedwait_time64 422 futex_time64 423 sched_rr_get_interval_time64 The syscall numbers are identical all over the architectures" * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) riscv: Use latest system call ABI checksyscalls: fix up mq_timedreceive and stat exceptions unicore32: Fix __ARCH_WANT_STAT64 definition asm-generic: Make time32 syscall numbers optional asm-generic: Drop getrlimit and setrlimit syscalls from default list 32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option compat ABI: use non-compat openat and open_by_handle_at variants y2038: add 64-bit time_t syscalls to all 32-bit architectures y2038: rename old time and utime syscalls y2038: remove struct definition redirects y2038: use time32 syscall names on 32-bit syscalls: remove obsolete __IGNORE_ macros y2038: syscalls: rename y2038 compat syscalls x86/x32: use time64 versions of sigtimedwait and recvmmsg timex: change syscalls to use struct __kernel_timex timex: use __kernel_timex internally sparc64: add custom adjtimex/clock_adjtime functions time: fix sys_timer_settime prototype time: Add struct __kernel_timex time: make adjtime compat handling available for 32 bit ...
2019-02-07y2038: syscalls: rename y2038 compat syscallsArnd Bergmann1-1/+1
A lot of system calls that pass a time_t somewhere have an implementation using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have been reworked so that this implementation can now be used on 32-bit architectures as well. The missing step is to redefine them using the regular SYSCALL_DEFINEx() to get them out of the compat namespace and make it possible to build them on 32-bit architectures. Any system call that ends in 'time' gets a '32' suffix on its name for that version, while the others get a '_time32' suffix, to distinguish them from the normal version, which takes a 64-bit time argument in the future. In this step, only 64-bit architectures are changed, doing this rename first lets us avoid touching the 32-bit architectures twice. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-01-29timers: Mark expected switch fall-throughsGustavo A. R. Silva1-1/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where fall through is indeed expected. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Stephen Boyd <sboyd@kernel.org> Link: https://lkml.kernel.org/r/20190123081413.GA3949@embeddedor
2018-11-23hrtimers/tick/clockevents: Remove sloppy license referencesThomas Gleixner1-2/+0
"For licencing details see kernel-base/COPYING" and similar license references have no value over the SPDX identifier. Remove them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: John Stultz <john.stultz@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Richard Cochran <richardcochran@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: David Riley <davidriley@chromium.org> Cc: Colin Cross <ccross@android.com> Cc: Mark Brown <broonie@kernel.org> Link: https://lkml.kernel.org/r/20181031182252.963632760@linutronix.de
2018-11-23time: Add SPDX license identifiersThomas Gleixner1-0/+1
Update the time(r) core files files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Philippe Ombredanne, Kate Stewart and myself. The data has been created with two independent license scanners and manual inspection. The following files do not contain any direct license information and have been omitted from the big initial SPDX changes: timeconst.bc: The .bc files were not touched time.c, timer.c, timekeeping.c: Licence was deduced from EXPORT_SYMBOL_GPL As those files do not contain direct license references they fall under the project license, i.e. GPL V2 only. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: John Stultz <john.stultz@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: David Riley <davidriley@chromium.org> Cc: Colin Cross <ccross@android.com> Cc: Mark Brown <broonie@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: https://lkml.kernel.org/r/20181031182252.879109557@linutronix.de
2018-11-23time: Remove useless filenames in top level commentsThomas Gleixner1-12/+4
Remove the pointless filenames in the top level comments. They have no value at all and just occupy space. While at it tidy up some of the comments and remove a stale one. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: John Stultz <john.stultz@linaro.org> Acked-by: Corey Minyard <cminyard@mvista.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Richard Cochran <richardcochran@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: David Riley <davidriley@chromium.org> Cc: Colin Cross <ccross@android.com> Cc: Mark Brown <broonie@kernel.org> Link: https://lkml.kernel.org/r/20181031182252.794898238@linutronix.de
2018-08-27y2038: globally rename compat_time to old_time32Arnd Bergmann1-4/+4
Christoph Hellwig suggested a slightly different path for handling backwards compatibility with the 32-bit time_t based system calls: Rather than simply reusing the compat_sys_* entry points on 32-bit architectures unchanged, we get rid of those entry points and the compat_time types by renaming them to something that makes more sense on 32-bit architectures (which don't have a compat mode otherwise), and then share the entry points under the new name with the 64-bit architectures that use them for implementing the compatibility. The following types and interfaces are renamed here, and moved from linux/compat_time.h to linux/time32.h: old new --- --- compat_time_t old_time32_t struct compat_timeval struct old_timeval32 struct compat_timespec struct old_timespec32 struct compat_itimerspec struct old_itimerspec32 ns_to_compat_timeval() ns_to_old_timeval32() get_compat_itimerspec64() get_old_itimerspec32() put_compat_itimerspec64() put_old_itimerspec32() compat_get_timespec64() get_old_timespec32() compat_put_timespec64() put_old_timespec32() As we already have aliases in place, this patch addresses only the instances that are relevant to the system call interface in particular, not those that occur in device drivers and other modules. Those will get handled separately, while providing the 64-bit version of the respective interfaces. I'm not renaming the timex, rusage and itimerval structures, as we are still debating what the new interface will look like, and whether we will need a replacement at all. This also doesn't change the names of the syscall entry points, which can be done more easily when we actually switch over the 32-bit architectures to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix. Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/lkml/20180705222110.GA5698@infradead.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-07-12Merge branch 'fortglx/4.19/time' of https://git.linaro.org/people/john.stultz/linux into timers/coreThomas Gleixner1-1/+1
Pull timekeeping updates from John Stultz: - Make the timekeeping update more precise when NTP frequency is set directly by updating the multiplier. - Adjust selftests
2018-07-12hrtimer: Improve kernel message printingGeert Uytterhoeven1-4/+3
- Join split message for easier grepping, - Use pr_*() instead of printk*(), - Use %u to format unsigned cpu numbers. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20180712144118.8819-1-geert+renesas@glider.be
2018-06-19posix-timers: Fix nanosleep_copyout() for CONFIG_COMPAT_32BIT_TIMEArnd Bergmann1-1/+1
Commit b5793b0d92c9 added support for building the nanosleep compat system call on 32-bit architectures, but missed one change in nanosleep_copyout(), which would trigger a BUG() as soon as any architecture is switched over to use it. Use the proper config symbol to enable the code path. Fixes: Commit b5793b0d92c9 ("posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIME") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: y2038@lists.linaro.org Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Deepa Dinamani <deepa.kernel@gmail.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Link: https://lkml.kernel.org/r/20180618140811.2998503-1-arnd@arndb.de
2018-05-02Merge branch 'timers/urgent' into timers/coreThomas Gleixner1-2/+14
Pick up urgent fixes to apply dependent cleanup patch
2018-04-26Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIMEThomas Gleixner1-2/+14
Revert commits 92af4dcb4e1c ("tracing: Unify the "boot" and "mono" tracing clocks") 127bfa5f4342 ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior") 7250a4047aa6 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior") d6c7270e913d ("timekeeping: Remove boot time specific code") f2d6fdbfd238 ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior") d6ed449afdb3 ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock") 72199320d49d ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock") As stated in the pull request for the unification of CLOCK_MONOTONIC and CLOCK_BOOTTIME, it was clear that we might have to revert the change. As reported by several folks systemd and other applications rely on the documented behaviour of CLOCK_MONOTONIC on Linux and break with the above changes. After resume daemons time out and other timeout related issues are observed. Rafael compiled this list: * systemd kills daemons on resume, after >WatchdogSec seconds of suspending (Genki Sky). [Verified that that's because systemd uses CLOCK_MONOTONIC and expects it to not include the suspend time.] * systemd-journald misbehaves after resume: systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal corrupted or uncleanly shut down, renaming and replacing. (Mike Galbraith). * NetworkManager reports "networking disabled" and networking is broken after resume 50% of the time (Pavel). [May be because of systemd.] * MATE desktop dims the display and starts the screensaver right after system resume (Pavel). * Full system hang during resume (me). [May be due to systemd or NM or both.] That happens on debian and open suse systems. It's sad, that these problems were neither catched in -next nor by those folks who expressed interest in this change. Reported-by: Rafael J. Wysocki <rjw@rjwysocki.net> Reported-by: Genki Sky <sky@genki.is>, Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
2018-04-19time: Change nanosleep to safe __kernel_* typesDeepa Dinamani1-2/+6
Change over clock_nanosleep syscalls to use y2038 safe __kernel_timespec times. This will enable changing over of these syscalls to use new y2038 safe syscalls when the architectures define the CONFIG_64BIT_TIME. Note that nanosleep syscall is deprecated and does not have a plan for making it y2038 safe. But, the syscall should work as before on 64 bit machines and on 32 bit machines, the syscall works correctly until y2038 as before using the existing compat syscall version. There is no new syscall for supporting 64 bit time_t on 32 bit architectures. Cc: linux-api@vger.kernel.org Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-19posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIMEDeepa Dinamani1-1/+1
clock_gettime, clock_settime, clock_getres and clock_nanosleep compat syscalls are also repurposed to provide backward compatibility to support 32 bit time_t on 32 bit systems. Note that nanosleep compat syscall will also be treated the same way as the above syscalls as it shares common handler functions with clock_nanosleep. But, there is no plan to provide y2038 safe solution for nanosleep. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-11Merge branches 'pm-cpuidle' and 'pm-qos'Rafael J. Wysocki1-2/+51
* pm-cpuidle: tick-sched: avoid a maybe-uninitialized warning cpuidle: Add definition of residency to sysfs documentation time: hrtimer: Use timerqueue_iterate_next() to get to the next timer nohz: Avoid duplication of code related to got_idle_tick nohz: Gather tick_sched booleans under a common flag field cpuidle: menu: Avoid selecting shallow states with stopped tick cpuidle: menu: Refine idle state selection for running tick sched: idle: Select idle state before stopping the tick time: hrtimer: Introduce hrtimer_next_event_without() time: tick-sched: Split tick_nohz_stop_sched_tick() cpuidle: Return nohz hint from cpuidle_select() jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC sched: idle: Do not stop the tick before cpuidle_idle_call() sched: idle: Do not stop the tick upfront in the idle loop time: tick-sched: Reorganize idle tick management code * pm-qos: PM / QoS: mark expected switch fall-throughs
2018-04-09time: hrtimer: Use timerqueue_iterate_next() to get to the next timerRafael J. Wysocki1-3/+1
Use timerqueue_iterate_next() to get to the next timer in __hrtimer_next_event_base() without browsing the timerqueue details diredctly. No intentional changes in functionality. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-07time: hrtimer: Introduce hrtimer_next_event_without()Rafael J. Wysocki1-2/+53
The next set of changes will need to compute the time to the next hrtimer event over all hrtimers except for the scheduler tick one. To that end introduce a new helper function, hrtimer_next_event_without(), for computing the time until the next hrtimer event over all timers except for one and modify the underlying code in __hrtimer_next_event_base() to prepare it for being called by that new function. No intentional changes in functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2018-03-13hrtimer: Unify MONOTONIC and BOOTTIME clock behaviorThomas Gleixner1-14/+2
Now that th MONOTONIC and BOOTTIME clocks are indentical remove all the special casing. The user space visible interfaces still support both clocks, but their behavior is identical. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20180301165150.410218515@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-06hrtimer: remove unneeded kallsyms includeSergey Senozhatsky1-1/+0
hrtimer does not seem to use any of kallsyms functions/defines. Link: http://lkml.kernel.org/r/20171208025616.16267-9-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-27Merge branch 'timers/urgent' into timers/coreThomas Gleixner1-0/+4
Pick up urgent bug fix and resolve the conflict. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2018-01-27hrtimer: Reset hrtimer cpu base proper on CPU hotplugThomas Gleixner1-0/+3
The hrtimer interrupt code contains a hang detection and mitigation mechanism, which prevents that a long delayed hrtimer interrupt causes a continous retriggering of interrupts which prevent the system from making progress. If a hang is detected then the timer hardware is programmed with a certain delay into the future and a flag is set in the hrtimer cpu base which prevents newly enqueued timers from reprogramming the timer hardware prior to the chosen delay. The subsequent hrtimer interrupt after the delay clears the flag and resumes normal operation. If such a hang happens in the last hrtimer interrupt before a CPU is unplugged then the hang_detected flag is set and stays that way when the CPU is plugged in again. At that point the timer hardware is not armed and it cannot be armed because the hang_detected flag is still active, so nothing clears that flag. As a consequence the CPU does not receive hrtimer interrupts and no timers expire on that CPU which results in RCU stalls and other malfunctions. Clear the flag along with some other less critical members of the hrtimer cpu base to ensure starting from a clean state when a CPU is plugged in. Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the root cause of that hard to reproduce heisenbug. Once understood it's trivial and certainly justifies a brown paperbag. Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
2018-01-16hrtimer: Implement SOFT/HARD clock base selectionAnna-Maria Gleixner1-4/+11
All prerequisites to handle hrtimers for expiry in either hard or soft interrupt context are in place. Add the missing bit in hrtimer_init() which associates the timer to the hard or the softirq clock base. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-30-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Implement support for softirq based hrtimersAnna-Maria Gleixner1-24/+172
hrtimer callbacks are always invoked in hard interrupt context. Several users in tree require soft interrupt context for their callbacks and achieve this by combining a hrtimer with a tasklet. The hrtimer schedules the tasklet in hard interrupt context and the tasklet callback gets invoked in softirq context later. That's suboptimal and aside of that the real-time patch moves most of the hrtimers into softirq context. So adding native support for hrtimers expiring in softirq context is a valuable extension for both mainline and the RT patch set. Each valid hrtimer clock id has two associated hrtimer clock bases: one for timers expiring in hardirq context and one for timers expiring in softirq context. Implement the functionality to associate a hrtimer with the hard or softirq related clock bases and update the relevant functions to take them into account when the next expiry time needs to be evaluated. Add a check into the hard interrupt context handler functions to check whether the first expiring softirq based timer has expired. If it's expired the softirq is raised and the accounting of softirq based timers to evaluate the next expiry time for programming the timer hardware is skipped until the softirq processing has finished. At the end of the softirq processing the regular processing is resumed. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-29-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Prepare handling of hard and softirq based hrtimersAnna-Maria Gleixner1-9/+29
The softirq based hrtimer can utilize most of the existing hrtimers functions, but need to operate on a different data set. Add an 'active_mask' parameter to various functions so the hard and soft bases can be selected. Fixup the existing callers and hand in the ACTIVE_HARD mask. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-28-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Add clock bases and hrtimer mode for softirq contextAnna-Maria Gleixner1-0/+20
Currently hrtimer callback functions are always executed in hard interrupt context. Users of hrtimers, which need their timer function to be executed in soft interrupt context, make use of tasklets to get the proper context. Add additional hrtimer clock bases for timers which must expire in softirq context, so the detour via the tasklet can be avoided. This is also required for RT, where the majority of hrtimer is moved into softirq hrtimer context. The selection of the expiry mode happens via a mode bit. Introduce HRTIMER_MODE_SOFT and the matching combinations with the ABS/REL/PINNED bits and update the decoding of hrtimer_mode in tracepoints. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-27-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Use irqsave/irqrestore around __run_hrtimer()Anna-Maria Gleixner1-13/+18
__run_hrtimer() is called with the hrtimer_cpu_base.lock held and interrupts disabled. Before invoking the timer callback the base lock is dropped, but interrupts stay disabled. The upcoming support for softirq based hrtimers requires that interrupts are enabled before the timer callback is invoked. To avoid code duplication, take hrtimer_cpu_base.lock with raw_spin_lock_irqsave(flags) at the call site and hand in the flags as a parameter. So raw_spin_unlock_irqrestore() before the callback invocation will either keep interrupts disabled in interrupt context or restore to interrupt enabled state when called from softirq context. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-26-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Factor out __hrtimer_next_event_base()Anna-Maria Gleixner1-4/+16
Preparatory patch for softirq based hrtimers to avoid code duplication. No functional change. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-25-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Factor out __hrtimer_start_range_ns()Anna-Maria Gleixner1-20/+24
Preparatory patch for softirq based hrtimers to avoid code duplication, factor out the __hrtimer_start_range_ns() function from hrtimer_start_range_ns(). No functional change. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-24-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Remove the 'base' parameter from hrtimer_reprogram()Anna-Maria Gleixner1-3/+3
hrtimer_reprogram() must have access to the hrtimer_clock_base of the new first expiring timer to access hrtimer_clock_base.offset for adjusting the expiry time to CLOCK_MONOTONIC. This is required to evaluate whether the new left most timer in the hrtimer_clock_base is the first expiring timer of all clock bases in a hrtimer_cpu_base. The only user of hrtimer_reprogram() is hrtimer_start_range_ns(), which has a pointer to hrtimer_clock_base() already and hands it in as a parameter. But hrtimer_start_range_ns() will be split for the upcoming support for softirq based hrtimers to avoid code duplication and will lose the direct access to the clock base pointer. Instead of handing in timer and timer->base as a parameter remove the base parameter from hrtimer_reprogram() instead and retrieve the clock base internally. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-23-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Make remote enqueue decision less restrictiveAnna-Maria Gleixner1-1/+1
The current decision whether a timer can be queued on a remote CPU checks for timer->expiry <= remote_cpu_base.expires_next. This is too restrictive because a timer with the same expiry time as an existing timer will be enqueued on right-hand size of the existing timer inside the rbtree, i.e. behind the first expiring timer. So its safe to allow enqueuing timers with the same expiry time as the first expiring timer on a remote CPU base. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-22-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-16hrtimer: Unify remote enqueue handlingAnna-Maria Gleixner1-12/+6
hrtimer_reprogram() is conditionally invoked from hrtimer_start_range_ns() when hrtimer_cpu_base.hres_active is true. In the !hres_active case there is a special condition for the nohz_active case: If the newly enqueued timer expires before the first expiring timer on a remote CPU then the remote CPU needs to be notified and woken up from a NOHZ idle sleep to take the new first expiring timer into account. Previous changes have already established the prerequisites to make the remote enqueue behaviour the same whether high resolution mode is active or not: If the to be enqueued timer expires before the first expiring timer on a remote CPU, then it cannot be enqueued there. This was done for the high resolution mode because there is no way to access the remote CPU timer hardware. The same is true for NOHZ, but was handled differently by unconditionally enqueuing the timer and waking up the remote CPU so it can reprogram its timer. Again there is no compelling reason for this difference. hrtimer_check_target(), which makes the 'can remote enqueue' decision is already unconditional, but not yet functional because nothing updates hrtimer_cpu_base.expires_next in the !hres_active case. To unify this the following changes are required: 1) Make the store of the new first expiry time unconditonal in hrtimer_reprogram() and check __hrtimer_hres_active() before proceeding to the actual hardware access. This check also lets the compiler eliminate the rest of the function in case of CONFIG_HIGH_RES_TIMERS=n. 2) Invoke hrtimer_reprogram() unconditionally from hrtimer_start_range_ns() 3) Remove the remote wakeup special case for the !high_res && nohz_active case. Confine the timers_nohz_active static key to timer.c which is the only user now. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@chromium.org Link: http://lkml.kernel.org/r/20171221104205.7269-21-anna-maria@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>