diff options
author | 2024-07-12 08:20:33 -1000 | |
---|---|---|
committer | 2024-07-12 08:20:33 -1000 | |
commit | 5b26f7b920f76b2b9cc398c252a9e35e44bf5bb9 (patch) | |
tree | 7453394e4a7ae37afb8bac18cc9479dc9cefb63d /kernel/sched/sched.h | |
parent | sched_ext: s/SCX_RQ_BALANCING/SCX_RQ_IN_BALANCE/ and add SCX_RQ_IN_WAKEUP (diff) | |
download | wireguard-linux-5b26f7b920f76b2b9cc398c252a9e35e44bf5bb9.tar.xz wireguard-linux-5b26f7b920f76b2b9cc398c252a9e35e44bf5bb9.zip |
sched_ext: Allow SCX_DSQ_LOCAL_ON for direct dispatches
In ops.dispatch(), SCX_DSQ_LOCAL_ON can be used to dispatch the task to the
local DSQ of any CPU. However, during direct dispatch from ops.select_cpu()
and ops.enqueue(), this isn't allowed. This is because dispatching to the
local DSQ of a remote CPU requires locking both the task's current and new
rq's and such double locking can't be done directly from ops.enqueue().
While waking up a task, as ops.select_cpu() can pick any CPU and both
ops.select_cpu() and ops.enqueue() can use SCX_DSQ_LOCAL as the dispatch
target to dispatch to the DSQ of the picked CPU, the BPF scheduler can still
do whatever it wants to do. However, while a task is being enqueued for a
different reason, e.g. after its slice expiration, only ops.enqueue() is
called and there's no way for the BPF scheduler to directly dispatch to the
local DSQ of a remote CPU. This gap in API forces schedulers into
work-arounds which are not straightforward or optimal such as skipping
direct dispatches in such cases.
Implement deferred enqueueing to allow directly dispatching to the local DSQ
of a remote CPU from ops.select_cpu() and ops.enqueue(). Such tasks are
temporarily queued on rq->scx.ddsp_deferred_locals. When the rq lock can be
safely released, the tasks are taken off the list and queued on the target
local DSQs using dispatch_to_local_dsq().
v2: - Add missing return after queue_balance_callback() in
schedule_deferred(). (David).
- dispatch_to_local_dsq() now assumes that @rq is locked but unpinned
and thus no longer takes @rf. Updated accordingly.
- UP build warning fix.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
Cc: Changwoo Min <changwoo@igalia.com>
Diffstat (limited to '')
-rw-r--r-- | kernel/sched/sched.h | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 8a0e8052f6b0..be7be54484c0 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -746,6 +746,7 @@ enum scx_rq_flags { struct scx_rq { struct scx_dispatch_q local_dsq; struct list_head runnable_list; /* runnable tasks on this rq */ + struct list_head ddsp_deferred_locals; /* deferred ddsps from enq */ unsigned long ops_qseq; u64 extra_enq_flags; /* see move_task_to_local_dsq() */ u32 nr_running; @@ -757,6 +758,8 @@ struct scx_rq { cpumask_var_t cpus_to_preempt; cpumask_var_t cpus_to_wait; unsigned long pnt_seq; + struct balance_callback deferred_bal_cb; + struct irq_work deferred_irq_work; struct irq_work kick_cpus_irq_work; }; #endif /* CONFIG_SCHED_CLASS_EXT */ |