summaryrefslogtreecommitdiffstats
path: root/kernel/sched/sched.h
diff options
context:
space:
mode:
authorJoel Fernandes (Google) <joel@joelfernandes.org>2020-11-18 00:19:39 +0100
committerPeter Zijlstra <peterz@infradead.org>2021-05-12 11:43:29 +0200
commitc6047c2e3af68dae23ad884249e0d42ff28d2d1b (patch)
treef7aafa01ceb8a9f2ecc413e190b49718e4860fd1 /kernel/sched/sched.h
parentsched: Fix priority inversion of cookied task with sibling (diff)
downloadlinux-c6047c2e3af68dae23ad884249e0d42ff28d2d1b.tar.xz
linux-c6047c2e3af68dae23ad884249e0d42ff28d2d1b.zip
sched/fair: Snapshot the min_vruntime of CPUs on force idle
During force-idle, we end up doing cross-cpu comparison of vruntimes during pick_next_task. If we simply compare (vruntime-min_vruntime) across CPUs, and if the CPUs only have 1 task each, we will always end up comparing 0 with 0 and pick just one of the tasks all the time. This starves the task that was not picked. To fix this, take a snapshot of the min_vruntime when entering force idle and use it for comparison. This min_vruntime snapshot will only be used for cross-CPU vruntime comparison, and nothing else. A note about the min_vruntime snapshot and force idling: During selection: When we're not fi, we need to update snapshot. when we're fi and we were not fi, we must update snapshot. When we're fi and we were already fi, we must not update snapshot. Which gives: fib fi update 0 0 1 0 1 1 1 0 1 1 1 0 Where: fi: force-idled now fib: force-idled before So the min_vruntime snapshot needs to be updated when: !(fib && fi). Also, the cfs_prio_less() function needs to be aware of whether the core is in force idle or not, since it will be use this information to know whether to advance a cfs_rq's min_vruntime_fi in the hierarchy. So pass this information along via pick_task() -> prio_less(). Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Don Hiatt <dhiatt@digitalocean.com> Tested-by: Hongyu Ning <hongyu.ning@linux.intel.com> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20210422123308.738542617@infradead.org
Diffstat (limited to 'kernel/sched/sched.h')
-rw-r--r--kernel/sched/sched.h8
1 files changed, 8 insertions, 0 deletions
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index db555143380d..4a898abc60ce 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -526,6 +526,11 @@ struct cfs_rq {
u64 exec_clock;
u64 min_vruntime;
+#ifdef CONFIG_SCHED_CORE
+ unsigned int forceidle_seq;
+ u64 min_vruntime_fi;
+#endif
+
#ifndef CONFIG_64BIT
u64 min_vruntime_copy;
#endif
@@ -1089,6 +1094,7 @@ struct rq {
unsigned int core_pick_seq;
unsigned long core_cookie;
unsigned char core_forceidle;
+ unsigned int core_forceidle_seq;
#endif
};
@@ -1162,6 +1168,8 @@ static inline raw_spinlock_t *__rq_lockp(struct rq *rq)
return &rq->__lock;
}
+bool cfs_prio_less(struct task_struct *a, struct task_struct *b, bool fi);
+
#else /* !CONFIG_SCHED_CORE */
static inline bool sched_core_enabled(struct rq *rq)