Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux
Joonwoo-Park/sched-fix-incorrect-wait-time-and-wait-count-statistics/20151027-094629
commit 4183fdd83194ded503e1cec873556dd560f43dd4
Author: Joonwoo Park <joonwoop(a)codeaurora.org>
AuthorDate: Mon Oct 26 18:44:48 2015 -0700
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Tue Oct 27 09:46:30 2015 +0800
sched: fix incorrect wait time and wait count statistics
On 10/25/2015 03:26 AM, Peter Zijlstra wrote:
On Sat, Oct 24, 2015 at 10:23:14PM -0700, Joonwoo Park wrote:
> @@ -1069,7 +1069,7 @@ static struct rq *move_queued_task(struct rq *rq, struct
task_struct *p, int new
> {
> lockdep_assert_held(&rq->lock);
>
> - dequeue_task(rq, p, 0);
> + dequeue_task(rq, p, DEQUEUE_MIGRATING);
> p->on_rq = TASK_ON_RQ_MIGRATING;
> set_task_cpu(p, new_cpu);
> raw_spin_unlock(&rq->lock);
> @@ -5656,7 +5671,7 @@ static void detach_task(struct task_struct *p, struct lb_env
*env)
> {
> lockdep_assert_held(&env->src_rq->lock);
>
> - deactivate_task(env->src_rq, p, 0);
> + deactivate_task(env->src_rq, p, DEQUEUE_MIGRATING);
> p->on_rq = TASK_ON_RQ_MIGRATING;
> set_task_cpu(p, env->dst_cpu);
> }
Also note that on both sites we also set TASK_ON_RQ_MIGRATING -- albeit
late. Can't you simply set that earlier (and back to QUEUED later) and
test for task_on_rq_migrating() instead of blowing up the fastpath like
you did?
Yes it's doable. I also find it's much simpler.
Please find patch v2. I verified v2 does same job as v1 by comparing sched_stat_wait
time with sched_switch - sched_wakeup timestamp.
Thanks,
Joonwoo
From 98d615d46211a90482a0f9b7204265c54bba8520 Mon Sep 17 00:00:00 2001
From: Joonwoo Park <joonwoop(a)codeaurora.org>
Date: Mon, 26 Oct 2015 16:37:47 -0700
Subject: [PATCH v2] sched: fix incorrect wait time and wait count statistics
At present scheduler resets task's wait start timestamp when the task
migrates to another rq. This misleads scheduler itself into reporting
less wait time than actual by omitting time spent for waiting prior to
migration and also more wait count than actual by counting migration as
wait end event which can be seen by trace or /proc/<pid>/sched with
CONFIG_SCHEDSTATS=y.
Carry forward migrating task's wait time prior to migration and
don't count migration as a wait end event to fix such statistics error.
In order to determine whether task is migrating mark task->on_rq with
TASK_ON_RQ_MIGRATING while dequeuing and enqueuing due to migration.
To: Ingo Molnar <mingo(a)kernel.org>
To: Peter Zijlstra <peterz(a)infradead.org>
Cc: linux-kernel(a)vger.kernel.org
Signed-off-by: Joonwoo Park <joonwoop(a)codeaurora.org>
+--------------------------------------------------------+------------+------------+------------+
| | 8a5ec90292 | 4183fdd831 |
7cb0326d8e |
+--------------------------------------------------------+------------+------------+------------+
| boot_successes | 112 | 0 | 0
|
| boot_failures | 4 | 26 | 39
|
| IP-Config:Auto-configuration_of_network_failed | 4 | |
|
| WARNING:at_kernel/sched/fair.c:#put_prev_entity() | 0 | 26 | 39
|
| BUG:KASan:out_of_bounds_access | 0 | 26 | 39
|
| BUG_kmalloc-#(Tainted:G_W):kasan:bad_access_detected | 0 | 26 | 39
|
| INFO:Slab#objects=#used=#fp=0x(null)flags= | 0 | 22 | 39
|
| INFO:Object#@offset=#fp= | 0 | 26 | 39
|
| BUG:KASan:use_after_free | 0 | 10 | 12
|
| BUG_kmalloc-#(Tainted:G_B_W):kasan:bad_access_detected | 0 | 26 | 39
|
| INFO:Slab#objects=#used=#fp=#flags= | 0 | 11 |
|
| backtrace:do_execve | 0 | 2 | 4
|
| backtrace:SyS_execve | 0 | 2 | 4
|
| backtrace:do_wait | 0 | 10 | 16
|
| backtrace:SyS_wait4 | 0 | 10 | 16
|
| backtrace:torture_stutter | 0 | 24 | 37
|
| backtrace:torture_shuffle | 0 | 24 | 36
|
| backtrace:vfs_write | 0 | 4 | 7
|
| backtrace:SyS_write | 0 | 4 | 7
|
| backtrace:smpboot_thread_fn | 0 | 14 | 22
|
| backtrace:vfs_read | 0 | 4 | 7
|
| backtrace:SyS_read | 0 | 4 | 7
|
| backtrace:arch_ptrace | 0 | 9 | 12
|
| backtrace:SyS_ptrace | 0 | 9 | 12
|
| backtrace:lock_torture_writer | 0 | 20 | 32
|
| backtrace:core_sys_select | 0 | 10 | 14
|
| backtrace:SyS_select | 0 | 10 | 14
|
| backtrace:vm_munmap | 0 | 2 | 2
|
| backtrace:SyS_munmap | 0 | 2 | 2
|
| backtrace:hrtimer_nanosleep | 0 | 5 |
|
| backtrace:SyS_nanosleep | 0 | 5 |
|
| backtrace:__sys_sendmsg | 0 | 1 | 2
|
| backtrace:SyS_sendmsg | 0 | 1 | 2
|
| backtrace:compat_SyS_wait4 | 0 | 2 | 7
|
| backtrace:_do_fork | 0 | 1 | 3
|
| backtrace:SyS_clone | 0 | 1 | 3
|
| INFO:Object | 0 | 0 | 1
|
| backtrace:call_usermodehelper_exec_work | 0 | 0 | 4
|
| backtrace:vm_mmap_pgoff | 0 | 0 | 1
|
| backtrace:SyS_mmap_pgoff | 0 | 0 | 1
|
| backtrace:SyS_mmap | 0 | 0 | 1
|
| backtrace:compat_core_sys_select | 0 | 0 | 8
|
| backtrace:compat_SyS_select | 0 | 0 | 8
|
| INFO:Obje | 0 | 0 | 1
|
+--------------------------------------------------------+------------+------------+------------+
[ 64.057664] page_owner is disabled
[ 64.059765] Key type encrypted registered
[ 64.087483] Freeing unused kernel memory: 952K (ffffffff82190000 - ffffffff8227e000)
[ 64.109779] WARNING: CPU: 0 PID: 1 at kernel/sched/fair.c:260
put_prev_entity+0x20f/0x2d0()
[ 64.110020] Modules linked in:
[ 64.110020] CPU: 0 PID: 1 Comm: init Not tainted 4.3.0-rc6-00788-g4183fdd #1
[ 64.110020] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1
04/01/2014
[ 64.110020] 0000000000000104 ffff88000eca7d00 ffffffff81415919 ffff88000eca7d40
[ 64.110020] ffffffff810a3308 ffffffff810dfd0f ffff880013961a00 ffffffff820728e0
[ 64.110020] ffff880013961a38 00000004ca03a0da ffffffff820728e0 ffff88000eca7d50
[ 64.110020] Call Trace:
[ 64.110020] [<ffffffff81415919>] dump_stack+0x19/0x20
[ 64.110020] [<ffffffff810a3308>] warn_slowpath_common+0x98/0xe0
[ 64.110020] [<ffffffff810dfd0f>] ? put_prev_entity+0x20f/0x2d0
[ 64.110020] [<ffffffff810a3405>] warn_slowpath_null+0x15/0x20
[ 64.110020] [<ffffffff810dfd0f>] put_prev_entity+0x20f/0x2d0
[ 64.110020] [<ffffffff810dd6db>] ? pick_next_entity+0x14b/0x2b0
[ 64.110020] [<ffffffff810dffde>] pick_next_task_fair+0x1be/0x2a0
[ 64.110020] [<ffffffff81b5eaca>] __schedule+0xea/0xb70
[ 64.110020] [<ffffffff81b5f74a>] preempt_schedule_common+0x2a/0x50
[ 64.110020] [<ffffffff81b5f786>] _cond_resched+0x16/0x20
[ 64.110020] [<ffffffff810c9356>] task_work_run+0xa6/0xe0
[ 64.110020] [<ffffffff81002a1f>] exit_to_usermode_loop+0xbf/0xd0
[ 64.110020] [<ffffffff810031b5>] do_syscall_32_irqs_off+0x175/0x190
[ 64.110020] [<ffffffff81b670d3>] entry_INT80_compat+0x33/0x40
[ 64.110020] ---[ end trace a212c87aecbb765b ]---
[ 64.182849] ==================================================================
git bisect start 7cb0326d8efd27c3936596f5abc6fac4400a5c48
32b88194f71d6ae7768a29f87fbba454728273ee --
git bisect good 515bb57a7bf106bb1a37e7a6ba0079d6322d94fd # 14:25 20+ 4 Merge
'l2-mtd/master' into devel-spot-201510271532
git bisect bad 5583b4496b3139343f9ba1c3e8d7869cb7d8d2d8 # 14:25 0- 41 Merge
'nvdimm/for-4.4/memremap' into devel-spot-201510271532
git bisect bad 2a8b8ef9e200879f7ee88706e4cf3d5625cf2a73 # 14:25 0- 24 Merge
'slave-dma/topic/edma' into devel-spot-201510271532
git bisect bad e0ba1a24b2093dd8195955039b918ba5f7301789 # 14:25 0- 54 Merge
'luto/x86/entry_compat' into devel-spot-201510271532
git bisect good 148a7732a83c50558948de14d79e7b34e602c007 # 15:17 21+ 4 Merge
'linux-review/Chao-Yu/f2fs-support-file-defragment/20151027-095717' into
devel-spot-201510271532
git bisect bad 10eb6ed23a5f3e60087bf7276358b5b348a20a67 # 15:17 0- 26 Merge
'linux-review/Joonwoo-Park/sched-fix-incorrect-wait-time-and-wait-count-statistics/20151027-094629'
into devel-spot-201510271532
git bisect good 295e5999f7bba512a3d744e11ed0327ba877fa3a # 15:27 22+ 0 Merge
branch 'perf/core'
git bisect good 1b9b556081cbc68c6bf5a8e540899171cecb12dc # 15:31 22+ 4 Merge
branch 'x86/apic'
git bisect good d58edcaab85edb3b1b72c0fee753d219cb4cee98 # 15:36 22+ 4 Merge
branch 'x86/debug'
git bisect good daa0f3856bb0673158f1952f494483e590e1a6d5 # 15:48 22+ 0 manual
merge of x86/fpu
git bisect good 2043d3d0c2ffb6afb71074df3f76a7bf179a31d6 # 15:56 22+ 0 Merge
branch 'x86/headers'
git bisect good 5fb50b1a1654b879ed4c7a940aef64f9bb4e3831 # 16:41 21+ 4 Merge
branch 'x86/platform'
git bisect bad 4183fdd83194ded503e1cec873556dd560f43dd4 # 16:41 0- 26 sched:
fix incorrect wait time and wait count statistics
git bisect good 8a5ec90292d4a7e5b133fd8d91ea3d1ddb9c2bff # 16:44 22+ 4 Merge
branch 'x86/urgent'
# first bad commit: [4183fdd83194ded503e1cec873556dd560f43dd4] sched: fix incorrect wait
time and wait count statistics
git bisect good 8a5ec90292d4a7e5b133fd8d91ea3d1ddb9c2bff # 16:52 63+ 4 Merge
branch 'x86/urgent'
# extra tests with DEBUG_INFO
git bisect bad 4183fdd83194ded503e1cec873556dd560f43dd4 # 17:02 0- 3 sched:
fix incorrect wait time and wait count statistics
# extra tests on HEAD of linux-devel/devel-spot-201510271532
git bisect bad f727564839f222a773ceb14f5c5753d1ba550acf # 17:03 0- 1116 0day
head guard for 'devel-spot-201510271532'
# extra tests on tree/branch
linux-review/Joonwoo-Park/sched-fix-incorrect-wait-time-and-wait-count-statistics/20151027-094629
git bisect bad 4183fdd83194ded503e1cec873556dd560f43dd4 # 17:03 0- 26 sched:
fix incorrect wait time and wait count statistics
# extra tests with first bad commit reverted
git bisect good 55f7afd3fcf8c06219adbb88fd94de4f60c18df2 # 17:22 61+ 0 Revert
"sched: fix incorrect wait time and wait count statistics"
# extra tests on tree/branch linus/master
git bisect good 38dab9ac1c017e96dc98e978111e365134d41d13 # 17:38 62+ 0 Merge
branch 'for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
# extra tests on tree/branch linux-next/master
git bisect good 6dcf94ff0c9e28e5790799e53641dd256745f425 # 17:50 62+ 0 Add
linux-next specific files for 20151022
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation