FYI, we noticed the following commit (built with gcc-7):
commit: 0a9efc8e562f66f927876db2effcbd6b80191476 ("sched: Basic tracking of matching
tasks")
https://github.com/digitalocean/linux-coresched coresched
in testcase: rcutorture
with following parameters:
runtime: 300s
test: cpuhotplug
torture_type: srcud
test-description: rcutorture is rcutorture kernel module load/unload test.
test-url:
https://www.kernel.org/doc/Documentation/RCU/torture.txt
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 2G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------------+------------+------------+
| | 6395f75d43 | 0a9efc8e56 |
+-------------------------------------------------------+------------+------------+
| boot_successes | 2 | 4 |
| boot_failures | 112 | 115 |
| BUG:kernel_hang_in_boot-around-mounting-root_stage | 112 | 113 |
| WARNING:at_kernel/sched/sched.h:#migrate_tasks | 0 | 2 |
| RIP:migrate_tasks | 0 | 2 |
| WARNING:possible_circular_locking_dependency_detected | 0 | 2 |
+-------------------------------------------------------+------------+------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen(a)intel.com>
[ 133.050138] WARNING: CPU: 1 PID: 14 at kernel/sched/sched.h:1763
migrate_tasks+0x24f/0x7a9
[ 133.060259] Modules linked in: rcutorture torture crct10dif_pclmul crc32c_intel
input_leds pcspkr i2c_piix4 evdev
[ 133.062468] CPU: 1 PID: 14 Comm: migration/1 Tainted: G T
5.2.0-rc1-00109-g0a9efc8 #1
[ 133.064365] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1
04/01/2014
[ 133.066105] RIP: 0010:migrate_tasks+0x24f/0x7a9
[ 133.067087] Code: 48 ff 05 72 25 a8 05 48 8d bb f8 09 00 00 48 ff 05 4c 25 a8 05 e8 cb
5d 2c 00 4c 39 ab f8 09 00 00 74 17 48 ff 05 57 25 a8 05 <0f> 0b 48 ff 05 56 25 a8
05 48 ff 05 57 25 a8 05 49 8d bd 88 03 00
[ 133.083817] RSP: 0018:ffff8880595cfc08 EFLAGS: 00010002
[ 133.084869] RAX: ffffed100b426800 RBX: ffff88805a133c80 RCX: ffffffff81181019
[ 133.086227] RDX: ffff88805a133c80 RSI: 2000040000000000 RDI: ffff88805a134678
[ 133.087637] RBP: ffff8880595cfc60 R08: 0000000000000007 R09: 0000000000000007
[ 133.089093] R10: ffffed100b42694f R11: 0000000000000000 R12: ffff88805a133c80
[ 133.090560] R13: ffff88803ce7a100 R14: ffff8880595cfc98 R15: ffffffff83471e60
[ 133.091940] FS: 0000000000000000(0000) GS:ffff88805a100000(0000)
knlGS:0000000000000000
[ 133.106073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 133.107285] CR2: 00000000004216d0 CR3: 0000000044ec4000 CR4: 00000000000406a0
[ 133.108742] Call Trace:
[ 133.109350] sched_cpu_dying+0x205/0x3d1
[ 133.110239] ? sched_cpu_starting+0x233/0x233
[ 133.111198] ? irq_work_run+0x4d/0x56
[ 133.112011] ? flush_smp_call_function_queue+0x356/0x369
[ 133.113166] ? sched_cpu_starting+0x233/0x233
[ 133.114111] cpuhp_invoke_callback+0x401/0x1675
[ 133.115082] ? _raw_spin_unlock+0x37/0x70
[ 133.115996] ? unlock_vector_lock+0x17/0x20
[ 133.116929] ? lapic_offline+0x2f/0x38
[ 133.117745] take_cpu_down+0xdb/0x180
[ 133.131667] ? multi_cpu_stop+0x14f/0x25a
[ 133.132589] ? cpuhp_invoke_callback+0x1675/0x1675
[ 133.133655] multi_cpu_stop+0x156/0x25a
[ 133.134507] ? cpu_stop_queue_work+0x1d4/0x1d4
[ 133.135494] cpu_stopper_thread+0x160/0x23d
[ 133.136382] ? cpu_stop_create+0x55/0x55
[ 133.137182] smpboot_thread_fn+0x605/0x651
[ 133.138027] ? sort_range+0x3e/0x3e
[ 133.138704] ? __kthread_parkme+0x27/0x10f
[ 133.139480] ? __kthread_parkme+0xfa/0x10f
[ 133.140303] kthread+0x254/0x270
[ 133.141003] ? sort_range+0x3e/0x3e
[ 133.141793] ? kthread_stop+0x566/0x566
[ 133.142643] ret_from_fork+0x24/0x30
[ 133.143436] irq event stamp: 2278
[ 133.144202] hardirqs last enabled at (2277): [<ffffffff82eeeb43>]
_raw_spin_unlock_irq+0x43/0x8a
[ 133.151128] hardirqs last disabled at (2278): [<ffffffff812b9e48>]
multi_cpu_stop+0x115/0x25a
[ 133.152925] softirqs last enabled at (2114): [<ffffffff832006fa>]
__do_softirq+0x6fa/0x774
[ 133.154642] softirqs last disabled at (2025): [<ffffffff81122f4b>]
irq_exit+0xaf/0x161
[ 133.156292] ---[ end trace b03815f8d0d80b05 ]---
To reproduce:
# build kernel
cd linux
cp config-5.2.0-rc1-00109-g0a9efc8 .config
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 prepare
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 modules_prepare
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 SHELL=/bin/bash
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 bzImage
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Rong Chen