Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 152b0b530094bc4aa9f2ba10e2046a1cf7c1cd25 ("sched/fair: reduce cases for
active balance")
https://git.linaro.org/people/vincent.guittot/kernel.git sched/pelt
in testcase: trinity
version: trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url:
http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------------+------------+------------+
| | feff2e65ef | 152b0b5300 |
+-------------------------------------------------------+------------+------------+
| boot_successes | 143 | 141 |
| boot_failures | 7 | 21 |
| INFO:rcu_sched_self-detected_stall_on_CPU | 2 | 1 |
| RIP:iov_iter_copy_from_user_atomic | 1 | 1 |
| BUG:soft_lockup-CPU##stuck_for#s![trinity-c4:#] | 2 | 1 |
| Kernel_panic-not_syncing | 2 | 1 |
| IP-Config:Auto-configuration_of_network_failed | 1 | 1 |
| BUG:workqueue_lockup-pool | 4 | |
| RIP:__asan_load8 | 1 | |
| RIP:__asan_load4 | 1 | |
| UBSAN:shift-out-of-bounds_in_kernel/sched/fair.c | 0 | 18 |
| RIP:lock_is_held_type | 0 | 1 |
| WARNING:possible_circular_locking_dependency_detected | 0 | 17 |
| RIP:_raw_spin_unlock_irq | 0 | 1 |
| WARNING:at_fs/read_write.c:#vfs_copy_file_range | 0 | 1 |
| RIP:vfs_copy_file_range | 0 | 1 |
| RIP:__slab_free | 0 | 1 |
| RIP:default_idle | 0 | 4 |
| RIP:rcutorture_one_extend | 0 | 1 |
| RIP:unwind_next_frame | 0 | 1 |
| RIP:__x64_sys_setrlimit | 0 | 1 |
| BUG:KASAN:double-free_or_invalid-free_in_s | 0 | 1 |
| RIP:mutex_spin_on_owner | 0 | 1 |
| RIP:begin_new_exec | 0 | 1 |
| WARNING:at_net/sched/sch_generic.c:#dev_watchdog | 0 | 1 |
| RIP:dev_watchdog | 0 | 1 |
| RIP:clear_page_rep | 0 | 1 |
+-------------------------------------------------------+------------+------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen(a)intel.com>
[ 169.729452] UBSAN: shift-out-of-bounds in kernel/sched/fair.c:7683:14
[ 169.731792] shift exponent 188 is too large for 64-bit type 'long unsigned
int'
[ 169.733890] CPU: 1 PID: 2259 Comm: trinity-c5 Not tainted
5.9.0-rc1-00141-g152b0b530094b #1
[ 169.736206] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1
04/01/2014
[ 169.738580] Call Trace:
[ 169.742120] <IRQ>
[ 169.742877] dump_stack+0x9e/0xe0
[ 169.743930] ubsan_epilogue+0x5/0x40
[ 169.745032] __ubsan_handle_shift_out_of_bounds.cold+0x53/0x100
[ 169.749683] ? can_migrate_task+0x410/0x600
[ 169.750945] load_balance.cold+0x18/0x24
[ 169.752100] ? __lock_acquire+0x817/0xef0
[ 169.753381] ? find_busiest_group+0x4e0/0x4e0
[ 169.754675] ? rcu_read_lock_held+0xaa/0xc0
[ 169.755916] ? rcu_read_lock_sched_held+0xe0/0xe0
[ 169.757273] rebalance_domains+0x4c6/0x860
[ 169.758545] ? load_balance+0x1880/0x1880
[ 169.759783] ? _raw_spin_unlock_irqrestore+0x39/0x40
[ 169.761238] ? trace_hardirqs_on+0x1e/0x140
[ 169.762504] __do_softirq+0x100/0x6b6
[ 169.763622] asm_call_on_stack+0xf/0x20
[ 169.764782] </IRQ>
[ 169.765549] do_softirq_own_stack+0x59/0x70
[ 169.766778] irq_exit_rcu+0xb2/0xd0
[ 169.767876] sysvec_apic_timer_interrupt+0x43/0xa0
[ 169.769332] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 169.770825] RIP: 0010:unwind_next_frame+0x23a/0xb40
[ 169.772263] Code: 4d 85 f6 75 14 48 8d 7b 34 49 c7 c6 40 55 35 83 e8 cb b3 30 00 c6 43
34 01 4d 8d 7e 04 4c 89 ff e8 4b b3 30 00 41 f6 46 04 0f <0f> 84 b7 00 00 00 be 02
00 00 00 4c 89 ff e8 43 bb 30 00 41 0f b6
[ 169.777030] RSP: 0018:ffff8881d8e87a30 EFLAGS: 00000206
[ 169.781848] RAX: 0000000000000000 RBX: ffff8881d8e87b30 RCX: ffffffff81083915
[ 169.783650] RDX: 1ffffffff083c876 RSI: ffffffff841e433c RDI: ffffffff841e43b2
[ 169.785586] RBP: 1ffff1103b1d0f4f R08: ffffffff81083323 R09: fffffbfff083c842
[ 169.787561] R10: 0000000000014010 R11: fffffbfff083c841 R12: 0000000000000001
[ 169.789550] R13: ffff8881d8e87b78 R14: ffffffff841e43ae R15: ffffffff841e43b2
[ 169.791488] ? __orc_find+0x63/0xc0
[ 169.792566] ? unwind_next_frame+0x235/0xb40
[ 169.793851] ? __kasan_kmalloc+0xc2/0xd0
[ 169.795301] ? get_reg+0xd0/0xd0
[ 169.796315] ? check_prevs_add+0x13a0/0x13a0
[ 169.797608] ? __unwind_start+0x2f3/0x370
[ 169.798831] ? stack_trace_save+0xc0/0xc0
[ 169.800057] arch_stack_walk+0x81/0xf0
[ 169.801206] ? __kasan_kmalloc+0xc2/0xd0
[ 169.802647] stack_trace_save+0x8c/0xc0
[ 169.803854] ? irqentry_exit_cond_resched+0x30/0x30
[ 169.805295] ? lock_downgrade+0x370/0x370
[ 169.806492] kasan_save_stack+0x1b/0x40
[ 169.807605] ? kasan_save_stack+0x1b/0x40
[ 169.808746] ? __kasan_kmalloc+0xc2/0xd0
[ 169.810110] ? do_raw_spin_unlock+0x9e/0x130
[ 169.811386] ? _raw_spin_unlock+0x1a/0x30
[ 169.812577] ? deactivate_slab+0x7c6/0x800
[ 169.813856] ? fsnotify_alloc_group+0x30/0x180
[ 169.815173] ? set_track+0x4a/0xc0
[ 169.816256] ? init_object+0x49/0x80
[ 169.817403] ? alloc_debug_processing+0x42/0x160
[ 169.818782] ? ___slab_alloc+0x6b3/0x970
[ 169.820196] ? fsnotify_alloc_group+0x30/0x180
[ 169.821532] ? mark_held_locks+0x23/0x90
[ 169.822717] ? fsnotify_alloc_group+0x30/0x180
[ 169.824022] ? lockdep_hardirqs_on_prepare+0x155/0x250
[ 169.825508] ? __slab_alloc+0x4b/0x60
[ 169.826868] ? kasan_unpoison_shadow+0x33/0x40
[ 169.828183] __kasan_kmalloc+0xc2/0xd0
[ 169.829592] ? fsnotify_alloc_group+0x30/0x180
[ 169.830865] kmem_cache_alloc_trace+0xdc/0x2e0
[ 169.832162] fsnotify_alloc_group+0x30/0x180
[ 169.833463] do_inotify_init+0x2c/0x280
[ 169.834554] __x64_sys_inotify_init1+0x16/0x20
[ 169.835797] do_syscall_64+0x2d/0x40
[ 169.836892] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 169.838398] RIP: 0033:0x463519
[ 169.839384] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83
db 59 00 00 c3 66 2e 0f 1f 84 00 00 00 00
[ 169.844369] RSP: 002b:00007ffe787d3648 EFLAGS: 00000246 ORIG_RAX: 0000000000000126
[ 169.846436] RAX: ffffffffffffffda RBX: 0000000000000126 RCX: 0000000000463519
[ 169.848395] RDX: 00000000ffffefff RSI: 00e3cb3139591d83 RDI: 0000000000000000
[ 169.850374] RBP: 00007f6f4f7f4000 R08: 0000000022000028 R09: 28a81d29208010a8
[ 169.852333] R10: 000000000000f774 R11: 0000000000000246 R12: 0000000000000002
[ 169.854317] R13: 00007f6f4f7f4058 R14: 000000000109a850 R15: 00007f6f4f7f4000
[ 169.856329]
================================================================================
[ 169.858715]
[ 169.858719] ======================================================
[ 169.858721] WARNING: possible circular locking dependency detected
[ 169.858724] 5.9.0-rc1-00141-g152b0b530094b #1 Not tainted
[ 169.858727] ------------------------------------------------------
[ 169.858729] trinity-c5/2259 is trying to acquire lock:
[ 169.858730] ffffffff833e95a0 (console_owner){-.-.}-{0:0}, at:
console_unlock+0x201/0x730
[ 169.858738]
[ 169.858740] but task is already holding lock:
[ 169.858742] ffff8881e99f8018 (&rq->lock){-.-.}-{2:2}, at:
load_balance+0x7c7/0x1880
[ 169.858749]
[ 169.858751] which lock already depends on the new lock.
[ 169.858753]
[ 169.858754]
[ 169.858756] the existing dependency chain (in reverse order) is:
[ 169.858758]
[ 169.858759] -> #4 (&rq->lock){-.-.}-{2:2}:
[ 169.858766] __lock_acquire+0x7f3/0xef0
[ 169.858768] lock_acquire+0x15d/0x550
[ 169.858770] _raw_spin_lock+0x2c/0x70
[ 169.858772] task_fork_fair+0x2e/0x2b0
[ 169.858773] sched_fork+0x146/0x2c0
[ 169.858776] copy_process+0xdd5/0x2ef0
[ 169.858777] _do_fork+0xf4/0x840
[ 169.858779] kernel_thread+0xa3/0xe0
[ 169.858781] rest_init+0x1e/0x2fa
[ 169.858783] start_kernel+0x3a5/0x3c3
[ 169.858785] secondary_startup_64+0xb6/0xc0
[ 169.858786]
[ 169.858787] -> #3 (&p->pi_lock){-.-.}-{2:2}:
[ 169.858795] __lock_acquire+0x7f3/0xef0
[ 169.858797] lock_acquire+0x15d/0x550
[ 169.858799] _raw_spin_lock_irqsave+0x37/0x80
[ 169.858801] try_to_wake_up+0xa2/0x1020
[ 169.858804] autoremove_wake_function+0x10/0x70
[ 169.858806] __wake_up_common+0xbe/0x250
[ 169.858808] __wake_up_common_lock+0xd0/0x130
[ 169.858810] tty_port_default_wakeup+0x16/0x30
[ 169.858812] serial8250_tx_chars+0x20d/0x3e0
[ 169.858815] serial8250_handle_irq+0xdf/0x170
[ 169.858816] serial8250_interrupt+0x83/0xd0
[ 169.858819] __handle_irq_event_percpu+0x80/0x4e0
[ 169.858821] handle_irq_event_percpu+0x6a/0xf0
[ 169.858823] handle_irq_event+0x50/0x7f
[ 169.858825] handle_edge_irq+0xfd/0x380
[ 169.858826] asm_call_on_stack+0xf/0x20
[ 169.858828] common_interrupt+0xea/0x1a0
[ 169.858830] asm_common_interrupt+0x1e/0x40
[ 169.858832] lock_acquire+0x18d/0x550
[ 169.858834] rcu_torture_read_lock+0x28/0x80
[ 169.858836] rcutorture_one_extend+0x26a/0x4b0
[ 169.858838] rcu_torture_one_read+0x30e/0x670
[ 169.858840] rcu_torture_reader+0x13b/0x2d0
[ 169.858842] kthread+0x1f4/0x220
[ 169.858844] ret_from_fork+0x1f/0x30
[ 169.858846]
[ 169.858847] -> #2 (&tty->write_wait){-.-.}-{2:2}:
[ 169.858855] __lock_acquire+0x7f3/0xef0
[ 169.858857] lock_acquire+0x15d/0x550
[ 169.858859] _raw_spin_lock_irqsave+0x37/0x80
[ 169.858861] __wake_up_common_lock+0xb4/0x130
[ 169.858863] tty_port_default_wakeup+0x16/0x30
[ 169.858865] serial8250_tx_chars+0x20d/0x3e0
[ 169.858867] serial8250_handle_irq+0xdf/0x170
[ 169.858869] serial8250_interrupt+0x83/0xd0
[ 169.858872] __handle_irq_event_percpu+0x80/0x4e0
[ 169.858874] handle_irq_event_percpu+0x6a/0xf0
[ 169.858876] handle_irq_event+0x50/0x7f
[ 169.858878] handle_edge_irq+0xfd/0x380
[ 169.858879] asm_call_on_stack+0xf/0x20
[ 169.858881] common_interrupt+0xea/0x1a0
[ 169.858883] asm_common_interrupt+0x1e/0x40
[ 169.858885] lock_acquire+0x18d/0x550
[ 169.858887] rcu_torture_read_lock+0x28/0x80
[ 169.858889] rcutorture_one_extend+0x26a/0x4b0
[ 169.858891] rcu_torture_one_read+0x30e/0x670
[ 169.858894] rcu_torture_reader+0x13b/0x2d0
[ 169.858895] kthread+0x1f4/0x220
[ 169.858898] ret_from_fork+0x1f/0x30
[ 169.858899]
[ 169.858900] -> #1 (&port->lock){-.-.}-{2:2}:
[ 169.858907] __lock_acquire+0x7f3/0xef0
[ 169.858909] lock_acquire+0x15d/0x550
[ 169.858911] _raw_spin_lock_irqsave+0x37/0x80
[ 169.858914] serial8250_console_write+0xfe/0x4f0
[ 169.858916] console_unlock+0x4c9/0x730
[ 169.858918] vprintk_emit+0xe5/0x290
[ 169.858920] printk+0xad/0xde
[ 169.858922] register_console+0x25a/0x3c0
[ 169.858924] univ8250_console_init+0x1f/0x22
[ 169.858926] console_init+0x21a/0x324
[ 169.858928] start_kernel+0x29a/0x3c3
[ 169.858931] secondary_startup_64+0xb6/0xc0
[ 169.858933]
[ 169.858934] -> #0 (console_owner){-.-.}-{0:0}:
[ 169.858941] check_prevs_add+0x325/0x13a0
[ 169.858943] validate_chain+0xde3/0x2860
[ 169.858945] __lock_acquire+0x7f3/0xef0
[ 169.858947] lock_acquire+0x15d/0x550
[ 169.858949] console_unlock+0x25f/0x730
[ 169.858951] vprintk_emit+0xe5/0x290
[ 169.858953] printk+0xad/0xde
[ 169.858955] ubsan_prologue+0x23/0x44
To reproduce:
# build kernel
cd linux
cp config-5.9.0-rc1-00141-g152b0b530094b .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Rong Chen