FYI, we noticed the following commit (built with gcc-5):
commit: 9165c9007c162918d9309ad97ec47520bfa58f7b ("[PATCH tip/core/rcu 4/5] rcu:
Diagnostics for grace-period hangs")
url:
https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Pre-gp_seq-misce...
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url:
http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -m 512M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+------------------------------------------------+------------+------------+
| | c21b784637 | 9165c9007c |
+------------------------------------------------+------------+------------+
| boot_successes | 4 | 2 |
| boot_failures | 18 | 20 |
| BUG:soft_lockup-CPU##stuck_for#s | 18 | 4 |
| RIP:unwind_get_return_address | 1 | |
| Kernel_panic-not_syncing:softlockup:hung_tasks | 18 | 4 |
| RIP:check_memory_region | 1 | 1 |
| RIP:deref_stack_reg | 4 | 1 |
| RIP:arch_local_irq_restore | 4 | 4 |
| RIP:__asan_load8 | 1 | |
| RIP:stack_access_ok | 1 | |
| RIP:__asan_load4 | 1 | 1 |
| RIP:__fentry | 1 | 1 |
| RIP:__orc_find | 2 | 2 |
| RIP:unwind_next_frame | 1 | |
| RIP:__asan_load1 | 1 | 1 |
| RIP:__save_stack_trace | 0 | 1 |
| WARNING:possible_recursive_locking_detected | 0 | 16 |
| BUG:kernel_hang_in_boot_stage | 0 | 16 |
| RIP:__read_once_size_nocheck | 0 | 1 |
| RIP:__asan_store8 | 0 | 1 |
| RIP:save_stack_address | 0 | 1 |
| RIP:depot_save_stack | 0 | 1 |
| RIP:save_stack | 0 | 1 |
| RIP:__mutex_lock | 0 | 1 |
| RIP:on_stack | 0 | 1 |
| RIP:kobject_get_unless_zero | 0 | 1 |
+------------------------------------------------+------------+------------+
[ 2.080000] WARNING: possible recursive locking detected
[ 2.080000] 4.17.0-rc5-00051-g9165c90 #1 Not tainted
[ 2.080000] --------------------------------------------
[ 2.080000] swapper/0/1 is trying to acquire lock:
[ 2.080000] (ptrval) (rcu_node_0){..-.}, at: rcu_process_callbacks+0x563/0xbab
[ 2.080000]
[ 2.080000] but task is already holding lock:
[ 2.080000] (ptrval) (rcu_node_0){..-.}, at: rcu_process_callbacks+0x462/0xbab
[ 2.080000]
[ 2.080000] other info that might help us debug this:
[ 2.080000] Possible unsafe locking scenario:
[ 2.080000]
[ 2.080000] CPU0
[ 2.080000] ----
[ 2.080000] lock(rcu_node_0);
[ 2.080000] lock(rcu_node_0);
[ 2.080000]
[ 2.080000] *** DEADLOCK ***
[ 2.080000]
[ 2.080000] May be due to missing lock nesting notation
[ 2.080000]
[ 2.080000] 1 lock held by swapper/0/1:
[ 2.080000] #0: (ptrval) (rcu_node_0){..-.}, at:
rcu_process_callbacks+0x462/0xbab
[ 2.080000]
[ 2.080000] stack backtrace:
[ 2.080000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.17.0-rc5-00051-g9165c90 #1
[ 2.080000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1
04/01/2014
[ 2.080000] Call Trace:
[ 2.080000] <IRQ>
[ 2.080000] dump_stack+0x8d/0xd3
[ 2.080000] __lock_acquire+0x13dd/0x14ab
[ 2.080000] ? debug_show_all_locks+0x19a/0x19a
[ 2.080000] ? debug_show_all_locks+0x19a/0x19a
[ 2.080000] ? ftrace_likely_update+0x24/0x39
[ 2.080000] ? pvclock_clocksource_read+0x107/0x1a2
[ 2.080000] ? rcu_process_callbacks+0x357/0xbab
[ 2.080000] ? sched_clock_local+0xbe/0xdc
[ 2.080000] ? match_held_lock+0x13/0xf0
[ 2.080000] ? find_held_lock+0x79/0x89
[ 2.080000] ? rcu_process_callbacks+0x563/0xbab
[ 2.080000] ? lock_acquire+0x142/0x219
[ 2.080000] ? lock_contended+0x479/0x479
[ 2.080000] lock_acquire+0x142/0x219
[ 2.080000] ? rcu_process_callbacks+0x563/0xbab
[ 2.080000] _raw_spin_lock+0x30/0x61
[ 2.080000] ? rcu_process_callbacks+0x563/0xbab
[ 2.080000] rcu_process_callbacks+0x563/0xbab
[ 2.080000] ? rcu_nocb_kthread+0x7a0/0x7a0
[ 2.080000] ? __lock_is_held+0x2a/0x87
[ 2.080000] __do_softirq+0x1f8/0x595
[ 2.080000] irq_exit+0x80/0x15b
[ 2.080000] smp_apic_timer_interrupt+0x1e8/0x2db
[ 2.080000] apic_timer_interrupt+0xf/0x20
[ 2.080000] </IRQ>
[ 2.080000] RIP: 0010:__orc_find+0x4c/0x93
[ 2.080000] RSP: 0000:ffff88000fcc76e8 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff13
[ 2.080000] RAX: 0000000000000000 RBX: ffffffff83e1a058 RCX: ffffffff8104768f
[ 2.080000] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffffff83e1a058
[ 2.080000] RBP: ffffffff83e1a03c R08: dffffc0000000000 R09: 0000000000000000
[ 2.080000] R10: fffffbfff0803f2a R11: fffffbfff0803f2b R12: ffffffff83e1a060
[ 2.080000] R13: ffffffff83e1a058 R14: ffffffff83e1a05c R15: 0000000000000002
[ 2.080000] ? __orc_find+0x58/0x93
[ 2.080000] ? __orc_find+0x58/0x93
[ 2.080000] ? __save_stack_trace+0x75/0xe2
[ 2.080000] ? __save_stack_trace+0x75/0xe2
[ 2.080000] unwind_next_frame+0x2e0/0x9c8
[ 2.080000] ? unwind_get_return_address_ptr+0x47/0x47
[ 2.080000] ? mark_lock+0x3a/0x262
[ 2.080000] ? sched_clock_local+0xbe/0xdc
[ 2.080000] __unwind_start+0x2a1/0x2fc
[ 2.080000] ? unwind_next_frame+0x9c8/0x9c8
[ 2.080000] ? ftrace_likely_update+0x24/0x39
[ 2.080000] ? pvclock_clocksource_read+0x107/0x1a2
[ 2.080000] __save_stack_trace+0x76/0xe2
[ 2.080000] ? __save_stack_trace+0x76/0xe2
[ 2.080000] save_stack+0x33/0xa6
[ 2.080000] ? debug_show_all_locks+0x19a/0x19a
[ 2.080000] ? ftrace_likely_update+0x24/0x39
[ 2.080000] ? pvclock_clocksource_read+0x107/0x1a2
[ 2.080000] ? ftrace_likely_update+0x24/0x39
[ 2.080000] ? pvclock_clocksource_read+0x107/0x1a2
[ 2.080000] ? fs_reclaim_release+0xe/0x27
[ 2.080000] ? sched_clock_local+0xbe/0xdc
[ 2.080000] ? sched_clock_cpu+0x138/0x154
[ 2.080000] ? lock_release+0x2bf/0x4c2
[ 2.080000] ? ftrace_likely_update+0x24/0x39
[ 2.080000] ? quarantine_reduce+0x63/0x1db
[ 2.080000] ? kasan_unpoison_shadow+0xf/0x2e
[ 2.080000] kasan_kmalloc+0x8e/0x9d
[ 2.080000] slab_post_alloc_hook+0x31/0x55
[ 2.080000] kmem_cache_alloc+0x11c/0x259
[ 2.080000] __kernfs_new_node+0x52/0x1c3
[ 2.080000] kernfs_new_node+0x32/0x59
[ 2.080000] __kernfs_create_file+0x2e/0x148
[ 2.080000] sysfs_add_file_mode_ns+0x1fb/0x24e
[ 2.080000] internal_create_group+0x315/0x4ce
[ 2.080000] param_sysfs_init+0x5f6/0x680
[ 2.080000] ? locate_module_kobject+0xe3/0xe3
[ 2.080000] ? __lock_is_held+0x2a/0x87
[ 2.080000] ? locate_module_kobject+0xe3/0xe3
[ 2.080000] do_one_initcall+0x134/0x3d3
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong