Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit b16c29191dc89bd877af99a7b04ce4866728a3e0
Author: Sasha Levin <sasha.levin(a)oracle.com>
AuthorDate: Mon Jan 18 19:23:51 2016 -0500
Commit: Pablo Neira Ayuso <pablo(a)netfilter.org>
CommitDate: Wed Jan 20 14:15:31 2016 +0100
netfilter: nf_conntrack: use safer way to lock all buckets
When we need to lock all buckets in the connection hashtable we'd attempt to
lock 1024 spinlocks, which is way more preemption levels than supported by
the kernel. Furthermore, this behavior was hidden by checking if lockdep is
enabled, and if it was - use only 8 buckets(!).
Fix this by using a global lock and synchronize all buckets on it when we
need to lock them all. This is pretty heavyweight, but is only done when we
need to resize the hashtable, and that doesn't happen often enough (or at all).
Signed-off-by: Sasha Levin <sasha.levin(a)oracle.com>
Acked-by: Jesper Dangaard Brouer <brouer(a)redhat.com>
Reviewed-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
+-------------------------------+------------+------------+------------+
| | 35b815392a | b16c29191d | 9f59b59fdd |
+-------------------------------+------------+------------+------------+
| boot_successes | 910 | 118 | 38 |
| boot_failures | 0 | 2 | 8 |
| BUG:spinlock_recursion_on_CPU | 0 | 2 | 8 |
| backtrace:cleanup_net | 0 | 2 | 7 |
+-------------------------------+------------+------------+------------+
[child0:1965] uid changed! Was: 0, now 65535
[child1:1964] child exiting.
Bailing main loop. Exit reason: UID changed.
[ 24.822451] BUG: spinlock recursion on CPU#1, kworker/u4:2/89
[ 24.823358] lock: nf_conntrack_locks+0x0/0x12780, .magic: dead4ead, .owner:
kworker/u4:2/89, .owner_cpu: 1
[ 24.824802] CPU: 1 PID: 89 Comm: kworker/u4:2 Not tainted 4.4.0-03418-gb16c291 #1
[ 24.825913] Workqueue: netns cleanup_net
[ 24.826552] ffffffff81e08880 ffff88001265fbe8 ffffffff813cf0c9 ffff8800125e6100
[ 24.827743] ffff88001265fc08 ffffffff810b37e8 ffffffff81e08880 ffffffff81e08880
[ 24.828930] ffff88001265fc38 ffffffff810b39aa ffffffff81e08898 ffffffff81e08880
[ 24.830114] Call Trace:
[ 24.830510] [<ffffffff813cf0c9>] dump_stack+0x4b/0x72
[ 24.831277] [<ffffffff810b37e8>] spin_dump+0x78/0xc0
[ 24.832043] [<ffffffff810b39aa>] do_raw_spin_lock+0x11a/0x150
[ 24.832922] [<ffffffff8185022d>] _raw_spin_lock+0x5d/0x80
[ 24.833753] [<ffffffff81701922>] ? nf_conntrack_lock+0x12/0x60
[ 24.834648] [<ffffffff81701922>] nf_conntrack_lock+0x12/0x60
[ 24.835353] caif:caif_disconnect_client(): nothing to disconnect
[ 24.849447] [<ffffffff8170bce2>] ctnl_untimeout+0x82/0xb0
[ 24.850371] [<ffffffff8170bd3b>] cttimeout_net_exit+0x2b/0x80
[ 24.851245] [<ffffffff816b4ff8>] ops_exit_list+0x38/0x60
[ 24.852153] [<ffffffff816b5dae>] cleanup_net+0x1ae/0x270
[ 24.852986] [<ffffffff81080b71>] process_one_work+0x1c1/0x500
[ 24.853879] [<ffffffff81080ae9>] ? process_one_work+0x139/0x500
[ 24.854805] [<ffffffff81080efe>] worker_thread+0x4e/0x490
[ 24.855649] [<ffffffff81080eb0>] ? process_one_work+0x500/0x500
[ 24.856571] [<ffffffff81080eb0>] ? process_one_work+0x500/0x500
[ 24.857507] [<ffffffff81087851>] kthread+0x101/0x120
[ 24.858285] [<ffffffff81087750>] ? kthread_stop+0x120/0x120
[ 24.859186] [<ffffffff8185146f>] ret_from_fork+0x3f/0x70
[ 24.860035] [<ffffffff81087750>] ? kthread_stop+0x120/0x120
[ 34.718759] init: tty4 main process (1966) terminated with status 1
[ 34.721009] init: tty4 main process ended, respawning
[ 34.730863] init: tty5 main process (1967) terminated with status 1
git bisect start 9f59b59fddd4ad4a398007059b179cc804e9a85f
92e963f50fc74041b5e9e744c330dca48e04f08d --
git bisect bad d2907ebfd7892cc6919288e306782d28b1aec054 # 11:25 14- 3 Merge
'linux-review/Nicolas-Ferre/spi-atmel-fix-gpio-chip-select-in-case-of-non-DT-platform/20160128-005054'
into devel-spot-201601280923
git bisect good 2f451ad637b99e6073555ee77b4861ba4c0b33d8 # 11:37 120+ 0 Merge
'asoc/for-next' into devel-spot-201601280923
git bisect bad 5bf7f3913eb09a58a29f8c49d35e6d564521efc3 # 11:43 0- 1 Merge
'asoc/topic/ssm4567' into devel-spot-201601280923
git bisect good c0cd80413f024d2d2b0306cd73ff905211b2dfdf # 12:00 120+ 0 Merge
'spi/for-next' into devel-spot-201601280923
git bisect good cec507cd4a1097ff1db46a2138b5a63a34e19a4e # 12:25 119+ 0 Merge
'linux-review/Ross-Zwisler/ext2-ext4-Fix-issue-with-missing-journal-entry/20160128-030742'
into devel-spot-201601280923
git bisect bad fab357e5a0b27fe98447194f4ff3b57cf161353a # 12:48 5- 5 Merge
'bluetooth/master' into devel-spot-201601280923
git bisect bad 41a751e8480e093999fed3af75cfd17ad3822f77 # 13:02 4- 8 Merge
'linux-review/Eric-Dumazet/tcp-beware-of-alignments-in-tcp_get_info/20160128-025549'
into devel-spot-201601280923
git bisect bad 0e03f563a04207cc8e5db6afe63309a585995de7 # 13:21 3- 5 net:
mvneta: sort the headers in alphabetic order
git bisect bad 8034e1efcb330d2aecef8cbf8a83f206270c1775 # 13:39 2- 3 Merge
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
git bisect good 81e8f2e930fe76b9814c71b9d87c30760b5eb705 # 13:48 120+ 0 net:
dp83640: Fix tx timestamp overflow handling.
git bisect good d6b3347bf178266259af64b1f27b5cf54acf62c8 # 14:19 120+ 0
netfilter: xt_TCPMSS: handle CHECKSUM_COMPLETE in tcpmss_tg6()
git bisect bad b16c29191dc89bd877af99a7b04ce4866728a3e0 # 16:32 75- 2
netfilter: nf_conntrack: use safer way to lock all buckets
git bisect good 35b815392a6b6c268baf3b63d7f2ba350597024f # 17:14 310+ 0
netfilter: nf_tables_netdev: fix error path in module initialization
# first bad commit: [b16c29191dc89bd877af99a7b04ce4866728a3e0] netfilter: nf_conntrack:
use safer way to lock all buckets
git bisect good 35b815392a6b6c268baf3b63d7f2ba350597024f # 17:27 905+ 0
netfilter: nf_tables_netdev: fix error path in module initialization
# extra tests with DEBUG_INFO
git bisect bad b16c29191dc89bd877af99a7b04ce4866728a3e0 # 17:51 104- 4
netfilter: nf_conntrack: use safer way to lock all buckets
# extra tests on HEAD of linux-devel/devel-spot-201601280923
git bisect bad 9f59b59fddd4ad4a398007059b179cc804e9a85f # 17:51 0- 8 0day
head guard for 'devel-spot-201601280923'
# extra tests on tree/branch linux-next/master
git bisect bad 888c8375131656144c1605071eab2eb6ac49abc3 # 18:04 16- 3 Add
linux-next specific files for 20160128
# extra tests with first bad commit reverted
git bisect good 8aef8bd67e3bcbe1894bf46992346fc0a5ef4647 # 18:32 905+ 0 Revert
"netfilter: nf_conntrack: use safer way to lock all buckets"
# extra tests on tree/branch linus/master
git bisect good 03c21cb775a313f1ff19be59c5d02df3e3526471 # 19:20 910+ 0 Merge
tag 'for_linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
# extra tests on tree/branch linux-next/master
git bisect bad 888c8375131656144c1605071eab2eb6ac49abc3 # 19:20 0- 64 Add
linux-next specific files for 20160128
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=quantal-core-x86_64.cgz
wget --no-clobber
https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu kvm64
-kernel $kernel
-initrd $initrd
-m 300
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
systemd.log_level=err
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation