Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux
George-Spelvin/vfs-add-simple-direct-mapped-dcache-lookup-front-end/20160612-075440
commit a542b0c761819c80d2d24cb7f3d4d5cf1f147434
Author: George Spelvin <linux(a)sciencehorizons.net>
AuthorDate: Sat Jun 11 19:51:30 2016 -0400
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Sun Jun 12 07:54:43 2016 +0800
vfs: add simple direct-mapped dcache lookup front-end
This is an old patch by Linus that he asked if I could fix the race
conditions in. Posted for comment on the RCU abuse (search for "Evil
RCU Hack") and performance in general.
[Linus speaking, Thu May 31, 2012]
I've pushed __d_lookup_rcu() just about as far as I could, and it still
had some problems.
The problems were mainly due to:
- the complexity of the slow-case handling causes register spills,
- the hash chain lookup loop causes not only register pressure, but
also the extra magic "mask off lock bit from the hash chain head
pointer" etc. logic, and
- the hash list needs to be dynamically sized (we want *big* caches, but
you can't use the same size for big and small machines), which causes
the initial hash lookup itself to be more complex.
This looks like a viable solution to all three problems, and it is
actually surprisingly simple: make a trivial fixed-size direct-mapped L1
dentry cache. No chains, no locking, no nothing.
This gives measurable improvement on my microbenchmark, and gets good
hit-rates on both kernel compiles and even on something like "updatedb",
which I'd have expected to be one of the worst possible cases.
Apparently updatedb still ends up looking up the same files (/etc/fstab
etc) a lot. So those good hit-rates seem to often be due to really
stupid programming, but hey, I think we all agree that "stupid
programming" is likely the common case that we generally do need to also
optimize for ;)
For my kernel compile benchmark ("make -j" on a fully built tree), the
profile shows (this is kernel-only profile, so user space overhead
removed):
8.19% [k] link_path_walk
7.74% [k] __d_lookup_rcu
5.66% [k] selinux_inode_permission
3.73% [k] do_lookup
2.86% [k] path_lookupat
2.72% [k] avc_has_perm_noaudit
2.71% [k] inode_has_perm.isra.49.constprop.71
2.68% [k] avc_lookup
2.51% [k] generic_permission
...
0.78% [k] __d_lookup_rcu_slow
...
where "__d_lookup_rcu_slow()" is the exact same old __d_lookup_rcu(), so
it's not really "slow", but it's quite noticeably slower than the
new
streamlined __d_lookup_rcu(). And as you can tell, that means that we
must have a 90%+ hitrate in the new L1 dcache lookup, since we only see
10% as much time in the slow routine as in the L1 front-end.
[George Spelvin speaking]
I fixed the race conditions in Linus's code, added Kconfig support,
and a number of comments.
I have two concerns about the performance of this code:
1) Since it was first written, RCU dcache lookup has gone completely
lockless and is even faster.
2) By adding a shared data structure that is written randomly by multiple
CPUs, this patch undoes a lot of that optimization. Even if it's a
win on a single-socket system, it may be a net loss on a larger one.
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Nick Piggin <npiggin(a)gmail.com>
Cc: Miklos Szeredi <mszeredi(a)suse.cz>
Cc: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: George Spelvin <linux(a)sciencehorizons.net>
+------------------------------------------------+------------+------------+------------+
| | 45b00c94be | a542b0c761 | db0d5369f2 |
+------------------------------------------------+------------+------------+------------+
| boot_successes | 857 | 241 | 25 |
| boot_failures | 52 | 73 | 7 |
| BUG:kernel_test_crashed | 52 | 12 | |
| WARNING:at_fs/dcache.c:#dentry_free | 0 | 59 | 4 |
| backtrace:user_path_at_empty | 0 | 57 | 2 |
| backtrace:SyS_faccessat | 0 | 57 | 2 |
| backtrace:SyS_access | 0 | 57 | 2 |
| backtrace:do_execve | 0 | 1 | |
| backtrace:SyS_execve | 0 | 1 | |
| backtrace:vfs_stat | 0 | 2 | 1 |
| backtrace:SyS_stat64 | 0 | 2 | 1 |
| IP-Config:Auto-configuration_of_network_failed | 0 | 2 | |
| backtrace:unix_stream_connect | 0 | 0 | 1 |
| backtrace:SyS_connect | 0 | 0 | 1 |
| backtrace:SyS_socketcall | 0 | 0 | 1 |
| BUG:unable_to_handle_kernel | 0 | 0 | 1 |
| Oops | 0 | 0 | 1 |
| EIP_is_at__dentry_kill | 0 | 0 | 1 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 0 | 1 |
| backtrace:vfs_fstatat | 0 | 0 | 1 |
| backtrace:SyS_fstatat64 | 0 | 0 | 1 |
| BUG:kernel_test_hang | 0 | 0 | 2 |
+------------------------------------------------+------------+------------+------------+
udevd[271]: failed to execute '/sbin/modprobe' '/sbin/modprobe -bv
acpi:PNP0700:': No such file or directory
[ 5.782865] ------------[ cut here ]------------
[ 5.782865] ------------[ cut here ]------------
[ 5.789054] WARNING: CPU: 0 PID: 338 at fs/dcache.c:345 dentry_free+0x1e/0x59
[ 5.789054] WARNING: CPU: 0 PID: 338 at fs/dcache.c:345 dentry_free+0x1e/0x59
[ 5.798410] CPU: 0 PID: 338 Comm: sh Not tainted 4.7.0-rc2-00326-ga542b0c #1
[ 5.798410] CPU: 0 PID: 338 Comm: sh Not tainted 4.7.0-rc2-00326-ga542b0c #1
[ 5.800780] 00000000
[ 5.800780] 00000000 00000000 00000000 c0621dac c0621dac c8fa077e c8fa077e 00000000
00000000 c8ef7308 c8ef7308 c0621dc4 c0621dc4 c8e3f04e c8e3f04e
[ 5.803388] 00000159
[ 5.803388] 00000159 d18d0950 d18d0950 00000001 00000001 d18d09b0 d18d09b0 c0621dd8
c0621dd8 c8e3f0d1 c8e3f0d1 00000009 00000009 00000000 00000000
[ 5.805886] 00000000
[ 5.805886] 00000000 c0621de4 c0621de4 c8ef7308 c8ef7308 d18d0950 d18d0950 c0621df8
c0621df8 c8ef7e11 c8ef7e11 d18d0950 d18d0950 d18d09b0 d18d09b0
[ 5.808486] Call Trace:
[ 5.808486] Call Trace:
[ 5.809279] [<c8fa077e>] dump_stack+0x58/0x72
[ 5.809279] [<c8fa077e>] dump_stack+0x58/0x72
[ 5.810672] [<c8ef7308>] ? dentry_free+0x1e/0x59
[ 5.810672] [<c8ef7308>] ? dentry_free+0x1e/0x59
[ 5.812081] [<c8e3f04e>] __warn+0xad/0xc4
[ 5.812081] [<c8e3f04e>] __warn+0xad/0xc4
[ 5.813344] [<c8e3f0d1>] warn_slowpath_null+0x11/0x16
[ 5.813344] [<c8e3f0d1>] warn_slowpath_null+0x11/0x16
[ 5.814945] [<c8ef7308>] dentry_free+0x1e/0x59
[ 5.814945] [<c8ef7308>] dentry_free+0x1e/0x59
[ 5.816286] [<c8ef7e11>] __dentry_kill+0x166/0x16e
[ 5.816286] [<c8ef7e11>] __dentry_kill+0x166/0x16e
[ 5.817841] [<c8ef82f4>] dput+0x1e3/0x24e
[ 5.817841] [<c8ef82f4>] dput+0x1e3/0x24e
[ 5.819069] [<c8eef36d>] path_put+0xe/0x18
[ 5.819069] [<c8eef36d>] path_put+0xe/0x18
[ 5.820375] [<c8eeffaa>] terminate_walk+0x1c/0x87
[ 5.820375] [<c8eeffaa>] terminate_walk+0x1c/0x87
[ 5.821941] [<c8ef0bac>] path_lookupat+0xb1/0xd3
[ 5.821941] [<c8ef0bac>] path_lookupat+0xb1/0xd3
[ 5.823476] [<c8ef0c20>] filename_lookup+0x52/0xc9
[ 5.823476] [<c8ef0c20>] filename_lookup+0x52/0xc9
[ 5.825079] [<c8ede062>] ? cache_alloc_debugcheck_after+0x30/0x132
[ 5.825079] [<c8ede062>] ? cache_alloc_debugcheck_after+0x30/0x132
[ 5.827148] [<c8ede4f4>] ? kmem_cache_alloc+0xd9/0xfa
[ 5.827148] [<c8ede4f4>] ? kmem_cache_alloc+0xd9/0xfa
[ 5.828780] [<c8ef195b>] user_path_at_empty+0x25/0x2c
[ 5.828780] [<c8ef195b>] user_path_at_empty+0x25/0x2c
[ 5.830427] [<c8eeaab6>] vfs_fstatat+0x49/0x7d
[ 5.830427] [<c8eeaab6>] vfs_fstatat+0x49/0x7d
[ 5.831896] [<c8eeaafd>] vfs_stat+0x13/0x15
[ 5.831896] [<c8eeaafd>] vfs_stat+0x13/0x15
[ 5.833147] [<c8eeacb7>] SyS_stat64+0x11/0x22
[ 5.833147] [<c8eeacb7>] SyS_stat64+0x11/0x22
[ 5.834558] [<c8e00d3d>] do_int80_syscall_32+0x48/0x84
[ 5.834558] [<c8e00d3d>] do_int80_syscall_32+0x48/0x84
[ 5.836269] [<c940abfa>] entry_INT80_32+0x2a/0x2a
[ 5.836269] [<c940abfa>] entry_INT80_32+0x2a/0x2a
[ 5.837849] ---[ end trace 4c316ed3dc6a1f1b ]---
[ 5.837849] ---[ end trace 4c316ed3dc6a1f1b ]---
git bisect start db0d5369f27706587ea26e571a1ef66321df0bba
5edb56491d4812c42175980759da53388e5d86f5 --
git bisect good 3700238e620892e500c6b7fabc2f0e611d908ac8 # 00:54 92+ 1 Merge
'agross/for-next' into devel-spot-201606122334
git bisect bad be45a2e3f83754faca0e50667e3666b20b2aa289 # 00:58 78- 6 Merge
'linux-review/Florian-Westphal/netfilter-move-zone-info-into-struct-nf_conn/20160612-040022'
into devel-spot-201606122334
git bisect good 298554eb44aad2aa38094f2d1fa76ddd4562e387 # 01:11 306+ 25 Merge
'jeffmerkey/master' into devel-spot-201606122334
git bisect bad 443ced919991f3586629089d4e382a2ae76bb9aa # 01:17 1- 1 Merge
'linux-review/George-Spelvin/vfs-add-simple-direct-mapped-dcache-lookup-front-end/20160612-075440'
into devel-spot-201606122334
git bisect bad a542b0c761819c80d2d24cb7f3d4d5cf1f147434 # 01:24 3- 3 vfs: add
simple direct-mapped dcache lookup front-end
# first bad commit: [a542b0c761819c80d2d24cb7f3d4d5cf1f147434] vfs: add simple
direct-mapped dcache lookup front-end
git bisect good 45b00c94be33db5d00595046663163ce55cbbfb9 # 01:44 905+ 52 Merge
tag 'scsi-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
# extra tests with CONFIG_DEBUG_INFO_REDUCED
git bisect bad a542b0c761819c80d2d24cb7f3d4d5cf1f147434 # 01:49 40- 3 vfs: add
simple direct-mapped dcache lookup front-end
# extra tests on HEAD of linux-devel/devel-spot-201606122334
git bisect bad db0d5369f27706587ea26e571a1ef66321df0bba # 01:49 0- 7 0day
head guard for 'devel-spot-201606122334'
# extra tests on tree/branch
linux-review/George-Spelvin/vfs-add-simple-direct-mapped-dcache-lookup-front-end/20160612-075440
git bisect bad a542b0c761819c80d2d24cb7f3d4d5cf1f147434 # 02:07 0- 73 vfs: add
simple direct-mapped dcache lookup front-end
# extra tests with first bad commit reverted
git bisect good 01c6ae35b5bea0fe16b14900cbbe411e4969c97d # 02:32 910+ 46 Revert
"vfs: add simple direct-mapped dcache lookup front-end"
# extra tests on tree/branch linus/master
git bisect good 5edb56491d4812c42175980759da53388e5d86f5 # 03:03 910+ 72 Linux
4.7-rc3
# extra tests on tree/branch linux-next/master
git bisect good 8f6027f7e808ed7c1fd8c8d37fc7a5076c683c4f # 03:19 910+ 46 Add
linux-next specific files for 20160609
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=quantal-core-i386.cgz
wget --no-clobber
https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu kvm64
-kernel $kernel
-initrd $initrd
-m 300
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
systemd.log_level=err
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation