Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
git://anongit.freedesktop.org/drm-intel topic/core-for-CI
commit 3f7a5fb77af4f5b38e514b85d491da0046d6bfb8
Author: Peter Zijlstra <peterz(a)infradead.org>
AuthorDate: Wed Jan 11 17:43:02 2017 +0100
Commit: Daniel Vetter <daniel.vetter(a)ffwll.ch>
CommitDate: Mon Feb 6 10:04:48 2017 +0100
locking/mutex: Clear mutex-handoff flag on interrupt
On Mon, Jan 09, 2017 at 11:52:03AM +0000, Chris Wilson wrote:
If we abort the mutex_lock() due to an interrupt, or other error from
s/interrupt/signal/, right?
ww_mutex, we need to relinquish the handoff flag if we applied it.
Otherwise, we may cause missed wakeups as the current owner may try to
handoff to a new thread that is not expecting the handoff and so sleep
thinking the lock is already claimed (and since the owner unlocked there
may never be a new wakeup).
Isn't that the exact same scenario as Nicolai fixed here:
http://lkml.kernel.org/r/1482346000-9927-3-git-send-email-nhaehnle@gmail.com
Did you, like Nicolai, find this by inspection, or can you reproduce?
FWIW, I have the below patch that should also solve this problem afaict.
d790812fa2 mm/vmalloc: Replace opencoded 4-level page walkers
3f7a5fb77a locking/mutex: Clear mutex-handoff flag on interrupt
+--------------------------------------------------------------+------------+------------+
| | d790812fa2 | 3f7a5fb77a
|
+--------------------------------------------------------------+------------+------------+
| boot_successes | 174 | 3
|
| boot_failures | 0 | 6
|
| invoked_oom-killer:gfp_mask=0x | 0 | 1
|
| Mem-Info | 0 | 1
|
| WARNING:at_arch/x86/include/asm/fpu/internal.h:#__switch_to | 0 | 5
|
| WARNING:at_arch/x86/include/asm/fpu/internal.h:#fpu__copy | 0 | 5
|
| WARNING:at_arch/x86/include/asm/fpu/internal.h:#fpu__restore | 0 | 5
|
+--------------------------------------------------------------+------------+------------+
[ 3.682260] ### dt-test ### end of unittest - 149 passed, 0 failed
[ 16.196304] Freeing unused kernel memory: 976K
[ 16.198261] Write protecting the kernel text: 12980k
[ 16.200258] Write protecting the kernel read-only data: 3808k
[ 16.203781] ------------[ cut here ]------------
[ 16.205683] WARNING: CPU: 0 PID: 1 at arch/x86/include/asm/fpu/internal.h:348
__switch_to+0x32b/0x350
[ 16.208506] Modules linked in:
[ 16.210235] CPU: 0 PID: 1 Comm: init Tainted: G S
4.10.0-rc7-00002-g3f7a5fb #1
[ 16.214583] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 16.217474] Call Trace:
[ 16.219571] ---[ end trace b0ed16ad1b787fda ]---
[ 16.227887] ------------[ cut here ]------------
[ 16.227887] ------------[ cut here ]------------
[ 16.229927] WARNING: CPU: 0 PID: 132 at arch/x86/include/asm/fpu/internal.h:363
__switch_to+0x303/0x350
[ 16.233012] Modules linked in:
[ 16.234867] CPU: 0 PID: 132 Comm: rc.local Tainted: G S W
4.10.0-rc7-00002-g3f7a5fb #1
[ 16.238180] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 16.241092] Call Trace:
[ 16.242936] ---[ end trace b0ed16ad1b787fdb ]---
[ 16.246981] ------------[ cut here ]------------
[ 16.246981] ------------[ cut here ]------------
[ 16.249093] WARNING: CPU: 0 PID: 132 at arch/x86/include/asm/fpu/internal.h:348
fpu__copy+0x183/0x2d0
[ 16.252242] Modules linked in:
[ 16.255251] CPU: 0 PID: 132 Comm: rc.local Tainted: G S W
4.10.0-rc7-00002-g3f7a5fb #1
[ 16.258238] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 16.261230] Call Trace:
[ 16.263287] dump_stack+0x16/0x1d
[ 16.265308] __warn+0xd9/0x100
[ 16.267624] ? fpu__copy+0x183/0x2d0
[ 16.269672] warn_slowpath_null+0x2a/0x30
[ 16.271895] fpu__copy+0x183/0x2d0
[ 16.274107] arch_dup_task_struct+0x34/0x40
[ 16.276322] copy_process+0xec/0x1410
[ 16.278587] ? __might_sleep+0x32/0xa0
[ 16.280642] ? __might_sleep+0x32/0xa0
[ 16.282811] _do_fork+0xda/0x340
[ 16.284735] ? _copy_to_user+0x4e/0x80
[ 16.286880] SyS_clone+0x2c/0x30
[ 16.288722] do_int80_syscall_32+0x5b/0xc0
[ 16.290793] entry_INT80_32+0x31/0x31
[ 16.292675] EIP: 0xb76f21b2
[ 16.294474] EFLAGS: 00000286 CPU: 0
[ 16.296356] EAX: ffffffda EBX: 01200011 ECX: 00000000 EDX: 00000000
[ 16.298685] ESI: 00000000 EDI: b7522728 EBP: bfc98a08 ESP: bfc989ac
[ 16.300922] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 16.303165] ---[ end trace b0ed16ad1b787fdc ]---
[ 16.307565] ------------[ cut here ]------------
[ 16.307565] ------------[ cut here ]------------
[ 16.309524] WARNING: CPU: 0 PID: 1 at arch/x86/include/asm/fpu/internal.h:363
fpu__restore+0x1d3/0x1e0
[ 16.312573] Modules linked in:
[ 16.314539] CPU: 0 PID: 1 Comm: init Tainted: G S W
4.10.0-rc7-00002-g3f7a5fb #1
[ 16.317147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 16.320117] Call Trace:
[ 16.322080] dump_stack+0x16/0x1d
[ 16.324886] __warn+0xd9/0x100
[ 16.326912] ? fpu__restore+0x1d3/0x1e0
[ 16.328932] warn_slowpath_null+0x2a/0x30
[ 16.330823] fpu__restore+0x1d3/0x1e0
[ 16.332703] __fpu__restore_sig+0x1dc/0x5c0
[ 16.334579] fpu__restore_sig+0x2f/0x50
[ 16.336455] restore_sigcontext+0xe5/0x100
[ 16.338370] sys_sigreturn+0xaa/0xe0
[ 16.340214] do_int80_syscall_32+0x5b/0xc0
[ 16.342001] entry_INT80_32+0x31/0x31
[ 16.343718] EIP: 0xb77421b0
[ 16.345258] EFLAGS: 00000246 CPU: 0
[ 16.348153] EAX: 00000003 EBX: 0000000a ECX: bf977a84 EDX: 0000000c
[ 16.352019] ESI: 00000086 EDI: 8006e4f8 EBP: bf979ac8 ESP: bf977824
[ 16.355205] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 16.357189] ---[ end trace b0ed16ad1b787fdd ]---
[ 16.404685] init: Failed to create pty - disabling logging for job
git bisect start 9f5f3fa2fdc98e99cf2861be423d3dace4e586cb
c470abd4fde40ea6a0846a2beab642a578c0b8cd --
git bisect bad c1533ff89438c8b2ca8a1d5054b25fb005a4f87f # 12:04 0- 41 Merge
'ext4/dev' into devel-spot-201702201311
git bisect bad 909f1974f14908d58bed1df7ac8c2b86cc7301a7 # 12:18 0- 2 Merge
'linux-review/Tobin-C-Harding/x86-purgatory-Fix-sparse-warning-symbol-not-declared/20170220-071752'
into devel-spot-201702201311
git bisect bad 313ad3777370eacce32486170ceeddc1840d5493 # 12:28 0- 44 Merge
'linux-review/jianchao-wang/MIPS-wrong-usage-of-l_exc_copy-in-octeon-memcpy-S/20170220-115401'
into devel-spot-201702201311
git bisect bad f46faf2dc962eef03bc8b74d927c6845f5f047bd # 21:27 0- 3 Merge
'gvt-linux/gvt-staging' into devel-spot-201702201311
git bisect good 5d3e978ff92b6dab9828ef1e15cb71a18f5d8bdf # 01:18 40+ 0 0day
base guard for 'devel-spot-201702201311'
git bisect good bc1e59b24d55320a8729eaf68b727ff65dfb521d # 02:04 42+ 0
drm/amdgpu:insert switch buffer only for VM submit
git bisect good 287599cf2d7719c812774ff49db9ae8ca4fa844a # 02:45 41+ 0 ALSA:
add Intel HDMI LPE audio driver for BYT/CHT-T
git bisect good 7fff8126d9cc902b2636d05d5d34894a75174993 # 03:08 43+ 0
drm/i915/gen9+: Enable hotplug detection early
git bisect good 72affdf9729d9e9a81498196ed5ada4d8f1c599e # 03:26 42+ 0
drm/i915: Silence compiler for GTT selftests
git bisect good 2d42c033aec9f8e7e175c551ae62ea3f4dc200b9 # 11:27 42+ 0 ALSA:
x86: Minor code rearrangement
git bisect good a4b10ccead4de0cf46bffb32fcb9e134b202676b # 01:14 41+ 0 drm:
Constify drm_mode_config atomic helper private pointer
git bisect good ec62ed3e1d93843b382c222bc0d81546f12c97b8 # 01:45 40+ 0
drm/i915: Restore context and pd for ringbuffer submission after reset
git bisect good 262fd485ac6b476479f41f00bb104f6a1766ae66 # 14:13 43+ 0
drm/i915: Only enable hotplug interrupts if the display interrupts are enabled
git bisect good 6ef88df694873e9fbb4717f35a28d6dd9ea5d316 # 19:02 40+ 0 Merge
remote-tracking branch 'sound/for-next' into drm-tip
git bisect good f642de16c86e9f31d084aa98a50d3a2c923450c2 # 07:12 41+ 0
dma-buf/dma-fence: improve doc for dma_fence_add_callback()
git bisect bad 0e65631f8a7a3bd5caa6545b210e218216624e7b # 07:25 0- 2 Merge
remote-tracking branch 'drm-misc/drm-misc-next' into drm-tip
git bisect bad 3f7a5fb77af4f5b38e514b85d491da0046d6bfb8 # 07:25 0- 6
locking/mutex: Clear mutex-handoff flag on interrupt
git bisect good d790812fa2577bf238c3c9f0cd81c361a0fe0870 # 07:38 44+ 0
mm/vmalloc: Replace opencoded 4-level page walkers
# first bad commit: [3f7a5fb77af4f5b38e514b85d491da0046d6bfb8] locking/mutex: Clear
mutex-handoff flag on interrupt
git bisect good d790812fa2577bf238c3c9f0cd81c361a0fe0870 # 07:55 122+ 0
mm/vmalloc: Replace opencoded 4-level page walkers
# extra tests with CONFIG_DEBUG_INFO_REDUCED
git bisect bad 3f7a5fb77af4f5b38e514b85d491da0046d6bfb8 # 10:46 0- 7
locking/mutex: Clear mutex-handoff flag on interrupt
# extra tests on HEAD of linux-devel/devel-spot-201702201311
git bisect bad 9f5f3fa2fdc98e99cf2861be423d3dace4e586cb # 10:46 0- 38 0day
head guard for 'devel-spot-201702201311'
# extra tests on tree/branch drm-intel/topic/core-for-CI
git bisect bad ce3000be4f666479e49a4e844bda2a469b0bbb4d # 10:48 0- 9 e1000e:
Undo e1000e_pm_freeze if __e1000_shutdown fails
# extra tests with first bad commit reverted
git bisect good 12f1a8d7fabf15efb1e509500ffe71551aae8717 # 13:08 126+ 0 Revert
"locking/mutex: Clear mutex-handoff flag on interrupt"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation