Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
git://anongit.freedesktop.org/drm-intel topic/core-for-CI
commit 585774774191cce113cd3ab1419499d8a7f7687e
Author: Peter Zijlstra <peterz(a)infradead.org>
AuthorDate: Wed Jan 11 17:43:02 2017 +0100
Commit: Chris Wilson <chris(a)chris-wilson.co.uk>
CommitDate: Mon Feb 20 17:07:24 2017 +0000
locking/mutex: Clear mutex-handoff flag on interrupt
On Mon, Jan 09, 2017 at 11:52:03AM +0000, Chris Wilson wrote:
If we abort the mutex_lock() due to an interrupt, or other error from
s/interrupt/signal/, right?
ww_mutex, we need to relinquish the handoff flag if we applied it.
Otherwise, we may cause missed wakeups as the current owner may try to
handoff to a new thread that is not expecting the handoff and so sleep
thinking the lock is already claimed (and since the owner unlocked there
may never be a new wakeup).
Isn't that the exact same scenario as Nicolai fixed here:
http://lkml.kernel.org/r/1482346000-9927-3-git-send-email-nhaehnle@gmail.com
Did you, like Nicolai, find this by inspection, or can you reproduce?
FWIW, I have the below patch that should also solve this problem afaict.
d8870ff73d mm/vmalloc: Replace opencoded 4-level page walkers
5857747741 locking/mutex: Clear mutex-handoff flag on interrupt
+------------------------------------------------------------------------+------------+------------+
| | d8870ff73d |
5857747741 |
+------------------------------------------------------------------------+------------+------------+
| boot_successes | 169 | 2
|
| boot_failures | 0 | 43
|
| WARNING:at_arch/x86/include/asm/fpu/internal.h:#__switch_to | 0 | 43
|
| WARNING:at_arch/x86/include/asm/fpu/internal.h:#copy_fpregs_to_fpstate | 0 | 43
|
| WARNING:at_arch/x86/include/asm/fpu/internal.h:#copy_kernel_to_xregs | 0 | 43
|
| BUG:kernel_hang_in_test_stage | 0 | 1
|
+------------------------------------------------------------------------+------------+------------+
[ 5.162880] No soundcards found.
[ 5.164629] Freeing unused kernel memory: 1040K
[ 5.166493] Write protecting the kernel text: 16632k
[ 5.168354] Write protecting the kernel read-only data: 4240k
[ 5.171189] ------------[ cut here ]------------
[ 5.172580] WARNING: CPU: 0 PID: 1 at arch/x86/include/asm/fpu/internal.h:348
__switch_to+0x576/0x937
[ 5.175473] Modules linked in:
[ 5.176591] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00002-g5857747 #1
[ 5.178260] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 5.181011] Call Trace:
[ 5.182035] ---[ end trace ebfa18e150d2487f ]---
[ 5.202948] ------------[ cut here ]------------
[ 5.202948] ------------[ cut here ]------------
[ 5.204774] WARNING: CPU: 0 PID: 1 at arch/x86/include/asm/fpu/internal.h:348
copy_fpregs_to_fpstate+0x159/0x1a0
[ 5.208551] Modules linked in:
[ 5.210071] CPU: 0 PID: 1 Comm: init Tainted: G W 4.10.0-00002-g5857747
#1
[ 5.213457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 5.216979] Call Trace:
[ 5.217995] dump_stack+0x138/0x189
[ 5.218843] ? copy_fpregs_to_fpstate+0x159/0x1a0
[ 5.219799] __warn+0x14c/0x18d
[ 5.220668] warn_slowpath_null+0x2d/0x40
[ 5.221590] copy_fpregs_to_fpstate+0x159/0x1a0
[ 5.222572] fpu__copy+0x114/0x174
[ 5.223366] arch_dup_task_struct+0x39/0x4b
[ 5.224515] copy_process+0x2d6/0x245c
[ 5.225360] ? strncpy_from_user+0x7a/0x228
[ 5.226305] ? recalc_sigpending_tsk+0x72/0xb7
[ 5.227206] _do_fork+0xc6/0x4b2
[ 5.228010] ? _copy_to_user+0xf7/0x10e
[ 5.228936] SyS_clone+0x35/0x53
[ 5.229745] do_int80_syscall_32+0x8b/0xb9
[ 5.230656] entry_INT80_32+0x2a/0x2a
[ 5.231485] EIP: 0x47f11df2
[ 5.232242] EFLAGS: 00000286 CPU: 0
[ 5.233047] EAX: ffffffda EBX: 01200011 ECX: 00000000 EDX: 00000000
[ 5.234344] ESI: 00000000 EDI: b77b6728 EBP: bff2b368 ESP: bff2b324
[ 5.235427] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 5.236539] ---[ end trace ebfa18e150d24880 ]---
[ 5.237894] ------------[ cut here ]------------
[ 5.237894] ------------[ cut here ]------------
[ 5.238931] WARNING: CPU: 0 PID: 1 at arch/x86/include/asm/fpu/internal.h:363
__switch_to+0x86b/0x937
[ 5.240950] Modules linked in:
[ 5.241768] CPU: 0 PID: 1 Comm: init Tainted: G W 4.10.0-00002-g5857747
#1
[ 5.243439] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 5.245433] Call Trace:
[ 5.246178] ---[ end trace ebfa18e150d24881 ]---
[ 5.250897] ------------[ cut here ]------------
[ 5.250897] ------------[ cut here ]------------
[ 5.251941] WARNING: CPU: 0 PID: 118 at arch/x86/include/asm/fpu/internal.h:363
copy_kernel_to_xregs+0x6b/0x7c
[ 5.254356] Modules linked in:
[ 5.255119] CPU: 0 PID: 118 Comm: rcS Tainted: G W 4.10.0-00002-g5857747
#1
[ 5.256892] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.9.3-20161025_171302-gandalf 04/01/2014
[ 5.258780] Call Trace:
[ 5.259515] dump_stack+0x138/0x189
[ 5.260385] ? copy_kernel_to_xregs+0x6b/0x7c
[ 5.261449] __warn+0x14c/0x18d
[ 5.262347] warn_slowpath_null+0x2d/0x40
[ 5.263231] copy_kernel_to_xregs+0x6b/0x7c
[ 5.264472] copy_kernel_to_fpregs+0x14a/0x17d
[ 5.265521] fpu__restore+0x57/0x7b
[ 5.266349] __fpu__restore_sig+0x521/0x99b
[ 5.267218] fpu__restore_sig+0x9e/0xb0
[ 5.268046] restore_sigcontext+0x21d/0x266
[ 5.269681] sys_sigreturn+0x171/0x1c9
[ 5.270978] do_int80_syscall_32+0x8b/0xb9
[ 5.271830] entry_INT80_32+0x2a/0x2a
[ 5.272635] EIP: 0x47ea7ee7
[ 5.273338] EFLAGS: 00000246 CPU: 0
[ 5.274126] EAX: 00000000 EBX: 00000002 ECX: bf976f00 EDX: 00000000
[ 5.275182] ESI: 00000008 EDI: 47fdcff4 EBP: bf976f00 ESP: bf976dd0
[ 5.276240] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 5.277237] ---[ end trace ebfa18e150d24882 ]---
[ 6.491623] genirq: Flags mismatch irq 4. 00000000 (serial) vs. 00000080
(goldfish_pdev_bus)
git bisect start 460111e8535eb05d059287bfda6fc8f981c889fd
c470abd4fde40ea6a0846a2beab642a578c0b8cd --
git bisect bad 41e6e84b43ae97c96732a66076b889effca6fffe # 09:50 0- 19 Merge
'linux-review/Masanari-Iida/xenbus-Remove-duplicate-inclusion-of-linux-init-h/20170226-152118'
into devel-spot-201702270443
git bisect bad ab4cf7fa90893dbb2ab56fedd8fcc036d0612512 # 10:29 0- 41 Merge
'rui/next' into devel-spot-201702270443
git bisect bad 9904ffa584c74bcef337587b9f622b242cf2ac30 # 10:44 0- 6 Merge
'linux-review/Adriana-Constantinescu/ASoC-omap-Remove-unnecessary-out-of-memory-message/20170227-032559'
into devel-spot-201702270443
git bisect good 7b89813a86589e53488565e5fe6cd9777afab7d3 # 11:05 41+ 0 0day
base guard for 'devel-spot-201702270443'
git bisect bad fa02746b1d2b7a7d6adaea3ddb69464dea384f68 # 11:36 0- 9 Merge
'drm-intel/topic/designware-baytrail' into devel-spot-201702270443
git bisect good 3d5dbb10f34ad0521bfb7091bd5b47f6c984b9aa # 12:06 41+ 0
drm/i915: Pass dev_priv to remainder of the cdclk functions
git bisect good 5089f3d6df2356bde254551d4bacc70d254d2601 # 12:40 41+ 0 Merge
remote-tracking branch 'airlied/drm-next' into drm-tip
git bisect good 141dee78c40ac2c43aa4ff306688d625e1c731de # 13:50 41+ 1 Merge
remote-tracking branches 'asoc/topic/wm8753' and 'asoc/topic/zte' into
asoc-next
git bisect good 5d81296b5e7849ba3bcc5bf430ffd37bf67ff7dc # 14:05 41+ 0 ALSA:
line6: Always setup isochronous transfer properties
git bisect good 00d3c14f14d51babd8aeafd5fa734ccf04f5ca3d # 15:36 41+ 0 drm: Add
name for DRM_DP_DUAL_MODE_LSPCON
git bisect good 0b6b524f3915f88eb4562e8d927528224a8eab41 # 15:48 41+ 0 ALSA:
x86: Don't enable runtime PM as default
git bisect bad 19c1f53ab6946a928263340f1b7c7465e862c3a6 # 16:01 0- 29
drm/i915: implement hsw WaDisableVFUnitClockGating
git bisect bad 637c0397a1c3442b29561e7ad95ee2b51fa30af7 # 16:11 0- 18 Merge
remote-tracking branch 'intel/topic/core-for-CI' into drm-tip
git bisect good 7086b7b3d101e0e6fca2bf7ca2f14483fc881837 # 16:23 41+ 0 ALSA:
usb-audio: Tidy up mixer_us16x08.c
git bisect bad 585774774191cce113cd3ab1419499d8a7f7687e # 16:33 0- 16
locking/mutex: Clear mutex-handoff flag on interrupt
git bisect good d8870ff73dff1928d039de955eff94434e4d7b23 # 16:44 41+ 0
mm/vmalloc: Replace opencoded 4-level page walkers
# first bad commit: [585774774191cce113cd3ab1419499d8a7f7687e] locking/mutex: Clear
mutex-handoff flag on interrupt
git bisect good d8870ff73dff1928d039de955eff94434e4d7b23 # 16:50 116+ 0
mm/vmalloc: Replace opencoded 4-level page walkers
# extra tests with CONFIG_DEBUG_INFO_REDUCED
git bisect bad 585774774191cce113cd3ab1419499d8a7f7687e # 17:01 0- 1
locking/mutex: Clear mutex-handoff flag on interrupt
# extra tests on HEAD of linux-devel/devel-spot-201702270443
git bisect bad 460111e8535eb05d059287bfda6fc8f981c889fd # 17:01 0- 38 0day
head guard for 'devel-spot-201702270443'
# extra tests on tree/branch drm-intel/topic/core-for-CI
git bisect bad ce3000be4f666479e49a4e844bda2a469b0bbb4d # 17:14 0- 2 e1000e:
Undo e1000e_pm_freeze if __e1000_shutdown fails
# extra tests with first bad commit reverted
git bisect good f3959c36d27bf9a5a37144f383501597473be0f9 # 17:41 124+ 0 Revert
"locking/mutex: Clear mutex-handoff flag on interrupt"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation