[lkp-robot] [mm] 44b163e12f: kernel_BUG_at_mm/swap.c
by kernel test robot
FYI, we noticed the following commit (built with gcc-7):
commit: 44b163e12fd4a133016482d94ad11d8f3365ddd2 ("mm: split up release_pages into non-sentinel and sentinel passes")
url: https://github.com/0day-ci/linux/commits/daniel-m-jordan-oracle-com/mm-ad...
in testcase: boot
on test machine: qemu-system-i386 -enable-kvm -m 360M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-----------------------------------------------------+------------+------------+
| | 6fe15c1d7a | 44b163e12f |
+-----------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 46 | 12 |
| WARNING:possible_recursive_locking_detected | 46 | 12 |
| WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page | 8 | 2 |
| EIP:note_page | 8 | 2 |
| kernel_BUG_at_mm/swap.c | 0 | 12 |
| invalid_opcode:#[##] | 0 | 12 |
| EIP:release_pages | 0 | 12 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 12 |
+-----------------------------------------------------+------------+------------+
[ 245.413373] kernel BUG at mm/swap.c:754!
[ 245.424199] invalid opcode: 0000 [#1] SMP
[ 245.432437] CPU: 0 PID: 164 Comm: sh Not tainted 4.15.0-00012-g44b163e #153
[ 245.445522] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 245.461052] EIP: release_pages+0x26/0x3ab
[ 245.468947] EFLAGS: 00010202 CPU: 0
[ 245.476401] EAX: c9c6200c EBX: c9c62000 ECX: c9c6dd80 EDX: 00000297
[ 245.490767] ESI: 00000000 EDI: c9c6de3c EBP: c9c6ddd8 ESP: c9c6dd64
[ 245.502693] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 245.513095] CR0: 80050033 CR2: 08138000 CR3: 0c9c0220 CR4: 000006b0
[ 245.524953] Call Trace:
[ 245.530908] ? cpumask_next+0x21/0x24
[ 245.537234] ? cpumask_any_but+0x1d/0x2d
[ 245.544004] ? flush_tlb_mm_range+0xcc/0x103
[ 245.552467] tlb_flush_mmu_free+0x17/0x33
[ 245.560820] tlb_flush_mmu+0x12/0x15
[ 245.568370] arch_tlb_finish_mmu+0x28/0x47
[ 245.575761] tlb_finish_mmu+0x1d/0x2c
[ 245.582080] exit_mmap+0xbc/0x10c
[ 245.588629] ? trace_hardirqs_off_caller+0x1b/0x99
[ 245.598128] mmput+0x53/0xc1
[ 245.604470] flush_old_exec+0x59f/0x60e
[ 245.612514] load_elf_binary+0x238/0x9d4
[ 245.620644] ? search_binary_handler+0x5c/0xbe
[ 245.629747] ? search_binary_handler+0x5c/0xbe
[ 245.638823] search_binary_handler+0x50/0xbe
[ 245.647474] do_execveat_common+0x545/0x7af
[ 245.656070] do_execve+0x14/0x16
[ 245.663265] SyS_execve+0x16/0x18
[ 245.670448] do_fast_syscall_32+0x11b/0x222
[ 245.679075] entry_SYSENTER_32+0x53/0x86
[ 245.687212] EIP: 0xb7eecbe5
[ 245.693652] EFLAGS: 00000292 CPU: 0
[ 245.701007] EAX: ffffffda EBX: 08138028 ECX: 081382a8 EDX: 08136008
[ 245.712423] ESI: 081382a8 EDI: b7ebbff4 EBP: 00000000 ESP: bfb82ed4
[ 245.723085] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 245.733522] Code: 7c f1 ff 5d c3 55 89 e5 57 56 53 83 ec 68 8d 4d a8 65 8b 35 14 00 00 00 89 75 f0 31 f6 81 fa 00 02 00 00 89 4d a8 89 4d ac 7e 02 <0f> 0b 8d 4a 1f c1 e9 05 c1 e1 02 83 f9 40 89 55 94 89 45 8c 76
[ 245.767993] EIP: release_pages+0x26/0x3ab SS:ESP: 0068:c9c6dd64
[ 245.779532] ---[ end trace 9116e5f455646a7b ]---
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 6 months
0a5ca4b36c ("rcu: Parallelize expedited grace-period .."): WARNING: CPU: 0 PID: 0 at kernel/workqueue.c:2866 flush_work
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/dev
commit 0a5ca4b36cf64ab6c333d2b0f6b79fca044d30ed
Author: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
AuthorDate: Thu Feb 1 22:05:38 2018 -0800
Commit: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
CommitDate: Fri Feb 2 03:28:18 2018 -0800
rcu: Parallelize expedited grace-period initialization
The latency of RCU expedited grace periods grows with increasing numbers
of CPUs, eventually failing to be all that expedited. Much of the growth
in latency is in the initialization phase, so this commit uses workqueues
to carry out this initialization concurrently on a rcu_node-by-rcu_node
basis.
Signed-off-by: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
0f0a62adf1 torture: Provide more sensible nreader/nwriter defaults for rcuperf
0a5ca4b36c rcu: Parallelize expedited grace-period initialization
+-------------------------------------------------------+------------+------------+
| | 0f0a62adf1 | 0a5ca4b36c |
+-------------------------------------------------------+------------+------------+
| boot_successes | 35 | 0 |
| boot_failures | 0 | 15 |
| WARNING:at_kernel/workqueue.c:#flush_work | 0 | 15 |
| EIP:flush_work | 0 | 15 |
| BUG:scheduling_while_atomic | 0 | 15 |
| WARNING:at_kernel/locking/lockdep.c:#lock_release | 0 | 15 |
| EIP:lock_release | 0 | 15 |
| WARNING:at_kernel/locking/lockdep.c:#lock_unpin_lock | 0 | 15 |
| EIP:lock_unpin_lock | 0 | 15 |
| WARNING:CPU:#PID:#at_kernel/locking/lockdep.c:#lock_u | 0 | 1 |
+-------------------------------------------------------+------------+------------+
[ 0.020000] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.020000] mce: CPU supports 10 MCE banks
[ 0.020077] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[ 0.020676] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[ 0.021493] CPU: Intel Core Processor (Haswell) (family: 0x6, model: 0x3c, stepping: 0x4)
[ 0.023238] WARNING: CPU: 0 PID: 0 at kernel/workqueue.c:2866 flush_work+0x278/0x290
[ 0.024330] Modules linked in:
[ 0.024756] CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-rc1-00112-g0a5ca4b #1
[ 0.025664] task: 818cbe00 task.stack: 818c0000
[ 0.026214] EIP: flush_work+0x278/0x290
[ 0.026698] EFLAGS: 00210246 CPU: 0
[ 0.027145] EAX: 00000000 EBX: 81a250c0 ECX: 00000006 EDX: 00000001
[ 0.027856] ESI: 81a25094 EDI: 81a25358 EBP: 818c1e00 ESP: 818c1d30
[ 0.028743] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 0.029397] CR0: 80050033 CR2: ffffffff CR3: 01b0d000 CR4: 001406d0
[ 0.030000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.030000] DR6: fffe0ff0 DR7: 00000400
[ 0.030000] Call Trace:
[ 0.030000] ? kvm_clock_read+0x14/0x30
[ 0.030000] ? kvm_sched_clock_read+0x9/0x20
[ 0.030000] ? sched_clock+0x9/0x10
[ 0.030000] ? sched_clock_local+0x8a/0x160
[ 0.030000] ? sched_clock_cpu+0xe5/0x120
[ 0.030000] ? mark_held_locks+0x4a/0x70
[ 0.030000] ? queue_work_on+0x36/0x80
[ 0.030000] ? sync_rcu_exp_select_cpus+0x129/0x220
[ 0.030000] ? rcu_preempt_qs+0x70/0x70
[ 0.030000] ? _synchronize_rcu_expedited+0x2dc/0x300
[ 0.030000] ? __native_set_fixmap+0x30/0x30
[ 0.030000] ? ___ratelimit+0xb7/0x120
[ 0.030000] ? apply_paravirt+0x92/0x110
[ 0.030000] ? sched_clock_local+0x73/0x160
[ 0.030000] ? acpi_hw_read_port+0x4d/0xbd
[ 0.030000] ? synchronize_rcu+0x5d/0x70
[ 0.030000] ? acpi_hw_read_multiple+0x1e/0x65
[ 0.030000] ? acpi_hw_register_read+0x59/0xd4
[ 0.030000] ? acpi_read_bit_register+0x2e/0x59
[ 0.030000] ? rcu_test_sync_prims+0x5/0x20
[ 0.030000] ? rest_init+0x9/0x1d0
[ 0.030000] ? start_kernel+0x3df/0x3f8
[ 0.030000] ? startup_32_smp+0x15f/0x170
[ 0.030000] Code: cb e0 01 00 00 e9 d9 fe ff ff 89 f6 8d bc 27 00 00 00 00 8b 85 34 ff ff ff e8 d5 bf 5c 00 31 c0 e9 5d ff ff ff 8d b6 00 00 00 00 <0f> ff e9 50 ff ff ff e8 2c b2 fe ff 8d b6 00 00 00 00 8d bf 00
[ 0.030000] ---[ end trace 94904b8e3de7dea6 ]---
[ 0.030038] BUG: scheduling while atomic: swapper/0/0x00000002
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start e2b60513fa8b07a40502812988533cf8cb10591e d8a5b80568a9cb66810e75b182018e9edb68e8ff --
git bisect bad 58b10f77084bf6cc7f8197c8fdbe4175e05abf18 # 07:48 B 0 9 23 0 Merge 'linux-review/Shreyas-NC/ASoC-Add-Multi-CPU-DAI-support/20180201-111320' into devel-hourly-2018020301
git bisect good dbfc3e24124c8e8763ec087f5c8b0a23d5985668 # 08:27 G 11 0 0 0 Merge 'linux-review/SF-Markus-Elfring/input-joystick-gamecon-Adjustments-for-gc_attach/20180130-143827' into devel-hourly-2018020301
git bisect good 495e3a5de5b1dc60026e1d984fdad1db0b6e2edb # 08:42 G 11 0 0 0 Merge 'regulator/fix/suspend' into devel-hourly-2018020301
git bisect bad 689177b3136567e1f74a36f42d2fb45ea5454fbe # 09:12 B 0 10 25 1 Merge 'sunxi/sunxi/for-next-next' into devel-hourly-2018020301
git bisect good 92def772c47a7d1d7a5715d114cc0b5d1c84ab46 # 09:46 G 11 0 0 0 Merge 'linux-review/Derek-Basehore/cpu_pm-add-syscore_suspend-error-handling/20180201-133540' into devel-hourly-2018020301
git bisect bad 75057dfeeadd8fb283d00bd49e1629ccfb60ea43 # 10:05 B 0 11 27 2 Merge 'linux-review/Lyude-Paul/Implement-full-clockgating-for-Kepler1-and-2/20180201-231630' into devel-hourly-2018020301
git bisect bad 16507d8c12819310bec815dfb343edac194d03c8 # 10:19 B 0 2 16 0 Merge 'rcu/rcu/dev' into devel-hourly-2018020301
git bisect good ad47dc6d915bf9ab013898220f3256be1cc9ad47 # 10:35 G 11 0 0 0 Merge 'linux-review/Valentin-Vidic/staging-pi433-fix-CamelCase-for-syncValues/20180129-233738' into devel-hourly-2018020301
git bisect good ccb640f29b5f1cd14f11724b0d4c062dbf48eeb5 # 11:08 G 11 0 0 0 Merge 'linux-review/Stephen-Boyd/MAINTAINERS-Update-sboyd-s-email-address/20180201-021431' into devel-hourly-2018020301
git bisect good 91b4c53238fd26ed23e5adccb7f91c549e02f0ee # 11:26 G 11 0 0 0 rcu: Remove obsolete callback-invocation statistics for debugfs
git bisect good 152de4e4e835bf56392c04bd64dc2cc3ce1c6625 # 11:37 G 11 0 0 0 rcu: Fix misprint in srcu_funnel_exp_start
git bisect good a2958a05be14e20ebb6f82c081d59add1bbfb1fb # 11:59 G 11 0 0 0 rcu: Add more tracing of expedited grace periods
git bisect good caff197f675357096a10a6ac7400675096c390d3 # 12:12 G 10 0 0 0 rcu: Meke expedited RCU CPU selection avoid unnecessary stores
git bisect bad 0a5ca4b36cf64ab6c333d2b0f6b79fca044d30ed # 12:27 B 0 7 22 1 rcu: Parallelize expedited grace-period initialization
git bisect good 0f0a62adf1c0fbec3c0e5988fb225e80b66d0340 # 12:40 G 11 0 0 0 torture: Provide more sensible nreader/nwriter defaults for rcuperf
# first bad commit: [0a5ca4b36cf64ab6c333d2b0f6b79fca044d30ed] rcu: Parallelize expedited grace-period initialization
git bisect good 0f0a62adf1c0fbec3c0e5988fb225e80b66d0340 # 12:43 G 31 0 0 0 torture: Provide more sensible nreader/nwriter defaults for rcuperf
# extra tests with debug options
git bisect bad 0a5ca4b36cf64ab6c333d2b0f6b79fca044d30ed # 12:53 B 0 11 25 0 rcu: Parallelize expedited grace-period initialization
# extra tests on HEAD of linux-devel/devel-hourly-2018020301
git bisect bad e2b60513fa8b07a40502812988533cf8cb10591e # 12:53 B 0 46 63 0 0day head guard for 'devel-hourly-2018020301'
# extra tests on tree/branch rcu/rcu/dev
git bisect bad 0a5ca4b36cf64ab6c333d2b0f6b79fca044d30ed # 13:04 B 0 15 29 0 rcu: Parallelize expedited grace-period initialization
# extra tests with first bad commit reverted
git bisect good d7ab22c6a90b87c44e778ace204f2fa18fec128d # 13:16 G 11 0 0 0 Revert "rcu: Parallelize expedited grace-period initialization"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
552d316987 ("of: cache phandle nodes to decrease cost of .."): BUG: sleeping function called from invalid context at mm/slab.h:419
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/frowand-list-gmail-com/of-cache-...
commit 552d316987dc98a6df43a2da659d80031dc146e6
Author: Frank Rowand <frank.rowand(a)sony.com>
AuthorDate: Wed Jan 31 12:05:42 2018 -0800
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Fri Feb 2 18:59:11 2018 +0800
of: cache phandle nodes to decrease cost of of_find_node_by_phandle()
Create a cache of the nodes that contain a phandle property. Use this
cache to find the node for a given phandle value instead of scanning
the devicetree to find the node. If the phandle value is not found
in the cache, of_find_node_by_phandle() will fall back to the tree
scan algorithm.
The cache is initialized in of_core_init().
The cache is freed via a late_initcall_sync().
Signed-off-by: Frank Rowand <frank.rowand(a)sony.com>
3a6fbcb2e2 xtensa: remove arch specific early DT functions
552d316987 of: cache phandle nodes to decrease cost of of_find_node_by_phandle()
+----------------------------------------------------------------+------------+------------+
| | 3a6fbcb2e2 | 552d316987 |
+----------------------------------------------------------------+------------+------------+
| boot_successes | 33 | 0 |
| boot_failures | 0 | 15 |
| BUG:sleeping_function_called_from_invalid_context_at_mm/slab.h | 0 | 15 |
+----------------------------------------------------------------+------------+------------+
[ 0.011671] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[ 0.012002] CPU: Intel Core Processor (Haswell) (family: 0x6, model: 0x3c, stepping: 0x4)
[ 0.014596] Performance Events: unsupported p6 CPU model 60 no PMU driver, software events only.
[ 0.015145] TSC deadline timer enabled
[ 0.016000] devtmpfs: initialized
[ 0.016147] BUG: sleeping function called from invalid context at mm/slab.h:419
[ 0.017000] in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper
[ 0.017000] CPU: 0 PID: 1 Comm: swapper Not tainted 4.15.0-rc3-00024-g552d316 #2
[ 0.017000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.017000] Call Trace:
[ 0.017000] ___might_sleep+0x107/0x120
[ 0.017000] __kmalloc+0x6e/0x200
[ 0.017000] ? do_early_param+0x88/0x88
[ 0.017000] of_core_init+0x77/0x19e
[ 0.017000] kernel_init_freeable+0x8e/0x17c
[ 0.017000] ? rest_init+0xa0/0xa0
[ 0.017000] kernel_init+0x5/0xe0
[ 0.017000] ret_from_fork+0x1f/0x30
[ 0.017586] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[ 0.018010] futex hash table entries: 256 (order: 1, 12288 bytes)
[ 0.019087] prandom: seed boundary self test passed
[ 0.020585] prandom: 100 self tests passed
[ 0.021005] pinctrl core: initialized pinctrl subsystem
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start ddae25ea4e02e6de3e40fdc11b80dc1abed0737a d8a5b80568a9cb66810e75b182018e9edb68e8ff --
git bisect good d73d92fec48507279b8ed464b9aa2020d84c8b95 # 01:21 G 11 0 1 1 Merge 'drm-drm-misc/topic/backlight_for_lag' into devel-catchup-201802022013
git bisect good 64dca479224eb6c8a7a72b0cfb76ebe4ba2b8721 # 01:55 G 11 0 0 0 Merge 'linux-review/Thinh-Nguyen/usb-dwc3-Add-new-updates-for-DWC_usb31/20180202-034333' into devel-catchup-201802022013
git bisect bad 6822f02287e929f4877898ab804ee7bcd9c7fff3 # 02:22 B 0 11 25 0 Merge 'gfs2/iomap-write' into devel-catchup-201802022013
git bisect good 183c37d81b7259fcb313510b35ae2ab9cb895ce7 # 03:00 G 11 0 1 1 Merge 'linux-review/changbin-du-intel-com/tracing-fgraph-Missed-irq-return-mark-for-leaf-entry/20180202-133421' into devel-catchup-201802022013
git bisect good a558c2cc2ac473ebdca1deaf86421ff4d65b57b4 # 03:27 G 10 0 1 1 Merge 'linux-review/Nadav-Amit/x86-Align-TLB-invalidation-info/20180202-181650' into devel-catchup-201802022013
git bisect good d4001e077d3fcae48f6f0bbfddac74a96cf3e56b # 03:49 G 11 0 0 0 Merge 'asoc/for-next' into devel-catchup-201802022013
git bisect bad 5a3488601d4c2228d86fa43648b523eb578e5fed # 04:13 B 0 5 20 1 Merge 'linux-review/frowand-list-gmail-com/of-cache-phandle-nodes-to-decrease-cost-of-of_find_node_by_phandle/20180202-185903' into devel-catchup-201802022013
git bisect bad 552d316987dc98a6df43a2da659d80031dc146e6 # 04:44 B 0 5 19 0 of: cache phandle nodes to decrease cost of of_find_node_by_phandle()
# first bad commit: [552d316987dc98a6df43a2da659d80031dc146e6] of: cache phandle nodes to decrease cost of of_find_node_by_phandle()
git bisect good 3a6fbcb2e2e4b263df1cc8647ce1858c57ddc805 # 05:27 G 31 0 4 4 xtensa: remove arch specific early DT functions
# extra tests with debug options
git bisect bad 552d316987dc98a6df43a2da659d80031dc146e6 # 05:49 B 0 11 25 0 of: cache phandle nodes to decrease cost of of_find_node_by_phandle()
# extra tests on HEAD of linux-devel/devel-catchup-201802022013
git bisect bad ddae25ea4e02e6de3e40fdc11b80dc1abed0737a # 05:49 B 0 13 30 0 0day head guard for 'devel-catchup-201802022013'
# extra tests on tree/branch linux-review/frowand-list-gmail-com/of-cache-phandle-nodes-to-decrease-cost-of-of_find_node_by_phandle/20180202-185903
git bisect bad 552d316987dc98a6df43a2da659d80031dc146e6 # 05:53 B 0 15 29 0 of: cache phandle nodes to decrease cost of of_find_node_by_phandle()
# extra tests with first bad commit reverted
git bisect good 9c494d48153fcace74e5bcf084f48f26a1f90a31 # 06:18 G 11 0 1 1 Revert "of: cache phandle nodes to decrease cost of of_find_node_by_phandle()"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
fe01f91314 ("mm: convert to-be-refactored lru_lock callsites .."): WARNING: possible recursive locking detected
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/daniel-m-jordan-oracle-com/mm-ad...
commit fe01f913149162bc7d3eaad69b27dd7fbe907c14
Author: daniel.m.jordan(a)oracle.com <daniel.m.jordan(a)oracle.com>
AuthorDate: Wed Jan 31 18:04:07 2018 -0500
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Fri Feb 2 13:11:33 2018 +0800
mm: convert to-be-refactored lru_lock callsites to lock-all API
Use the heavy locking API for now to allow us to focus on the path we're
measuring to prove the concept--the release_pages path. In that path,
LRU batch locking will be used, but everywhere else will be heavy.
For now, exclude compaction since this would be a nontrivial
refactoring. We can deal with that in a future series.
Signed-off-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
2ec368073f mm: add lru_[un]lock_all APIs
fe01f91314 mm: convert to-be-refactored lru_lock callsites to lock-all API
b11d52e255 mm: splice local lists onto the front of the LRU
+--------------------------------------------------------------------------------+------------+------------+------------+
| | 2ec368073f | fe01f91314 | b11d52e255 |
+--------------------------------------------------------------------------------+------------+------------+------------+
| boot_successes | 61 | 0 | 0 |
| boot_failures | 0 | 15 | 43 |
| WARNING:possible_recursive_locking_detected | 0 | 15 | 43 |
| Kernel_panic-not_syncing:stack-protector:Kernel_stack_is_corrupted_in | 0 | 0 | 41 |
| Kernel_panic-not_syncing:stack-protector:Kernel_stack_is_corrupted_in:ca#c97 | 0 | 0 | 1 |
| Kernel_panic-not_syncing:stack-protector:Kernel_stack_is_corrupted_in:fb#b7654 | 0 | 0 | 1 |
+--------------------------------------------------------------------------------+------------+------------+------------+
[ 1.649963] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[ 1.651430] PCI: CLS 0 bytes, default 64
[ 1.652486] Unpacking initramfs...
[ 1.654432]
[ 1.654733] ============================================
[ 1.655710] WARNING: possible recursive locking detected
[ 1.655992] 4.15.0-00007-gfe01f91 #1 Not tainted
[ 1.655992] --------------------------------------------
[ 1.655992] swapper/1 is trying to acquire lock:
[ 1.655992] (&(&pgdat->lru_batch_locks[i].lock)->rlock){....}, at: [<50565ce3>] pagevec_lru_move_fn+0x1ca/0x360
[ 1.655992]
[ 1.655992] but task is already holding lock:
[ 1.655992] (&(&pgdat->lru_batch_locks[i].lock)->rlock){....}, at: [<50565ce3>] pagevec_lru_move_fn+0x1ca/0x360
[ 1.655992]
[ 1.655992] other info that might help us debug this:
[ 1.655992] Possible unsafe locking scenario:
[ 1.655992]
[ 1.655992] CPU0
[ 1.655992] ----
[ 1.655992] lock(&(&pgdat->lru_batch_locks[i].lock)->rlock);
[ 1.655992]
[ 1.655992] *** DEADLOCK ***
[ 1.655992]
[ 1.655992] May be due to missing lock nesting notation
[ 1.655992]
[ 1.655992] 3 locks held by swapper/1:
[ 1.655992] #0: (sb_writers#2){.+.+}, at: [<7eb1a34b>] vfs_write+0x35a/0x390
[ 1.655992] #1: (&sb->s_type->i_mutex_key#2){++++}, at: [<bda546a1>] generic_file_write_iter+0x2c/0x5c0
[ 1.655992] #2: (&(&pgdat->lru_batch_locks[i].lock)->rlock){....}, at: [<50565ce3>] pagevec_lru_move_fn+0x1ca/0x360
[ 1.655992]
[ 1.655992] stack backtrace:
[ 1.655992] CPU: 0 PID: 1 Comm: swapper Not tainted 4.15.0-00007-gfe01f91 #1
[ 1.655992] Call Trace:
[ 1.655992] dump_stack+0x16/0x18
[ 1.655992] __lock_acquire+0xf1f/0x1770
[ 1.655992] lock_acquire+0xbd/0x1dd
[ 1.655992] ? pagevec_lru_move_fn+0x1ca/0x360
[ 1.655992] ? native_restore_fl+0x10/0x10
[ 1.655992] _raw_spin_lock+0x2b/0x40
[ 1.655992] ? pagevec_lru_move_fn+0x1ca/0x360
[ 1.655992] pagevec_lru_move_fn+0x1ca/0x360
[ 1.655992] ? lru_lazyfree_fn+0x480/0x480
[ 1.655992] __lru_cache_add+0x8b/0x110
[ 1.655992] lru_cache_add+0xd/0x10
[ 1.655992] add_to_page_cache_lru+0x114/0x1e0
[ 1.655992] pagecache_get_page+0x2f9/0x5e0
[ 1.655992] ? unlock_page+0x25/0x70
[ 1.655992] grab_cache_page_write_begin+0x3e/0x70
[ 1.655992] simple_write_begin+0x23/0x220
[ 1.655992] generic_perform_write+0xe8/0x290
[ 1.655992] __generic_file_write_iter+0x222/0x2b0
[ 1.655992] generic_file_write_iter+0x472/0x5c0
[ 1.655992] __vfs_write+0x1b4/0x240
[ 1.655992] vfs_write+0x201/0x390
[ 1.655992] SyS_write+0x6b/0x120
[ 1.655992] xwrite+0x28/0x9f
[ 1.655992] do_copy+0xcd/0x125
[ 1.655992] write_buffer+0x22/0x31
[ 1.655992] flush_buffer+0x44/0xe8
[ 1.655992] __gunzip+0x3ac/0x4c6
[ 1.655992] ? bunzip2+0x5f9/0x5f9
[ 1.655992] ? __gunzip+0x4c6/0x4c6
[ 1.655992] gunzip+0x16/0x18
[ 1.655992] ? error+0x31/0x31
[ 1.655992] ? do_start+0x20/0x20
[ 1.655992] unpack_to_rootfs+0x1cf/0x33f
[ 1.655992] ? error+0x31/0x31
[ 1.655992] ? do_start+0x20/0x20
[ 1.655992] ? do_header+0x297/0x297
[ 1.655992] populate_rootfs+0x7d/0xf7
[ 1.655992] do_one_initcall+0xa5/0x22d
[ 1.655992] ? do_early_param+0xb0/0xb0
[ 1.655992] kernel_init_freeable+0xe2/0x19b
[ 1.655992] ? rest_init+0x240/0x240
[ 1.655992] kernel_init+0xd/0x180
[ 1.655992] ret_from_fork+0x2e/0x40
[ 2.017319] Freeing initrd memory: 4084K
[ 2.020746] PCLMULQDQ-NI instructions are not detected.
[ 2.021583] The force parameter has not been set to 1. The Iris poweroff handler will not be installed.
[ 2.023006] NatSemi SCx200 Driver
[ 2.023622] spin_lock-torture:--- Start of test [debug]: nwriters_stress=2 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start b11d52e25588e5267868ac7a33f0b93cbb12ac2b d8a5b80568a9cb66810e75b182018e9edb68e8ff --
git bisect good 2ec368073f6dd8ec95f4d0555037088ca6e110ad # 16:33 G 11 0 0 0 mm: add lru_[un]lock_all APIs
git bisect bad c61170a78526b4c065a6a2c90d72af4daedb22ed # 16:59 B 0 11 50 26 mm: introduce add-only version of pagevec_lru_move_fn
git bisect bad fe01f913149162bc7d3eaad69b27dd7fbe907c14 # 17:21 B 0 1 15 1 mm: convert to-be-refactored lru_lock callsites to lock-all API
# first bad commit: [fe01f913149162bc7d3eaad69b27dd7fbe907c14] mm: convert to-be-refactored lru_lock callsites to lock-all API
git bisect good 2ec368073f6dd8ec95f4d0555037088ca6e110ad # 17:29 G 31 0 0 0 mm: add lru_[un]lock_all APIs
# extra tests on HEAD of linux-review/daniel-m-jordan-oracle-com/mm-add-a-percpu_pagelist_batch-sysctl-interface/20180202-131129
git bisect bad b11d52e25588e5267868ac7a33f0b93cbb12ac2b # 17:30 B 0 43 59 0 mm: splice local lists onto the front of the LRU
# extra tests on tree/branch linux-review/daniel-m-jordan-oracle-com/mm-add-a-percpu_pagelist_batch-sysctl-interface/20180202-131129
git bisect bad b11d52e25588e5267868ac7a33f0b93cbb12ac2b # 17:32 B 0 43 59 0 mm: splice local lists onto the front of the LRU
# extra tests with first bad commit reverted
git bisect good 25adbd15d95547e45e91961614537fadf6f3096f # 18:20 G 11 0 11 33 Revert "mm: convert to-be-refactored lru_lock callsites to lock-all API"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
b11d52e255 ("mm: splice local lists onto the front of the LRU"): kernel BUG at include/linux/mm_inline.h:60!
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/daniel-m-jordan-oracle-com/mm-ad...
commit b11d52e25588e5267868ac7a33f0b93cbb12ac2b
Author: daniel.m.jordan(a)oracle.com <daniel.m.jordan(a)oracle.com>
AuthorDate: Wed Jan 31 18:04:13 2018 -0500
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Fri Feb 2 13:11:34 2018 +0800
mm: splice local lists onto the front of the LRU
Now that release_pages is scaling better with concurrent removals from
the LRU, the performance results (included below) showed increased
contention on lru_lock in the add-to-LRU path.
To alleviate some of this contention, do more work outside the LRU lock.
Prepare a local list of pages to be spliced onto the front of the LRU,
including setting PageLRU in each page, before taking lru_lock. Since
other threads use this page flag in certain checks outside lru_lock,
ensure each page's LRU links have been properly initialized before
setting the flag, and use memory barriers accordingly.
Performance Results
This is a will-it-scale run of page_fault1 using 4 different kernels.
kernel kern #
4.15-rc2 1
large-zone-batch 2
lru-lock-base 3
lru-lock-splice 4
Each kernel builds on the last. The first is a baseline, the second
makes zone->lock more scalable by increasing an order-0 per-cpu
pagelist's 'batch' and 'high' values to 310 and 1860 respectively
(courtesy of Aaron Lu's patch), the third scales lru_lock without
splicing pages (the previous patch in this series), and the fourth adds
page splicing (this patch).
N tasks mmap, fault, and munmap anonymous pages in a loop until the test
time has elapsed.
The process case generally does better than the thread case most likely
because of mmap_sem acting as a bottleneck. There's ongoing work
upstream[*] to scale this lock, however, and once it goes in, my
hypothesis is the thread numbers here will improve.
kern # ntask proc thr proc stdev thr stdev
speedup speedup pgf/s pgf/s
1 1 705,533 1,644 705,227 1,122
2 1 2.5% 2.8% 722,912 453 724,807 728
3 1 2.6% 2.6% 724,215 653 723,213 941
4 1 2.3% 2.8% 721,746 272 724,944 728
kern # ntask proc thr proc stdev thr stdev
speedup speedup pgf/s pgf/s
1 4 2,525,487 7,428 1,973,616 12,568
2 4 2.6% 7.6% 2,590,699 6,968 2,123,570 10,350
3 4 2.3% 4.4% 2,584,668 12,833 2,059,822 10,748
4 4 4.7% 5.2% 2,643,251 13,297 2,076,808 9,506
kern # ntask proc thr proc stdev thr stdev
speedup speedup pgf/s pgf/s
1 16 6,444,656 20,528 3,226,356 32,874
2 16 1.9% 10.4% 6,566,846 20,803 3,560,437 64,019
3 16 18.3% 6.8% 7,624,749 58,497 3,447,109 67,734
4 16 28.2% 2.5% 8,264,125 31,677 3,306,679 69,443
kern # ntask proc thr proc stdev thr stdev
speedup speedup pgf/s pgf/s
1 32 11,564,988 32,211 2,456,507 38,898
2 32 1.8% 1.5% 11,777,119 45,418 2,494,064 27,964
3 32 16.1% -2.7% 13,426,746 94,057 2,389,934 40,186
4 32 26.2% 1.2% 14,593,745 28,121 2,486,059 42,004
kern # ntask proc thr proc stdev thr stdev
speedup speedup pgf/s pgf/s
1 64 12,080,629 33,676 2,443,043 61,973
2 64 3.9% 9.9% 12,551,136 206,202 2,684,632 69,483
3 64 15.0% -3.8% 13,892,933 351,657 2,351,232 67,875
4 64 21.9% 1.8% 14,728,765 64,945 2,485,940 66,839
[*] https://lwn.net/Articles/724502/ Range reader/writer locks
https://lwn.net/Articles/744188/ Speculative page faults
Signed-off-by: Daniel Jordan <daniel.m.jordan(a)oracle.com>
44b163e12f mm: split up release_pages into non-sentinel and sentinel passes
b11d52e255 mm: splice local lists onto the front of the LRU
+------------------------------------------+------------+------------+
| | 44b163e12f | b11d52e255 |
+------------------------------------------+------------+------------+
| boot_successes | 35 | 0 |
| boot_failures | 0 | 34 |
| kernel_BUG_at_include/linux/mm_inline.h | 0 | 34 |
| invalid_opcode:#[##] | 0 | 34 |
| RIP:__activate_page | 0 | 33 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 34 |
| RIP:release_pages | 0 | 1 |
+------------------------------------------+------------+------------+
[ 5.261954] flags: 0x600000000038(uptodate|dirty|lru)
[ 5.261958] raw: 0000600000000038 ffff9c0c180cae78 0000000000000000 00000001ffffffff
[ 5.261961] raw: ffff9c0c1e0c5320 ffff9c0c1e0aa920 0000000000000000 0000000000000000
[ 5.261963] page dumped because: VM_BUG_ON_PAGE(!second_page->lru_sentinel)
[ 5.261983] ------------[ cut here ]------------
[ 5.261985] kernel BUG at include/linux/mm_inline.h:60!
[ 5.261992] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 5.261994] CPU: 0 PID: 176 Comm: 10-help-text Not tainted 4.15.0-00013-gb11d52e #193
[ 5.261996] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 5.262002] RIP: 0010:__activate_page+0x23e/0x2d6
[ 5.262003] RSP: 0018:ffffac1680257d78 EFLAGS: 00010096
[ 5.262006] RAX: 000000000000003f RBX: ffff9c0c1e0aa900 RCX: ee3388f000000000
[ 5.262007] RDX: ee3388f000000000 RSI: 0000000011fa3063 RDI: 0000000000000046
[ 5.262008] RBP: 0000000000000001 R08: ffff9c0c179dbdc0 R09: 0000000089d0656d
[ 5.262010] R10: 0000000000000000 R11: 0000000000000060 R12: 0000000000000000
[ 5.262011] R13: 0000000018000001 R14: 0000000000000000 R15: ffffffffbd961ef0
[ 5.262013] FS: 0000000000000000(0000) GS:ffff9c0c1d400000(0000) knlGS:0000000000000000
[ 5.262014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.262016] CR2: 00007fb195035688 CR3: 0000000008816000 CR4: 00000000000006f0
[ 5.262019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5.262020] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5.262021] Call Trace:
[ 5.262025] ? pagevec_move_tail_fn+0x30e/0x30e
[ 5.262027] pagevec_lru_move_fn+0xde/0x140
[ 5.262030] lru_add_drain+0x13/0x22
[ 5.262035] exit_mmap+0x5e/0x112
[ 5.262040] mmput+0x64/0xe5
[ 5.262042] do_exit+0x3e1/0x967
[ 5.262046] ? up_read+0x17/0x2c
[ 5.262049] ? __do_page_fault+0x35f/0x3e0
[ 5.262051] do_group_exit+0xad/0xad
[ 5.262053] SyS_exit_group+0xb/0xb
[ 5.262056] entry_SYSCALL_64_fastpath+0x24/0x8c
[ 5.262058] RIP: 0033:0x7fb194d3d408
[ 5.262060] RSP: 002b:00007ffe8553fa78 EFLAGS: 00000246
[ 5.262061] Code: 77 08 48 89 7b 20 48 89 53 28 48 89 32 48 8b 73 20 c6 43 3c 01 80 7e 1c 00 48 8d 7e e0 75 0e 48 c7 c6 bf 09 5f bd e8 08 75 01 00 <0f> 0b 8b 50 44 89 53 38 8b 48 40 8d 51 01 83 fa 0d 89 50 40 76
[ 5.262120] RIP: __activate_page+0x23e/0x2d6 RSP: ffffac1680257d78
[ 5.262122] ---[ end trace c12061551a396e73 ]---
[ 5.262124] Kernel panic - not syncing: Fatal exception
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start b11d52e25588e5267868ac7a33f0b93cbb12ac2b d8a5b80568a9cb66810e75b182018e9edb68e8ff --
git bisect good 2ec368073f6dd8ec95f4d0555037088ca6e110ad # 14:17 G 11 0 0 0 mm: add lru_[un]lock_all APIs
git bisect good c61170a78526b4c065a6a2c90d72af4daedb22ed # 14:24 G 11 0 0 0 mm: introduce add-only version of pagevec_lru_move_fn
git bisect good 6fe15c1d7af584aea805223aac5011c846bbe7c6 # 14:31 G 10 0 0 1 mm: use lru_batch locking in release_pages
git bisect good 44b163e12fd4a133016482d94ad11d8f3365ddd2 # 14:40 G 11 0 0 0 mm: split up release_pages into non-sentinel and sentinel passes
# first bad commit: [b11d52e25588e5267868ac7a33f0b93cbb12ac2b] mm: splice local lists onto the front of the LRU
git bisect good 44b163e12fd4a133016482d94ad11d8f3365ddd2 # 14:43 G 31 0 0 0 mm: split up release_pages into non-sentinel and sentinel passes
# extra tests on HEAD of linux-review/daniel-m-jordan-oracle-com/mm-add-a-percpu_pagelist_batch-sysctl-interface/20180202-131129
git bisect bad b11d52e25588e5267868ac7a33f0b93cbb12ac2b # 14:43 B 0 33 82 1 mm: splice local lists onto the front of the LRU
# extra tests on tree/branch linux-review/daniel-m-jordan-oracle-com/mm-add-a-percpu_pagelist_batch-sysctl-interface/20180202-131129
git bisect bad b11d52e25588e5267868ac7a33f0b93cbb12ac2b # 14:45 B 0 33 82 1 mm: splice local lists onto the front of the LRU
# extra tests with first bad commit reverted
git bisect good 73f8c593c869f0aa408485c758fd5b56c9eb36aa # 14:52 G 11 0 0 0 Revert "mm: splice local lists onto the front of the LRU"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
eb75544520 ("struct page: add field for vm_struct"): BUG: unable to handle kernel paging request at 0000000003fb4008
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/Igor-Stoppa/mm-security-ro-prote...
commit eb755445201ed5425b5b12ecffec44ba8fd02a54
Author: Igor Stoppa <igor.stoppa(a)huawei.com>
AuthorDate: Tue Jan 30 17:14:43 2018 +0200
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Fri Feb 2 12:34:39 2018 +0800
struct page: add field for vm_struct
When a page is used for virtual memory, it is often necessary to obtian
a handler to the corresponding vm_struct, which refers to the virtually
continuous area generated when invoking vmalloc.
The struct page has a "mapping" field, which can be re-used, to store a
pointer to the parent area. This will avoid more expensive searches.
As example, the function find_vm_area is reimplemented, to take advantage
of the newly introduced field.
Signed-off-by: Igor Stoppa <igor.stoppa(a)huawei.com>
6717c15611 genalloc: selftest
eb75544520 struct page: add field for vm_struct
c463d254c1 Pmalloc: self-test
+------------------------------------------+------------+------------+------------+
| | 6717c15611 | eb75544520 | c463d254c1 |
+------------------------------------------+------------+------------+------------+
| boot_successes | 36 | 0 | 0 |
| boot_failures | 0 | 41 | 13 |
| BUG:unable_to_handle_kernel | 0 | 41 | 13 |
| Oops:#[##] | 0 | 41 | 13 |
| RIP:find_vm_area | 0 | 41 | 13 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 41 | 13 |
+------------------------------------------+------------+------------+------------+
[ 5.529354] 00:06: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A
[ 5.531437] Initializing Nozomi driver 2.1d
[ 5.532196] Applicom driver: $Id: ac.c,v 1.30 2000/03/22 16:03:57 dwmw2 Exp $
[ 5.533309] ac.o: No PCI boards found.
[ 5.533914] ac.o: For an ISA board you must supply memory and irq parameters.
[ 5.535263] BUG: unable to handle kernel paging request at 0000000003fb4008
[ 5.536085] IP: find_vm_area+0x2b/0x30
[ 5.536085] PGD 0 P4D 0
[ 5.536085] Oops: 0000 [#1] PREEMPT SMP PTI
[ 5.536085] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.15.0-09942-geb75544 #192
[ 5.536085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 5.536085] RIP: 0010:find_vm_area+0x2b/0x30
[ 5.536085] RSP: 0000:ffffb23d0000bc20 EFLAGS: 00010206
[ 5.536085] RAX: 0000000003fb4000 RBX: ffffb23d00105000 RCX: 00000000000001f0
[ 5.536085] RDX: 0000000000000000 RSI: 0000000000027063 RDI: 80000000fed00073
[ 5.536085] RBP: ffffb23d0000bd18 R08: 00003ffffffff000 R09: 00000000ddcb162f
[ 5.536085] R10: 00000000000fed00 R11: 00000000000fed00 R12: ffff96b3d9e257c0
[ 5.536085] R13: ffff96b3d9e257e8 R14: 0000000000000000 R15: 0000000000000000
[ 5.536085] FS: 0000000000000000(0000) GS:ffff96b3dd600000(0000) knlGS:0000000000000000
[ 5.536085] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.536085] CR2: 0000000003fb4008 CR3: 0000000008816000 CR4: 00000000000006e0
[ 5.536085] Call Trace:
[ 5.536085] iounmap+0x50/0xaa
[ 5.536085] hpet_resources+0x43/0x9c
[ 5.536085] hpet_resources+0x5e/0x6b
[ 5.536085] ? acpi_ut_update_ref_count+0x7f/0x2da
[ 5.536085] ? acpi_ut_update_object_reference+0x119/0x187
[ 5.536085] ? hpet_resources+0x9c/0x9c
[ 5.536085] acpi_walk_resource_buffer+0x43/0x7a
[ 5.536085] ? hpet_resources+0x9c/0x9c
[ 5.536085] acpi_walk_resources+0x8b/0xad
[ 5.536085] hpet_acpi_add+0x34/0x6e
[ 5.536085] acpi_device_probe+0x48/0xf3
[ 5.536085] driver_probe_device+0x153/0x2b3
[ 5.536085] __driver_attach+0x69/0x89
[ 5.536085] ? driver_probe_device+0x2b3/0x2b3
[ 5.536085] bus_for_each_dev+0x5e/0x7c
[ 5.536085] bus_add_driver+0xe4/0x1bd
[ 5.536085] ? set_debug_rodata+0xc/0xc
[ 5.536085] driver_register+0x7d/0xaf
[ 5.536085] hpet_init+0x35/0x60
[ 5.536085] ? hpet_mmap_enable+0x40/0x40
[ 5.536085] do_one_initcall+0x83/0x118
[ 5.536085] ? set_debug_rodata+0xc/0xc
[ 5.536085] kernel_init_freeable+0x19a/0x218
[ 5.536085] ? rest_init+0x134/0x134
[ 5.536085] kernel_init+0x5/0xe1
[ 5.536085] ret_from_fork+0x35/0x40
[ 5.536085] Code: 48 8b 05 37 ab 6f 01 48 39 c7 73 03 31 c0 c3 48 ba ff ff ff ff ff 1f 00 00 48 01 d0 48 39 c7 73 eb e8 f6 f5 ff ff 48 85 c0 74 e1 <48> 8b 40 08 c3 55 53 31 d2 be d5 05 00 00 48 89 fb 48 c7 c7 17
[ 5.536085] RIP: find_vm_area+0x2b/0x30 RSP: ffffb23d0000bc20
[ 5.536085] CR2: 0000000003fb4008
[ 5.536085] ---[ end trace 18d603729549dc6e ]---
[ 5.536085] Kernel panic - not syncing: Fatal exception
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start c463d254c109fedb4dcdf7ec4a1c9da2960f3063 4bf772b14675411a69b3c807f73006de0fe4b649 --
git bisect bad eb755445201ed5425b5b12ecffec44ba8fd02a54 # 13:41 B 0 1 16 2 struct page: add field for vm_struct
git bisect good 4738e41690f2ad4956f2178fb50119f3f533012f # 13:52 G 11 0 0 0 genalloc: track beginning of allocations
git bisect good 6717c15611ae0926f6f9c466247c32c06d82ae00 # 14:02 G 11 0 0 0 genalloc: selftest
# first bad commit: [eb755445201ed5425b5b12ecffec44ba8fd02a54] struct page: add field for vm_struct
git bisect good 6717c15611ae0926f6f9c466247c32c06d82ae00 # 14:09 G 31 0 0 0 genalloc: selftest
# extra tests on HEAD of linux-review/Igor-Stoppa/mm-security-ro-protection-for-dynamic-data/20180202-123437
git bisect bad c463d254c109fedb4dcdf7ec4a1c9da2960f3063 # 14:09 B 0 13 29 0 Pmalloc: self-test
# extra tests on tree/branch linux-review/Igor-Stoppa/mm-security-ro-protection-for-dynamic-data/20180202-123437
git bisect bad c463d254c109fedb4dcdf7ec4a1c9da2960f3063 # 14:10 B 0 13 29 0 Pmalloc: self-test
# extra tests with first bad commit reverted
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
[fw_cfg] c8bf448ff3: kernel_BUG_at_arch/x86/mm/physaddr.c
by kernel test robot
FYI, we noticed the following commit (built with gcc-7):
commit: c8bf448ff3899860de51fbae61a43619c912ddf2 ("fw_cfg: do DMA read operation")
https://git.kernel.org/cgit/linux/kernel/git/mst/vhost.git vhost
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -m 420M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+------------------------------------------+------------+------------+
| | b4b818b1f7 | c8bf448ff3 |
+------------------------------------------+------------+------------+
| boot_successes | 8 | 0 |
| boot_failures | 0 | 8 |
| kernel_BUG_at_arch/x86/mm/physaddr.c | 0 | 8 |
| invalid_opcode:#[##] | 0 | 8 |
| RIP:__phys_addr | 0 | 8 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 8 |
+------------------------------------------+------------+------------+
[ 19.254526] kernel BUG at arch/x86/mm/physaddr.c:27!
[ 19.255580] invalid opcode: 0000 [#1]
[ 19.256147] Modules linked in:
[ 19.256561] CPU: 0 PID: 1 Comm: swapper Not tainted 4.15.0-00020-gc8bf448 #1
[ 19.256561] RIP: 0010:__phys_addr+0x4f/0x90
[ 19.256561] RSP: 0000:ffffc9000000bc50 EFLAGS: 00010287
[ 19.256561] RAX: 0000780000000000 RBX: ffff880017c5ff20 RCX: ffff880017c5ff20
[ 19.256561] RDX: 0000000080000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 19.256561] RBP: ffffc9000000bc50 R08: ffff88000002e750 R09: 0000000000000000
[ 19.256561] R10: ffff880017c5ff20 R11: 0000000000000000 R12: 0000000004000000
[ 19.256561] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000004000000
[ 19.256561] FS: 0000000000000000(0000) GS:ffffffff82849000(0000) knlGS:0000000000000000
[ 19.256561] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.256561] CR2: 0000000000000000 CR3: 000000000281b000 CR4: 00000000000006b0
[ 19.256561] Call Trace:
[ 19.256561] fw_cfg_dma_transfer+0x5c/0x130
[ 19.256561] fw_cfg_read_blob+0x132/0x250
[ 19.256561] fw_cfg_sysfs_probe+0x43e/0xa40
[ 19.256561] ? mutex_unlock+0x1d/0x30
[ 19.256561] platform_drv_probe+0x5e/0x130
[ 19.256561] driver_probe_device+0x5c2/0x770
[ 19.256561] __driver_attach+0x14c/0x1d0
[ 19.256561] ? driver_probe_device+0x770/0x770
[ 19.256561] bus_for_each_dev+0xa7/0xf0
[ 19.256561] driver_attach+0x21/0x30
[ 19.256561] bus_add_driver+0x318/0x420
[ 19.256561] ? firmware_map_add_early+0x84/0x84
[ 19.256561] driver_register+0xa7/0x190
[ 19.256561] ? firmware_map_add_early+0x84/0x84
[ 19.256561] __platform_driver_register+0x39/0x50
[ 19.256561] fw_cfg_sysfs_init+0x4e/0x8e
[ 19.256561] ? firmware_map_add_early+0x84/0x84
[ 19.256561] do_one_initcall+0x53/0x285
[ 19.256561] kernel_init_freeable+0x1dc/0x2d8
[ 19.256561] ? rest_init+0x140/0x140
[ 19.256561] kernel_init+0x11/0x1d0
[ 19.256561] ret_from_fork+0x1f/0x30
[ 19.256561] Code: 01 f8 48 39 c2 72 24 0f b6 0d 9a 17 e3 01 48 89 c2 48 83 05 e3 5e fc 01 01 48 d3 ea 48 85 d2 75 0a 48 83 05 db 5e fc 01 01 5d c3 <0f> 0b 48 83 05 d7 5e fc 01 01 48 8b 05 30 54 7c 01 48 83 05 98
[ 19.256561] RIP: __phys_addr+0x4f/0x90 RSP: ffffc9000000bc50
[ 19.285532] ---[ end trace 4c809434fb988277 ]---
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
lkp
4 years, 6 months
d753570949 ("per-cpu free_area list v1"): BUG: Bad page state in process swapper/0 pfn:0ee61
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
aaron/free_area_per_cpu_list
commit d75357094987e78929937ddce021a619f9ba0f3d
Author: Aaron Lu <aaron.lu(a)intel.com>
AuthorDate: Wed Jan 31 14:07:16 2018 +0800
Commit: Aaron Lu <aaron.lu(a)intel.com>
CommitDate: Wed Jan 31 14:07:16 2018 +0800
per-cpu free_area list v1
8cc71e74d1 __free_one_page: skip merge for order-0 page unless compaction is in progress
d753570949 per-cpu free_area list v1
d0376610ff use per cpu
+-------------------------------------------------------+------------+------------+------------+
| | 8cc71e74d1 | d753570949 | d0376610ff |
+-------------------------------------------------------+------------+------------+------------+
| boot_successes | 2 | 0 | 0 |
| boot_failures | 57 | 26 | 21 |
| WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page | 57 | 12 | 2 |
| EIP:note_page | 57 | 12 | 2 |
| Mem-Info | 3 | 1 | |
| WARNING:at_drivers/pci/pci-sysfs.c:#pci_mmap_resource | 1 | | |
| EIP:pci_mmap_resource | 1 | | |
| BUG:Bad_page_state_in_process | 0 | 26 | 21 |
+-------------------------------------------------------+------------+------------+------------+
[ 0.850007] PCI: CLS 0 bytes, default 64
[ 0.851072] Unpacking initramfs...
[ 1.021516] Freeing initrd memory: 2140K
[ 1.024267] Machine check injector initialized
[ 1.025382] Scanning for low memory corruption every 60 seconds
[ 1.026444] BUG: Bad page state in process swapper/0 pfn:0ee61
[ 1.027330] page:fef9d2f0 count:0 mapcount:0 mapping:8947b258 index:0x200
[ 1.028020] flags: 0x800000()
[ 1.028020] raw: 00000100 00000200 00800000 ffffffff 00000000 00000100 00000200 00000000
[ 1.028020] page dumped because: non-NULL mapping
[ 1.028020] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc4-00003-gd753570 #547
[ 1.028020] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 1.028020] Call Trace:
[ 1.028020] dump_stack+0x7b/0xaf
[ 1.028020] bad_page+0xda/0xf7
[ 1.028020] check_new_page_bad+0x46/0x48
[ 1.028020] get_page_from_freelist+0xa2d/0xd5e
[ 1.028020] __alloc_pages_nodemask+0x116/0xb57
[ 1.028020] ? __save_stack_trace+0x81/0xc2
[ 1.028020] ? ret_from_fork+0x19/0x24
[ 1.028020] ? mark_held_locks+0x43/0x5c
[ 1.028020] ? new_slab+0x6d/0x2b9
[ 1.028020] new_slab+0xae/0x2b9
[ 1.028020] ? kvm_clock_read+0x1f/0x29
[ 1.028020] ___slab_alloc+0x242/0x351
[ 1.028020] ? crypto_larval_alloc+0x27/0x83
[ 1.028020] ? __lock_is_held+0x30/0x64
[ 1.028020] __slab_alloc+0x30/0x5f
[ 1.028020] ? __slab_alloc+0x30/0x5f
[ 1.028020] ? crypto_larval_alloc+0x27/0x83
[ 1.028020] kmem_cache_alloc_trace+0x70/0x1e5
[ 1.028020] ? crypto_larval_alloc+0x27/0x83
[ 1.028020] crypto_larval_alloc+0x27/0x83
[ 1.028020] __crypto_register_alg+0x93/0x154
[ 1.028020] crypto_register_alg+0x27/0x4c
[ 1.028020] ? pt_dump_debug_init+0x2c/0x2c
[ 1.028020] aes_init+0xd/0xf
[ 1.028020] do_one_initcall+0x8b/0x12d
[ 1.028020] ? do_early_param+0x75/0x75
[ 1.028020] kernel_init_freeable+0x181/0x206
[ 1.028020] ? rest_init+0x1ee/0x1ee
[ 1.028020] kernel_init+0x8/0xcb
[ 1.028020] ret_from_fork+0x19/0x24
[ 1.028020] Disabling lock debugging due to kernel taint
[ 1.061998] BUG: Bad page state in process kthreadd pfn:0cee1
[ 1.062834] page:c522119e count:0 mapcount:0 mapping:8947b258 index:0x200
[ 1.063763] flags: 0x800000()
[ 1.064256] raw: 00000100 00000200 00800000 ffffffff 00000000 00000100 00000200 00000000
[ 1.065398] page dumped because: non-NULL mapping
[ 1.065506] CPU: 0 PID: 2 Comm: kthreadd Tainted: G B 4.15.0-rc4-00003-gd753570 #547
[ 1.065506] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 1.065506] Call Trace:
[ 1.065506] dump_stack+0x7b/0xaf
[ 1.065506] bad_page+0xda/0xf7
[ 1.065506] check_new_page_bad+0x46/0x48
[ 1.065506] get_page_from_freelist+0xa2d/0xd5e
[ 1.065506] __alloc_pages_nodemask+0x116/0xb57
[ 1.065506] ? _raw_spin_unlock+0x1d/0x27
[ 1.065506] ? deactivate_slab+0x507/0x53e
[ 1.065506] ? trace_hardirqs_on+0xb/0xd
[ 1.065506] ? new_slab+0x6d/0x2b9
[ 1.065506] new_slab+0xae/0x2b9
[ 1.065506] ___slab_alloc+0x242/0x351
[ 1.065506] ? copy_process+0xec/0x17b3
[ 1.065506] ? fs_reclaim_acquire+0xc/0x2a
[ 1.065506] __slab_alloc+0x30/0x5f
[ 1.065506] ? __slab_alloc+0x30/0x5f
[ 1.065506] ? copy_process+0xec/0x17b3
[ 1.065506] kmem_cache_alloc+0x6d/0x1e7
[ 1.065506] ? copy_process+0xec/0x17b3
[ 1.065506] copy_process+0xec/0x17b3
[ 1.065506] ? __kthread_bind_mask+0x4a/0x4a
[ 1.065506] ? set_next_entity+0x3bf/0x8bd
[ 1.065506] ? put_prev_entity+0x33/0x636
[ 1.065506] ? trace_hardirqs_on+0xb/0xd
[ 1.065506] _do_fork+0x79/0x519
[ 1.065506] ? kthreadd+0x18e/0x1a9
[ 1.065506] ? kthreadd+0x153/0x1a9
[ 1.065506] ? __kthread_bind_mask+0x4a/0x4a
[ 1.065506] kernel_thread+0x1c/0x21
[ 1.065506] kthreadd+0x164/0x1a9
[ 1.065506] ? kthread_stop+0x21a/0x21a
[ 1.065506] ret_from_fork+0x19/0x24
[ 1.088722] The force parameter has not been set to 1. The Iris poweroff handler will not be installed.
[ 1.090122] NatSemi SCx200 Driver
[ 1.090783] spin_lock-torture:--- Start of test [debug]: nwriters_stress=2 nreaders_stress=0 stat_interval=60 verbose=1 shuffle_interval=3 stutter=5 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
[ 1.093212] spin_lock-torture: Creating torture_shuffle task
[ 1.094121] spin_lock-torture: Creating torture_stutter task
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start d0376610ff3246a87b792ffac272a32973e42592 1291a0d5049dbc06baaaf66a9ff3f53db493b19b --
git bisect good 8cc71e74d1fb231fedb937ba63baf0bca5ab2103 # 13:12 G 10 0 10 22 __free_one_page: skip merge for order-0 page unless compaction is in progress
git bisect bad d75357094987e78929937ddce021a619f9ba0f3d # 13:18 B 0 1 19 5 per-cpu free_area list v1
# first bad commit: [d75357094987e78929937ddce021a619f9ba0f3d] per-cpu free_area list v1
git bisect good 8cc71e74d1fb231fedb937ba63baf0bca5ab2103 # 13:26 G 32 0 32 56 __free_one_page: skip merge for order-0 page unless compaction is in progress
# extra tests with debug options
git bisect bad d75357094987e78929937ddce021a619f9ba0f3d # 13:37 B 0 3 16 0 per-cpu free_area list v1
# extra tests on HEAD of aaron/free_area_per_cpu_list
git bisect bad d0376610ff3246a87b792ffac272a32973e42592 # 13:43 B 0 21 37 0 use per cpu
# extra tests on tree/branch aaron/free_area_per_cpu_list
git bisect bad d0376610ff3246a87b792ffac272a32973e42592 # 13:44 B 0 21 37 0 use per cpu
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months