9f4835fb96 ("x86/fpu: Tighten validation of user-supplied .."): Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/fpu
commit 9f4835fb965d8eea7e608d0cb62c246c804dec90
Author: Eric Biggers <ebiggers(a)google.com>
AuthorDate: Fri Sep 22 10:41:55 2017 -0700
Commit: Ingo Molnar <mingo(a)kernel.org>
CommitDate: Sat Sep 23 11:02:00 2017 +0200
x86/fpu: Tighten validation of user-supplied xstate_header
Move validation of user-supplied xstate_headers into a helper function
and call it from both the ptrace and sigreturn syscall paths. The new
function also considers it to be an error if *any* reserved bits are
set, whereas before we were just clearing most of them.
This should reduce the chance of bugs that fail to correctly validate
user-supplied XSAVE areas. It also will expose any broken userspace
programs that set the other reserved bits; this is desirable because
such programs will lose compatibility with future CPUs and kernels if
those bits are ever used for anything. (There shouldn't be any such
programs, and in fact in the case where the compacted format is in use
we were already validating xfeatures. But you never know...)
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Reviewed-by: Rik van Riel <riel(a)redhat.com>
Acked-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Kevin Hao <haokexin(a)gmail.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Michael Halcrow <mhalcrow(a)google.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Wanpeng Li <wanpeng.li(a)hotmail.com>
Cc: Yu-cheng Yu <yu-cheng.yu(a)intel.com>
Cc: kernel-hardening(a)lists.openwall.com
Link: http://lkml.kernel.org/r/20170922174156.16780-3-ebiggers3@gmail.com
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
29ed270cd3 x86/fpu: Don't let userspace set bogus xcomp_bv
9f4835fb96 x86/fpu: Tighten validation of user-supplied xstate_header
8d3e268d89 x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__read/write()
e7c6e36753 Merge branch 'x86/urgent'
+-----------------------------------------------------------+------------+------------+------------+------------+
| | 29ed270cd3 | 9f4835fb96 | 8d3e268d89 | e7c6e36753 |
+-----------------------------------------------------------+------------+------------+------------+------------+
| boot_successes | 35 | 2 | 6 | 0 |
| boot_failures | 0 | 13 | 13 | 11 |
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 0 | 13 | 13 | 11 |
+-----------------------------------------------------------+------------+------------+------------+------------+
[ 1.610349]
[ 1.611017] ======================================================
[ 1.611575] WARNING: possible circular locking dependency detected
[ 1.612125] 4.14.0-rc1-00218-g9f4835f #1 Not tainted
[ 1.612762] ------------------------------------------------------
[ 1.613483] kworker/0:1/13 is trying to acquire lock:
[ 1.613483] (ww_class_mutex){+.+.}, at: [<81151595>] test_abba_work+0xea/0x571
[ 1.613483]
[ 1.613483] but now in release context of a crosslock acquired at the following:
[ 1.613483] ((complete)&abba.b_ready){+.+.}, at: [<83104c1c>] wait_for_completion+0x25/0x35
[ 1.613483]
[ 1.613483] which lock already depends on the new lock.
[ 1.613483]
[ 1.613483] the existing dependency chain (in reverse order) is:
[ 1.613483]
[ 1.613483] -> #1 ((complete)&abba.b_ready){+.+.}:
[ 1.613483] validate_chain+0xf47/0x1171
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start f8fce8fa419bb00ed5a5d6e91abe6dbed75f5842 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e --
git bisect good 330ac28434f18e4dfc62985e9d2ed5119c224781 # 23:44 G 11 0 0 0 Merge 'rdma/k.o/net-next-base' into devel-spot-201709232001
git bisect good 2cf018879b36a0d3681086cfc1c08c6cc9bef52a # 00:58 G 11 0 0 0 Merge 'linux-review/Thiebaud-Weksteen/Call-GetEventLog-before-ExitBootServices/20170923-004848' into devel-spot-201709232001
git bisect good 422c87daea34f0298708f6afdf4591e5a0f9b9ea # 01:13 G 10 0 0 0 Merge 'linux-review/Colin-King/video-fbdev-radeon-make-const-array-post_divs-static-reduces-object-code-size/20170922-203140' into devel-spot-201709232001
git bisect good 3303d4863ae6dd72e2481abfd247e127933a5631 # 01:31 G 11 0 0 0 Merge 'ceph-client/testing' into devel-spot-201709232001
git bisect bad 5310cfb68118cd2970a7e8b6d4693c23c2535564 # 01:50 B 0 3 15 0 Merge 'anholt/bcm2835-soc-next-v2' into devel-spot-201709232001
git bisect bad c346b48b4f79509e371f96aafb72f40f60810571 # 02:13 B 0 3 15 0 Merge 'tip/WIP.x86/fpu' into devel-spot-201709232001
git bisect good 1a4a586e67792afc4b3a070ce64e0aa7b1cd5bc0 # 02:40 G 11 0 0 0 x86/fpu: Remove 'kbuf' parameter from the copy_user_to_xstate() API
git bisect good 9e7deb522d8fa604f687b61dcd4c13358df9c753 # 03:34 G 11 0 0 0 x86/fpu: Decouple fpregs_activate()/fpregs_deactivate() from fpu->fpregs_active
git bisect good e9758265c677494bb8c532520cb950b14cf8709a # 03:55 G 11 0 0 0 x86/fpu: Fix boolreturn.cocci warnings
git bisect good 29ed270cd32335003f65dae9a6981c7819f3467c # 04:11 G 11 0 0 0 x86/fpu: Don't let userspace set bogus xcomp_bv
git bisect bad 9f4835fb965d8eea7e608d0cb62c246c804dec90 # 04:27 B 0 11 23 0 x86/fpu: Tighten validation of user-supplied xstate_header
# first bad commit: [9f4835fb965d8eea7e608d0cb62c246c804dec90] x86/fpu: Tighten validation of user-supplied xstate_header
git bisect good 29ed270cd32335003f65dae9a6981c7819f3467c # 04:34 G 31 0 0 0 x86/fpu: Don't let userspace set bogus xcomp_bv
# extra tests with CONFIG_DEBUG_INFO_REDUCED
git bisect bad 9f4835fb965d8eea7e608d0cb62c246c804dec90 # 04:51 B 0 11 23 0 x86/fpu: Tighten validation of user-supplied xstate_header
# extra tests on HEAD of linux-devel/devel-spot-201709232001
git bisect bad f8fce8fa419bb00ed5a5d6e91abe6dbed75f5842 # 04:51 B 0 31 51 4 0day head guard for 'devel-spot-201709232001'
# extra tests on tree/branch tip/WIP.x86/fpu
git bisect bad 8d3e268d89523abba613763da67c7eb47a744ad7 # 05:41 B 0 10 22 0 x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__read/write()
# extra tests with first bad commit reverted
git bisect good ab2a8bbacf8d609fb05ea05464eb6a00747a9459 # 06:05 G 11 0 0 0 Revert "x86/fpu: Tighten validation of user-supplied xstate_header"
# extra tests on tree/branch tip/master
git bisect bad e7c6e36753316c8dee2a7fe939db0c3046c5f357 # 06:36 B 0 11 23 0 Merge branch 'x86/urgent'
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 7 months
[lkp-robot] [x86/fpu] 14e633085a: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=
by kernel test robot
FYI, we noticed the following commit:
commit: 14e633085ab2716f757f0c3d994efe14e5fe604e ("x86/fpu: don't let userspace set bogus xcomp_bv")
url: https://github.com/0day-ci/linux/commits/Eric-Biggers/x86-fpu-prevent-lea...
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -m 420M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-----------------------------------------------------------+------------+------------+
| | dc1fb16d0a | 14e633085a |
+-----------------------------------------------------------+------------+------------+
| boot_successes | 10 | 0 |
| boot_failures | 0 | 4 |
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 0 | 4 |
+-----------------------------------------------------------+------------+------------+
[ 6.272786] init: Console is alive
[ 6.273405] init: - watchdog -
[ 6.275390] kmodloader (122) used greatest stack depth: 14352 bytes left
[ 7.274437] init: - preinit -
[ 7.286014] init[1] bad frame in 32bit sigreturn frame:00000000fff5eb2c ip:f7f369b5 sp:fff5f08c orax:ffffffffffffffff in libuClibc-0.9.33.2.so[f7f2c000+4f000]
[ 7.288482] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 7.288482]
[ 7.289898] CPU: 0 PID: 1 Comm: init Not tainted 4.14.0-rc1-00021-g14e6330 #38
[ 7.290988] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 7.292562] Call Trace:
[ 7.292960] dump_stack+0x61/0x7e
[ 7.293485] panic+0xd3/0x20f
[ 7.293967] do_exit+0x4f2/0x983
[ 7.294440] do_group_exit+0x45/0xb0
[ 7.294966] get_signal+0x4b8/0x4e4
[ 7.295483] do_signal+0x23/0x5bc
[ 7.295964] ? force_sig_info+0xc6/0xd5
[ 7.296521] ? force_sig+0x11/0x13
[ 7.297027] ? signal_fault+0xb8/0xc1
[ 7.297557] exit_to_usermode_loop+0x3a/0x72
[ 7.298178] do_int80_syscall_32+0xe9/0xfe
[ 7.298784] entry_INT80_compat+0x2a/0x40
[ 7.299374] RIP: 0023:0xf7f369b5
[ 7.299848] RSP: 002b:00000000fff5f08c EFLAGS: 00000246
[ 7.300587] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00000000f7fb670c
[ 7.301609] RDX: 000000000000000a RSI: 0000000000000f9f RDI: 0000000000000fa0
[ 7.302624] RBP: 00000000fff5f0f8 R08: 0000000000000000 R09: 0000000000000000
[ 7.303631] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 7.304666] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 7.305677] Kernel Offset: 0x7000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Elapsed time: 10
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 7 months
[lkp-robot] [cpufreq] aa7519af45: [No primary change] pm-qa.time.involuntary_context_switches +1204%
by kernel test robot
Greeting,
There is no primary kpi change in this test, below is the data collected through multiple monitors running background just for your information.
commit: aa7519af450d3c62a057aece24877c34562fa25a ("cpufreq: Use transition_delay_us for legacy governors as well")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: pm-qa
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
with following parameters:
test: cpuhotplug
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: pm-qa/cpuhotplug/lkp-bdw-ep3d
2d045036322c29b6 aa7519af450d3c62a057aece24
---------------- --------------------------
%stddev change %stddev
\ | \
13667 1204% 178195 pm-qa.time.involuntary_context_switches
24.71 -6% 23.12 pm-qa.time.system_time
pm-qa.time.involuntary_context_switches
180000 O-+-OO-O-O-OO-O-O-OO-O-O-OO-O-O-O-OO-O-O-OO-O-O-OO-----------------+
| |
160000 +-+ |
140000 +-+ |
| |
120000 +-+ |
100000 +-+ |
| |
80000 +-+ |
60000 +-+ |
| |
40000 +-+ |
20000 +-+ |
|.+.++.+.+.++.+.+.++.+.+.++.+.+.+.++.+.+.++.+.+. .++.+.+.++.+.|
0 +-O----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
4 years, 8 months
[lkp-robot] [x86/fpu] 29ed270cd3: BUG:KASAN:slab-out-of-bounds
by kernel test robot
FYI, we noticed the following commit:
commit: 29ed270cd32335003f65dae9a6981c7819f3467c ("x86/fpu: Don't let userspace set bogus xcomp_bv")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git WIP.x86/fpu
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-----------------------------------------------------------+------------+------------+
| | 331ac0b067 | 29ed270cd3 |
+-----------------------------------------------------------+------------+------------+
| boot_successes | 10 | 0 |
| boot_failures | 2 | 12 |
| IP-Config:Auto-configuration_of_network_failed | 2 | 2 |
| BUG:kernel_hang_in_test_stage | 0 | 2 |
| BUG:KASAN:slab-out-of-bounds | 0 | 8 |
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 0 | 8 |
+-----------------------------------------------------------+------------+------------+
[ 26.224248] BUG: KASAN: slab-out-of-bounds in __fpu__restore_sig+0xf34/0x1050
[ 26.225765] Read of size 8 at addr ffff880019200b88 by task init/1
[ 26.226982]
[ 26.227408] CPU: 0 PID: 1 Comm: init Not tainted 4.14.0-rc1-00217-g29ed270 #1
[ 26.228807] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 26.230764] Call Trace:
[ 26.231354] dump_stack+0xb9/0xfa
[ 26.232095] print_address_description+0x95/0x3f0
[ 26.233066] ? __fpu__restore_sig+0xf34/0x1050
[ 26.233986] kasan_report+0x1ee/0x3d0
[ 26.234789] __asan_report_load8_noabort+0x29/0x40
[ 26.235775] __fpu__restore_sig+0xf34/0x1050
[ 26.236674] ? save_fsave_header+0x1c0/0x1c0
[ 26.237571] ? do_signal+0x49b/0x1df0
[ 26.238364] ? ep_poll_readyevents_proc+0x90/0x90
[ 26.239356] ? __might_sleep+0xb2/0x1e0
[ 26.240178] ? recalc_sigpending+0x23/0xc0
[ 26.241044] ? __set_task_blocked+0xdc/0x220
[ 26.241944] ? retarget_shared_pending+0x250/0x250
[ 26.242940] ? _copy_to_user+0xc5/0xf0
[ 26.243745] fpu__restore_sig+0xa1/0x120
[ 26.244585] ? __set_current_blocked+0xee/0x130
[ 26.245535] ia32_restore_sigcontext+0x48f/0x560
[ 26.246478] sys32_sigreturn+0x23a/0x2e0
[ 26.247311] ? get_sigframe+0x6b0/0x6b0
[ 26.248414] ? exit_to_usermode_loop+0x107/0x190
[ 26.249387] ? get_sigframe+0x6b0/0x6b0
[ 26.250496] do_int80_syscall_32+0x1d1/0x520
[ 26.251396] entry_INT80_compat+0x2d/0x40
[ 26.252244] RIP: 0023:0xf7ed99b5
[ 26.252965] RSP: 002b:00000000ff9139bc EFLAGS: 00000246
[ 26.254031] RAX: 00000000fffffffc RBX: 0000000000000004 RCX: 00000000f7f5970c
[ 26.255413] RDX: 000000000000000a RSI: 0000000000000f9c RDI: 0000000000000fa0
[ 26.256789] RBP: 00000000ff913a28 R08: 0000000000000000 R09: 0000000000000000
[ 26.258162] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 26.259566] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 26.260947]
[ 26.261360] Allocated by task 0:
[ 26.262081] save_stack_trace+0x23/0x30
[ 26.262900] save_stack+0x4e/0x170
[ 26.263645] kasan_kmalloc+0xd5/0x130
[ 26.264428] kasan_slab_alloc+0x1a/0x30
[ 26.265254] kmem_cache_alloc_node+0x18d/0x330
[ 26.266173] copy_process+0x28a/0x4c50
[ 26.266980] _do_fork+0xf5/0x9c0
[ 26.267700] kernel_thread+0x31/0x40
[ 26.268468] rest_init+0x30/0x170
[ 26.269279] start_kernel+0x82e/0x859
[ 26.270077] x86_64_start_reservations+0x40/0x49
[ 26.271033] x86_64_start_kernel+0xc3/0xcd
[ 26.271892] verify_cpu+0x0/0xfb
[ 26.272608]
[ 26.273069] Freed by task 0:
[ 26.273725] (stack is not available)
[ 26.274488]
[ 26.274909] The buggy address belongs to the object at ffff880019200040
[ 26.274909] which belongs to the cache task_struct of size 2880
[ 26.277272] The buggy address is located 8 bytes to the right of
[ 26.277272] 2880-byte region [ffff880019200040, ffff880019200b80)
[ 26.279629] The buggy address belongs to the page:
[ 26.280602] page:ffffea0000648000 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
[ 26.282496] flags: 0x80000000008100(slab|head)
[ 26.283422] raw: 0080000000008100 0000000000000000 0000000000000000 0000000100090009
[ 26.284979] raw: ffff880019ad1470 ffffea000064a220 ffff880019815bc0 0000000000000000
[ 26.286543] page dumped because: kasan: bad access detected
[ 26.287644]
[ 26.288057] Memory state around the buggy address:
[ 26.289040] ffff880019200a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 26.290556] ffff880019200b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 26.292010] >ffff880019200b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 26.293465] ^
[ 26.294225] ffff880019200c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 26.295710] ffff880019200c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 26.297260] ==================================================================
[ 26.298719] Disabling lock debugging due to kernel taint
[ 26.299916] init[1] bad frame in 32bit sigreturn frame:00000000ff91342c ip:f7ed99b5 sp:ff9139bc orax:ffffffffffffffff in libuClibc-0.9.33.2.so[f7ecf000+4f000]
[ 26.303002] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 26.303002]
[ 26.304893] CPU: 0 PID: 1 Comm: init Tainted: G B 4.14.0-rc1-00217-g29ed270 #1
[ 26.306554] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 26.308508] Call Trace:
[ 26.309095] dump_stack+0xb9/0xfa
[ 26.309946] panic+0x1d9/0x3ee
[ 26.310637] ? __warn+0x212/0x212
[ 26.311360] ? perf_event_exit_task+0x88c/0xbe0
[ 26.312305] ? preempt_count_add+0x1bf/0x210
[ 26.313205] do_exit+0x2785/0x32e0
[ 26.313956] ? __sigqueue_free+0x9b/0xc0
[ 26.314789] ? kmem_cache_free+0x91/0x300
[ 26.315646] ? is_current_pgrp_orphaned+0xc0/0xc0
[ 26.316612] ? __sigqueue_free+0x9b/0xc0
[ 26.317438] ? __dequeue_signal+0x349/0x790
[ 26.318328] ? try_to_wake_up+0xcd/0x1220
[ 26.319185] do_group_exit+0xf7/0x360
[ 26.320204] get_signal+0x67c/0x1330
[ 26.320990] do_signal+0x9d/0x1df0
[ 26.321745] ? __send_signal+0x63f/0xc30
[ 26.322580] ? vprintk_default+0x27/0x40
[ 26.323404] ? setup_sigcontext+0x8e0/0x8e0
[ 26.324289] ? _raw_spin_unlock_irqrestore+0x73/0xd0
[ 26.325315] ? force_sig_info+0x268/0x340
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 8 months
[lkp-robot] [x86/fpu] 5192698d8d: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=
by kernel test robot
FYI, we noticed the following commit:
commit: 5192698d8d89d3158e9ed54d1a3b6e7b6daaad3b ("x86/fpu: Don't let userspace set bogus xcomp_bv")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git WIP.x86/fpu
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -m 420M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-----------------------------------------------------------+------------+------------+
| | 03eaec81ac | 5192698d8d |
+-----------------------------------------------------------+------------+------------+
| boot_successes | 8 | 0 |
| boot_failures | 0 | 3 |
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 0 | 3 |
+-----------------------------------------------------------+------------+------------+
[ 6.342102] init: Console is alive
[ 6.342551] init: - watchdog -
[ 6.343905] kmodloader (122) used greatest stack depth: 14352 bytes left
[ 7.343383] init: - preinit -
[ 7.353277] init[1] bad frame in 32bit sigreturn frame:00000000ff9933ec ip:f7f5d9b5 sp:ff99394c orax:ffffffffffffffff in libuClibc-0.9.33.2.so[f7f53000+4f000]
[ 7.355067] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 7.355067]
[ 7.356164] CPU: 0 PID: 1 Comm: init Not tainted 4.14.0-rc1-00217-g5192698 #1
[ 7.357002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 7.358202] Call Trace:
[ 7.358506] dump_stack+0x61/0x7e
[ 7.358906] panic+0xd3/0x20f
[ 7.359274] do_exit+0x4f2/0x983
[ 7.359663] do_group_exit+0x45/0xb0
[ 7.360095] get_signal+0x4b8/0x4e4
[ 7.360519] do_signal+0x23/0x5bc
[ 7.360919] ? force_sig_info+0xc6/0xd5
[ 7.361350] ? force_sig+0x11/0x13
[ 7.361662] ? signal_fault+0xb8/0xc1
[ 7.361996] exit_to_usermode_loop+0x3a/0x72
[ 7.362398] do_int80_syscall_32+0xcb/0xe0
[ 7.362771] entry_INT80_compat+0x2a/0x40
[ 7.363140] RIP: 0023:0xf7f5d9b5
[ 7.363437] RSP: 002b:00000000ff99394c EFLAGS: 00000246
[ 7.363907] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00000000f7fdd70c
[ 7.364548] RDX: 000000000000000a RSI: 0000000000000f9f RDI: 0000000000000fa0
[ 7.365188] RBP: 00000000ff9939b8 R08: 0000000000000000 R09: 0000000000000000
[ 7.365825] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 7.366485] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 7.367183] Kernel Offset: 0x24000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Elapsed time: 10
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 8 months
041cd640b2 ("cgroup: Implement cgroup2 basic CPU usage .."): BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-cgroup2-cpu-on-basic-acct
commit 041cd640b2f3c5607171c59d8712b503659d21f7
Author: Tejun Heo <tj(a)kernel.org>
AuthorDate: Mon Sep 25 08:12:05 2017 -0700
Commit: Tejun Heo <tj(a)kernel.org>
CommitDate: Mon Sep 25 08:12:05 2017 -0700
cgroup: Implement cgroup2 basic CPU usage accounting
In cgroup1, while cpuacct isn't actually controlling any resources, it
is a separate controller due to combination of two factors -
1. enabling cpu controller has significant side effects, and 2. we
have to pick one of the hierarchies to account CPU usages on. cpuacct
controller is effectively used to designate a hierarchy to track CPU
usages on.
cgroup2's unified hierarchy removes the second reason and we can
account basic CPU usages by default. While we can use cpuacct for
this purpose, both its interface and implementation leave a lot to be
desired - it collects and exposes two sources of truth which don't
agree with each other and some of the exposed statistics don't make
much sense. Also, it propagates all the way up the hierarchy on each
accounting event which is unnecessary.
This patch adds basic resource accounting mechanism to cgroup2's
unified hierarchy and accounts CPU usages using it.
* All accountings are done per-cpu and don't propagate immediately.
It just bumps the per-cgroup per-cpu counters and links to the
parent's updated list if not already on it.
* On a read, the per-cpu counters are collected into the global ones
and then propagated upwards. Only the per-cpu counters which have
changed since the last read are propagated.
* CPU usage stats are collected and shown in "cgroup.stat" with "cpu."
prefix. Total usage is collected from scheduling events. User/sys
breakdown is sourced from tick sampling and adjusted to the usage
using cputime_adjust().
This keeps the accounting side hot path O(1) and per-cpu and the read
side O(nr_updated_since_last_read).
v2: Minor changes and documentation updates as suggested by Waiman and
Roman.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Acked-by: Peter Zijlstra <peterz(a)infradead.org>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Li Zefan <lizefan(a)huawei.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Waiman Long <longman(a)redhat.com>
Cc: Roman Gushchin <guro(a)fb.com>
d2cc5ed694 cpuacct: Introduce cgroup_account_cputime[_field]()
041cd640b2 cgroup: Implement cgroup2 basic CPU usage accounting
8e3456f0a9 sched: Implement interface for cgroup unified hierarchy
+-------------------------------------------------------+------------+------------+------------+
| | d2cc5ed694 | 041cd640b2 | 8e3456f0a9 |
+-------------------------------------------------------+------------+------------+------------+
| boot_successes | 188 | 1 | 21 |
| boot_failures | 0 | 1 | 3 |
| BUG:unable_to_handle_kernel | 0 | 1 | 3 |
| Oops:#[##] | 0 | 1 | 3 |
| Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 0 | 1 | 3 |
+-------------------------------------------------------+------------+------------+------------+
[ 0.003004] pid_max: default: 32768 minimum: 301
[ 0.004070] ACPI: Core revision 20170728
[ 0.008575] ACPI: 1 ACPI AML tables successfully acquired and loaded
[ 0.009110] Security Framework initialized
[ 0.010014] SELinux: Initializing.
[ 0.011087] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
[ 0.011132] IP: account_system_index_time+0x60/0x90
[ 0.011135] PGD 0 P4D 0
[ 0.011147] Oops: 0000 [#1] SMP
[ 0.011152] Modules linked in:
[ 0.011168] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc2-00003-g041cd64 #10
[ 0.011171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 0.011175] task: ffffffff81e10480 task.stack: ffffffff81e00000
[ 0.011183] RIP: 0010:account_system_index_time+0x60/0x90
[ 0.011184] RSP: 0000:ffff880011e03cb8 EFLAGS: 00010002
[ 0.011188] RAX: ffffffff81ef8800 RBX: ffffffff81e10480 RCX: 0000000000000003
[ 0.011190] RDX: 0000000000000000 RSI: 00000000000f4240 RDI: 0000000000000000
[ 0.011193] RBP: ffff880011e03cc0 R08: 0000000000010000 R09: 0000000000000000
[ 0.011193] R10: 0000000000000020 R11: 0000003b9aca0000 R12: 000000000001c100
[ 0.011195] R13: 0000000000000000 R14: ffffffff81e10480 R15: ffffffff81e03cd8
[ 0.011205] FS: 0000000000000000(0000) GS:ffff880011e00000(0000) knlGS:0000000000000000
[ 0.011207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.011210] CR2: 00000000000000b0 CR3: 0000000001e09000 CR4: 00000000000006b0
[ 0.011229] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.011231] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.011234] Call Trace:
[ 0.011241] <IRQ>
[ 0.011255] account_system_time+0x45/0x60
[ 0.011261] account_process_tick+0x5a/0x140
[ 0.011275] update_process_times+0x22/0x60
[ 0.011289] tick_periodic+0x2b/0x90
[ 0.011291] tick_handle_periodic+0x25/0x70
[ 0.011297] timer_interrupt+0x15/0x20
[ 0.011308] __handle_irq_event_percpu+0x7e/0x1b0
[ 0.011311] handle_irq_event_percpu+0x23/0x60
[ 0.011313] handle_irq_event+0x42/0x70
[ 0.011321] handle_level_irq+0x83/0x100
[ 0.011335] handle_irq+0x6f/0x110
[ 0.011344] do_IRQ+0x46/0xd0
[ 0.011354] common_interrupt+0x9d/0x9d
[ 0.011367] RIP: 0010:native_irq_enable+0x6/0x10
[ 0.011370] RSP: 0000:ffff880011e03f30 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffcf
[ 0.011375] RAX: 0000000000019840 RBX: 0000000000000030 RCX: 0000000000000000
[ 0.011376] RDX: 0000000000000002 RSI: 0000000000200002 RDI: ffff8800119036a4
[ 0.011376] RBP: ffff880011e03f98 R08: 0000000000000000 R09: 0000000000000005
[ 0.011377] R10: 0000000000000020 R11: 0000000000000000 R12: ffffffff81e03cd8
[ 0.011378] R13: ffff880011903600 R14: 0000000000000030 R15: 0000000000000000
[ 0.011385] ? __do_softirq+0x70/0x2ab
[ 0.011395] irq_exit+0xf1/0x100
[ 0.011397] do_IRQ+0x4f/0xd0
[ 0.011399] common_interrupt+0x9d/0x9d
[ 0.011400] </IRQ>
[ 0.011402] RIP: 0010:native_restore_fl+0x6/0x10
[ 0.011402] RSP: 0000:ffffffff81e03d88 EFLAGS: 00000247 ORIG_RAX: ffffffffffffffcf
[ 0.011406] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000ffffffff
[ 0.011409] RDX: ffffffff81e03d50 RSI: 0000000000000004 RDI: 0000000000000247
[ 0.011413] RBP: ffffffff81e03d88 R08: 0000000000000000 R09: 0000000000000000
[ 0.011416] R10: ffff8800000b9d60 R11: 072007200720075b R12: 0000000000000027
[ 0.011418] R13: ffffffff820763e0 R14: 0000000000000000 R15: ffffffff823522a0
[ 0.011440] console_unlock+0x1f5/0x4f0
[ 0.011442] vprintk_emit+0x302/0x3b0
[ 0.011444] vprintk_default+0x1f/0x30
[ 0.011445] vprintk_func+0x27/0x60
[ 0.011447] printk+0x43/0x4b
[ 0.011454] selinux_init+0x51/0x17c
[ 0.011456] security_init+0x4d/0x58
[ 0.011458] start_kernel+0x3be/0x421
[ 0.011463] x86_64_start_reservations+0x2a/0x2c
[ 0.011465] x86_64_start_kernel+0x72/0x75
[ 0.011478] secondary_startup_64+0xa5/0xa5
[ 0.011480] Code: ff ff ff 74 08 f0 48 01 b0 f8 00 00 00 48 63 c1 65 ff 05 74 cf f5 7e 65 48 01 34 c5 c0 de 00 00 48 8b 83 08 0d 00 00 48 8b 78 68 <48> 83 bf b0 00 00 00 00 74 0a 48 89 f2 89 ce e8 bc 21 07 00 48
[ 0.011511] RIP: account_system_index_time+0x60/0x90 RSP: ffff880011e03cb8
[ 0.011511] CR2: 00000000000000b0
[ 0.011540] ---[ end trace be658dd14e22cef1 ]---
[ 0.011547] Kernel panic - not syncing: Fatal exception in interrupt
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 8e3456f0a97a8042b0c2485b938d2a62001f0f7c e19b205be43d11bff638cad4487008c48d21c103 --
git bisect good d2cc5ed6949085cfba30ec5228816cf6eb1d02b9 # 02:15 G 47 0 0 0 cpuacct: Introduce cgroup_account_cputime[_field]()
git bisect bad 041cd640b2f3c5607171c59d8712b503659d21f7 # 02:15 B 0 1 17 0 cgroup: Implement cgroup2 basic CPU usage accounting
# first bad commit: [041cd640b2f3c5607171c59d8712b503659d21f7] cgroup: Implement cgroup2 basic CPU usage accounting
git bisect good d2cc5ed6949085cfba30ec5228816cf6eb1d02b9 # 02:44 G 137 0 0 0 cpuacct: Introduce cgroup_account_cputime[_field]()
# extra tests on HEAD of cgroup/review-cgroup2-cpu-on-basic-acct
git bisect bad 8e3456f0a97a8042b0c2485b938d2a62001f0f7c # 02:44 B 20 3 0 0 sched: Implement interface for cgroup unified hierarchy
# extra tests on tree/branch cgroup/review-cgroup2-cpu-on-basic-acct
git bisect bad 8e3456f0a97a8042b0c2485b938d2a62001f0f7c # 02:50 B 20 3 0 0 sched: Implement interface for cgroup unified hierarchy
# extra tests with first bad commit reverted
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 8 months
Re: [LKP] [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression
by Joel Fernandes
Hi Mike,
On Sun, Sep 17, 2017 at 10:30 PM, Mike Galbraith <efault(a)gmx.de> wrote:
> On Sun, 2017-09-17 at 14:41 -0700, Joel Fernandes wrote:
>> Hi Mike,
>>
>> On Sun, Sep 17, 2017 at 9:47 AM, Mike Galbraith <efault(a)gmx.de> wrote:
>> > On Sat, 2017-09-16 at 23:42 -0700, Joel Fernandes wrote:
>> >>
>> >> Yes I understand. However with my 'strong sync' patch, such a
>> >> balancing check could be useful which is what I was trying to do in a
>> >> different way in my patch - but it could be that my way is not good
>> >> enough and potentially the old wake_affine check could help here
>> >
>> > Help how? The old wake_affine() check contained zero concurrency
>> > information, it served to exclude excessive stacking, defeating the
>> > purpose of SMP. A truly synchronous wakeup has absolutely nothing to
>> > do with load balance in the general case: you can neither generate nor
>> > cure an imbalance by replacing one (nice zero) task with another. The
>> > mere existence of a load based braking mechanism speaks volumes.
>>
>> This is the part I didn't get.. you said "neither generate an
>> imbalance", it is possible that a CPU with a high blocked load but
>> just happen to be running 1 task at the time and did a sync wake up
>> for another task. In this case dragging the task onto the waker's CPU
>> might hurt it since it will face more competition than if it were
>> woken up on its previous CPU which is possibly lighty loaded than the
>> waker's CPU?
>
> Sure, a preexisting imbalance, but what does that have to do with
> generic pass the baton behavior, and knowing whether pull will result
> in a performance win or loss? In the overloaded and imbalanced case, a
> lower load number on this/that CPU means little in and of itself wrt
> latency expectations... the wakee may or may not be next in line, may
> or may not be able to cut the line (preempt).
In my patch, I used nr_running == 1, so if waker slept immediately,
then wakee would be next in line. I guess this is the part I
misunderstood - that I assumed for all cases the sync waker sleeps
immediately, even though it does in the binder use cases. We have
control over the binder userspace bits that sleep immediately since
its part of the framework - user apps don't decide whether to stay
awake or not - the framework will put them to sleep since any progress
can happen only after waiting for the transaction to return. However
this assumption may not be true for other users of the regular sync
flag as you pointed :-/.
Just for completeness sake, I wanted to mention how we have a patch
similar to this on the Pixel product and works real well for all the
usecases we care about:
https://android.googlesource.com/kernel/msm.git/+/android-msm-marlin-3.18...
It will probably not work well if someone tried to run netperf or
other usecases where waker doesn't sleep immediately and is definitely
not mainlinable considering the wreckage in $SUBJECT.
>
>> Also the other thing I didn't fully get is why is concurrency a
>> discussion point here, in this case A wakes up B and goes to sleep,
>> and then B wakes up A. They never run concurrently. Could you let me
>> know what I am missing?
>
> If waker does not IMMEDIATELY sleep, and you wake CPU affine, you toss
> the time interval between wakeup and context switch out the window, the
> wakee could have already been executing elsewhere when the waker gets
> around to getting the hell out of the way.
>
> The point I'm belaboring here is that we had that load check and more,
> and it proved insufficient. Yeah, pipe-test absolutely loves running
> on one CPU, but who cares, pipe-test is an overhead measurement tool,
> NOT a benchmark, those try to emulate the real world.
Ok.
> Hell, take two large footprint tasks, and let them exchange data
> briefly via pipe.. what would make it a wonderful idea to pull these
> tasks together each and every time they did that? Because pipe-test
> can thus play super fast ping-pong with itself? Obviously not.
Yes, I agree with you.
[..]
>>
>> >> > "Strong sync" wakeups like you propose would also
>> >> > change the semantics of wake_wide() and potentially
>> >> > other bits of code...doesn't make much sense for
>> >> >
>> >>
>> >> I understand, I am not very confident that wake_wide does the right
>> >> thing anyway. Atleast for Android, wake_wide doesn't seem to mirror
>> >> the most common usecase of display pipeline well. It seems that we
>> >> have cases where the 'flip count' is really high and causes wake_wide
>> >> all the time and sends us straight to the wake up slow path causing
>> >> regressions in Android benchmarks.
>> >
>> > Hm. It didn't pull those counts out of the vacuum, it measured them.
>> > It definitely does not force Android into the full balance path, that
>> > is being done by Android developers, as SD_BALANCE_WAKE is off by
>> > default. It was briefly on by default, but was quickly turned back off
>> > because it... induced performance regressions.
>> >
>> > In any case, if you have cause to believe that wake_wide() is causing
>> > you grief, why the heck are you bending up the sync hint?
>>
>> So its not just wake_wide causing the grief, even select_idle_sibling
>> doesn't seem to be doing the right thing. We really don't want to wake
>> up a task on an idle CPU if the current CPU is a better candidate.
>
> That is the nut, defining better candidate is not so trivial. Lacking
> concrete knowledge (omniscience?), you're always gambling, betting on
> concurrency potential existing is much safer than betting against it if
> your prime directive is to maximize utilization.
Ok so I think then reusing the sync flag for binder in mainline
doesn't make much sense and we could define a new flag that indicates
the waker will sleep immediately and call it WF_HSYNC (as in hard
sync) - since in the binder case we have this knowledge that waker is
going to sleep waiting for the transaction to finish.. I think the
mistake I did was trying to change the semantic of the original sync
flag which cannot assume that the waker sleeps immediately.
>> Binder microbenchmarks should that (as you can see in the results in
>> this patch) it performs better to wake up on the current CPU (no wake
>> up from idle latency on a sibling etc). Perhaps we should fix
>> select_idle_sibling to also consider latency of CPU to come out of
>> idle?
>
> That would add complexity. If you're gonna do that, you may as well go
> the extra mile, nuke it, and unify the full balance path instead.
Ok, I agree its not a good idea to make the fast path any slower just for this.
>> Using the sync hint was/is a way on the product kernel to
>> prevent both these paths. On current products, sync is ignored though
>> if the system is in an "overutilized" state (any of the CPUs are more
>> than 80% utilized) but otherwise the sync is used as a hard flag. This
>> is all probably wrong though - considering the concurrency point you
>> brought up..
>
> My experience with this space was that it's remarkably easy to wreck
> performance, why I was inspired to bang on podium with shoe. I'm not
> trying to discourage your explorations, merely passing along a well
> meant "Danger Will Robinson".
Thanks for the guidance, and sharing your knowledge/experience about this,
regards,
- Joel
4 years, 8 months
[writeback] e0530257f5: INFO:trying_to_register_non-static_key
by kernel test robot
FYI, we noticed the following commit:
commit: e0530257f5ee716e14559f82a3ac7abd78ec6f0b ("writeback: introduce super_operations->write_metadata")
https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git new-kill-btree-inode
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 2G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+------------------------------------------+------------+------------+
| | 7ef2ff57b2 | e0530257f5 |
+------------------------------------------+------------+------------+
| boot_successes | 16 | 4 |
| boot_failures | 0 | 12 |
| INFO:trying_to_register_non-static_key | 0 | 11 |
| BUG:unable_to_handle_kernel | 0 | 1 |
| Oops:#[##] | 0 | 1 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 1 |
+------------------------------------------+------------+------------+
[ 203.021546] INFO: trying to register non-static key.
[ 203.029973] the code is fine but needs lockdep annotation.
[ 203.037678] turning off the locking correctness validator.
[ 203.047115] CPU: 0 PID: 3907 Comm: mount.nfs Not tainted 4.14.0-rc1-00041-ge053025 #142
[ 203.059329] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 203.075796] Call Trace:
[ 203.079813] dump_stack+0x61/0x7e
[ 203.085737] register_lock_class+0x17f/0x494
[ 203.092420] __lock_acquire+0x9b/0x783
[ 203.099086] lock_acquire+0x53/0x76
[ 203.103179] ? deactivate_locked_super+0x5e/0xb2
[ 203.110146] _raw_spin_lock+0x2f/0x65
[ 203.116973] ? deactivate_locked_super+0x5e/0xb2
[ 203.124145] deactivate_locked_super+0x5e/0xb2
[ 203.132052] deactivate_super+0x33/0x36
[ 203.138072] cleanup_mnt+0x44/0x62
[ 203.143555] __cleanup_mnt+0xd/0xf
[ 203.148933] task_work_run+0x7d/0xa1
[ 203.154597] exit_to_usermode_loop+0x55/0x72
[ 203.161276] syscall_return_slowpath+0x71/0x86
[ 203.167530] entry_SYSCALL_64_fastpath+0xac/0xae
[ 203.174858] RIP: 0033:0x7f87c032298a
[ 203.180073] RSP: 002b:00007ffccfe9b368 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 203.191807] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f87c032298a
[ 203.202145] RDX: 000000000061bf30 RSI: 000000000061bf10 RDI: 000000000061bef0
[ 203.210831] RBP: 00007ffccfe9b470 R08: 000000000061c890 R09: 000000000061c890
[ 203.221750] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffccfe9b470
[ 203.232733] R13: 000000000061c650 R14: 0000000000000010 R15: 000000000061a010
[ 203.252920] mount.nfs (3907) used greatest stack depth: 11744 bytes left
[ 203.521056] run-job /lkp/scheduled/vm-lkp-wsx03-2G-8/boot-1-debian-x86_64-2016-08-31.cgz-e0530257f5ee716e14559f82a3ac7abd78ec6f0b-20170923-22048-1j30qi9-0.yaml
[ 203.521088]
[ 203.948760] /usr/bin/curl -sSf http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/scheduled... -o /dev/null
[ 203.948795]
[ 208.084918] kill 3943 dmesg --follow --decode
[ 208.084949]
[ 208.158474] wait for background monitors: 3946 oom-killer
[ 208.158506]
[ 213.764255] /lkp/lkp/src/bin/post-run: 187: /lkp/lkp/src/bin/post-run: cannot create /inn/result/boot/1/vm-lkp-wsx03-2G/debian-x86_64-2016-08-31.cgz/x86_64-acpi-redef/gcc-6/e0530257f5ee716e14559f82a3ac7abd78ec6f0b/0/time-debug: Input/output error
[ 213.764288]
[ 215.061984] /usr/bin/curl -sSf http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/scheduled... -o /dev/null
[ 215.062019]
[ 215.441280] /usr/bin/curl -sSf http://inn:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/scheduled... -o /dev/null
[ 215.441314]
[ 216.581819] LKP: rebooting
[ 216.581851]
Elapsed time: 230
initrds=(
/osimage/debian/debian-x86_64-2016-08-31.cgz
/lkp/scheduled/vm-lkp-wsx03-2G-8/boot-1-debian-x86_64-2016-08-31.cgz-e0530257f5ee716e14559f82a3ac7abd78ec6f0b-20170923-22048-1j30qi9-0.cgz
/lkp/lkp/lkp-x86_64.cgz
/osimage/deps/debian-x86_64-2016-08-31.cgz/lkp_2017-08-01.cgz
/osimage/deps/debian-x86_64-2016-08-31.cgz/rsync-rootfs_2016-11-15.cgz
/osimage/deps/debian-x86_64-2016-08-31.cgz/run-ipconfig_2016-11-15.cgz
)
cat "${initrds[@]}" > /fs/sdc1/initrd-vm-lkp-wsx03-2G-8
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu host
-kernel /pkg/linux/x86_64-acpi-redef/gcc-6/e0530257f5ee716e14559f82a3ac7abd78ec6f0b/vmlinuz-4.14.0-rc1-00041-ge053025
-initrd /fs/sdc1/initrd-vm-lkp-wsx03-2G-8
-m 2048
-smp 2
-device e1000,netdev=net0
-netdev user,id=net0,hostfwd=tcp::23637-:22
-boot order=nc
-no-reboot
-watchdog i6300esb
-watchdog-action debug
-rtc base=localtime
-drive file=/fs/sdc1/disk0-vm-lkp-wsx03-2G-8,media=disk,if=virtio
-drive file=/fs/sdc1/disk1-vm-lkp-wsx03-2G-8,media=disk,if=virtio
-pidfile /dev/shm/kboot/pid-vm-lkp-wsx03-2G-8
-serial file:/dev/shm/kboot/vm-lkp-wsx03-2G-8/serial
-serial file:/dev/shm/kboot/vm-lkp-wsx03-2G-8/kmsg
-daemonize
-display none
-monitor null
)
append=(
ip=::::vm-lkp-wsx03-2G-8::dhcp
root=/dev/ram0
user=lkp
job=/lkp/scheduled/vm-lkp-wsx03-2G-8/boot-1-debian-x86_64-2016-08-31.cgz-e0530257f5ee716e14559f82a3ac7abd78ec6f0b-20170923-22048-1j30qi9-0.yaml
ARCH=x86_64
kconfig=x86_64-acpi-redef
branch=linux-devel/devel-spot-201709231300
commit=e0530257f5ee716e14559f82a3ac7abd78ec6f0b
BOOT_IMAGE=/pkg/linux/x86_64-acpi-redef/gcc-6/e0530257f5ee716e14559f82a3ac7abd78ec6f0b/vmlinuz-4.14.0-rc1-00041-ge053025
max_uptime=600
RESULT_ROOT=/result/boot/1/vm-lkp-wsx03-2G/debian-x86_64-2016-08-31.cgz/x86_64-acpi-redef/gcc-6/e0530257f5ee716e14559f82a3ac7abd78ec6f0b/0
LKP_SERVER=inn
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
lkp
4 years, 8 months
[writeback] bee30fb09c: general_protection_fault:#[##]
by kernel test robot
FYI, we noticed the following commit:
commit: bee30fb09ca96981fc19edfa22e10ee9ea0265d5 ("writeback: introduce super_operations->write_metadata")
https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git new-kill-btree-inode
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu IvyBridge -smp 2 -m 1G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------+------------+------------+
| | eecf7db193 | bee30fb09c |
+-------------------------------------------------+------------+------------+
| boot_successes | 45 | 2 |
| boot_failures | 1 | 27 |
| BUG:kernel_reboot-without-warning_in_test_stage | 1 | |
| BUG:spinlock_bad_magic_on_CPU | 0 | 20 |
| general_protection_fault:#[##] | 0 | 5 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 5 |
| INFO:trying_to_register_non-static_key | 0 | 2 |
+-------------------------------------------------+------------+------------+
[ 130.931901]
[ 131.051529] RESULT_ROOT=/result/boot/1/vm-ivb41-1G/debian-x86_64-2016-08-31.cgz/x86_64-nfsroot/gcc-6/bee30fb09ca96981fc19edfa22e10ee9ea0265d5/0
[ 131.051585]
[ 131.072270] job=/lkp/scheduled/vm-ivb41-1G-2/boot-1-debian-x86_64-2016-08-31.cgz-bee30fb09ca96981fc19edfa22e10ee9ea0265d5-20170923-92490-1faw20g-0.yaml
[ 131.072302]
[ 131.216278] general protection fault: 0000 [#1] SMP
[ 131.220454] Modules linked in: snd_pcsp
[ 131.224669] CPU: 1 PID: 7434 Comm: mount.nfs Not tainted 4.14.0-rc1-00041-gbee30fb #24
[ 131.239538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 131.250690] task: ffff897a48218040 task.stack: ffffb4ba83134000
[ 131.255633] RIP: 0010:__lock_acquire+0xcd/0x822
[ 131.259930] RSP: 0018:ffffb4ba83137d78 EFLAGS: 00010002
[ 131.264465] RAX: 6b6b6b6b6b6b6b6b RBX: ffff897a48218040 RCX: ffff897a7a015a68
[ 131.270134] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff897a7a015a68
[ 131.275698] RBP: ffffb4ba83137db8 R08: 0000000000000001 R09: 0000000000000000
[ 131.281488] R10: 0000000000000001 R11: 00000000000057d6 R12: 0000000000000000
[ 131.287149] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
[ 131.292708] FS: 00007f2309820880(0000) GS:ffff897a73a00000(0000) knlGS:0000000000000000
[ 131.300377] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 131.305207] CR2: 000055d3fab6a8c8 CR3: 000000003b567000 CR4: 00000000001406e0
[ 131.311209] Call Trace:
[ 131.314283] ? kvm_sched_clock_read+0x9/0x12
[ 131.318426] lock_acquire+0x142/0x1dd
[ 131.322152] ? deactivate_locked_super+0x6c/0xc7
[ 131.326374] _raw_spin_lock+0x34/0x6a
[ 131.330298] ? deactivate_locked_super+0x6c/0xc7
[ 131.334604] deactivate_locked_super+0x6c/0xc7
[ 131.338952] deactivate_super+0x38/0x3b
[ 131.342763] cleanup_mnt+0x49/0x67
[ 131.346347] __cleanup_mnt+0x12/0x14
[ 131.350145] task_work_run+0x82/0xab
[ 131.353815] exit_to_usermode_loop+0x6d/0x99
[ 131.357880] syscall_return_slowpath+0xc1/0xd6
[ 131.362013] entry_SYSCALL_64_fastpath+0xbc/0xbe
[ 131.366239] RIP: 0033:0x7f2308ede98a
[ 131.369893] RSP: 002b:00007fffd4c3a778 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[ 131.377116] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f2308ede98a
[ 131.382645] RDX: 000000000061bf30 RSI: 000000000061bf10 RDI: 000000000061bef0
[ 131.388202] RBP: 00007fffd4c3a880 R08: 000000000061c890 R09: 000000000061c890
[ 131.393759] R10: 0000000000000000 R11: 0000000000000206 R12: 00007fffd4c3a880
[ 131.399318] R13: 000000000061c650 R14: 0000000000000010 R15: 000000000061a010
[ 131.404879] Code: c4 48 89 4d c8 e8 79 b3 ff ff 48 8b 4d c8 48 85 c0 44 8b 4d c4 75 14 45 31 e4 e9 4f 07 00 00 89 f0 48 8b 44 c1 08 48 85 c0 74 cd <f0> ff 80 38 01 00 00 83 3d f5 70 52 02 00 44 8b ab e0 08 00 00
[ 131.419712] RIP: __lock_acquire+0xcd/0x822 RSP: ffffb4ba83137d78
[ 131.424702] ---[ end trace d5fbfc4b8d2d3e93 ]---
[ 131.428920] Kernel panic - not syncing: Fatal exception
[ 131.433651] Kernel Offset: 0xe000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Elapsed time: 140
initrds=(
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
lkp
4 years, 8 months
1ee5b5d213 ("serial: core: Release memory obtained by kasprintf"): BUG: -1 unexpected failures (out of 262) - debugging disabled! |
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/Arvind-Yadav/serial-core-Release...
commit 1ee5b5d213183b5aa42e87c7ce16c7d5de520cb2
Author: Arvind Yadav <arvind.yadav.cs(a)gmail.com>
AuthorDate: Wed Sep 20 12:33:39 2017 +0530
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Fri Sep 22 13:47:05 2017 +0800
serial: core: Release memory obtained by kasprintf
Free memory region, if uart_add_one_port is not successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs(a)gmail.com>
be7da1a2b7 serial: imx: default to half duplex rs485
1ee5b5d213 serial: core: Release memory obtained by kasprintf
1ee5b5d213 serial: core: Release memory obtained by kasprintf
+---------------------------------------------------------+------------+------------+------------+
| | be7da1a2b7 | 1ee5b5d213 | 1ee5b5d213 |
+---------------------------------------------------------+------------+------------+------------+
| boot_successes | 9 | 0 | 0 |
| boot_failures | 24 | 15 | 15 |
| BUG:-#unexpected_failures(out_of#)-debugging_disabled!| | 24 | 15 | 15 |
| WARNING:at_fs/sysfs/dir.c:#sysfs_warn_dup | 0 | 15 | 15 |
| EIP:sysfs_warn_dup | 0 | 15 | 15 |
| WARNING:at_fs/proc/generic.c:#__proc_create | 0 | 15 | 15 |
| EIP:__proc_create | 0 | 15 | 15 |
+---------------------------------------------------------+------------+------------+------------+
[ 0.010000] context:failed| ok | ok |
[ 0.010000] try:failed| ok |failed|
[ 0.010000] block:failed| ok |failed|
[ 0.010000] spinlock:failed| ok |failed|
[ 0.010000] -----------------------------------------------------------------
[ 0.010000] BUG: -1 unexpected failures (out of 262) - debugging disabled! |
[ 0.010000] -----------------------------------------------------------------
[ 0.010000] ODEBUG: selftest passed
[ 0.010000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[ 0.010000] hpet clockevent registered
[ 0.010006] tsc: Detected 2925.998 MHz processor
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 666b090a99a6c17730e76385bb309818aaa39ed7 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e --
git bisect good dcfb7fbe6f933f00d0eb704ba606dc42c6f2261f # 19:46 G 10 0 7 7 Merge 'nomadik/gemini-defconfig' into devel-hourly-2017092213
git bisect good 5048858ca9a8861977f4e7837e6903727be77c85 # 20:01 G 11 0 5 5 Merge 'linux-review/Jacob-Chen/media-i2c-OV5647-ensure-clock-lane-in-LP-11-state-before-streaming-on/20170913-173307' into devel-hourly-2017092213
git bisect good 2120a2f1001eeec43802302bb50b180cc9a2e8ba # 20:19 G 11 0 6 6 Merge 'linux-review/Naohiro-Aota/btrfs-propagate-error-to-btrfs_cmp_data_prepare-caller/20170911-085221' into devel-hourly-2017092213
git bisect good 428784dfd22d0cf01e45ed75d39cc9d40bfd954d # 20:35 G 11 0 4 4 Merge 'linux-review/Prateek-Sood/cgroup-cpuset-remove-circular-dependency-deadlock/20170910-055343' into devel-hourly-2017092213
git bisect good 9ca6b36be4830bcd6ad8952d0437661f83639e2e # 20:56 G 11 0 8 12 Merge 'linux-review/Dan-Carpenter/staging-iio-tsl2x7x-clean-up-limit-checks/20170909-162850' into devel-hourly-2017092213
git bisect good 82d52ec780bf4a3b1634d6cfdbec1b2bc4dd7ee9 # 21:10 G 11 0 7 9 Merge 'ipmi/for-next' into devel-hourly-2017092213
git bisect good adbc127857229a108b740e0e77226fe0c2e3cf16 # 21:26 G 11 0 6 8 Merge 'kdave-btrfs-devel/for-next-20170914' into devel-hourly-2017092213
git bisect bad 4efc234a90a50a6dc1d3af9dd96ce294fab9aee8 # 21:40 B 0 1 14 0 Merge 'linux-review/Haishuang-Yan/ipv4-Namespaceify-tcp_fastopen-knob/20170922-130021' into devel-hourly-2017092213
git bisect good 4e1a95a48120de1ad2a9d3e477d84b5efefab820 # 22:09 G 11 0 8 12 Merge 'kdave-btrfs-devel/for-next-20170913' into devel-hourly-2017092213
git bisect bad c85e3bc599eb5fea1de3b7b08d1b6402bbbbebef # 22:24 B 0 11 24 0 Merge 'linux-review/Arvind-Yadav/serial-core-Release-memory-obtained-by-kasprintf/20170922-134703' into devel-hourly-2017092213
git bisect bad 1ee5b5d213183b5aa42e87c7ce16c7d5de520cb2 # 22:44 B 0 2 15 0 serial: core: Release memory obtained by kasprintf
# first bad commit: [1ee5b5d213183b5aa42e87c7ce16c7d5de520cb2] serial: core: Release memory obtained by kasprintf
git bisect good be7da1a2b714a387e6ac5e3db21a1760c9969ae0 # 23:17 G 31 0 22 22 serial: imx: default to half duplex rs485
# extra tests with CONFIG_DEBUG_INFO_REDUCED
git bisect bad 1ee5b5d213183b5aa42e87c7ce16c7d5de520cb2 # 23:30 B 0 5 18 0 serial: core: Release memory obtained by kasprintf
# extra tests on HEAD of linux-devel/devel-hourly-2017092213
git bisect bad 666b090a99a6c17730e76385bb309818aaa39ed7 # 23:35 B 0 13 29 0 0day head guard for 'devel-hourly-2017092213'
# extra tests on tree/branch linux-review/Arvind-Yadav/serial-core-Release-memory-obtained-by-kasprintf/20170922-134703
git bisect bad 1ee5b5d213183b5aa42e87c7ce16c7d5de520cb2 # 23:36 B 0 15 28 0 serial: core: Release memory obtained by kasprintf
# extra tests with first bad commit reverted
git bisect good c15c7d00955b33cdeed7a3887c96ed9f390e42f1 # 23:54 G 11 0 7 7 Revert "serial: core: Release memory obtained by kasprintf"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 8 months