[tracing] 8e130b0d92: WARNING:at_kernel/rcu/tree.c:#rcu_irq_exit
by kernel test robot
FYI, we noticed the following commit (built with gcc-7):
commit: 8e130b0d9284a0a01ca1d6ecf8f0896cfc28b112 ("tracing: Improve design of preemptirq tracepoints and its users")
url: https://github.com/0day-ci/linux/commits/Joel-Fernandes/tracing-Improve-d...
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -m 512M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+----------------------------------------------------+------------+------------+
| | e4c1091cb4 | 8e130b0d92 |
+----------------------------------------------------+------------+------------+
| boot_successes | 1 | 0 |
| boot_failures | 0 | 43 |
| WARNING:at_kernel/rcu/tree.c:#rcu_irq_exit | 0 | 43 |
| RIP:rcu_irq_exit | 0 | 43 |
| WARNING:at_kernel/rcu/tree.c:#rcu_irq_enter | 0 | 43 |
| RIP:rcu_irq_enter | 0 | 43 |
| WARNING:at_kernel/rcu/tree.c:#rcu_eqs_enter_common | 0 | 43 |
| RIP:rcu_eqs_enter_common | 0 | 43 |
+----------------------------------------------------+------------+------------+
[ 0.001000] WARNING: CPU: 0 PID: 0 at kernel/rcu/tree.c:892 rcu_irq_exit+0x4d/0x19a
[ 0.001000] Modules linked in:
[ 0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc7-00430-g8e130b0 #35
[ 0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.001000] RIP: 0010:rcu_irq_exit+0x4d/0x19a
[ 0.001000] RSP: 0000:ffffffff92803e18 EFLAGS: 00010082
[ 0.001000] RAX: 000000000000001d RBX: 0000000000000082 RCX: 70a3d70a3d70a3e0
[ 0.001000] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000046
[ 0.001000] RBP: ffffffff910fa412 R08: 00000000c5610c9a R09: 0000000000000004
[ 0.001000] R10: 0000000000000001 R11: ffffffff93dc0469 R12: 0000000000000000
[ 0.001000] R13: ffffffff93dc0880 R14: 0000000000000002 R15: 0000000000000048
[ 0.001000] FS: 0000000000000000(0000) GS:ffff8bfa5f600000(0000) knlGS:0000000000000000
[ 0.001000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.001000] CR2: 00000000ffffffff CR3: 0000000009824000 CR4: 00000000000006b0
[ 0.001000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.001000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.001000] Call Trace:
[ 0.001000] ? vprintk_emit+0x26c/0x29c
[ 0.001000] rcu_irq_exit_irqson+0x21/0x48
[ 0.001000] trace_hardirqs_on+0xc2/0xd0
[ 0.001000] vprintk_emit+0x26c/0x29c
[ 0.001000] printk+0x43/0x4b
[ 0.001000] lockdep_init+0x36/0xcf
[ 0.001000] start_kernel+0x2fd/0x416
[ 0.001000] secondary_startup_64+0xa5/0xb0
[ 0.001000] Code: 08 00 00 00 75 27 83 b8 88 08 00 00 00 74 1e 80 3d ef 68 8c 01 00 75 15 48 c7 c7 a9 e3 5e 92 c6 05 df 68 8c 01 01 e8 31 9d f8 ff <0f> ff 48 c7 c5 a0 9d 1d 00 65 48 03 2d 70 cf ef 6e 83 7d 08 00
[ 0.001000] ---[ end trace 2507864299958132 ]---
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
lkp
4 years, 6 months
1b73fe146b ("rmqueue_bulk: avoid touching page structures .."): kernel BUG at mm/page_alloc.c:798!
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/Aaron-Lu/__free_one_page-skip-me...
commit 1b73fe146bd01f57fd619d54761ac7f6efac5594
Author: Aaron Lu <aaron.lu(a)intel.com>
AuthorDate: Mon Feb 5 13:32:25 2018 +0800
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Wed Feb 7 15:08:06 2018 +0800
rmqueue_bulk: avoid touching page structures under zone->lock
Profile on Intel Skylake server shows the most time consuming part
under zone->lock on allocation path is accessing those to-be-returned
page's struct page in rmqueue_bulk() and its child functions.
We do not really need to touch all those to-be-returned pages under
zone->lock, just need to move them out of the order 0's free_list and
adjust area->nr_free under zone->lock, other operations on page structure
like rmv_page_order(page) etc. could be done outside zone->lock.
So if it's possible to know the 1st and the last page structure of the
pcp->batch number pages in the free_list, we can achieve the above
without needing to touch all those page structures in between. The
problem is, the free page is linked in a doubly list so we only know
where the head and tail is, but not the Nth element in the list.
Assume order0 mt=Movable free_list has 7 pages available:
head <-> p7 <-> p6 <-> p5 <-> p4 <-> p3 <-> p2 <-> p1
One experiment I have done here is to add a new list for it: say
cluster list, where it will link pages of every pcp->batch(th) element
in the free_list.
Take pcp->batch=3 as an example, we have:
free_list: head <-> p7 <-> p6 <-> p5 <-> p4 <-> p3 <-> p2 <-> p1
cluster_list: head <--------> p6 <---------------> p3
Let's call p6-p4 a cluster, similarly, p3-p1 is another cluster.
Then every time rmqueue_bulk() is called to get 3 pages, we will iterate
the cluster_list first. If cluster list is not empty, we can quickly locate
the first and last page, p6 and p4 in this case(p4 is retrieved by checking
p6's next on cluster_list and then check p3's prev on free_list). This way,
we can reduce the need to touch all those page structures in between under
zone->lock.
Note: a common pcp->batch should be 31 since it is the default PCP batch number.
With this change, on 2 sockets Skylake server, with will-it-scale/page_fault1
full load test, zone lock has gone, lru_lock contention rose to 70% and
performance increased by 16.7% compared to vanilla.
There are some fundemental problems with this patch though:
1 When compaction occurs, the number of pages in a cluster could be less than
predefined; this will make "1 cluster can satify the request" not true any more.
Due to this reason, the patch currently requires no compaction to happen;
2 When new pages are freed to order 0 free_list, it could merge with its buddy
and that would also cause fewer pages left in a cluster. Thus, no merge
for order-0 is required for this patch to work;
3 Similarly, when fallback allocation happens, the same problem could happen again.
Considering the above listed problems, this patch can only serve as a POC that
cache miss is the most time consuming operation in big server. Your comments
on a possible way to overcome them are greatly appreciated.
Suggested-by: Ying Huang <ying.huang(a)intel.com>
Signed-off-by: Aaron Lu <aaron.lu(a)intel.com>
e990ff63d9 __free_one_page: skip merge for order-0 page unless compaction is in progress
1b73fe146b rmqueue_bulk: avoid touching page structures under zone->lock
+-----------------------------------------------------+------------+------------+
| | e990ff63d9 | 1b73fe146b |
+-----------------------------------------------------+------------+------------+
| boot_successes | 2 | 0 |
| boot_failures | 57 | 21 |
| WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page | 57 | |
| EIP:note_page | 57 | |
| kernel_BUG_at_mm/page_alloc.c | 0 | 21 |
| invalid_opcode:#[##] | 0 | 21 |
| EIP:__do_merge | 0 | 21 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 21 |
+-----------------------------------------------------+------------+------------+
[ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Initializing CPU#0
[ 0.000000] Initializing HighMem for node 0 (00000000:00000000)
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at mm/page_alloc.c:798!
[ 0.000000] invalid opcode: 0000 [#1] PREEMPT SMP
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-00002-g1b73fe1 #82
[ 0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.000000] EIP: __do_merge+0x1d/0x13b
[ 0.000000] EFLAGS: 00210046 CPU: 0
[ 0.000000] EAX: cfbc8340 EBX: 00000000 ECX: c8a7b540 EDX: 0000009a
[ 0.000000] ESI: 00000000 EDI: 00000000 EBP: c8983e94 ESP: c8983e78
[ 0.000000] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 0.000000] CR0: 80050033 CR2: 00000000 CR3: 08b5d000 CR4: 000406b0
[ 0.000000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.000000] DR6: fffe0ff0 DR7: 00000400
[ 0.000000] Call Trace:
[ 0.000000] ? __mod_zone_page_state+0x66/0x71
[ 0.000000] free_pcppages_bulk+0x8b/0xb0
[ 0.000000] free_unref_page_commit+0x70/0x79
[ 0.000000] free_unref_page+0x36/0x41
[ 0.000000] __free_pages+0x12/0x1b
[ 0.000000] __free_pages_bootmem+0x8a/0x92
[ 0.000000] free_all_bootmem+0x110/0x171
[ 0.000000] mem_init+0x3b/0x1e3
[ 0.000000] start_kernel+0x229/0x3d0
[ 0.000000] i386_start_kernel+0x95/0x99
[ 0.000000] startup_32_smp+0x164/0x168
[ 0.000000] Code: ff 89 43 1c eb b8 83 c4 10 5b 5e 5f 5d c3 55 89 e5 57 56 53 89 d3 89 c2 83 ec 10 2b 15 04 3d c7 c8 89 4d f0 c1 fa 05 85 db 75 02 <0f> 0b 89 c6 6b c3 24 03 45 f0 89 45 e8 83 fb 09 77 44 bf 01 00
[ 0.000000] EIP: __do_merge+0x1d/0x13b SS:ESP: 0068:c8983e78
[ 0.000000] ---[ end trace 20c59bb79bbbff9e ]---
[ 0.000000] Kernel panic - not syncing: Fatal exception
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 1b73fe146bd01f57fd619d54761ac7f6efac5594 d8a5b80568a9cb66810e75b182018e9edb68e8ff --
git bisect good e990ff63d932b965e3a22b32349bae1285bbcc28 # 16:05 G 11 0 11 22 __free_one_page: skip merge for order-0 page unless compaction is in progress
# first bad commit: [1b73fe146bd01f57fd619d54761ac7f6efac5594] rmqueue_bulk: avoid touching page structures under zone->lock
git bisect good e990ff63d932b965e3a22b32349bae1285bbcc28 # 16:11 G 32 0 32 54 __free_one_page: skip merge for order-0 page unless compaction is in progress
# extra tests with debug options
git bisect bad 1b73fe146bd01f57fd619d54761ac7f6efac5594 # 16:17 B 0 11 24 0 rmqueue_bulk: avoid touching page structures under zone->lock
# extra tests on HEAD of linux-review/Aaron-Lu/__free_one_page-skip-merge-for-order-0-page-unless-compaction-is-in-progress/20180207-150802
git bisect bad 1b73fe146bd01f57fd619d54761ac7f6efac5594 # 16:17 B 0 21 37 0 rmqueue_bulk: avoid touching page structures under zone->lock
# extra tests on tree/branch linux-review/Aaron-Lu/__free_one_page-skip-merge-for-order-0-page-unless-compaction-is-in-progress/20180207-150802
git bisect bad 1b73fe146bd01f57fd619d54761ac7f6efac5594 # 16:18 B 0 21 37 0 rmqueue_bulk: avoid touching page structures under zone->lock
# extra tests with first bad commit reverted
git bisect good fa6c803da2a1a49d44f098528815694ae658a5f8 # 16:29 G 11 0 11 22 Revert "rmqueue_bulk: avoid touching page structures under zone->lock"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
a57c0243cb ("mm/sparse.c: Add nr_present_sections to change .."): BUG: kernel reboot-without-warning in early-boot stage, last printk: early console in setup code
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://github.com/0day-ci/linux/commits/Baoquan-He/Optimize-the-code-of-...
commit a57c0243cb37ab6dc5375ea699054afa3a5d7f05
Author: Baoquan He <bhe(a)redhat.com>
AuthorDate: Thu Feb 1 15:19:56 2018 +0800
Commit: 0day robot <fengguang.wu(a)intel.com>
CommitDate: Fri Feb 2 12:17:20 2018 +0800
mm/sparse.c: Add nr_present_sections to change the mem_map allocation
In sparse_init(), we allocate usemap_map and map_map which are pointer
array with the size of NR_MEM_SECTIONS. The memory consumption can be
ignorable in 4-level paging mode. While in 5-level paging, this costs
much memory, 512M. Kdump kernel even can't boot up with a normal
'crashkernel=' setting.
Here add a new variable to record the number of present sections. Let's
allocate the usemap_map and map_map with the size of nr_present_sections.
We only need to make sure that for the ith present section, usemap_map[i]
and map_map[i] store its usemap and mem_map separately.
This change can save much memory on most of systems. Anytime, we should
avoid to define array or allocate memory with the size of NR_MEM_SECTIONS.
Signed-off-by: Baoquan He <bhe(a)redhat.com>
c9e4aaa455 mm/sparsemem: Defer the ms->section_mem_map clearing a little later
a57c0243cb mm/sparse.c: Add nr_present_sections to change the mem_map allocation
+-----------------------------------------------------------------------------------------------+------------+------------+
| | c9e4aaa455 | a57c0243cb |
+-----------------------------------------------------------------------------------------------+------------+------------+
| boot_successes | 36 | 0 |
| boot_failures | 0 | 13 |
| BUG:kernel_reboot-without-warning_in_early-boot_stage,last_printk:early_console_in_setup_code | 0 | 13 |
+-----------------------------------------------------------------------------------------------+------------+------------+
early console in setup code
BUG: kernel reboot-without-warning in early-boot stage, last printk: early console in setup code
Linux version 4.15.0-09941-ga57c024 #30
Command line: root=/dev/ram0 hung_task_panic=1 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw link=/kbuild-tests/run-queue/kvm/x86_64-acpi-redef/linux-review:Baoquan-He:Optimize-the-code-of-mem_map-allocation-in:20180202-121717:a57c0243cb37ab6dc5375ea699054afa3a5d7f05/.vmlinuz-a57c0243cb37ab6dc5375ea699054afa3a5d7f05-20180202125702-1:yocto-ivb41-116 branch=linux-review/Baoquan-He/Optimize-the-code-of-mem_map-allocation-in/20180202-121717 BOOT_IMAGE=/pkg/linux/x86_64-acpi-redef/gcc-7/a57c0243cb37ab6dc5375ea699054afa3a5d7f05/vmlinuz-4.15.0-09941-ga57c024 drbd.minor_count=8 rcuperf.shutdown=0
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start a57c0243cb37ab6dc5375ea699054afa3a5d7f05 4bf772b14675411a69b3c807f73006de0fe4b649 --
git bisect good c9e4aaa4552e677e84a77961c074b3cf8670e6b3 # 13:31 G 11 0 1 1 mm/sparsemem: Defer the ms->section_mem_map clearing a little later
# first bad commit: [a57c0243cb37ab6dc5375ea699054afa3a5d7f05] mm/sparse.c: Add nr_present_sections to change the mem_map allocation
git bisect good c9e4aaa4552e677e84a77961c074b3cf8670e6b3 # 13:46 G 30 0 0 1 mm/sparsemem: Defer the ms->section_mem_map clearing a little later
# extra tests on HEAD of linux-review/Baoquan-He/Optimize-the-code-of-mem_map-allocation-in/20180202-121717
git bisect bad a57c0243cb37ab6dc5375ea699054afa3a5d7f05 # 13:46 B 0 13 29 0 mm/sparse.c: Add nr_present_sections to change the mem_map allocation
# extra tests on tree/branch linux-review/Baoquan-He/Optimize-the-code-of-mem_map-allocation-in/20180202-121717
git bisect bad a57c0243cb37ab6dc5375ea699054afa3a5d7f05 # 13:47 B 0 13 29 0 mm/sparse.c: Add nr_present_sections to change the mem_map allocation
# extra tests with first bad commit reverted
git bisect good 7b1911c8253773f4b3260937f17d5b5cd0b29474 # 14:17 G 11 0 2 2 Revert "mm/sparse.c: Add nr_present_sections to change the mem_map allocation"
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months
Re: [LKP] e7e61e51f9 ("fw_cfg: do DMA read operation"): kernel BUG at arch/x86/mm/physaddr.c:75!
by Marc-Andre Lureau
Hi
On Tue, Feb 6, 2018 at 10:01 PM, Michael S. Tsirkin <mst(a)redhat.com> wrote:
> On Fri, Feb 02, 2018 at 11:59:00AM +0100, Marc-Andre Lureau wrote:
>> Hi
>>
>> On Fri, Feb 2, 2018 at 3:19 AM, Michael S. Tsirkin <mst(a)redhat.com> wrote:
>> > On Fri, Feb 02, 2018 at 06:01:01AM +0800, kernel test robot wrote:
>> That's actually the regression, before Peter patch squashed in v10, we
>> checked if the given address wasn't NULL before doing dma_map_single.
>> Now it is gone and raises the panic.
>>
>> Thus, the diff I propose is:
>>
>> diff --git a/drivers/firmware/qemu_fw_cfg.c b/drivers/firmware/qemu_fw_cfg.c
>> index 3b3cf6222c97..08309939cd94 100644
>> --- a/drivers/firmware/qemu_fw_cfg.c
>> +++ b/drivers/firmware/qemu_fw_cfg.c
>> @@ -128,7 +128,7 @@ static ssize_t fw_cfg_dma_transfer(struct device *dev,
>> }
>>
>> *d = (struct fw_cfg_dma) {
>> - .address = cpu_to_be64(virt_to_phys(address)),
>> + .address = address ? cpu_to_be64(virt_to_phys(address)) : 0,
>> .length = cpu_to_be32(length),
>> .control = cpu_to_be32(control)
>> };
>> @@ -518,7 +518,8 @@ static ssize_t fw_cfg_sysfs_read_raw(struct file
>> *filp, struct kobject *kobj,
>> if (count > entry->f.size - pos)
>> count = entry->f.size - pos;
>>
>> - return fw_cfg_read_blob(dev, entry->f.select, buf, pos, count, true);
>> + /* do not use DMA, virt_to_phys(buf) might not be ok */
>> + return fw_cfg_read_blob(dev, entry->f.select, buf, pos, count, false);
>> }
>>
>> (and I tested it works both with x86 and x86_64 kernel/qemu)
>>
>> thanks
>
> Point is there are callers where address is on stack.
Which one do you see left where we don't pass dma=false?
> Let's try to do a minimal approach - enable coredump
> writes first. Defer speedups to next version.
Fair enough (although it would have been nice to ask before reaching v12)
4 years, 6 months
Patch "x86/asm: Fix inline asm call constraints for GCC 4.4" has been added to the 4.9-stable tree
by gregkh@linuxfoundation.org
This is a note to let you know that I've just added the patch titled
x86/asm: Fix inline asm call constraints for GCC 4.4
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=s...
The filename of the patch is:
x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 520a13c530aeb5f63e011d668c42db1af19ed349 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
Date: Thu, 28 Sep 2017 16:58:26 -0500
Subject: x86/asm: Fix inline asm call constraints for GCC 4.4
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
commit 520a13c530aeb5f63e011d668c42db1af19ed349 upstream.
The kernel test bot (run by Xiaolong Ye) reported that the following commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
is causing double faults in a kernel compiled with GCC 4.4.
Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:
register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:
ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>
Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.
So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.
This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye(a)intel.com>
Diagnosed-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: LKP <lkp(a)01.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/asm.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -11,10 +11,12 @@
# define __ASM_FORM_COMMA(x) " " #x ","
#endif
-#ifdef CONFIG_X86_32
+#ifndef __x86_64__
+/* 32 bit */
# define __ASM_SEL(a,b) __ASM_FORM(a)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
#else
+/* 64 bit */
# define __ASM_SEL(a,b) __ASM_FORM(b)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(b)
#endif
Patches currently in stable-queue which might be from jpoimboe(a)redhat.com are
queue-4.9/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
4 years, 6 months
Patch "x86/asm: Fix inline asm call constraints for GCC 4.4" has been added to the 4.4-stable tree
by gregkh@linuxfoundation.org
This is a note to let you know that I've just added the patch titled
x86/asm: Fix inline asm call constraints for GCC 4.4
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=s...
The filename of the patch is:
x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 520a13c530aeb5f63e011d668c42db1af19ed349 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
Date: Thu, 28 Sep 2017 16:58:26 -0500
Subject: x86/asm: Fix inline asm call constraints for GCC 4.4
From: Josh Poimboeuf <jpoimboe(a)redhat.com>
commit 520a13c530aeb5f63e011d668c42db1af19ed349 upstream.
The kernel test bot (run by Xiaolong Ye) reported that the following commit:
f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
is causing double faults in a kernel compiled with GCC 4.4.
Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:
register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)
Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:
ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>
Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.
So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.
This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.
Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye(a)intel.com>
Diagnosed-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: LKP <lkp(a)01.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin(a)linux.intel.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Cc: Matthias Kaehlcke <mka(a)chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/include/asm/asm.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -11,10 +11,12 @@
# define __ASM_FORM_COMMA(x) " " #x ","
#endif
-#ifdef CONFIG_X86_32
+#ifndef __x86_64__
+/* 32 bit */
# define __ASM_SEL(a,b) __ASM_FORM(a)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
#else
+/* 64 bit */
# define __ASM_SEL(a,b) __ASM_FORM(b)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(b)
#endif
Patches currently in stable-queue which might be from jpoimboe(a)redhat.com are
queue-4.4/x86-asm-fix-inline-asm-call-constraints-for-gcc-4.4.patch
4 years, 6 months
[lkp-robot] [rcu] 5bb856a179: WARNING:at_kernel/workqueue.c:#flush_work
by kernel test robot
FYI, we noticed the following commit (built with gcc-7):
commit: 5bb856a17977672f2142d06e68ac60e746e7bca2 ("rcu: Parallelize expedited grace-period initialization")
https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git rcu/dev
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------------+------------+------------+
| | 5314d220f6 | 5bb856a179 |
+-------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 16 | 60 |
| WARNING:possible_circular_locking_dependency_detected | 16 | |
| genirq:Flags_mismatch_irq##(ttyS0)vs.#(sir_ir) | 16 | |
| INFO:rcu_preempt_self-detected_stall_on_CPU | 10 | |
| EIP:__do_softirq | 10 | |
| EIP:_raw_spin_unlock_irqrestore | 10 | |
| WARNING:at_kernel/workqueue.c:#flush_work | 0 | 60 |
| EIP:flush_work | 0 | 60 |
| BUG:scheduling_while_atomic | 0 | 60 |
| WARNING:at_kernel/locking/lockdep.c:#lock_release | 0 | 60 |
| EIP:lock_release | 0 | 60 |
| WARNING:at_kernel/locking/lockdep.c:#lock_unpin_lock | 0 | 60 |
| EIP:lock_unpin_lock | 0 | 60 |
| WARNING:CPU:#PID:#at_kernel/ | 0 | 1 |
+-------------------------------------------------------+------------+------------+
[ 0.018242] WARNING: CPU: 0 PID: 0 at kernel/workqueue.c:2866 flush_work+0x27/0x260
[ 0.019000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.15.0-rc1-00112-g5bb856a #125
[ 0.019000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.019000] task: cb412dc0 task.stack: cb408000
[ 0.019000] EIP: flush_work+0x27/0x260
[ 0.019000] EFLAGS: 00210246 CPU: 0
[ 0.019000] EAX: 00000000 EBX: cb43fac0 ECX: 00000001 EDX: 00000001
[ 0.019000] ESI: cb43fa9c EDI: cb43fd00 EBP: cb409e10 ESP: cb409d5c
[ 0.019000] DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 0.019000] CR0: 80050033 CR2: ffffffff CR3: 0b62d000 CR4: 00000690
[ 0.019000] Call Trace:
[ 0.019000] ? __queue_work+0x423/0x450
[ 0.019000] ? mark_held_locks+0x61/0x90
[ 0.019000] ? queue_work_on+0x65/0x80
[ 0.019000] ? trace_hardirqs_on+0xb/0x10
[ 0.019000] sync_rcu_exp_select_cpus+0x201/0x230
[ 0.019000] ? _synchronize_rcu_expedited+0x316/0x340
[ 0.019000] ? rcu_report_exp_cpu_mult+0x60/0x60
[ 0.019000] _synchronize_rcu_expedited+0x210/0x340
[ 0.019000] ? apply_paravirt+0xc5/0x108
[ 0.019000] ? acpi_hw_validate_io_request+0xe/0x10d
[ 0.019000] ? acpi_os_read_port+0xc/0x60
[ 0.019000] ? acpi_hw_read_port+0x44/0xa8
[ 0.019000] ? acpi_hw_read+0x101/0x16b
[ 0.019000] synchronize_rcu_expedited+0x8e/0x90
[ 0.019000] synchronize_rcu+0x97/0xc0
[ 0.019000] ? acpi_hw_read_multiple+0x1b/0x54
[ 0.019000] ? ___might_sleep+0x5d/0x1e0
[ 0.019000] ? acpi_hw_register_read+0x56/0xbc
[ 0.019000] ? __might_sleep+0x74/0x80
[ 0.019000] ? acpi_hw_get_mode+0x2f/0x42
[ 0.019000] ? preempt_count_sub+0xa/0x60
[ 0.019000] ? synchronize_sched_expedited+0xa5/0xc0
[ 0.019000] rcu_test_sync_prims+0xd/0x30
[ 0.019000] rcu_scheduler_starting+0x27/0x30
[ 0.019000] rest_init+0xe/0x1d0
[ 0.019000] start_kernel+0x4b4/0x4cc
[ 0.019000] i386_start_kernel+0xca/0xe3
[ 0.019000] startup_32_smp+0x164/0x168
[ 0.019000] Code: 7b ff ff ff 55 89 e5 57 56 53 81 ec a8 00 00 00 e8 f3 f9 fc ff 89 c6 65 a1 14 00 00 00 89 45 f0 31 c0 80 3d a8 d2 63 cb 00 75 09 <0f> ff e9 02 02 00 00 66 90 31 c9 ba f4 0a 00 00 b8 bc c6 32 cb
[ 0.019000] ---[ end trace ec20ab83de8b0a87 ]---
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 6 months
[lkp-robot] [mm] f7f99100d8: BUG:unable_to_handle_kernel
by kernel test robot
FYI, we noticed the following commit (built with gcc-7):
commit: f7f99100d8d95dbcf09e0216a143211e79418b9f ("mm: stop zeroing memory during allocation in vmemmap")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 32G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------+------------+------------+
| | e17d8025f0 | f7f99100d8 |
+-------------------------------------------------+------------+------------+
| boot_successes | 2 | 1 |
| boot_failures | 10 | 11 |
| BUG:kernel_reboot-without-warning_in_boot_stage | 10 | |
| BUG:unable_to_handle_kernel | 0 | 11 |
| Oops:#[##] | 0 | 11 |
| RIP:per_cpu_ptr_to_phys | 0 | 11 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 11 |
+-------------------------------------------------+------------+------------+
[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 0.000000] IP: per_cpu_ptr_to_phys+0x7f/0xd8
[ 0.000000] PGD 0 P4D 0
[ 0.000000] Oops: 0000 [#1]
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.0-04321-gf7f9910 #1
[ 0.000000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.000000] task: ffffffff9741a500 task.stack: ffffffff97400000
[ 0.000000] RIP: 0010:per_cpu_ptr_to_phys+0x7f/0xd8
[ 0.000000] RSP: 0000:ffffffff97403ee0 EFLAGS: 00010046
[ 0.000000] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9f1274b33230
[ 0.000000] RDX: ffff9f127ffc2000 RSI: 0000000000000000 RDI: 000000000007ffff
[ 0.000000] RBP: 0000000000000000 R08: ffffffff97403ef4 R09: ffff9f12607bd000
[ 0.000000] R10: 0000000000042000 R11: ffffffff97fb482c R12: ffffffff97668640
[ 0.000000] R13: ffff9f127ffbd8c0 R14: 0000000000000000 R15: 0000000000000000
[ 0.000000] FS: 0000000000000000(0000) GS:ffffffff97436000(0000) knlGS:0000000000000000
[ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.000000] CR2: 0000000000000000 CR3: 000000052b415000 CR4: 00000000000006b0
[ 0.000000] Call Trace:
[ 0.000000] cpu_init+0x1da/0x1f9
[ 0.000000] trap_init+0x42/0x52
[ 0.000000] start_kernel+0x277/0x48a
[ 0.000000] secondary_startup_64+0xa5/0xb0
[ 0.000000] Code: 75 0c 48 39 d8 b9 01 00 00 00 76 ed eb 45 48 89 df e8 da ed ff ff 48 8b 38 48 89 c1 81 e3 ff 0f 00 00 48 c1 ef 2d e8 95 e8 ff ff <48> 8b 00 48 83 e0 f8 48 29 c1 48 89 c8 48 b9 b7 6d db b6 6d db
[ 0.000000] RIP: per_cpu_ptr_to_phys+0x7f/0xd8 RSP: ffffffff97403ee0
[ 0.000000] CR2: 0000000000000000
[ 0.000000] ---[ end trace 142a0423c71f6258 ]---
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 6 months
[lkp-robot] [rcu] 6c0a1d639c: BUG:scheduling_while_atomic
by kernel test robot
FYI, we noticed the following commit (built with gcc-4.9):
commit: 6c0a1d639cb7e989007cc3153b2f4eafb2e5bb7b ("rcu: Parallelize expedited grace-period initialization")
https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git rcu/dev
in testcase: trinity
with following parameters:
runtime: 300s
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-i386 -enable-kvm -smp 2 -m 320M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+-------------------------------------------------------+------------+------------+
| | 0f0a62adf1 | 6c0a1d639c |
+-------------------------------------------------------+------------+------------+
| boot_successes | 0 | 0 |
| boot_failures | 196 | 67 |
| WARNING:possible_circular_locking_dependency_detected | 196 | |
| WARNING:at_drivers/pci/pci-sysfs.c:#pci_mmap_resource | 20 | |
| EIP:pci_mmap_resource | 22 | |
| invoked_oom-killer:gfp_mask=0x | 1 | |
| Mem-Info | 10 | |
| EIP:__put_user_4 | 1 | |
| BUG:scheduling_while_atomic | 0 | 67 |
| WARNING:at_kernel/locking/lockdep.c:#lock_release | 0 | 67 |
| EIP:lock_release | 0 | 67 |
| WARNING:at_kernel/locking/lockdep.c:#lock_unpin_lock | 0 | 67 |
| EIP:lock_unpin_lock | 0 | 67 |
+-------------------------------------------------------+------------+------------+
[ 0.066192] BUG: scheduling while atomic: swapper/0/0/0x00000002
[ 0.068160] 1 lock held by swapper/0/0:
[ 0.070023] #0: (rcu_preempt_state.exp_mutex){+.+.}, at: [<c10d920a>] _synchronize_rcu_expedited+0x7aa/0xa10
[ 0.073303] Modules linked in:
[ 0.074521] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc1-00112-g6c0a1d63 #774
[ 0.077215] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.080000] Call Trace:
[ 0.080000] dump_stack+0x76/0xa9
[ 0.080000] __schedule_bug+0x72/0xa0
[ 0.080000] __schedule+0x871/0xb30
[ 0.080000] ? mark_held_locks+0x73/0xa0
[ 0.080000] ? _raw_spin_unlock_irqrestore+0x4d/0x70
[ 0.080000] schedule+0x35/0x80
[ 0.080000] schedule_timeout+0x192/0x490
[ 0.080000] ? collect_expired_timers+0xc0/0xc0
[ 0.080000] rcu_exp_wait_wake+0xaf/0x940
[ 0.080000] ? trace_hardirqs_on+0xb/0x10
[ 0.080000] _synchronize_rcu_expedited+0x931/0xa10
[ 0.080000] ? native_set_pte+0x10/0x10
[ 0.080000] ? native_set_pte_at+0x10/0x10
[ 0.080000] ? __ptep_modify_prot_start+0x10/0x10
[ 0.080000] ? __paravirt_pgd_alloc+0x10/0x10
[ 0.080000] ? rcu_report_exp_cpu_mult+0x70/0x70
[ 0.080000] ? __paravirt_pgd_alloc+0x10/0x10
[ 0.080000] ? rewind_stack_do_exit+0x13/0x13
[ 0.080000] ? rewind_stack_do_exit+0x13/0x13
[ 0.080000] ? rewind_stack_do_exit+0x13/0x13
[ 0.080000] ? end_pv_irq_ops_irq_disable+0x1/0x1
[ 0.080000] ? ___ratelimit+0xb7/0x110
[ 0.080000] ? trace_hardirqs_on+0xb/0x10
[ 0.080000] ? apply_paravirt+0xc7/0x130
[ 0.080000] ? ___ratelimit+0xb7/0x110
[ 0.080000] ? wake_up_klogd+0x8/0x50
[ 0.080000] ? ___might_sleep+0x2f/0x1f0
[ 0.080000] ? acpi_hw_validate_io_request+0xe/0xed
[ 0.080000] ? acpi_os_read_port+0xb/0x60
[ 0.080000] synchronize_rcu_expedited+0x37/0xb0
[ 0.080000] synchronize_rcu+0xb5/0xe0
[ 0.080000] ? acpi_hw_read_multiple+0x1a/0x47
[ 0.080000] ? ___might_sleep+0x2f/0x1f0
[ 0.080000] ? acpi_hw_register_read+0x56/0xbc
[ 0.080000] ? __might_sleep+0x33/0x90
[ 0.080000] ? find_next_bit+0x12/0x20
[ 0.080000] ? cpumask_next+0x15/0x20
[ 0.080000] ? synchronize_sched_expedited+0x3c/0xe0
[ 0.080000] rcu_test_sync_prims+0xd/0x30
[ 0.080000] rcu_scheduler_starting+0x34/0x60
[ 0.080000] rest_init+0xe/0x1f0
[ 0.080000] start_kernel+0x44f/0x457
[ 0.080000] i386_start_kernel+0x8f/0x93
[ 0.080000] startup_32_smp+0x164/0x170
[ 0.080044] ------------[ cut here ]------------
[ 0.082082] releasing a pinned lock
[ 0.083741] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3780 lock_release+0x45c/0x4b0
[ 0.087752] Modules linked in:
[ 0.089169] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.15.0-rc1-00112-g6c0a1d63 #774
[ 0.090000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.090000] task: c1ba3340 task.stack: c1b98000
[ 0.090000] EIP: lock_release+0x45c/0x4b0
[ 0.090000] EFLAGS: 00210082 CPU: 0
[ 0.090000] EAX: 00000017 EBX: d244b2c0 ECX: c10c08f0 EDX: c10c090c
[ 0.090000] ESI: c1ba39b8 EDI: c1ba3340 EBP: c1b99c64 ESP: c1b99c34
[ 0.090000] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 0.090000] CR0: 80050033 CR2: ffffffff CR3: 01e7a000 CR4: 00000690
[ 0.090000] Call Trace:
[ 0.090000] ? dequeue_task_idle+0x13/0x40
[ 0.090000] _raw_spin_unlock_irq+0x1b/0x50
[ 0.090000] dequeue_task_idle+0x13/0x40
[ 0.090000] deactivate_task+0xb8/0x160
[ 0.090000] __schedule+0x55c/0xb30
[ 0.090000] ? _raw_spin_unlock_irqrestore+0x4d/0x70
[ 0.090000] schedule+0x35/0x80
[ 0.090000] schedule_timeout+0x192/0x490
[ 0.090000] ? collect_expired_timers+0xc0/0xc0
[ 0.090000] rcu_exp_wait_wake+0xaf/0x940
[ 0.090000] ? trace_hardirqs_on+0xb/0x10
[ 0.090000] _synchronize_rcu_expedited+0x931/0xa10
[ 0.090000] ? native_set_pte+0x10/0x10
[ 0.090000] ? native_set_pte_at+0x10/0x10
[ 0.090000] ? __ptep_modify_prot_start+0x10/0x10
[ 0.090000] ? __paravirt_pgd_alloc+0x10/0x10
[ 0.090000] ? rcu_report_exp_cpu_mult+0x70/0x70
[ 0.090000] ? __paravirt_pgd_alloc+0x10/0x10
[ 0.090000] ? rewind_stack_do_exit+0x13/0x13
[ 0.090000] ? rewind_stack_do_exit+0x13/0x13
[ 0.090000] ? rewind_stack_do_exit+0x13/0x13
[ 0.090000] ? end_pv_irq_ops_irq_disable+0x1/0x1
[ 0.090000] ? ___ratelimit+0xb7/0x110
[ 0.090000] ? trace_hardirqs_on+0xb/0x10
[ 0.090000] ? apply_paravirt+0xc7/0x130
[ 0.090000] ? ___ratelimit+0xb7/0x110
[ 0.090000] ? wake_up_klogd+0x8/0x50
[ 0.090000] ? ___might_sleep+0x2f/0x1f0
[ 0.090000] ? acpi_hw_validate_io_request+0xe/0xed
[ 0.090000] ? acpi_os_read_port+0xb/0x60
[ 0.090000] synchronize_rcu_expedited+0x37/0xb0
[ 0.090000] synchronize_rcu+0xb5/0xe0
[ 0.090000] ? acpi_hw_read_multiple+0x1a/0x47
[ 0.090000] ? ___might_sleep+0x2f/0x1f0
[ 0.090000] ? acpi_hw_register_read+0x56/0xbc
[ 0.090000] ? __might_sleep+0x33/0x90
[ 0.090000] ? find_next_bit+0x12/0x20
[ 0.090000] ? cpumask_next+0x15/0x20
[ 0.090000] ? synchronize_sched_expedited+0x3c/0xe0
[ 0.090000] rcu_test_sync_prims+0xd/0x30
[ 0.090000] rcu_scheduler_starting+0x34/0x60
[ 0.090000] rest_init+0xe/0x1f0
[ 0.090000] start_kernel+0x44f/0x457
[ 0.090000] i386_start_kernel+0x8f/0x93
[ 0.090000] startup_32_smp+0x164/0x170
[ 0.090000] Code: c1 ba 3f 00 00 00 b8 08 11 a9 c1 c6 05 6e f3 c5 c1 01 e8 08 b4 ff ff e9 cd fd ff ff 8d 76 00 c7 04 24 9f fc a8 c1 e8 34 5c fa ff <0f> ff e9 6c fc ff ff e8 c2 c4 f4 ff e9 57 fc ff ff 8d 44 03 74
[ 0.090000] ---[ end trace 677d86ae136591a1 ]---
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong
4 years, 6 months
e7e61e51f9 ("fw_cfg: do DMA read operation"): kernel BUG at arch/x86/mm/physaddr.c:75!
by kernel test robot
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit e7e61e51f97355bd0bfcba115829fafe81681133
Author: Marc-André Lureau <marcandre.lureau(a)redhat.com>
AuthorDate: Tue Jan 23 17:40:39 2018 +0100
Commit: Michael S. Tsirkin <mst(a)redhat.com>
CommitDate: Wed Jan 31 01:47:37 2018 +0200
fw_cfg: do DMA read operation
Modify fw_cfg_read_blob() to use DMA if the device supports it.
Return errors, because the operation may fail.
The DMA operation is expected to run synchronously with today qemu,
but the specification states that it may become async, so we run
"control" field check in a loop for eventual changes.
We may want to switch all the *buf addresses to use only kmalloc'ed
buffers (instead of using stack/image addresses with dma=false).
Signed-off-by: Marc-André Lureau <marcandre.lureau(a)redhat.com>
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
2e63155a7c fw_cfg: add DMA register
e7e61e51f9 fw_cfg: do DMA read operation
f26e52e08a Add linux-next specific files for 20180201
+------------------------------------------+------------+------------+---------------+
| | 2e63155a7c | e7e61e51f9 | next-20180201 |
+------------------------------------------+------------+------------+---------------+
| boot_successes | 35 | 0 | 0 |
| boot_failures | 0 | 15 | 11 |
| kernel_BUG_at_arch/x86/mm/physaddr.c | 0 | 15 | 11 |
| invalid_opcode:#[##] | 0 | 15 | 11 |
| EIP:__phys_addr | 0 | 15 | 11 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 15 | 11 |
+------------------------------------------+------------+------------+---------------+
[ 12.081587] Driver for 1-wire Dallas network protocol.
[ 12.082466] w1_f0d_init()
[ 12.083421] sdhci: Secure Digital Host Controller Interface driver
[ 12.084331] sdhci: Copyright(c) Pierre Ossman
[ 12.090633] ------------[ cut here ]------------
[ 12.091317] kernel BUG at arch/x86/mm/physaddr.c:75!
[ 12.092306] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 12.093194] Modules linked in:
[ 12.093354] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-00009-ge7e61e5 #1
[ 12.093354] EIP: __phys_addr+0x48/0x90
[ 12.093354] EFLAGS: 00210213 CPU: 0
[ 12.093354] EAX: 00000000 EBX: cd4dde40 ECX: 7c3dbb35 EDX: 00000001
[ 12.093354] ESI: 00000004 EDI: 04000000 EBP: cf01bd9c ESP: cf01bd88
[ 12.093354] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 12.093354] CR0: 80050033 CR2: 00000000 CR3: 01afa000 CR4: 00140690
[ 12.093354] Call Trace:
[ 12.093354] ? fw_cfg_dma_transfer+0x33/0xb0
[ 12.093354] fw_cfg_read_blob+0x9d/0x220
[ 12.093354] fw_cfg_sysfs_probe+0x263/0x670
[ 12.093354] ? mutex_unlock+0xb/0x10
[ 12.093354] platform_drv_probe+0x44/0xc0
[ 12.093354] ? devices_kset_move_last+0x67/0x90
[ 12.093354] driver_probe_device+0x23d/0x2e0
[ 12.093354] ? acpi_driver_match_device+0x27/0x70
[ 12.093354] __driver_attach+0xa1/0xb0
[ 12.093354] ? driver_probe_device+0x2e0/0x2e0
[ 12.093354] bus_for_each_dev+0x4f/0x90
[ 12.093354] driver_attach+0x19/0x20
[ 12.093354] ? driver_probe_device+0x2e0/0x2e0
[ 12.093354] bus_add_driver+0x197/0x210
[ 12.093354] ? firmware_map_add_early+0x47/0x47
[ 12.093354] driver_register+0x54/0xe0
[ 12.093354] __platform_driver_register+0x2a/0x30
[ 12.093354] fw_cfg_sysfs_init+0x2f/0x53
[ 12.093354] do_one_initcall+0x41/0x183
[ 12.093354] ? parse_args+0xb6/0x300
[ 12.093354] ? do_early_param+0x73/0x73
[ 12.093354] kernel_init_freeable+0x178/0x1fc
[ 12.093354] ? rest_init+0xe0/0xe0
[ 12.093354] kernel_init+0xb/0x100
[ 12.093354] ? schedule_tail_wrapper+0x9/0xc
[ 12.093354] ret_from_fork+0x2e/0x38
[ 12.093354] Code: 87 0f c2 85 d2 74 12 89 d9 c1 e9 0c 39 ca 72 13 e8 7e ce ff ff 39 c3 75 4a 89 d8 5b 5d c3 90 8d 74 26 00 0f 0b 8d b6 00 00 00 00 <0f> 0b 8d b6 00 00 00 00 8b 0d 00 87 0f c2 8d 91 00 00 80 00 39
[ 12.093354] EIP: __phys_addr+0x48/0x90 SS:ESP: 0068:cf01bd88
[ 12.119458] ---[ end trace ac8e79b92be7b5f1 ]---
[ 12.120156] Kernel panic - not syncing: Fatal exception
# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start ab57acc2ef0458b7beb8067a0912e301ced5a481 d8a5b80568a9cb66810e75b182018e9edb68e8ff --
git bisect bad a789dae55d5661ea7317e2ec848d18c744ec5ba4 # 03:31 B 0 11 27 2 Merge 'pinctrl/devel' into devel-spot-201802011417
git bisect bad ed2a2a43d202a5c35c50abc07b55f13e2c78473b # 03:42 B 0 9 23 0 Merge 'linux-review/Hui-Wang/ALSA-hda-Fix-headset-mic-detection-problem-for-two-Dell-machines/20180201-114516' into devel-spot-201802011417
git bisect bad 95b346b758bb1a0131826b4aeea504704bd78ad6 # 03:52 B 0 1 15 0 Merge 'kgdb/kgdb-next' into devel-spot-201802011417
git bisect bad 9aa6129bdef4ba1a880967714cd6ddb14cbeecee # 04:07 B 0 11 25 0 Merge 'kas/la57/boot-switching/wip' into devel-spot-201802011417
git bisect good b68ba2f68adffbcdaa8ad504644618a25fcb8c60 # 04:20 G 11 0 1 1 0day base guard for 'devel-spot-201802011417'
git bisect bad 15859da5c121d822ed223bf923f136e9e7df3bca # 04:29 B 0 11 25 0 Merge 'vhost/vhost' into devel-spot-201802011417
git bisect good 2e63155a7c9e3fa55138295a44fa138a0864c569 # 04:41 G 11 0 0 0 fw_cfg: add DMA register
git bisect bad 84d3ba05c3e8ed0ddf7d1c49c29e754a82b3acfb # 05:00 B 0 11 25 0 virtio: make VIRTIO a menuconfig to ease disabling it all
git bisect bad 95afb03a149615e85b268bb5cd221ea72b30e353 # 05:10 B 0 11 25 0 crash: export paddr_vmcoreinfo_note()
git bisect bad e7e61e51f97355bd0bfcba115829fafe81681133 # 05:30 B 0 6 20 0 fw_cfg: do DMA read operation
# first bad commit: [e7e61e51f97355bd0bfcba115829fafe81681133] fw_cfg: do DMA read operation
git bisect good 2e63155a7c9e3fa55138295a44fa138a0864c569 # 05:33 G 30 0 0 0 fw_cfg: add DMA register
# extra tests with debug options
git bisect bad e7e61e51f97355bd0bfcba115829fafe81681133 # 05:43 B 0 11 25 0 fw_cfg: do DMA read operation
# extra tests on HEAD of linux-devel/devel-spot-201802011417
git bisect bad ab57acc2ef0458b7beb8067a0912e301ced5a481 # 05:44 B 0 13 30 0 0day head guard for 'devel-spot-201802011417'
# extra tests on tree/branch linux-next/master
git bisect bad f26e52e08ab8e56f528ac14aa7929b3477de5616 # 06:00 B 0 11 25 0 Add linux-next specific files for 20180201
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation
4 years, 6 months