Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 54b675d9b28d9a56289d06a813250472bc621f40 ("[HACK] demonstrate lazy tlb
issues")
https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git sched/bad_lazymm
in testcase: xfstests
version: xfstests-x86_64-99bc497-1_20211101
with following parameters:
disk: 2pmem
fs: ext4
test: ext4-dax
ucode: 0x7000019
test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url:
git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
on test machine: 16 threads 1 sockets Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G
memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+------------------------------------+------------+------------+
| | 0304e8588d | 54b675d9b2 |
+------------------------------------+------------+------------+
| boot_successes | 11 | 0 |
| boot_failures | 0 | 13 |
| WARNING:at_kernel/fork.c:#__mmdrop | 0 | 13 |
| RIP:__mmdrop | 0 | 13 |
+------------------------------------+------------+------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang(a)intel.com>
[ 11.139437][ T230] WARNING: CPU: 10 PID: 230 at kernel/fork.c:749 __mmdrop
(kernel/fork.c:749 (discriminator 1))
[ 11.147364][ T230] Modules linked in:
[ 11.151125][ T230] CPU: 10 PID: 230 Comm: modprobe Not tainted
5.15.0-00004-g54b675d9b28d #1
[ 11.159670][ T230] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1
03/02/2016
[ 11.168529][ T230] RIP: 0010:__mmdrop (kernel/fork.c:749 (discriminator 1))
[ 11.172685][ T209] usb 3-4.1: new low-speed USB device number 4 using xhci_hcd
[ 11.173251][ T230] Code: 48 89 ee 5d 41 5c e9 ad 1c 28 00 be 03 00 00 00 4c 89 c7 e8 60
62 50 00 eb de e8 d9 af 0f 00 eb d7 0f 0b 0f 0b e9 3d ff ff ff <0f> 0b e9 4c ff ff
ff 48 89 ef e8 7f 41 27 00 e9 61 ff ff ff 66 66
All code
========
0: 48 89 ee mov %rbp,%rsi
3: 5d pop %rbp
4: 41 5c pop %r12
6: e9 ad 1c 28 00 jmpq 0x281cb8
b: be 03 00 00 00 mov $0x3,%esi
10: 4c 89 c7 mov %r8,%rdi
13: e8 60 62 50 00 callq 0x506278
18: eb de jmp 0xfffffffffffffff8
1a: e8 d9 af 0f 00 callq 0xfaff8
1f: eb d7 jmp 0xfffffffffffffff8
21: 0f 0b ud2
23: 0f 0b ud2
25: e9 3d ff ff ff jmpq 0xffffffffffffff67
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 4c ff ff ff jmpq 0xffffffffffffff7d
31: 48 89 ef mov %rbp,%rdi
34: e8 7f 41 27 00 callq 0x2741b8
39: e9 61 ff ff ff jmpq 0xffffffffffffff9f
3e: 66 data16
3f: 66 data16
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 4c ff ff ff jmpq 0xffffffffffffff53
7: 48 89 ef mov %rbp,%rdi
a: e8 7f 41 27 00 callq 0x27418e
f: e9 61 ff ff ff jmpq 0xffffffffffffff75
14: 66 data16
15: 66 data16
[ 11.199988][ T230] RSP: 0018:ffffc900005bbe88 EFLAGS: 00010246
[ 11.205915][ T230] RAX: ffff8881011b4f80 RBX: ffff8881011b4f80 RCX: ffff8881584c5300
[ 11.213733][ T230] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff888103960cc0
[ 11.221550][ T230] RBP: ffff888103960cc0 R08: ffff8881584c4d00 R09: ffffffff81300300
[ 11.229384][ T230] R10: ffff888c7cf09b58 R11: 0000000000000001 R12: ffff8881011b4f80
[ 11.237220][ T230] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8881011b5bf0
[ 11.245055][ T230] FS: 0000000000000000(0000) GS:ffff888c3d080000(0000)
knlGS:0000000000000000
[ 11.253842][ T230] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 11.260271][ T230] CR2: 00007f4c016bf114 CR3: 0000000c7ec10001 CR4: 00000000003706e0
[ 11.268115][ T230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 11.275965][ T230] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 11.283802][ T230] Call Trace:
[ 11.283994][ T209] usb 3-4.1: New USB device found, idVendor=0557, idProduct=2419,
bcdDevice= 1.00
[ 11.286944][ T230] do_exit (arch/x86/include/asm/bitops.h:207
include/asm-generic/bitops/instrumented-non-atomic.h:135 include/linux/thread_info.h:118
kernel/exit.c:502 kernel/exit.c:812)
[ 11.295979][ T209] usb 3-4.1: New USB device strings: Mfr=0, Product=0,
SerialNumber=0
[ 11.299897][ T230] do_group_exit (include/linux/list.h:282
include/linux/sched/signal.h:686 kernel/exit.c:907)
[ 11.312160][ T230] __x64_sys_exit_group (kernel/exit.c:933)
[ 11.313067][ T209] input: HID 0557:2419 as
/devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4.1/3-4.1:1.0/0003:0557:2419.0002/input/input4
[ 11.317038][ T230] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 11.317044][ T230] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
[ 11.339130][ T230] RIP: 0033:0x7f4c012c19d6
[ 11.343393][ T230] Code: Unable to access opcode bytes at RIP 0x7f4c012c19ac.
Code starting with the faulting instruction
===========================================
[ 11.350621][ T230] RSP: 002b:00007ffe398efea8 EFLAGS: 00000246 ORIG_RAX:
00000000000000e7
[ 11.358889][ T230] RAX: ffffffffffffffda RBX: 00007f4c013b2760 RCX: 00007f4c012c19d6
[ 11.366706][ T230] RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
[ 11.374524][ T230] RBP: 0000000000000001 R08: 00000000000000e7 R09: ffffffffffffff80
[ 11.380812][ T209] hid-generic 0003:0557:2419.0002: input,hidraw1: USB HID v1.00
Keyboard [HID 0557:2419] on usb-0000:00:14.0-4.1/input0
[ 11.382358][ T230] R10: 00007ffe398efd5c R11: 0000000000000246 R12: 00007f4c013b2760
[ 11.382360][ T230] R13: 0000000000000001 R14: 00007f4c013bb428 R15: 0000000000000000
[ 11.395475][ T209] input: HID 0557:2419 as
/devices/pci0000:00/0000:00:14.0/usb3/3-4/3-4.1/3-4.1:1.1/0003:0557:2419.0003/input/input5
[ 11.402518][ T230] ---[ end trace 8a0cdd37e7ac904b ]---
[ 11.404617][ T1] Loaded X.509 cert 'Build time autogenerated kernel key:
825f8f632f2835177197bb4b0ca2da2b106df827'
[ 11.410430][ T209] hid-generic 0003:0557:2419.0003: input,hidraw2: USB HID v1.00 Mouse
[HID 0557:2419] on usb-0000:00:14.0-4.1/input1
[ 11.422482][ T1] zswap: loaded using pool lzo/zbud
[ 11.455839][ T1] Key type ._fscrypt registered
[ 11.460543][ T1] Key type .fscrypt registered
[ 11.465177][ T1] Key type fscrypt-provisioning registered
[ 11.470889][ T1] pstore: Using crash dump compression: deflate
[ 11.479754][ T1] Key type encrypted registered
[ 11.658636][ T1] pps pps0: new PPS source ptp2
[ 11.663412][ T1] ixgbe 0000:03:00.0: registered PHC device on eth2
[ 11.876201][ T1] pps pps1: new PPS source ptp3
[ 11.880977][ T1] ixgbe 0000:03:00.1: registered PHC device on eth3
[ 12.845928][ C4] random: fast init done
[ 15.035123][ T195] igb 0000:05:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full
Duplex, Flow Control: RX
[ 15.045789][ T195] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 15.057691][ T1] Sending DHCP requests ...., OK
[ 26.900567][ T1] IP-Config: Got DHCP answer from 192.168.3.2, my address is
192.168.3.82
[ 26.908908][ T1] IP-Config: Complete:
[ 26.912824][ T1] device=eth0, hwaddr=0c:c4:7a:c4:ab:7a, ipaddr=192.168.3.82,
mask=255.255.255.0, gw=192.168.3.200
[ 26.923770][ T1] host=lkp-bdw-de1,
domain=lkp.intel.com, nis-domain=(none)
[ 26.931351][ T1] bootserver=192.168.3.200, rootserver=192.168.3.200, rootpath=
[ 26.931353][ T1] nameserver0=192.168.3.200
[ 26.989125][ T1] ixgbe 0000:03:00.0: removed PHC on eth2
[ 28.116657][ T1] ixgbe 0000:03:00.1: removed PHC on eth3
[ 29.605079][ T1] Freeing unused kernel image (initmem) memory: 2548K
[ 29.613687][ T1] Write protecting the kernel read-only data: 24576k
[ 29.620764][ T1] Freeing unused kernel image (text/rodata gap) memory: 2036K
[ 29.628382][ T1] Freeing unused kernel image (rodata/data gap) memory: 1340K
[ 29.644737][ T1] Run /init as init process
[ 29.649096][ T1] with arguments:
[ 29.652777][ T1] /init
[ 29.655741][ T1] nokaslr
[ 29.658877][ T1] with environment:
[ 29.662709][ T1] HOME=/
[ 29.665758][ T1] TERM=linux
[ 29.669155][ T1] user=lkp
[ 29.672381][ T1]
job=/lkp/jobs/scheduled/lkp-bdw-de1/xfstests-2pmem-ext4-ext4-dax-ucode=0x7000019-debian-10.4-x86_64-20200603.cgz-54b675d9b28d9a56289d06a813250472bc621f40-20211107-33854-12dhxvn-6.yaml
[ 29.690798][ T1] ARCH=x86_64
[ 29.694280][ T1] kconfig=x86_64-rhel-8.3-func
[ 29.699238][ T1] branch=luto/sched/bad_lazymm
[ 29.704198][ T1] commit=54b675d9b28d9a56289d06a813250472bc621f40
[ 29.710809][ T1]
BOOT_IMAGE=/pkg/linux/x86_64-rhel-8.3-func/gcc-9/54b675d9b28d9a56289d06a813250472bc621f40/vmlinuz-5.15.0-00004-g54b675d9b28d
[ 29.724102][ T1] max_uptime=2100
[ 29.727950][ T1]
RESULT_ROOT=/result/xfstests/2pmem-ext4-ext4-dax-ucode=0x7000019/lkp-bdw-de1/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3-func/gcc-9/54b675d9b28d9a56289d06a813250472bc621f40/6
[ 29.745925][ T1] LKP_SERVER=internal-lkp-server
[ 29.751073][ T1] selinux=0
[ 29.754400][ T1] softlockup_panic=1
[ 29.758510][ T1] prompt_ramdisk=0
[ 29.767904][ T1] systemd[1]: RTC configured in localtime, applying delta of 0
minutes to system time.
[ 29.815353][ T252] random: lvmconfig: uninitialized urandom read (4 bytes read)
[ 29.870206][ T280] random: systemd-random-: uninitialized urandom read (512 bytes
read)
[ 29.969670][ T321] IPMI message handler: version 39.2
[ 29.976632][ T321] ipmi device interface
[ 29.980769][ T329] dca service started, version 1.12.1
[ 29.987470][ T321] ipmi_si: IPMI System Interface driver
[ 29.993083][ T321] ipmi_si dmi-ipmi-si.0: ipmi_platform: probing via SMBIOS
[ 30.000160][ T321] ipmi_platform: ipmi_si: SMBIOS: io 0xca2 regsize 1 spacing 1 irq 0
[ 30.011663][ T321] ipmi_si: Adding SMBIOS-specified kcs state machine
[ 30.018552][ T321] ipmi_si IPI0001:00: ipmi_platform: probing via ACPI
[ 30.025355][ T321] ipmi_si IPI0001:00: ipmi_platform: [io 0x0ca2] regsize 1 spacing 1
irq 0
[ 30.042600][ T327] gpio_ich gpio_ich.1.auto: GPIO from 948 to 1023
[ 30.049182][ T326] ioatdma: Intel(R) QuickData Technology Driver 5.00
[ 30.056063][ T321] ipmi_si dmi-ipmi-si.0: Removing SMBIOS-specified kcs state machine
in favor of ACPI
[ 30.065467][ T321] ipmi_si: Adding ACPI-specified kcs state machine
[ 30.065565][ T321] ipmi_si: Trying ACPI-specified kcs state machine at i/o address
0xca2, slave address 0x20, irq 0
[ 30.077176][ T326] IOAPIC[9]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1
Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:002C SQ:0 SVT:1)
[ 30.096532][ T326] IOAPIC[1]: Preconfigured routing entry (9-13 -> IRQ 88 Level:1
ActiveLow:1)
[ 30.105769][ T358] libata version 3.00 loaded.
[ 30.106791][ T326] IOAPIC[9]: Set IRTE entry (P:1 FPD:0 Dst_Mode:0 Redir_hint:1
Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000000 SID:002C SQ:0 SVT:1)
[ 30.121899][ T435] random: ln: uninitialized urandom read (6 bytes read)
[ 30.124481][ T326] IOAPIC[1]: Preconfigured routing entry (9-14 -> IRQ 90 Level:1
ActiveLow:1)
LKP: HOSTNAME lkp-bdw-de1, MAC 0c:c4:7a:c4:ab:7a, kernel 5.15.0-00004-g54b675d9b28d 1,
serial console /dev/ttyS0
[ 30.161780][ T358] ahci 0000:00:1f.2: version 3.0
[ 30.166594][ T321] ipmi_si IPI0001:00: The BMC does not support clearing the recv irq
bit, compensating, but the BMC needs to be fixed.
[ 30.167637][ T358] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl
SATA mode
[ 30.167649][ T358] ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems
apst
[ 30.170722][ T94] mei_me 0000:00:16.0: Device doesn't have valid ME Interface
[ 30.184055][ T358] scsi host0: ahci
[ 30.191544][ T360] nd_pmem namespace0.0: unable to guarantee persistence of writes
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang