Greeting,
FYI, we noticed a -24.0% regression of stress-ng.lockf.ops_per_sec due to commit:
commit: b47291ef02b0bee85ffb7efd6c336060ad1fe1a4 ("mm, slub: change percpu partial
accounting from objects to pages")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: stress-ng
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G
memory
with following parameters:
nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: xfs
class: filesystem
test: lockf
cpufreq_governor: performance
ucode: 0x5003006
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
filesystem/gcc-9/performance/1HDD/xfs/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/lockf/stress-ng/60s/0x5003006
commit:
d0fe47c641 ("slub: add back check for free nonslab objects")
b47291ef02 ("mm, slub: change percpu partial accounting from objects to
pages")
d0fe47c64152a63c b47291ef02b0bee85ffb7efd6c3
---------------- ---------------------------
%stddev %change %stddev
\ | \
21444470 -24.0% 16302033 ± 3% stress-ng.lockf.ops
357404 -24.0% 271697 ± 3% stress-ng.lockf.ops_per_sec
39082 ± 2% -25.3% 29179 ± 3% stress-ng.time.voluntary_context_switches
0.16 ± 12% -0.0 0.13 ± 9% mpstat.cpu.all.usr%
111717 ± 9% -16.2% 93652 ± 9% numa-meminfo.node0.SUnreclaim
27929 ± 9% -16.2% 23412 ± 9% numa-vmstat.node0.nr_slab_unreclaimable
8521 ± 6% +15.4% 9833 ± 15% softirqs.CPU45.SCHED
146.10 +1.6% 148.47 turbostat.RAMWatt
3238 -8.7% 2955 ± 2% vmstat.system.cs
206396 -12.3% 180914 meminfo.SUnreclaim
320235 -10.6% 286436 meminfo.Slab
28459 -7.3% 26380 proc-vmstat.nr_slab_reclaimable
51598 -12.3% 45228 proc-vmstat.nr_slab_unreclaimable
0.87 ± 11% -0.2 0.65 ± 8% perf-profile.children.cycles-pp.llseek
0.40 ± 13% -0.1 0.29 ± 6% perf-profile.children.cycles-pp.ksys_lseek
0.27 ± 10% -0.1 0.20 ± 12%
perf-profile.children.cycles-pp.__entry_text_start
0.25 ± 11% -0.1 0.19 ± 13% perf-profile.children.cycles-pp.memset_erms
0.07 ± 17% -0.0 0.04 ± 45%
perf-profile.children.cycles-pp.xfs_file_llseek
0.08 ± 6% -0.0 0.06 ± 9%
perf-profile.children.cycles-pp.stress_lockf_contention
0.05 ± 47% +0.1 0.13 ± 11%
perf-profile.children.cycles-pp.locks_insert_lock_ctx
0.04 ± 72% +0.1 0.12 ± 20%
perf-profile.children.cycles-pp.locks_release_private
0.11 ± 11% +0.1 0.20 ± 12%
perf-profile.children.cycles-pp.locks_dispose_list
0.05 ± 45% +0.1 0.18 ± 23%
perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.00 +0.2 0.21 ± 31%
perf-profile.children.cycles-pp.__unfreeze_partials
0.00 +0.2 0.25 ± 29%
perf-profile.children.cycles-pp.get_partial_node
0.24 ± 10% -0.1 0.18 ± 12% perf-profile.self.cycles-pp.memset_erms
0.22 ± 8% -0.0 0.17 ± 12%
perf-profile.self.cycles-pp.kmem_cache_alloc
0.14 ± 10% -0.0 0.11 ± 11%
perf-profile.self.cycles-pp.__entry_text_start
0.07 ± 17% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.xfs_file_llseek
0.04 ± 73% +0.1 0.12 ± 20%
perf-profile.self.cycles-pp.locks_release_private
0.00 +0.1 0.09 ± 29%
perf-profile.self.cycles-pp.__unfreeze_partials
0.04 ± 45% +0.1 0.15 ± 22%
perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.00 +0.2 0.16 ± 29%
perf-profile.self.cycles-pp.get_partial_node
0.26 ± 12% +0.2 0.48 ± 5% perf-profile.self.cycles-pp._raw_spin_lock
0.67 ± 8% +1032.2% 7.57 ± 3% perf-stat.i.MPKI
6.339e+09 -23.8% 4.829e+09 ± 3% perf-stat.i.branch-instructions
11570593 ± 7% -16.3% 9688662 ± 6% perf-stat.i.branch-misses
3504821 ± 39% +730.5% 29107036 ± 26% perf-stat.i.cache-misses
13767953 ± 9% +853.6% 1.313e+08 ± 6% perf-stat.i.cache-references
2889 -8.7% 2636 perf-stat.i.context-switches
1.20 +30.6% 1.57 ± 3% perf-stat.i.cpi
10237 ± 41% -87.6% 1269 ± 31% perf-stat.i.cycles-between-cache-misses
0.00 ± 18% +0.0 0.00 ± 47% perf-stat.i.dTLB-load-miss-rate%
79657 ± 23% +111.8% 168752 ± 33% perf-stat.i.dTLB-load-misses
7.894e+09 -22.8% 6.096e+09 ± 3% perf-stat.i.dTLB-loads
1.195e+09 -21.5% 9.385e+08 ± 3% perf-stat.i.dTLB-stores
2756251 ± 3% -14.2% 2364908 ± 4% perf-stat.i.iTLB-load-misses
2.264e+10 -23.7% 1.727e+10 ± 3% perf-stat.i.instructions
8215 ± 3% -11.1% 7303 ± 2% perf-stat.i.instructions-per-iTLB-miss
0.83 -22.9% 0.64 ± 3% perf-stat.i.ipc
11.34 ± 2% -3.8% 10.92 ± 4% perf-stat.i.major-faults
179.74 ± 7% -32.9% 120.57 ± 20% perf-stat.i.metric.K/sec
160.68 -22.3% 124.90 ± 3% perf-stat.i.metric.M/sec
1133752 ± 45% +392.3% 5581428 ± 24% perf-stat.i.node-load-misses
394916 ± 51% +290.3% 1541328 ± 27% perf-stat.i.node-loads
122286 ± 17% +438.7% 658797 ± 37% perf-stat.i.node-store-misses
0.61 ± 9% +1149.6% 7.59 ± 3% perf-stat.overall.MPKI
1.20 +30.5% 1.57 ± 3% perf-stat.overall.cpi
9089 ± 37% -88.6% 1038 ± 41%
perf-stat.overall.cycles-between-cache-misses
0.00 ± 23% +0.0 0.00 ± 34% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 4% +0.0 0.00 ± 26% perf-stat.overall.dTLB-store-miss-rate%
8224 ± 3% -11.1% 7308 ± 2%
perf-stat.overall.instructions-per-iTLB-miss
0.83 -23.3% 0.64 ± 3% perf-stat.overall.ipc
6.238e+09 -23.8% 4.752e+09 ± 3% perf-stat.ps.branch-instructions
11350804 ± 7% -16.1% 9518421 ± 6% perf-stat.ps.branch-misses
3449652 ± 39% +730.2% 28640228 ± 26% perf-stat.ps.cache-misses
13541733 ± 9% +854.1% 1.292e+08 ± 6% perf-stat.ps.cache-references
2842 -8.7% 2595 perf-stat.ps.context-switches
78208 ± 23% +112.4% 166150 ± 33% perf-stat.ps.dTLB-load-misses
7.77e+09 -22.8% 5.999e+09 ± 3% perf-stat.ps.dTLB-loads
1.176e+09 -21.5% 9.234e+08 ± 3% perf-stat.ps.dTLB-stores
2712629 ± 3% -14.2% 2327097 ± 4% perf-stat.ps.iTLB-load-misses
2.228e+10 -23.7% 1.7e+10 ± 3% perf-stat.ps.instructions
1116097 ± 45% +392.1% 5492089 ± 24% perf-stat.ps.node-load-misses
388697 ± 51% +290.2% 1516521 ± 27% perf-stat.ps.node-loads
120277 ± 17% +439.0% 648261 ± 37% perf-stat.ps.node-store-misses
1.408e+12 -23.1% 1.083e+12 ± 3% perf-stat.total.instructions
26590 ± 3% -60.7% 10448 ± 4% slabinfo.anon_vma.active_objs
577.50 ± 3% -60.1% 230.17 ± 4% slabinfo.anon_vma.active_slabs
26590 ± 3% -60.1% 10618 ± 4% slabinfo.anon_vma.num_objs
577.50 ± 3% -60.1% 230.17 ± 4% slabinfo.anon_vma.num_slabs
61948 ± 6% -79.2% 12866 slabinfo.anon_vma_chain.active_objs
971.33 ± 6% -78.8% 206.00 slabinfo.anon_vma_chain.active_slabs
62204 ± 6% -78.8% 13209 slabinfo.anon_vma_chain.num_objs
971.33 ± 6% -78.8% 206.00 slabinfo.anon_vma_chain.num_slabs
141234 ± 2% -23.7% 107719 slabinfo.dentry.active_objs
3385 ± 2% -23.5% 2589 slabinfo.dentry.active_slabs
142212 ± 2% -23.5% 108771 slabinfo.dentry.num_objs
3385 ± 2% -23.5% 2589 slabinfo.dentry.num_slabs
22577 ± 11% -25.4% 16841 ± 3% slabinfo.file_lock_cache.active_objs
610.17 ± 11% -23.2% 468.83 ± 2% slabinfo.file_lock_cache.active_slabs
22590 ± 11% -23.1% 17368 ± 2% slabinfo.file_lock_cache.num_objs
610.17 ± 11% -23.2% 468.83 ± 2% slabinfo.file_lock_cache.num_slabs
29547 ± 4% -67.3% 9675 slabinfo.filp.active_objs
954.50 ± 4% -58.6% 395.00 ± 2% slabinfo.filp.active_slabs
30553 ± 4% -58.6% 12658 ± 2% slabinfo.filp.num_objs
954.50 ± 4% -58.6% 395.00 ± 2% slabinfo.filp.num_slabs
9641 ± 2% -22.8% 7445 slabinfo.kmalloc-256.active_objs
9641 ± 2% -20.5% 7666 slabinfo.kmalloc-256.num_objs
5391 ± 5% -31.1% 3712 slabinfo.kmalloc-2k.active_objs
340.67 ± 4% -30.4% 237.17 slabinfo.kmalloc-2k.active_slabs
5460 ± 4% -30.3% 3806 slabinfo.kmalloc-2k.num_objs
340.67 ± 4% -30.4% 237.17 slabinfo.kmalloc-2k.num_slabs
15908 ± 3% -26.3% 11720 slabinfo.kmalloc-512.active_objs
498.00 ± 3% -25.5% 370.83 slabinfo.kmalloc-512.active_slabs
15945 ± 3% -25.4% 11888 slabinfo.kmalloc-512.num_objs
498.00 ± 3% -25.5% 370.83 slabinfo.kmalloc-512.num_slabs
3377 ± 7% -18.1% 2767 ± 6% slabinfo.kmalloc-cg-1k.active_objs
3377 ± 7% -18.1% 2767 ± 6% slabinfo.kmalloc-cg-1k.num_objs
554.67 ± 13% -46.2% 298.67 ± 22% slabinfo.kmalloc-rcl-128.active_objs
554.67 ± 13% -26.0% 410.67 ± 15% slabinfo.kmalloc-rcl-128.num_objs
7747 ± 5% -16.4% 6475 ± 5% slabinfo.kmalloc-rcl-64.active_objs
7771 ± 5% -16.3% 6501 ± 5% slabinfo.kmalloc-rcl-64.num_objs
2443 ± 6% -25.5% 1820 ± 2% slabinfo.kmalloc-rcl-96.active_objs
2443 ± 6% -24.4% 1848 ± 5% slabinfo.kmalloc-rcl-96.num_objs
4608 ± 4% -10.1% 4142 ± 5% slabinfo.signal_cache.active_objs
4373 ± 4% -13.9% 3765 slabinfo.skbuff_head_cache.active_objs
4373 ± 4% -13.0% 3802 slabinfo.skbuff_head_cache.num_objs
55302 ± 8% -76.1% 13209 slabinfo.vm_area_struct.active_objs
1384 ± 8% -75.9% 334.17 slabinfo.vm_area_struct.active_slabs
55407 ± 8% -75.9% 13377 slabinfo.vm_area_struct.num_objs
1384 ± 8% -75.9% 334.17 slabinfo.vm_area_struct.num_slabs
14355 -9.7% 12956 slabinfo.vmap_area.active_objs
14358 -9.5% 12988 slabinfo.vmap_area.num_objs
stress-ng.lockf.ops
2.5e+07 +-----------------------------------------------------------------+
| |
|.++.++.++.++.++.++.++.+.++.++.++.++.++.++.+. +.++.++.+ |
2e+07 |-+ + |
| O O O O O O |
| OO O O O OO O OO O OO O O OO O OO O OO O O OO OO OO O |
1.5e+07 |-+ O O O |
| |
1e+07 |-+ |
| |
| |
5e+06 |-+ |
| |
| |
0 +-----------------------------------------------------------------+
stress-ng.lockf.ops_per_sec
400000 +------------------------------------------------------------------+
|.+ .++. .+ .+. +. +. +.+ .+ .++. +.+.++.+ ++. .+. |
350000 |-++ ++ + + + + +.+ + + :+ ++ + |
| + |
300000 |-+O O O O O O O O O O O O OO |
250000 |-O OO O OO O OO OO O O O O OO O O OO O O OO OO |
| |
200000 |-+ |
| |
150000 |-+ |
100000 |-+ |
| |
50000 |-+ |
| |
0 +------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang