Greeting,
FYI, we noticed a 57.3% improvement of will-it-scale.per_process_ops due to commit:
commit: 3510ca20ece0150af6b10c77a74ff1b5c198e3e2 ("Minor page waitqueue
cleanups")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
with following parameters:
nr_task: 100%
mode: process
test: pread2
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel
copies to see if the testcase will scale. It builds both a process and threads based test
in order to see any differences between the two.
test-url:
https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run:
will-it-scale/100%-process-pread2-performance/lkp-hsw-ep4
0cc3b0ec23ce4c69 3510ca20ece0150af6b10c77a7
---------------- --------------------------
%stddev change %stddev
\ | \
6671 57% 10495 will-it-scale.per_process_ops
468302 40% 657813 vmstat.system.cs
14.64 53% 22.42 perf-stat.cache-miss-rate%
1.42e+08 41% 1.996e+08 perf-stat.context-switches
1.826e+09 39% 2.546e+09 perf-stat.iTLB-loads
8.607e+08 39% 1.198e+09 perf-stat.node-store-misses
2.482e+11 35% 3.361e+11 perf-stat.dTLB-stores
0.34 33% 0.45 ± 3% perf-stat.branch-miss-rate%
3.512e+09 28% 4.49e+09 perf-stat.branch-misses
3.972e+09 26% 4.992e+09 perf-stat.cache-misses
2.475e+09 24% 3.067e+09 perf-stat.node-load-misses
6.177e+08 14% 7.073e+08 perf-stat.node-stores
14841 ± 10% 13% 16765 ± 3% perf-stat.instructions-per-iTLB-miss
58.21 8% 62.87 perf-stat.node-store-miss-rate%
36936 6% 39270 perf-stat.cpu-migrations
0.29 6% 0.31 perf-stat.ipc
1.032e+12 -4% 9.917e+11 perf-stat.branch-instructions
3.40 -6% 3.20 perf-stat.cpi
1.472e+13 -7% 1.363e+13 ± 3% perf-stat.cpu-cycles
2.965e+08 ± 13% -14% 2.543e+08 perf-stat.iTLB-load-misses
2.714e+10 -18% 2.226e+10 perf-stat.cache-references
2.191e+08 ± 4% -24% 1.667e+08 perf-stat.dTLB-store-misses
13.95 ± 12% -35% 9.08 perf-stat.iTLB-load-miss-rate%
0.09 ± 4% -44% 0.05 perf-stat.dTLB-store-miss-rate%
0.11 ± 36% -54% 0.05 ± 5% perf-stat.dTLB-load-miss-rate%
1.247e+09 ± 37% -55% 5.627e+08 ± 4% perf-stat.dTLB-load-misses
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong