Greeting,
We noticed a -10% regression of aim7.jobs-per-min due to commit:
commit: bb95d9ab2a9d4afd03b59a603cccb2c601f68b78 ("f2fs: drop exist_data for
inline_data when truncated to 0")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: aim7
on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
with following parameters:
disk: 4BRD_12G
md: RAID1
fs: f2fs
test: creat-clo
load: 1500
cpufreq_governor: performance
test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to
test and measure the performance of multiuser system.
test-url:
https://sourceforge.net/projects/aimbench/files/aim-suite7/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run:
aim7/4BRD_12G-RAID1-f2fs-creat-clo-1500-performance/lkp-ivb-ep01
363fa4e078cbdc97 bb95d9ab2a9d4afd03b59a603c
---------------- --------------------------
%stddev change %stddev
\ | \
69986 ± 3% -10% 63066 aim7.jobs-per-min
226578 ± 8% 87% 423593 ± 4% aim7.time.involuntary_context_switches
4502 ± 3% 15% 5157 aim7.time.system_time
129 ± 3% 11% 142 aim7.time.elapsed_time
129 ± 3% 11% 142 aim7.time.elapsed_time.max
644595 ± 17% -28% 462051 ± 3% aim7.time.voluntary_context_switches
12748 ± 8% -11% 11310 ± 3% vmstat.system.cs
4192 ± 3% -10% 3772 iostat.md0.wkB/s
2622 5% 2747 turbostat.Avg_MHz
87.84 3% 90.68 turbostat.%Busy
128 3% 132 turbostat.CorWatt
155 159 turbostat.PkgWatt
316869 ± 24% 7e+05 972706 ± 23%
latency_stats.avg.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_creat.do_syscall_64.return_from_SYSCALL_64
3618 ± 83% 1e+04 18282 ± 43%
latency_stats.avg.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
3618 ± 83% 1e+04 18282 ± 43%
latency_stats.max.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
3618 ± 83% 1e+04 18282 ± 43%
latency_stats.sum.perf_event_alloc.SYSC_perf_event_open.SyS_perf_event_open.entry_SYSCALL_64_fastpath
0.32 ± 23% 40% 0.45 ± 9% perf-stat.dTLB-load-miss-rate%
0.16 38% 0.22 perf-stat.branch-miss-rate%
1.854e+09 ± 3% 23% 2.282e+09 perf-stat.branch-misses
1.362e+13 ± 4% 16% 1.574e+13 perf-stat.cpu-cycles
1.176e+09 ± 3% 14% 1.343e+09 perf-stat.node-store-misses
1.447e+10 ± 7% 14% 1.645e+10 ± 4% perf-stat.cache-references
1.579e+09 ± 4% 9% 1.72e+09 perf-stat.node-stores
3.156e+11 3% 3.261e+11 perf-stat.dTLB-stores
42.68 43.85 perf-stat.node-store-miss-rate%
45.56 45.00 perf-stat.node-load-miss-rate%
25.79 -3% 24.97 perf-stat.cache-miss-rate%
1.189e+12 ± 4% -11% 1.058e+12 perf-stat.branch-instructions
5.785e+12 ± 4% -16% 4.837e+12 perf-stat.instructions
1.552e+12 ± 4% -17% 1.286e+12 perf-stat.dTLB-loads
0.42 -28% 0.31 perf-stat.ipc
perf-stat.node-store-miss-rate_
44.6 ++-------------------------------------------------------------------+
44.4 ++ O O O |
| O O O O O O O |
44.2 O+ O O O O O O |
44 ++ O O O O O O O O |
43.8 ++ O O O O
43.6 ++ |
| |
43.4 ++ |
43.2 ++ |
43 ++ |
42.8 ++ |
*. .*..*. .*..*. .*..*.*..*.*.*..*.*.. .*.. .*..*.* |
42.6 ++*..* *..* *..* * * |
42.4 ++-------------------------------------------------------------------+
perf-stat.ipc
0.44 ++-------------*----*-*----------------------------------------------+
|.*..*.*.. .*. *. *. .*. .*.*.*..*.*..*.*..*.*.. .* |
0.42 *+ * *. *. * |
0.4 ++ |
| |
0.38 ++ |
0.36 ++ |
| |
0.34 ++ |
0.32 ++ |
| O O O O
0.3 O+ O O O O O O O O O |
0.28 ++O O O O O O O O O O O O |
| O O O |
0.26 ++-------------------------------------------------------------------+
turbostat.Avg_MHz
2800 ++-------------------------------------------------------------------+
| O O O O O O |
| O O O O O O O O O O
2750 ++O O O O O O O O O O |
O O O |
| |
2700 ++ |
| |
2650 ++ |
* * * |
|+ .. : *.* .*.. .*.. .*..*.*.. + |
2600 ++* : .. + .*..* * *.*.*.. .*..* * |
| * + .* * |
| *.*. |
2550 ++-------------------------------------------------------------------+
turbostat._Busy
91.5 ++-------------------------------------------------------------------+
91 ++ O O O |
| O O O O O O O O O O O O O O O O O O O
90.5 O+ O O O O O O |
90 ++ |
| |
89.5 ++ |
89 ++ |
88.5 ++ |
| |
88 *+*..*. *.*.. .*.. .*.*.. .* |
87.5 ++ *..*.* .*. .. * *.*.*.. .*.. .*. * |
| + *. * * * |
87 ++ + + |
86.5 ++-------------*-----------------------------------------------------+
aim7.time.involuntary_context_switches
600000 ++-----------------------------------------------------------------+
| O O |
550000 ++ O O O O |
500000 ++ O O |
| O O O O O O O |
450000 O+O O O O O O O O O O
400000 ++ O O |
| O |
350000 ++ |
300000 ++ |
| |
250000 *+*..*.*..*. .*.. *.*..*.*..*. *.* .*. |
200000 ++ *. .* + *. .. + .*.*. *.* |
| *. * * *. |
150000 ++-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong