Greeting,
We noticed a -32% regression of aim7.jobs-per-min due to commit:
commit: ae75f0ca76d8ad07f9e902f0f9e46ec042acf47c ("f2fs: introduce free nid
bitmap")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: aim7
on test machine: 40 threads Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz with 384G memory
with following parameters:
disk: 4BRD_12G
md: RAID0
fs: f2fs
test: disk_cp
load: 3000
cpufreq_governor: performance
test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to
test and measure the performance of multiuser system.
test-url:
https://sourceforge.net/projects/aimbench/files/aim-suite7/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run:
aim7/4BRD_12G-RAID0-f2fs-disk_cp-3000-performance/lkp-ivb-ep01
8559922435601e8b ae75f0ca76d8ad07f9e902f0f9
---------------- --------------------------
%stddev change %stddev
\ | \
117068 -32% 79734 aim7.jobs-per-min
154 47% 226 aim7.time.elapsed_time
154 47% 226 aim7.time.elapsed_time.max
22.64 38% 31.13 ± 8% aim7.time.user_time
5387 14% 6147 aim7.time.system_time
399647 11% 442359 interrupts.CAL:Function_call_interrupts
1080 ± 10% -17% 898 ± 5% iostat.md0.w/s
21463 -31% 14725 iostat.md0.wkB/s
7688 -32% 5263 vmstat.io.bo
47120 -4% 45046 vmstat.system.in
18949 ± 5% -25% 14119 vmstat.system.cs
38.21 -8% 35.11 turbostat.RAMWatt
87.77 -21% 69.39 turbostat.%Busy
158 -32% 107 turbostat.PkgWatt
130.42 -38% 80.32 ± 3% turbostat.CorWatt
2642 -41% 1570 ± 3% turbostat.Avg_MHz
0.19 28% 0.25 ± 3% perf-stat.branch-miss-rate%
1.899e+09 22% 2.317e+09 perf-stat.branch-misses
752803 19% 897581 perf-stat.page-faults
752798 19% 897578 perf-stat.minor-faults
2951975 ± 4% 9% 3215525 ± 4% perf-stat.context-switches
0.27 8% 0.29 perf-stat.ipc
3.115e+09 -4% 2.99e+09 perf-stat.node-store-misses
4.396e+12 -4% 4.199e+12 perf-stat.instructions
31.26 -5% 29.61 perf-stat.cache-miss-rate%
9.801e+11 -5% 9.323e+11 perf-stat.branch-instructions
4.499e+09 -6% 4.215e+09 perf-stat.node-stores
1.191e+12 -7% 1.107e+12 perf-stat.dTLB-loads
822029 ± 5% -12% 725505 ± 4% perf-stat.cpu-migrations
1.634e+13 -12% 1.439e+13 perf-stat.cpu-cycles
495167 ±121% -4e+05 124905 ±150%
latency_stats.avg.do_write_page.[f2fs].write_data_page.[f2fs].f2fs_convert_inline_page.[f2fs].f2fs_convert_inline_inode.[f2fs].f2fs_preallocate_blocks.[f2fs].f2fs_file_write_iter.[f2fs].__vfs_write.vfs_write.SyS_write.do_syscall_64.return_from_SYSCALL_64
1119 ±164% 5e+03 5832 ±113%
latency_stats.avg.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
28663 ± 81% -1e+04 14258 ± 27%
latency_stats.avg.call_rwsem_down_read_failed.build_free_nids.[f2fs].alloc_nid.[f2fs].f2fs_new_inode.[f2fs].f2fs_create.[f2fs].path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
21190 ±116% -2e+04 3463 ± 8%
latency_stats.avg.call_rwsem_down_read_failed.build_free_nids.[f2fs].f2fs_balance_fs_bg.[f2fs].f2fs_balance_fs.[f2fs].f2fs_create.[f2fs].path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
37304 ±109% 4e+05 418035 ± 23%
latency_stats.max.call_rwsem_down_read_failed.build_free_nids.[f2fs].f2fs_balance_fs_bg.[f2fs].f2fs_balance_fs.[f2fs].f2fs_create.[f2fs].path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
0 2e+05 186976 ± 18%
latency_stats.max.call_rwsem_down_read_failed.build_free_nids.[f2fs].f2fs_balance_fs_bg.[f2fs].f2fs_balance_fs.[f2fs].f2fs_unlink.[f2fs].vfs_unlink.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
495167 ±121% -4e+05 124905 ±150%
latency_stats.max.do_write_page.[f2fs].write_data_page.[f2fs].f2fs_convert_inline_page.[f2fs].f2fs_convert_inline_inode.[f2fs].f2fs_preallocate_blocks.[f2fs].f2fs_file_write_iter.[f2fs].__vfs_write.vfs_write.SyS_write.do_syscall_64.return_from_SYSCALL_64
3255 ±167% 2e+04 19600 ±137%
latency_stats.max.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
7.859e+09 ± 12% 2e+10 3.16e+10
latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
102988 ± 98% 7e+06 7189879 ± 7%
latency_stats.sum.call_rwsem_down_read_failed.build_free_nids.[f2fs].f2fs_balance_fs_bg.[f2fs].f2fs_balance_fs.[f2fs].f2fs_create.[f2fs].path_openat.do_filp_open.do_sys_open.SyS_creat.entry_SYSCALL_64_fastpath
0 2e+06 2172771 ± 13%
latency_stats.sum.call_rwsem_down_read_failed.build_free_nids.[f2fs].f2fs_balance_fs_bg.[f2fs].f2fs_balance_fs.[f2fs].f2fs_unlink.[f2fs].vfs_unlink.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
495167 ±121% -4e+05 124905 ±150%
latency_stats.sum.do_write_page.[f2fs].write_data_page.[f2fs].f2fs_convert_inline_page.[f2fs].f2fs_convert_inline_inode.[f2fs].f2fs_preallocate_blocks.[f2fs].f2fs_file_write_iter.[f2fs].__vfs_write.vfs_write.SyS_write.do_syscall_64.return_from_SYSCALL_64
3312 ±167% 2e+04 19776 ±135%
latency_stats.sum.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
41381473 ± 71% -2e+07 19676236 ± 29%
latency_stats.sum.invalidate_blocks.[f2fs].truncate_data_blocks_range.[f2fs].truncate_blocks.[f2fs].f2fs_truncate.[f2fs].f2fs_evict_inode.[f2fs].evict.iput.dentry_unlink_inode.__dentry_kill.dput.__fput.____fput
aim7.jobs-per-min
140000 ++-----------------------------------------------------------------+
|.* *. *. .* .**. .* .* |
120000 *+ *.**.* * **.**.**.** *.**.* **.** * *.* **.**.**.**.**.*
| : : |
100000 ++ : : |
| : : |
80000 O+O O O OO OO OO OO OO OO OO : : |
| : : |
60000 ++ :: |
| :: |
40000 ++ :: |
| :: |
20000 ++ : |
| : |
0 ++-O--O-O-----------------------------------------*----------------+
turbostat.Avg_MHz
3000 ++-------------------------------------------------------------------+
| .**. *. *. *. .** .* |
2500 *+** **.*.* **.* *.**.**.**.*.**.**.**.* * : *.**.*.** *.**.*
| : : |
| : : |
2000 ++ : : |
O O O O O O : : |
1500 ++ O O OO O OO O O OO O : : |
| : : |
1000 ++ :: |
| :: |
| : |
500 ++ : |
| : |
0 ++-O--O-O-------------------------------------------*----------------+
turbostat._Busy
90 *+**-**-*-**-*-**-**-*-**-**-*-**-*-**-**-*-**-**-*-*--**-*-**-*-**-**-*
| : : |
80 ++ : : |
70 O+O O OO O OO OO O OO OO O OO : : |
| : : |
60 ++ : : |
50 ++ : : |
| :: |
40 ++ :: |
30 ++ :: |
| :: |
20 ++ : |
10 ++ : |
| : |
0 ++-O--O-O--------------------------------------------*-----------------+
turbostat.PkgWatt
180 ++--------------------------------------------------------------------+
| .* .* |
160 *+**.**.*.**.**.*.**.**.**.*.**.**.*.**.**.*.**.** : *.**.**.* *.**.*
140 ++ : : |
| : : |
120 O+O O O O O O O : : |
100 ++ O OO O OO O O OO : : |
| : : |
80 ++ : : |
60 ++ :: |
| :: |
40 ++ : |
20 ++ : |
| : |
0 ++-O--O-O--------------------------------------------*----------------+
turbostat.CorWatt
140 ++--------------------------------------------------------------------+
*.**.**.*.**.**.*.**.**.**.*.**.**.*.**.**.*.**.**.* *.**.**.*.**.**.*
120 ++ : : |
| : : |
100 ++ : : |
| O : : |
80 O+O OO OO O OO OO OO O OO O : : |
| : : |
60 ++ :: |
| :: |
40 ++ :: |
| : |
20 ++ : |
| : |
0 ++-O--O-O--------------------------------------------*----------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong