Greeting,
FYI, we noticed a 20.1% regression of fio.latency_10us% due to commit:
commit: 51f2c7c0900521da299f5b28f642582ea97dc47a ("mm, vmscan: consider eligible
zones in get_scan_count")
https://git.kernel.org/cgit/linux/kernel/git/mhocko/mm.git to_test/linus-tree/oom_hickups
in testcase: fio-basic
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:
disk: 1SSD
fs: ext4
runtime: 300s
nr_task: 8
rw: write
bs: 4k
ioengine: sync
test_size: 512g
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a
particular type of I/O action as specified by the user.
test-url:
https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run:
fio-basic/1SSD-ext4-300s-8-write-4k-sync-512g-performance/lkp-bdw-ep2
c675d2353ec11e10 51f2c7c0900521da299f5b28f6
---------------- --------------------------
14.19 20% 17.04 ± 5% fio.latency_10us%
0.50 ± 3% 12% 0.56 ± 3% fio.latency_50us%
636 628 fio.write_clat_stddev
83.22 81.09 fio.latency_4us%
1.41 ± 10% -55% 0.63 fio.latency_2us%
386 5% 406 fio.time.system_time
152 5% 159 fio.time.percent_of_cpu_this_job_got
95250 4% 99321 fio.time.voluntary_context_switches
4.315e+11 7% 4.6e+11 ± 3% perf-stat.branch-instructions
2.178e+12 6% 2.314e+12 perf-stat.instructions
8887 6% 9424 ± 3% perf-stat.instructions-per-iTLB-miss
6.109e+11 6% 6.459e+11 perf-stat.dTLB-loads
2.404e+09 4% 2.502e+09 perf-stat.branch-misses
3.751e+11 4% 3.894e+11 perf-stat.dTLB-stores
7.819e+08 4% 8.111e+08 ± 3% perf-stat.dTLB-load-misses
1.663e+08 1.638e+08 perf-stat.dTLB-store-misses
0.04 -5% 0.04 perf-stat.dTLB-store-miss-rate%
0 1e+04 14354 ±100%
latency_stats.avg.balance_dirty_pages.balance_dirty_pages_ratelimited.generic_perform_write.__generic_file_write_iter.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write.do_syscall_64.return_from_SYSCALL_64
0 8e+04 82365 ± 5%
latency_stats.max.call_rwsem_down_read_failed.ext4_da_get_block_prep.__block_write_begin_int.__block_write_begin.ext4_da_write_begin.generic_perform_write.__generic_file_write_iter.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
0 1e+04 14354 ±100%
latency_stats.max.balance_dirty_pages.balance_dirty_pages_ratelimited.generic_perform_write.__generic_file_write_iter.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write.do_syscall_64.return_from_SYSCALL_64
0 1e+06 1353230 ± 11%
latency_stats.sum.call_rwsem_down_read_failed.ext4_da_get_block_prep.__block_write_begin_int.__block_write_begin.ext4_da_write_begin.generic_perform_write.__generic_file_write_iter.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
0 1e+04 14354 ±100%
latency_stats.sum.balance_dirty_pages.balance_dirty_pages_ratelimited.generic_perform_write.__generic_file_write_iter.ext4_file_write_iter.__vfs_write.vfs_write.SyS_write.do_syscall_64.return_from_SYSCALL_64
perf-stat.dTLB-stores
3.94e+11 ++---------------------------------------------------------------+
3.92e+11 ++O O O O O O O O O |
O O O O O O O O O O O |
3.9e+11 ++ O |
3.88e+11 ++ O O O O |
3.86e+11 ++ |
3.84e+11 ++ |
| |
3.82e+11 ++ |
3.8e+11 ++ |
3.78e+11 ++ * *. * * |
3.76e+11 ++ + : *.* + *.*.* + : + : *.* |
| .* : .. + .* + .* : .* : + +|
3.74e+11 *+* *.* * *.*. *.*.*.* *.*..*.*.* *
3.72e+11 ++---------------------------------------------------------------+
fio.time.voluntary_context_switches
101000 ++-----------------------------------------------------------------+
| O O |
100000 ++ O O O O O |
| O O O O O O O O O O |
99000 O+O O O O O O O |
| |
98000 ++ |
| |
97000 ++ |
| |
96000 ++ *.* * * |
| .*.*.. : + .*.*. .*.* : + .*.* : + .*.. .*.*
95000 *+* : *. .*..* *.*.*. + : *.*. + : * *.* |
| * * * * |
94000 ++-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong