Hello!
On Tue, Jan 02, 2018 at 02:35:28PM +0800, kernel test robot wrote:
Greeting,
FYI, we noticed a -16.1% regression of fio.read_bw_MBps due to commit:
commit: 2b0f904a5a8781498417d67226fd12c5e56053ae ("mm/cma: manage the memory of the
CMA area by using the ZONE_MOVABLE")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
disk: 2pmem
fs: ext4
runtime: 200s
nr_task: 50%
time_based: tb
rw: randread
bs: 2M
ioengine: mmap
test_size: 200G
cpufreq_governor: performance
test-description: Fio is a tool that will spawn a number of threads or processes doing a
particular type of I/O action as specified by the user.
test-url:
https://github.com/axboe/fio
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based:
2M/gcc-7/performance/2pmem/ext4/mmap/x86_64-rhel-7.2/50%/debian-x86_64-2016-08-31.cgz/200s/randread/lkp-hsw-ep6/200G/fio-basic/tb
commit:
f6572f9cd2 ("mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE
request")
2b0f904a5a ("mm/cma: manage the memory of the CMA area by using the
ZONE_MOVABLE")
f6572f9cd248df2c 2b0f904a5a8781498417d67226
---------------- --------------------------
%stddev %change %stddev
\ | \
11451 -16.1% 9605 fio.read_bw_MBps
0.29 ± 5% +0.1 0.40 ± 3% fio.latency_1000us%
19.35 ± 5% -4.7 14.69 ± 3% fio.latency_10ms%
7.92 ± 3% +12.2 20.15 fio.latency_20ms%
0.05 ± 11% +0.0 0.09 ± 8% fio.latency_2ms%
70.22 -8.9 61.36 fio.latency_4ms%
0.29 ± 13% +0.0 0.33 ± 3% fio.latency_500us%
0.45 ± 29% +1.0 1.45 ± 4% fio.latency_50ms%
1.37 +0.1 1.44 fio.latency_750us%
9792 +31.7% 12896 fio.read_clat_90%_us
10560 +33.0% 14048 fio.read_clat_95%_us
15376 ± 10% +46.9% 22592 fio.read_clat_99%_us
4885 +19.2% 5825 fio.read_clat_mean_us
5725 -16.1% 4802 fio.read_iops
4.598e+09 -16.4% 3.845e+09 fio.time.file_system_inputs
453153 -8.4% 415215 fio.time.involuntary_context_switches
5.748e+08 -16.4% 4.806e+08 fio.time.major_page_faults
1822257 +23.7% 2254706 fio.time.maximum_resident_set_size
5089 +1.6% 5172 fio.time.system_time
514.50 -16.3% 430.48 fio.time.user_time
System time is increased and user time is decreased. On the below, there is a clue.
24569 ± 2% +9.6% 26917 ± 2%
fio.time.voluntary_context_switches
54443725 -14.9% 46353339 interrupts.CAL:Function_call_interrupts
0.00 ± 79% -0.0 0.00 ± 17% mpstat.cpu.iowait%
4.45 -0.7 3.71 mpstat.cpu.usr%
1467516 +21.3% 1779543 ± 3% meminfo.Active
1276031 +23.7% 1578443 ± 4% meminfo.Active(file)
25789 ± 3% -76.7% 6013 ± 4% meminfo.CmaFree
1.296e+08 -12.6% 1.133e+08 turbostat.IRQ
41.89 -3.4% 40.47 turbostat.RAMWatt
17444 ± 2% -13.5% 15092 ± 3% turbostat.SMI
10896428 -16.4% 9111830 vmstat.io.bi
6010 -6.2% 5637 vmstat.system.cs
317438 -12.1% 278980 vmstat.system.in
1072892 ± 3% +21.5% 1303487 numa-meminfo.node0.Active
978318 +21.6% 1189809 ± 2% numa-meminfo.node0.Active(file)
222968 -25.2% 166818 numa-meminfo.node0.PageTables
47374 ± 2% +10.6% 52402 ± 7% numa-meminfo.node0.SUnreclaim
165213 +31.9% 217870 numa-meminfo.node1.PageTables
222405 +10.4% 245633 ± 2% numa-meminfo.node1.SReclaimable
102992 ± 46% -80.8% 19812 ± 38% numa-meminfo.node1.Shmem
2.475e+08 ± 2% -24.0% 1.881e+08 numa-numastat.node0.local_node
39371795 ± 14% +167.1% 1.052e+08 ± 2% numa-numastat.node0.numa_foreign
2.475e+08 ± 2% -24.0% 1.881e+08 numa-numastat.node0.numa_hit
31890417 ± 17% +40.2% 44705135 ± 8% numa-numastat.node0.numa_miss
31899482 ± 17% +40.2% 44713255 ± 8% numa-numastat.node0.other_node
2.566e+08 ± 2% -44.2% 1.433e+08 numa-numastat.node1.local_node
31890417 ± 17% +40.2% 44705135 ± 8% numa-numastat.node1.numa_foreign
2.566e+08 ± 2% -44.2% 1.433e+08 numa-numastat.node1.numa_hit
39371795 ± 14% +167.1% 1.052e+08 ± 2% numa-numastat.node1.numa_miss
39373660 ± 14% +167.1% 1.052e+08 ± 2% numa-numastat.node1.other_node
6047 ± 39% -66.5% 2028 ± 63% sched_debug.cfs_rq:/.exec_clock.min
461.37 ± 8% +64.9% 760.74 ± 20% sched_debug.cfs_rq:/.load_avg.avg
1105 ± 13% +1389.3% 16467 ± 56% sched_debug.cfs_rq:/.load_avg.max
408.99 ± 3% +495.0% 2433 ± 49% sched_debug.cfs_rq:/.load_avg.stddev
28746 ± 12% -18.7% 23366 ± 14% sched_debug.cfs_rq:/.min_vruntime.min
752426 ± 3% -12.7% 656636 ± 4% sched_debug.cpu.avg_idle.avg
144956 ± 61% -85.4% 21174 ± 26% sched_debug.cpu.avg_idle.min
245684 ± 11% +44.6% 355257 ± 2% sched_debug.cpu.avg_idle.stddev
236035 ± 15% +51.8% 358264 ± 16% sched_debug.cpu.nr_switches.max
42039 ± 22% +34.7% 56616 ± 8% sched_debug.cpu.nr_switches.stddev
3204 ± 24% -48.1% 1663 ± 30% sched_debug.cpu.sched_count.min
2132 ± 25% +38.7% 2957 ± 11% sched_debug.cpu.sched_count.stddev
90.67 ± 32% -71.8% 25.58 ± 26% sched_debug.cpu.sched_goidle.min
6467 ± 15% +22.3% 7912 ± 15% sched_debug.cpu.ttwu_count.max
1513 ± 27% -55.7% 670.92 ± 22% sched_debug.cpu.ttwu_count.min
1025 ± 20% +68.4% 1727 ± 9% sched_debug.cpu.ttwu_count.stddev
1057 ± 16% -62.9% 391.85 ± 31% sched_debug.cpu.ttwu_local.min
244876 +21.6% 297770 ± 2% numa-vmstat.node0.nr_active_file
88.00 ± 5% +19.3% 105.00 ± 5% numa-vmstat.node0.nr_isolated_file
55778 -25.1% 41765 numa-vmstat.node0.nr_page_table_pages
11843 ± 2% +10.6% 13100 ± 7% numa-vmstat.node0.nr_slab_unreclaimable
159.25 ± 42% -74.9% 40.00 ± 52%
numa-vmstat.node0.nr_vmscan_immediate_reclaim
244862 +21.6% 297739 ± 2% numa-vmstat.node0.nr_zone_active_file
19364320 ± 19% +187.2% 55617595 ± 2% numa-vmstat.node0.numa_foreign
268155 ± 3% +49.6% 401089 ± 4% numa-vmstat.node0.workingset_activate
1.229e+08 -19.0% 99590617 numa-vmstat.node0.workingset_refault
6345 ± 3% -76.5% 1489 ± 3% numa-vmstat.node1.nr_free_cma
41335 +32.0% 54552 numa-vmstat.node1.nr_page_table_pages
25770 ± 46% -80.8% 4956 ± 38% numa-vmstat.node1.nr_shmem
55684 +10.4% 61475 ± 2% numa-vmstat.node1.nr_slab_reclaimable
1.618e+08 ± 8% -47.6% 84846798 ± 17% numa-vmstat.node1.numa_hit
1.617e+08 ± 8% -47.6% 84676284 ± 17% numa-vmstat.node1.numa_local
19365342 ± 19% +187.2% 55620100 ± 2% numa-vmstat.node1.numa_miss
19534837 ± 19% +185.6% 55790654 ± 2% numa-vmstat.node1.numa_other
1.296e+08 -21.0% 1.024e+08 numa-vmstat.node1.workingset_refault
1.832e+12 -7.5% 1.694e+12 perf-stat.branch-instructions
0.25 -0.0 0.23 perf-stat.branch-miss-rate%
4.666e+09 -16.0% 3.918e+09 perf-stat.branch-misses
39.88 +1.1 40.98 perf-stat.cache-miss-rate%
2.812e+10 -11.6% 2.485e+10 perf-stat.cache-misses
7.051e+10 -14.0% 6.064e+10 perf-stat.cache-references
1260521 -6.1% 1183071 perf-stat.context-switches
1.87 +9.6% 2.05 perf-stat.cpi
6707 ± 2% -5.2% 6359 perf-stat.cpu-migrations
1.04 ± 11% -0.3 0.77 ± 4% perf-stat.dTLB-load-miss-rate%
2.365e+10 ± 7% -25.9% 1.751e+10 ± 9% perf-stat.dTLB-load-misses
1.05e+12 ± 4% -9.5% 9.497e+11 ± 2% perf-stat.dTLB-stores
28.16 +2.2 30.35 ± 2% perf-stat.iTLB-load-miss-rate%
2.56e+08 -10.4% 2.295e+08 perf-stat.iTLB-loads
8.974e+12 -9.2% 8.151e+12 perf-stat.instructions
89411 -8.8% 81529 perf-stat.instructions-per-iTLB-miss
0.54 -8.8% 0.49 perf-stat.ipc
5.748e+08 -16.4% 4.806e+08 perf-stat.major-faults
52.82 +5.8 58.61 ± 2% perf-stat.node-load-miss-rate%
7.206e+09 ± 2% -18.6% 5.867e+09 ± 3% perf-stat.node-loads
17.96 ± 8% +15.7 33.69 ± 2% perf-stat.node-store-miss-rate%
2.055e+09 ± 8% +65.1% 3.393e+09 ± 4% perf-stat.node-store-misses
9.391e+09 ± 2% -28.9% 6.675e+09 perf-stat.node-stores
5.753e+08 -16.4% 4.811e+08 perf-stat.page-faults
305865 -16.3% 256108 proc-vmstat.allocstall_movable
1923 ± 14% -72.1% 537.00 ± 12% proc-vmstat.allocstall_normal
0.00 +Inf% 1577 ± 67% proc-vmstat.compact_isolated
1005 ± 4% -65.8% 344.00 ± 7% proc-vmstat.kswapd_low_wmark_hit_quickly
320062 +23.2% 394374 ± 4% proc-vmstat.nr_active_file
6411 ± 2% -76.4% 1511 ± 4% proc-vmstat.nr_free_cma
277.00 ± 12% -51.4% 134.75 ± 52% proc-vmstat.nr_vmscan_immediate_reclaim
320049 +23.2% 394353 ± 4% proc-vmstat.nr_zone_active_file
71262212 ± 15% +110.3% 1.499e+08 ± 3% proc-vmstat.numa_foreign
5.042e+08 ± 2% -34.3% 3.314e+08 proc-vmstat.numa_hit
5.041e+08 ± 2% -34.3% 3.314e+08 proc-vmstat.numa_local
71262212 ± 15% +110.3% 1.499e+08 ± 3% proc-vmstat.numa_miss
71273176 ± 15% +110.3% 1.499e+08 ± 3% proc-vmstat.numa_other
1007 ± 4% -65.6% 346.25 ± 7% proc-vmstat.pageoutrun
23070268 -16.0% 19386190 proc-vmstat.pgalloc_dma32
5.525e+08 -16.7% 4.603e+08 proc-vmstat.pgalloc_normal
5.753e+08 -16.4% 4.812e+08 proc-vmstat.pgfault
5.751e+08 -16.3% 4.813e+08 proc-vmstat.pgfree
5.748e+08 -16.4% 4.806e+08 proc-vmstat.pgmajfault
2.299e+09 -16.4% 1.923e+09 proc-vmstat.pgpgin
8.396e+08 -17.8% 6.901e+08 proc-vmstat.pgscan_direct
3.018e+08 ± 2% -13.0% 2.627e+08 proc-vmstat.pgscan_kswapd
4.1e+08 -15.1% 3.48e+08 proc-vmstat.pgsteal_direct
1.542e+08 ± 3% -20.9% 1.22e+08 ± 3% proc-vmstat.pgsteal_kswapd
23514 ± 4% -23.1% 18076 ± 16% proc-vmstat.slabs_scanned
343040 ± 2% +40.3% 481253 ± 2% proc-vmstat.workingset_activate
2.525e+08 -20.1% 2.018e+08 proc-vmstat.workingset_refault
13.64 ± 3% -1.7 11.96 ± 2%
perf-profile.calltrace.cycles-pp.ext4_mpage_readpages.filemap_fault.ext4_filemap_fault.__do_fault.__handle_mm_fault
11.67 ± 3% -1.4 10.29 ± 2%
perf-profile.calltrace.cycles-pp.submit_bio.ext4_mpage_readpages.filemap_fault.ext4_filemap_fault.__do_fault
11.64 ± 3% -1.4 10.25 ± 2%
perf-profile.calltrace.cycles-pp.generic_make_request.submit_bio.ext4_mpage_readpages.filemap_fault.ext4_filemap_fault
11.10 ± 3% -1.3 9.82 ± 2%
perf-profile.calltrace.cycles-pp.pmem_make_request.generic_make_request.submit_bio.ext4_mpage_readpages.filemap_fault
9.21 ± 3% -1.2 8.04 ± 3%
perf-profile.calltrace.cycles-pp.pmem_do_bvec.pmem_make_request.generic_make_request.submit_bio.ext4_mpage_readpages
27.33 ± 4% -1.0 26.35 ± 5%
perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
27.33 ± 4% -1.0 26.35 ± 5%
perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
27.33 ± 4% -1.0 26.35 ± 5%
perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
27.33 ± 4% -1.0 26.35 ± 5%
perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
26.79 ± 4% -0.8 25.98 ± 5%
perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
27.98 ± 3% -0.8 27.22 ± 4%
perf-profile.calltrace.cycles-pp.secondary_startup_64
5.36 ± 12% -0.6 4.76 ± 7%
perf-profile.calltrace.cycles-pp.kswapd.kthread.ret_from_fork
5.36 ± 12% -0.6 4.76 ± 7%
perf-profile.calltrace.cycles-pp.shrink_node.kswapd.kthread.ret_from_fork
5.30 ± 12% -0.6 4.71 ± 7%
perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd.kthread
5.35 ± 12% -0.6 4.76 ± 7%
perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.kswapd.kthread.ret_from_fork
5.43 ± 12% -0.5 4.88 ± 7%
perf-profile.calltrace.cycles-pp.ret_from_fork
5.43 ± 12% -0.5 4.88 ± 7%
perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
11.04 ± 2% -0.2 10.82 ± 2%
perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
62.44 ± 2% +1.9 64.38
perf-profile.calltrace.cycles-pp.page_fault
62.38 ± 2% +2.0 64.33
perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault
62.38 ± 2% +2.0 64.34
perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
61.52 ± 2% +2.1 63.58
perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
61.34 ± 2% +2.1 63.44
perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
30.18 ± 3% +2.3 32.45 ± 2%
perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages
7.98 ± 3% +2.3 10.33 ± 2%
perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.filemap_fault.ext4_filemap_fault.__do_fault.__handle_mm_fault
30.48 ± 3% +2.4 32.83 ± 2%
perf-profile.calltrace.cycles-pp.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.filemap_fault.ext4_filemap_fault
30.46 ± 3% +2.4 32.81 ± 2%
perf-profile.calltrace.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.filemap_fault
30.46 ± 3% +2.4 32.81 ± 2%
perf-profile.calltrace.cycles-pp.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask
30.37 ± 3% +2.4 32.75 ± 2%
perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath
5.58 ± 4% +2.5 8.08 ± 2%
perf-profile.calltrace.cycles-pp.__lru_cache_add.add_to_page_cache_lru.filemap_fault.ext4_filemap_fault.__do_fault
32.88 ± 3% +2.5 35.38 ± 2%
perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.filemap_fault.ext4_filemap_fault.__do_fault.__handle_mm_fault
5.51 ± 4% +2.5 8.02 ± 2%
perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru.filemap_fault.ext4_filemap_fault
4.24 ± 4% +2.5 6.76 ± 2%
perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru.filemap_fault
4.18 ± 4% +2.5 6.70 ± 2%
perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru
18.64 ± 3% +2.5 21.16 ± 2%
perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node
31.65 ± 3% +2.7 34.31 ± 2%
perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.filemap_fault.ext4_filemap_fault.__do_fault
17.21 ± 3% +2.7 19.93 ± 2%
perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
It looks like there is more lru lock contention. It would be caused by
using a movable zone for the CMA memory by this patch. In this case,
reclaim for normal memory skips the lru page on the movable zone so needs
more time to find enough reclaim target pages. It would increase lru lock
holding time and then cause contention.
Could you give me another stat 'pgskip_XXX' in /proc/vmstat to confirm
my theory?
BTW, other stats for memory management looks better than before,
less nr_free_cma, less pgmajfault, less reclaim scan/steal.
Thanks.