Greeting,
FYI, we noticed a 31.0% improvement of stress-ng.inode-flags.ops_per_sec due to commit:
commit: be05dd0e68ac9991ee0f3f30dd436e8c7579b5bd ("xfs: Add order IDs to log items in
CIL")
https://git.kernel.org/cgit/linux/kernel/git/dgc/linux-xfs.git xfs-cil-scale-2
in testcase: stress-ng
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G
memory
with following parameters:
nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: xfs
class: filesystem
test: inode-flags
cpufreq_governor: performance
ucode: 0x5003006
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
filesystem/gcc-9/performance/1HDD/xfs/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/inode-flags/stress-ng/60s/0x5003006
commit:
7f3b7c463f ("xfs: convert CIL busy extents to per-cpu")
be05dd0e68 ("xfs: Add order IDs to log items in CIL")
7f3b7c463f00c996 be05dd0e68ac9991ee0f3f30dd4
---------------- ---------------------------
%stddev %change %stddev
\ | \
3012308 ± 2% +31.0% 3945286 ± 3% stress-ng.inode-flags.ops
50204 ± 2% +31.0% 65754 ± 3% stress-ng.inode-flags.ops_per_sec
3360614 -45.5% 1830302 ± 26% turbostat.C1
0.40 ± 2% -0.2 0.20 ± 22% turbostat.C1%
25097741 ± 2% -50.5% 12433379 ± 22% cpuidle.C1.time
3365317 -45.5% 1834770 ± 26% cpuidle.C1.usage
2183568 ± 8% +105.2% 4481098 ± 4% cpuidle.POLL.time
421819 ± 10% +232.8% 1403957 ± 9% cpuidle.POLL.usage
2.32 ± 45% -75.9% 0.56 ± 82%
perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
11.48 ± 55% -84.3% 1.80 ± 77%
perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
1.50 ± 36% -63.2% 0.55 ± 82%
perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
6.97 ± 58% -74.2% 1.80 ± 77%
perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
502.50 ± 8% -75.3% 124.00 slabinfo.numa_policy.active_objs
502.50 ± 8% -75.3% 124.00 slabinfo.numa_policy.num_objs
129.00 -100.0% 0.00 slabinfo.xfs_icr.active_objs
129.00 -100.0% 0.00 slabinfo.xfs_icr.num_objs
18568 ± 5% -12.9% 16173 ± 4% softirqs.CPU0.SCHED
19212 ± 9% -16.0% 16135 ± 11% softirqs.CPU18.SCHED
20664 ± 6% -21.5% 16220 ± 12% softirqs.CPU20.SCHED
20316 ± 7% -20.7% 16101 ± 7% softirqs.CPU21.SCHED
18802 ± 6% -20.0% 15048 ± 10% softirqs.CPU23.SCHED
20092 ± 7% -17.4% 16600 ± 7% softirqs.CPU33.SCHED
19321 ± 12% -15.6% 16306 ± 9% softirqs.CPU34.SCHED
18689 ± 8% -14.2% 16029 ± 12% softirqs.CPU35.SCHED
19921 ± 13% -24.5% 15046 ± 13% softirqs.CPU36.SCHED
20074 ± 7% -19.1% 16246 ± 8% softirqs.CPU37.SCHED
20125 ± 8% -23.3% 15435 ± 7% softirqs.CPU40.SCHED
20708 ± 9% -23.0% 15955 ± 10% softirqs.CPU42.SCHED
21025 ± 4% -7.9% 19357 ± 4% softirqs.CPU48.SCHED
1727003 -9.1% 1569991 ± 3% softirqs.SCHED
7.72 ± 3% +17.7% 9.09 ± 4% perf-stat.i.MPKI
0.43 ± 5% +0.1 0.51 ± 3% perf-stat.i.branch-miss-rate%
24265367 ± 3% +23.1% 29860571 ± 3% perf-stat.i.branch-misses
2.653e+08 ± 3% +17.7% 3.122e+08 ± 4% perf-stat.i.cache-references
1.404e+09 ± 2% +23.9% 1.739e+09 ± 2% perf-stat.i.dTLB-stores
62.36 +10.7 73.01 ± 2% perf-stat.i.iTLB-load-miss-rate%
10552635 ± 5% +45.9% 15391618 ± 8% perf-stat.i.iTLB-load-misses
6185500 -13.0% 5378443 ± 4% perf-stat.i.iTLB-loads
3312 ± 5% -29.0% 2350 ± 9% perf-stat.i.instructions-per-iTLB-miss
179.64 +2.6% 184.25 perf-stat.i.metric.M/sec
90.96 -2.0 88.98 perf-stat.i.node-load-miss-rate%
10087148 ± 6% +15.5% 11647587 ± 4% perf-stat.i.node-store-misses
7.87 ± 3% +17.5% 9.25 ± 4% perf-stat.overall.MPKI
0.34 ± 3% +0.1 0.42 ± 3% perf-stat.overall.branch-miss-rate%
63.01 +11.0 74.01 ± 2% perf-stat.overall.iTLB-load-miss-rate%
3203 ± 5% -30.9% 2212 ± 9%
perf-stat.overall.instructions-per-iTLB-miss
91.33 -1.9 89.40 perf-stat.overall.node-load-miss-rate%
23868916 ± 3% +23.1% 29373779 ± 3% perf-stat.ps.branch-misses
2.612e+08 ± 3% +17.7% 3.073e+08 ± 4% perf-stat.ps.cache-references
1.381e+09 ± 2% +23.9% 1.711e+09 ± 2% perf-stat.ps.dTLB-stores
10387330 ± 5% +45.9% 15152041 ± 8% perf-stat.ps.iTLB-load-misses
6087865 -13.0% 5294078 ± 4% perf-stat.ps.iTLB-loads
9928691 ± 6% +15.4% 11459578 ± 4% perf-stat.ps.node-store-misses
264556 ± 3% +55.0% 410193 ± 6% interrupts.CAL:Function_call_interrupts
188.00 ± 19% -35.7% 120.83 ± 22%
interrupts.CPU25.RES:Rescheduling_interrupts
2577 ± 19% +49.9% 3863 ± 17%
interrupts.CPU26.CAL:Function_call_interrupts
2552 ± 16% +54.3% 3938 ± 15%
interrupts.CPU27.CAL:Function_call_interrupts
2682 ± 20% +49.1% 3999 ± 18%
interrupts.CPU28.CAL:Function_call_interrupts
171.67 ± 17% -45.2% 94.00 ± 13%
interrupts.CPU30.RES:Rescheduling_interrupts
2999 ± 13% +53.1% 4593 ± 15%
interrupts.CPU32.CAL:Function_call_interrupts
3149 ± 12% +43.0% 4503 ± 8%
interrupts.CPU33.CAL:Function_call_interrupts
3185 ± 11% +46.4% 4664 ± 14%
interrupts.CPU37.CAL:Function_call_interrupts
194.83 ± 20% -48.3% 100.67 ± 24%
interrupts.CPU37.RES:Rescheduling_interrupts
3126 ± 18% +49.1% 4662 ± 12%
interrupts.CPU38.CAL:Function_call_interrupts
2576 ± 16% +40.1% 3609 ± 13%
interrupts.CPU4.CAL:Function_call_interrupts
180.83 ± 13% -39.6% 109.17 ± 19%
interrupts.CPU40.RES:Rescheduling_interrupts
3116 ± 19% +44.5% 4504 ± 26%
interrupts.CPU41.CAL:Function_call_interrupts
218.50 ± 15% -45.0% 120.17 ± 34%
interrupts.CPU42.RES:Rescheduling_interrupts
3252 ± 8% +55.5% 5057 ± 18%
interrupts.CPU44.CAL:Function_call_interrupts
2893 ± 24% +54.3% 4464 ± 14%
interrupts.CPU47.CAL:Function_call_interrupts
3766 ± 5% +56.9% 5908 ± 15%
interrupts.CPU48.CAL:Function_call_interrupts
3251 ± 8% +47.4% 4791 ± 14%
interrupts.CPU49.CAL:Function_call_interrupts
3446 ± 13% +50.1% 5174 ± 16%
interrupts.CPU50.CAL:Function_call_interrupts
3177 ± 13% +70.1% 5404 ± 24%
interrupts.CPU51.CAL:Function_call_interrupts
3085 ± 16% +48.2% 4574 ± 15%
interrupts.CPU52.CAL:Function_call_interrupts
3093 ± 18% +57.2% 4861 ± 13%
interrupts.CPU53.CAL:Function_call_interrupts
3051 ± 18% +43.8% 4389 ± 20%
interrupts.CPU54.CAL:Function_call_interrupts
2793 ± 21% +71.5% 4791 ± 20%
interrupts.CPU59.CAL:Function_call_interrupts
2580 ± 15% +49.3% 3852 ± 14%
interrupts.CPU6.CAL:Function_call_interrupts
2075 ± 33% +46.1% 3032 ± 14%
interrupts.CPU67.CAL:Function_call_interrupts
5727 ± 12% -34.2% 3770 ± 43%
interrupts.CPU67.NMI:Non-maskable_interrupts
5727 ± 12% -34.2% 3770 ± 43%
interrupts.CPU67.PMI:Performance_monitoring_interrupts
1909 ± 20% +89.7% 3622 ± 28%
interrupts.CPU69.CAL:Function_call_interrupts
1921 ± 35% +73.2% 3328 ± 20%
interrupts.CPU70.CAL:Function_call_interrupts
2000 ± 19% +74.9% 3498 ± 22%
interrupts.CPU71.CAL:Function_call_interrupts
2612 ± 17% +133.0% 6087 ± 9%
interrupts.CPU73.CAL:Function_call_interrupts
2970 ± 15% +88.8% 5607 ± 20%
interrupts.CPU74.CAL:Function_call_interrupts
3044 ± 16% +78.8% 5445 ± 22%
interrupts.CPU75.CAL:Function_call_interrupts
2885 ± 16% +91.5% 5525 ± 9%
interrupts.CPU76.CAL:Function_call_interrupts
2916 ± 21% +82.9% 5334 ± 27%
interrupts.CPU77.CAL:Function_call_interrupts
2323 ± 17% +113.5% 4961 ± 17%
interrupts.CPU78.CAL:Function_call_interrupts
2482 ± 28% +112.5% 5274 ± 29%
interrupts.CPU79.CAL:Function_call_interrupts
2217 ± 21% +113.5% 4733 ± 31%
interrupts.CPU80.CAL:Function_call_interrupts
2208 ± 26% +110.9% 4657 ± 19%
interrupts.CPU81.CAL:Function_call_interrupts
2669 ± 28% +80.1% 4805 ± 21%
interrupts.CPU82.CAL:Function_call_interrupts
2154 ± 31% +134.2% 5044 ± 19%
interrupts.CPU84.CAL:Function_call_interrupts
2262 ± 16% +101.5% 4559 ± 18%
interrupts.CPU85.CAL:Function_call_interrupts
2184 ± 34% +95.3% 4266 ± 14%
interrupts.CPU86.CAL:Function_call_interrupts
2160 ± 16% +103.1% 4386 ± 41%
interrupts.CPU87.CAL:Function_call_interrupts
2031 ± 20% +140.9% 4894 ± 16%
interrupts.CPU88.CAL:Function_call_interrupts
1833 ± 27% +150.9% 4599 ± 27%
interrupts.CPU90.CAL:Function_call_interrupts
2152 ± 18% +109.3% 4503 ± 40%
interrupts.CPU91.CAL:Function_call_interrupts
1501 ± 28% +107.2% 3112 ± 28%
interrupts.CPU94.CAL:Function_call_interrupts
1862 ± 33% +102.1% 3763 ± 33%
interrupts.CPU95.CAL:Function_call_interrupts
11.11 ± 10% -9.3 1.76 ±116%
perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.xlog_cil_insert_items.xlog_cil_commit.__xfs_trans_commit
11.42 ± 10% -9.2 2.25 ± 90%
perf-profile.calltrace.cycles-pp._raw_spin_lock.xlog_cil_insert_items.xlog_cil_commit.__xfs_trans_commit.xfs_fileattr_set
12.70 ± 9% -8.8 3.86 ± 50%
perf-profile.calltrace.cycles-pp.xlog_cil_insert_items.xlog_cil_commit.__xfs_trans_commit.xfs_fileattr_set.vfs_fileattr_set
15.75 ± 8% -4.1 11.64 ± 12%
perf-profile.calltrace.cycles-pp.xlog_cil_commit.__xfs_trans_commit.xfs_fileattr_set.vfs_fileattr_set.do_vfs_ioctl
15.81 ± 8% -4.1 11.73 ± 12%
perf-profile.calltrace.cycles-pp.__xfs_trans_commit.xfs_fileattr_set.vfs_fileattr_set.do_vfs_ioctl.__x64_sys_ioctl
0.38 ± 70% +2.9 3.30 ± 34%
perf-profile.calltrace.cycles-pp.xlog_grant_add_space.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_trans_alloc_ichange
1.22 ± 8% +4.1 5.35 ± 28%
perf-profile.calltrace.cycles-pp.xfs_log_ticket_ungrant.xlog_cil_commit.__xfs_trans_commit.xfs_fileattr_set.vfs_fileattr_set
1.05 ± 9% +4.2 5.24 ± 32%
perf-profile.calltrace.cycles-pp.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_trans_alloc_ichange.xfs_fileattr_set
1.06 ± 9% +4.2 5.25 ± 32%
perf-profile.calltrace.cycles-pp.xfs_trans_reserve.xfs_trans_alloc.xfs_trans_alloc_ichange.xfs_fileattr_set.vfs_fileattr_set
1.21 ± 8% +4.2 5.45 ± 32%
perf-profile.calltrace.cycles-pp.xfs_trans_alloc.xfs_trans_alloc_ichange.xfs_fileattr_set.vfs_fileattr_set.do_vfs_ioctl
1.52 ± 7% +4.3 5.78 ± 30%
perf-profile.calltrace.cycles-pp.xfs_trans_alloc_ichange.xfs_fileattr_set.vfs_fileattr_set.do_vfs_ioctl.__x64_sys_ioctl
11.12 ± 10% -9.4 1.76 ±116%
perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
11.72 ± 10% -9.1 2.59 ± 77%
perf-profile.children.cycles-pp._raw_spin_lock
12.70 ± 9% -8.8 3.86 ± 50%
perf-profile.children.cycles-pp.xlog_cil_insert_items
15.75 ± 8% -4.1 11.65 ± 12%
perf-profile.children.cycles-pp.xlog_cil_commit
15.81 ± 8% -4.1 11.73 ± 12%
perf-profile.children.cycles-pp.__xfs_trans_commit
0.07 ± 18% +0.0 0.10 ± 14% perf-profile.children.cycles-pp.memset_erms
0.08 ± 11% +0.0 0.12 ± 15% perf-profile.children.cycles-pp.fput_many
0.12 ± 17% +0.0 0.16 ± 16% perf-profile.children.cycles-pp.osq_unlock
0.09 ± 11% +0.0 0.14 ± 24%
perf-profile.children.cycles-pp.__fget_files
0.10 ± 12% +0.1 0.15 ± 22%
perf-profile.children.cycles-pp.__fget_light
0.17 ± 12% +0.1 0.24 ± 11%
perf-profile.children.cycles-pp.kmem_cache_alloc
0.03 ± 70% +0.1 0.11 ± 13% perf-profile.children.cycles-pp.poll_idle
0.26 ± 10% +0.1 0.34 ± 15%
perf-profile.children.cycles-pp.xfs_trans_log_inode
0.23 ± 14% +0.1 0.31 ± 15% perf-profile.children.cycles-pp.down_write
0.26 ± 9% +0.1 0.36 ± 13%
perf-profile.children.cycles-pp.xfs_inode_item_format
0.11 ± 12% +0.1 0.21 ± 35%
perf-profile.children.cycles-pp.xlog_calc_unit_res
0.22 ± 11% +0.1 0.36 ± 12%
perf-profile.children.cycles-pp.security_capable
0.20 ± 11% +0.1 0.34 ± 13%
perf-profile.children.cycles-pp.apparmor_capable
0.24 ± 13% +0.1 0.38 ± 13%
perf-profile.children.cycles-pp.ns_capable_common
0.34 ± 9% +0.1 0.49 ± 11% perf-profile.children.cycles-pp.up_write
0.20 ± 10% +0.1 0.35 ± 26%
perf-profile.children.cycles-pp.xlog_ticket_alloc
0.48 ± 7% +0.2 0.66 ± 10% perf-profile.children.cycles-pp.xfs_iunlock
0.47 ± 7% +0.2 0.68 ± 13% perf-profile.children.cycles-pp.up_read
0.66 ± 11% +0.2 0.88 ± 12%
perf-profile.children.cycles-pp.xfs_fileattr_get
0.68 ± 10% +0.2 0.92 ± 10% perf-profile.children.cycles-pp.down_read
0.21 ± 5% +0.7 0.95 ± 31%
perf-profile.children.cycles-pp.xlog_grant_push_ail
0.20 ± 6% +0.7 0.95 ± 31%
perf-profile.children.cycles-pp.xlog_grant_push_threshold
0.22 ± 10% +0.9 1.15 ± 34%
perf-profile.children.cycles-pp.xlog_space_left
0.14 ± 30% +1.1 1.24 ± 35%
perf-profile.children.cycles-pp.xfs_log_space_wake
0.54 ± 9% +2.8 3.32 ± 34%
perf-profile.children.cycles-pp.xlog_grant_add_space
1.22 ± 8% +4.1 5.36 ± 28%
perf-profile.children.cycles-pp.xfs_log_ticket_ungrant
1.06 ± 9% +4.2 5.24 ± 32%
perf-profile.children.cycles-pp.xfs_log_reserve
1.06 ± 9% +4.2 5.26 ± 32%
perf-profile.children.cycles-pp.xfs_trans_reserve
1.21 ± 8% +4.2 5.45 ± 32%
perf-profile.children.cycles-pp.xfs_trans_alloc
1.52 ± 7% +4.3 5.78 ± 30%
perf-profile.children.cycles-pp.xfs_trans_alloc_ichange
11.04 ± 10% -9.3 1.76 ±115%
perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.12 ± 15% +0.0 0.15 ± 14% perf-profile.self.cycles-pp.__might_sleep
0.07 ± 14% +0.0 0.10 ± 14% perf-profile.self.cycles-pp.memset_erms
0.08 ± 8% +0.0 0.12 ± 13% perf-profile.self.cycles-pp.fput_many
0.13 ± 10% +0.0 0.17 ± 15%
perf-profile.self.cycles-pp.xfs_trans_log_inode
0.12 ± 19% +0.0 0.16 ± 16% perf-profile.self.cycles-pp.osq_unlock
0.08 ± 14% +0.0 0.13 ± 25% perf-profile.self.cycles-pp.__fget_files
0.03 ± 70% +0.1 0.10 ± 12% perf-profile.self.cycles-pp.poll_idle
0.20 ± 13% +0.1 0.27 ± 16% perf-profile.self.cycles-pp.down_write
0.28 ± 16% +0.1 0.37 ± 16%
perf-profile.self.cycles-pp.rwsem_down_write_slowpath
0.11 ± 12% +0.1 0.21 ± 36%
perf-profile.self.cycles-pp.xlog_calc_unit_res
0.20 ± 11% +0.1 0.33 ± 13%
perf-profile.self.cycles-pp.apparmor_capable
0.34 ± 10% +0.2 0.48 ± 11% perf-profile.self.cycles-pp.up_write
0.46 ± 7% +0.2 0.67 ± 13% perf-profile.self.cycles-pp.up_read
0.60 ± 13% +0.2 0.82 ± 12% perf-profile.self.cycles-pp._raw_spin_lock
0.63 ± 9% +0.2 0.85 ± 10% perf-profile.self.cycles-pp.down_read
0.70 ± 6% +0.3 1.03 ± 15% perf-profile.self.cycles-pp.xlog_cil_commit
0.74 ± 7% +0.5 1.24 ± 13%
perf-profile.self.cycles-pp.xlog_cil_insert_items
0.22 ± 9% +0.9 1.14 ± 34% perf-profile.self.cycles-pp.xlog_space_left
0.14 ± 30% +1.1 1.23 ± 36%
perf-profile.self.cycles-pp.xfs_log_space_wake
0.54 ± 9% +2.8 3.30 ± 34%
perf-profile.self.cycles-pp.xlog_grant_add_space
1.07 ± 6% +3.0 4.09 ± 26%
perf-profile.self.cycles-pp.xfs_log_ticket_ungrant
stress-ng.inode-flags.ops_per_sec
85000 +-------------------------------------------------------------------+
|O O O O O O |
80000 |-+ O O O O O OO O |
75000 |-+OO OO OO O O O O O |
| O O O O O O O OO O OO O O OO O |
70000 |-+ O O O O|
65000 |-+ O O O O |
| O |
60000 |-+ |
55000 |-+ |
|+.+ +. + + ++. + ++. +. + + .+ |
50000 |-+ + + ++.+++.+ +.+ ++ +.+ ++ .+ : ++ +.++ ++.++ + :+ |
45000 |-+ + + +.+ : + |
| ++.+ |
40000 +-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang