Greeting,
FYI, we noticed a -5.1% regression of will-it-scale.per_thread_ops due to commit:
commit: 5f02a877638472e83cb5e335f9eec27052b1c7c2 ("fsnotify: annotate directory entry
modification events")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 8G memory
with following parameters:
nr_task: 16
mode: thread
test: unlink1
cpufreq_governor: performance
ucode: 0x20
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel
copies to see if the testcase will scale. It builds both a process and threads based test
in order to see any differences between the two.
test-url:
https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/thread/16/debian-x86_64-2018-04-03.cgz/lkp-ivb-d02/unlink1/will-it-scale/0x20
commit:
v5.0-rc4
5f02a87763 ("fsnotify: annotate directory entry modification events")
v5.0-rc4 5f02a877638472e83cb5e335f9e
---------------- ---------------------------
%stddev %change %stddev
\ | \
19323 -5.1% 18346 will-it-scale.per_thread_ops
191165 ± 2% +4.1% 198910
will-it-scale.time.involuntary_context_switches
762321 ± 15% +61.3% 1229571 ± 7%
will-it-scale.time.voluntary_context_switches
309177 -5.1% 293549 will-it-scale.workload
432464 ± 27% +102.6% 876020 ± 11% cpuidle.POLL.usage
283.25 ± 28% +66.6% 472.00 ± 21% slabinfo.skbuff_head_cache.active_objs
352.25 ± 21% +41.9% 500.00 ± 14% slabinfo.skbuff_head_cache.num_objs
1303292 -15.7% 1099147 vmstat.memory.cache
10069 ± 7% +31.0% 13187 ± 4% vmstat.system.cs
1261118 -16.1% 1057781 meminfo.Cached
1617123 -12.6% 1412934 meminfo.Memused
1244897 -16.4% 1041195 meminfo.Unevictable
5403 -12.5% 4730 meminfo.max_used_kB
227.62 ± 15% -22.8% 175.76 ± 11% sched_debug.cfs_rq:/.util_est_enqueued.avg
488355 ± 5% +23.7% 603940 ± 2% sched_debug.cpu.nr_switches.avg
510161 ± 5% +23.8% 631404 ± 3% sched_debug.cpu.nr_switches.max
468309 ± 5% +23.6% 578813 ± 3% sched_debug.cpu.nr_switches.min
157186 +3.2% 162275 proc-vmstat.nr_dirty_background_threshold
314758 +3.2% 324948 proc-vmstat.nr_dirty_threshold
315280 -16.1% 264445 proc-vmstat.nr_file_pages
1607301 +3.2% 1658312 proc-vmstat.nr_free_pages
10531 -1.8% 10338 proc-vmstat.nr_slab_reclaimable
311224 -16.4% 260298 proc-vmstat.nr_unevictable
311224 -16.4% 260298 proc-vmstat.nr_zone_unevictable
8845997 -6.8% 8245187 proc-vmstat.numa_hit
8845997 -6.8% 8245187 proc-vmstat.numa_local
16702645 -6.5% 15612083 proc-vmstat.pgalloc_normal
16691594 -6.5% 15601846 proc-vmstat.pgfree
1993 ± 13% -20.9% 1576 ± 10% interrupts.24:PCI-MSI.1572864-edge.eth0
28179 ± 5% +13.3% 31936 ± 4% interrupts.CPU0.RES:Rescheduling_interrupts
4167 ± 55% -99.7% 14.00 ±145% interrupts.CPU1.NMI:Non-maskable_interrupts
4167 ± 55% -99.7% 14.00 ±145%
interrupts.CPU1.PMI:Performance_monitoring_interrupts
28494 ± 5% +11.4% 31739 ± 5% interrupts.CPU1.RES:Rescheduling_interrupts
1993 ± 13% -20.9% 1576 ± 10%
interrupts.CPU2.24:PCI-MSI.1572864-edge.eth0
28158 ± 6% +13.9% 32060 ± 4% interrupts.CPU2.RES:Rescheduling_interrupts
1947 ± 2% +10.2% 2145 ± 5%
interrupts.CPU3.CAL:Function_call_interrupts
2677 ±100% +205.9% 8191 ± 32% interrupts.CPU3.NMI:Non-maskable_interrupts
2677 ±100% +205.9% 8191 ± 32%
interrupts.CPU3.PMI:Performance_monitoring_interrupts
28455 ± 5% +14.3% 32526 ± 3% interrupts.CPU3.RES:Rescheduling_interrupts
113288 ± 5% +13.2% 128262 ± 4% interrupts.RES:Rescheduling_interrupts
1.321e+09 -3.7% 1.272e+09 perf-stat.i.branch-instructions
15326992 -5.4% 14505572 perf-stat.i.branch-misses
4546846 ± 3% -6.5% 4251911 perf-stat.i.cache-misses
10143 ± 7% +31.2% 13307 ± 4% perf-stat.i.context-switches
1.62 +3.1% 1.67 perf-stat.i.cpi
308.48 +4.4% 322.20 ± 2% perf-stat.i.cpu-migrations
2321 ± 3% +5.8% 2456 perf-stat.i.cycles-between-cache-misses
1.943e+09 -3.8% 1.869e+09 perf-stat.i.dTLB-loads
1.084e+09 -3.6% 1.045e+09 perf-stat.i.dTLB-stores
933404 +7.0% 998689 perf-stat.i.iTLB-load-misses
38106 ± 4% -7.0% 35421 ± 2% perf-stat.i.iTLB-loads
6.52e+09 -3.9% 6.263e+09 perf-stat.i.instructions
7049 ± 2% -10.5% 6311 perf-stat.i.instructions-per-iTLB-miss
0.62 -3.1% 0.60 perf-stat.i.ipc
5.15 ± 2% +4.1% 5.36 perf-stat.overall.MPKI
1.61 +3.1% 1.66 perf-stat.overall.cpi
2313 ± 3% +5.8% 2448
perf-stat.overall.cycles-between-cache-misses
6986 ± 2% -10.2% 6272
perf-stat.overall.instructions-per-iTLB-miss
0.62 -3.0% 0.60 perf-stat.overall.ipc
1.318e+09 -3.8% 1.268e+09 perf-stat.ps.branch-instructions
15293474 -5.5% 14458274 perf-stat.ps.branch-misses
4536767 ± 3% -6.6% 4237911 perf-stat.ps.cache-misses
10119 ± 7% +31.1% 13264 ± 4% perf-stat.ps.context-switches
307.80 +4.3% 321.14 ± 2% perf-stat.ps.cpu-migrations
1.939e+09 -3.9% 1.863e+09 perf-stat.ps.dTLB-loads
1.082e+09 -3.7% 1.042e+09 perf-stat.ps.dTLB-stores
931359 +6.9% 995399 perf-stat.ps.iTLB-load-misses
38023 ± 4% -7.1% 35306 ± 2% perf-stat.ps.iTLB-loads
6.505e+09 -4.0% 6.242e+09 perf-stat.ps.instructions
1.973e+12 -4.0% 1.894e+12 perf-stat.total.instructions
2.94 ± 11% -0.5 2.46 ± 8%
perf-profile.calltrace.cycles-pp.osq_lock.rwsem_down_write_failed.call_rwsem_down_write_failed.down_write.do_unlinkat
14.02 -0.4 13.65
perf-profile.calltrace.cycles-pp.call_rwsem_down_write_failed.down_write.path_openat.do_filp_open.do_sys_open
13.98 -0.4 13.61
perf-profile.calltrace.cycles-pp.rwsem_down_write_failed.call_rwsem_down_write_failed.down_write.path_openat.do_filp_open
14.29 -0.3 13.95
perf-profile.calltrace.cycles-pp.down_write.path_openat.do_filp_open.do_sys_open.do_syscall_64
2.77 ± 5% -0.3 2.50 ± 4%
perf-profile.calltrace.cycles-pp.dput.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
1.50 ± 3% -0.3 1.24 ± 6%
perf-profile.calltrace.cycles-pp.dentry_kill.dput.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.64 -0.2 8.39
perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_down_write_failed.call_rwsem_down_write_failed.down_write.path_openat
0.62 ± 2% +0.1 0.71 ± 5%
perf-profile.calltrace.cycles-pp.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe.__GI___libc_open
0.67 ± 7% +0.1 0.77 ± 7%
perf-profile.calltrace.cycles-pp.selinux_inode_init_security.security_inode_init_security.shmem_mknod.path_openat.do_filp_open
1.21 ± 3% +0.1 1.31 ± 4%
perf-profile.calltrace.cycles-pp.call_rwsem_wake.up_write.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.77 ± 10% +0.1 0.88 ± 9%
perf-profile.calltrace.cycles-pp.selinux_inode_permission.security_inode_permission.link_path_walk.path_parentat.filename_parentat
0.88 ± 3% +0.2 1.05 ± 5%
perf-profile.calltrace.cycles-pp.security_inode_init_security.shmem_mknod.path_openat.do_filp_open.do_sys_open
3.61 +0.2 3.81 ± 3%
perf-profile.calltrace.cycles-pp.path_parentat.filename_parentat.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.86 ± 3% +0.6 5.44 ± 4%
perf-profile.calltrace.cycles-pp.vfs_unlink.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
1.19 ± 3% +0.9 2.09 ± 3%
perf-profile.calltrace.cycles-pp.d_delete.vfs_unlink.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
7.54 ± 5% -0.6 6.93 ± 5% perf-profile.children.cycles-pp.osq_lock
1.51 ± 2% -0.3 1.25 ± 5% perf-profile.children.cycles-pp.dentry_kill
0.95 ± 11% -0.1 0.80 ± 7%
perf-profile.children.cycles-pp.__follow_mount_rcu
0.71 ± 4% -0.1 0.60 ± 8% perf-profile.children.cycles-pp.iput
0.20 ± 7% -0.1 0.14 ± 6% perf-profile.children.cycles-pp.prandom_u32
0.16 ± 8% -0.0 0.11 ± 15%
perf-profile.children.cycles-pp.shmem_create
0.17 ± 29% -0.0 0.13 ± 6%
perf-profile.children.cycles-pp.irq_work_run_list
0.12 ± 9% -0.0 0.07 ± 14%
perf-profile.children.cycles-pp.selinux_d_instantiate
0.12 ± 12% -0.0 0.08 ± 19%
perf-profile.children.cycles-pp.mpol_shared_policy_init
0.15 ± 12% -0.0 0.12 ± 9%
perf-profile.children.cycles-pp.irq_work_interrupt
0.15 ± 12% -0.0 0.12 ± 9%
perf-profile.children.cycles-pp.smp_irq_work_interrupt
0.15 ± 13% -0.0 0.12 ± 9%
perf-profile.children.cycles-pp.irq_work_run
0.15 ± 13% -0.0 0.12 ± 9% perf-profile.children.cycles-pp.printk
0.15 ± 13% -0.0 0.12 ± 9%
perf-profile.children.cycles-pp.vprintk_emit
0.09 ± 27% -0.0 0.06 ± 28% perf-profile.children.cycles-pp.module_put
0.08 ± 34% +0.0 0.12 ± 14%
perf-profile.children.cycles-pp.process_measurement
0.01 ±173% +0.1 0.06 ± 13%
perf-profile.children.cycles-pp.dequeue_entity
0.07 ± 63% +0.1 0.12 ± 7%
perf-profile.children.cycles-pp.locks_remove_file
0.15 ± 15% +0.1 0.21 ± 12% perf-profile.children.cycles-pp.map_id_up
0.00 +0.1 0.07 ± 17%
perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
0.18 ± 14% +0.1 0.25 ± 11% perf-profile.children.cycles-pp.___d_drop
0.00 +0.1 0.07 ± 14% perf-profile.children.cycles-pp.unlink@plt
0.20 ± 13% +0.1 0.29 ± 10% perf-profile.children.cycles-pp.__d_drop
0.16 ± 13% +0.1 0.26 ± 8% perf-profile.children.cycles-pp.memcpy_erms
0.68 ± 6% +0.1 0.78 ± 6%
perf-profile.children.cycles-pp.selinux_inode_init_security
0.63 ± 3% +0.1 0.73 ± 4% perf-profile.children.cycles-pp.__alloc_fd
0.07 ± 58% +0.1 0.17 ± 22%
perf-profile.children.cycles-pp.expand_files
0.90 ± 2% +0.2 1.07 ± 6%
perf-profile.children.cycles-pp.security_inode_init_security
3.64 +0.2 3.82 ± 3%
perf-profile.children.cycles-pp.path_parentat
0.00 +0.3 0.28 ± 10%
perf-profile.children.cycles-pp.lockref_get_not_zero
0.00 +0.3 0.28 ± 7%
perf-profile.children.cycles-pp.take_dentry_name_snapshot
0.00 +0.3 0.31 ± 12% perf-profile.children.cycles-pp.dget_parent
4.89 ± 3% +0.6 5.46 ± 4% perf-profile.children.cycles-pp.vfs_unlink
1.19 ± 3% +0.9 2.12 ± 3% perf-profile.children.cycles-pp.d_delete
7.36 ± 5% -0.6 6.77 ± 5% perf-profile.self.cycles-pp.osq_lock
0.54 ± 13% -0.1 0.42 ± 11%
perf-profile.self.cycles-pp.__follow_mount_rcu
0.44 ± 7% -0.1 0.36 ± 8% perf-profile.self.cycles-pp.__fput
0.12 ± 27% -0.1 0.04 ±107%
perf-profile.self.cycles-pp.security_inode_free
0.36 ± 8% -0.1 0.28 ± 8% perf-profile.self.cycles-pp.may_link
0.90 ± 3% -0.1 0.82 ± 6% perf-profile.self.cycles-pp.link_path_walk
0.52 ± 7% -0.1 0.45 ± 3% perf-profile.self.cycles-pp._cond_resched
0.29 ± 10% -0.1 0.22 ± 17% perf-profile.self.cycles-pp.iput
0.17 ± 6% -0.1 0.11 ± 19% perf-profile.self.cycles-pp.shmem_unlink
0.15 ± 7% -0.1 0.09 ± 7% perf-profile.self.cycles-pp.shmem_create
0.16 ± 15% -0.1 0.10 ± 34%
perf-profile.self.cycles-pp.dentry_unlink_inode
0.15 ± 13% -0.0 0.11 ± 28%
perf-profile.self.cycles-pp.__x64_sys_unlink
0.31 ± 3% -0.0 0.27 ± 10%
perf-profile.self.cycles-pp.lockref_put_or_lock
0.09 ± 8% -0.0 0.06 ± 59%
perf-profile.self.cycles-pp.security_file_open
0.11 ± 10% -0.0 0.07 ± 11%
perf-profile.self.cycles-pp.mpol_shared_policy_init
0.09 ± 14% -0.0 0.06 ± 16%
perf-profile.self.cycles-pp.selinux_d_instantiate
0.05 ± 8% +0.0 0.08 ± 8%
perf-profile.self.cycles-pp.security_task_getsecid
0.17 ± 5% +0.0 0.20 ± 11%
perf-profile.self.cycles-pp.shmem_free_inode
0.09 ± 23% +0.0 0.13 ± 6%
perf-profile.self.cycles-pp.always_delete_dentry
0.25 ± 5% +0.0 0.29 ± 5% perf-profile.self.cycles-pp.__alloc_fd
0.15 ± 13% +0.0 0.20 ± 8% perf-profile.self.cycles-pp.simple_lookup
0.07 ± 63% +0.0 0.11 ± 4%
perf-profile.self.cycles-pp.locks_remove_file
0.04 ± 58% +0.1 0.09 ± 20%
perf-profile.self.cycles-pp.shmem_truncate_range
0.11 ± 34% +0.1 0.16 ± 17% perf-profile.self.cycles-pp.d_delete
0.06 ± 61% +0.1 0.12 ± 21% perf-profile.self.cycles-pp.get_cached_acl
0.07 ± 26% +0.1 0.13 ± 17%
perf-profile.self.cycles-pp.security_transition_sid
0.00 +0.1 0.07 ± 17% perf-profile.self.cycles-pp.unlink@plt
0.17 ± 14% +0.1 0.24 ± 10% perf-profile.self.cycles-pp.___d_drop
0.28 ± 10% +0.1 0.35 ± 12% perf-profile.self.cycles-pp.do_unlinkat
0.20 ± 10% +0.1 0.28 ± 6%
perf-profile.self.cycles-pp.security_inode_init_security
0.07 ± 59% +0.1 0.14 ± 23% perf-profile.self.cycles-pp.expand_files
0.19 ± 6% +0.1 0.28 ± 17% perf-profile.self.cycles-pp.lookup_fast
0.52 ± 10% +0.1 0.63 ± 5% perf-profile.self.cycles-pp.dput
0.14 ± 13% +0.1 0.25 ± 8% perf-profile.self.cycles-pp.memcpy_erms
0.38 ± 9% +0.1 0.49 ± 10%
perf-profile.self.cycles-pp.__virt_addr_valid
0.00 +0.3 0.26 ± 11%
perf-profile.self.cycles-pp.lockref_get_not_zero
1.29 ± 2% +0.3 1.56 ± 3%
perf-profile.self.cycles-pp.selinux_inode_permission
will-it-scale.per_thread_ops
19600 +-+-----------------------------------------------------------------+
| +.. .+.. .. + |
19400 +-+ .+..+. .. .+. + +..+ + .+. .+.. |
19200 +-++ + + + : + .+. +..+..+ +.+ |
| +.. : +.+. |
19000 +-+ : |
| + |
18800 +-+ |
| O O O O |
18600 +-+ O O O O O O O O O O |
18400 O-+ O O O |
| O O O
18200 +-+ O O O O |
| O O |
18000 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen