Greeting,
FYI, we noticed a 3.8% improvement of will-it-scale.per_thread_ops due to commit:
commit: ec6aba3d2be1ed75b3f4c894bb64a36d40db1f55 ("kprobes: Remove
kprobe::fault_handler")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git perf/core
in testcase: will-it-scale
on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G
memory
with following parameters:
nr_task: 100%
mode: thread
test: getppid1
cpufreq_governor: performance
ucode: 0x5003006
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel
copies to see if the testcase will scale. It builds both a process and threads based test
in order to see any differences between the two.
test-url:
https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/getppid1/will-it-scale/0x5003006
commit:
9ce4d216fe ("uprobes: Update uprobe_write_opcode() kernel-doc comment")
ec6aba3d2b ("kprobes: Remove kprobe::fault_handler")
9ce4d216fe8b581e ec6aba3d2be1ed75b3f4c894bb6
---------------- ---------------------------
%stddev %change %stddev
\ | \
7.36e+08 +3.8% 7.641e+08 will-it-scale.88.threads
8363471 +3.8% 8682920 will-it-scale.per_thread_ops
7.36e+08 +3.8% 7.641e+08 will-it-scale.workload
29916 ± 11% -13.2% 25973 ± 9% softirqs.CPU3.RCU
323.00 ± 2% +12.3% 362.83 ± 12%
interrupts.CPU14.RES:Rescheduling_interrupts
4510 ± 76% -59.9% 1806 ± 8%
interrupts.CPU81.CAL:Function_call_interrupts
3.828e+10 +3.8% 3.973e+10 perf-stat.i.branch-instructions
0.49 -0.2 0.25 ± 2% perf-stat.i.branch-miss-rate%
1.859e+08 -46.3% 99837696 ± 2% perf-stat.i.branch-misses
1.36 -3.7% 1.31 perf-stat.i.cpi
36366 ± 8% -13.2% 31556 ± 7% perf-stat.i.dTLB-load-misses
6.104e+10 +3.8% 6.335e+10 perf-stat.i.dTLB-loads
0.00 -0.0 0.00 ± 5% perf-stat.i.dTLB-store-miss-rate%
4.263e+10 +3.8% 4.425e+10 perf-stat.i.dTLB-stores
1.924e+08 ± 2% -36.2% 1.227e+08 ± 11% perf-stat.i.iTLB-load-misses
29467 ± 18% +3793.7% 1147349 ±214% perf-stat.i.iTLB-loads
1.782e+11 +3.8% 1.85e+11 perf-stat.i.instructions
928.95 ± 2% +68.2% 1562 ± 10% perf-stat.i.instructions-per-iTLB-miss
0.74 +3.9% 0.76 perf-stat.i.ipc
1613 +3.8% 1674 perf-stat.i.metric.M/sec
0.49 -0.2 0.25 ± 2% perf-stat.overall.branch-miss-rate%
1.36 -3.7% 1.31 perf-stat.overall.cpi
0.00 ± 13% -0.0 0.00 ± 5% perf-stat.overall.dTLB-load-miss-rate%
0.00 -0.0 0.00 ± 5% perf-stat.overall.dTLB-store-miss-rate%
926.74 ± 2% +64.7% 1526 ± 10%
perf-stat.overall.instructions-per-iTLB-miss
0.74 +3.9% 0.76 perf-stat.overall.ipc
3.815e+10 +3.8% 3.96e+10 perf-stat.ps.branch-instructions
1.853e+08 -46.3% 99536406 ± 2% perf-stat.ps.branch-misses
39607 ± 13% -18.1% 32424 ± 5% perf-stat.ps.dTLB-load-misses
6.084e+10 +3.8% 6.314e+10 perf-stat.ps.dTLB-loads
4.249e+10 +3.8% 4.41e+10 perf-stat.ps.dTLB-stores
1.917e+08 ± 2% -36.2% 1.223e+08 ± 11% perf-stat.ps.iTLB-load-misses
29376 ± 18% +3795.4% 1144323 ±214% perf-stat.ps.iTLB-loads
1.776e+11 +3.8% 1.843e+11 perf-stat.ps.instructions
5.363e+13 +3.9% 5.573e+13 perf-stat.total.instructions
32.44 -2.9 29.56
perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
40.06 -2.3 37.73
perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.getppid
12.76 -1.9 10.84
perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
9.08 -1.3 7.82
perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
13.43 -1.2 12.24
perf-profile.calltrace.cycles-pp.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
2.23 ± 2% -0.7 1.52
perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
9.04 -0.2 8.84
perf-profile.calltrace.cycles-pp.__task_pid_nr_ns.__x64_sys_getppid.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
3.02 +0.2 3.20
perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.getppid
2.98 +0.3 3.30
perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
8.84 +0.4 9.28 ± 5% perf-profile.calltrace.cycles-pp.testcase
7.14 +0.5 7.62
perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.getppid
43.06 +1.5 44.54
perf-profile.calltrace.cycles-pp.__entry_text_start.getppid
33.32 -2.8 30.49
perf-profile.children.cycles-pp.do_syscall_64
40.77 -2.3 38.48
perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
13.89 -1.9 11.95
perf-profile.children.cycles-pp.syscall_exit_to_user_mode
14.01 -1.3 12.75
perf-profile.children.cycles-pp.__x64_sys_getppid
9.41 -1.2 8.20
perf-profile.children.cycles-pp.exit_to_user_mode_prepare
2.44 -0.6 1.84
perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
9.73 -0.2 9.52
perf-profile.children.cycles-pp.__task_pid_nr_ns
1.18 -0.1 1.07
perf-profile.children.cycles-pp.rcu_read_unlock_strict
0.85 ± 2% -0.0 0.81 ± 2%
perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
3.22 +0.2 3.43
perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
3.05 +0.3 3.38
perf-profile.children.cycles-pp.syscall_enter_from_user_mode
22.54 +0.9 23.49
perf-profile.children.cycles-pp.syscall_return_via_sysret
27.86 +1.0 28.86
perf-profile.children.cycles-pp.__entry_text_start
8.21 -1.1 7.11
perf-profile.self.cycles-pp.exit_to_user_mode_prepare
3.92 -1.0 2.91
perf-profile.self.cycles-pp.__x64_sys_getppid
2.04 ± 2% -0.6 1.47
perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
2.46 -0.2 2.29
perf-profile.self.cycles-pp.syscall_exit_to_user_mode
0.81 ± 2% -0.1 0.72 ± 3%
perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
0.81 -0.1 0.74
perf-profile.self.cycles-pp.rcu_read_unlock_strict
2.72 +0.1 2.78 perf-profile.self.cycles-pp.do_syscall_64
3.19 +0.2 3.39
perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
2.38 +0.2 2.62
perf-profile.self.cycles-pp.syscall_enter_from_user_mode
12.62 +0.5 13.13
perf-profile.self.cycles-pp.__entry_text_start
7.57 +0.5 8.11
perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
19.30 +0.7 20.04 perf-profile.self.cycles-pp.getppid
22.23 +0.9 23.17
perf-profile.self.cycles-pp.syscall_return_via_sysret
will-it-scale.per_thread_ops
8.7e+06 +----------------------------------------------------------------+
| O O O O O |
8.65e+06 |-+ |
| |
8.6e+06 |-+ |
| ..+.... ..+....+ |
8.55e+06 |.. +.. : |
| : |
8.5e+06 |-+ : |
| : |
8.45e+06 |-+ : |
| : |
8.4e+06 |-+ : |
| +....+.... ..+....+.... ..|
8.35e+06 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang