Greeting,
There is no primary kpi change in this test, below is the data collected through multiple
monitors running background just for your information.
commit d4550809586dc011ef8c5a9179e6a4325e7b98e1 ("sched: look for idle cpu at wake
up")
https://git.linaro.org/people/vincent.guittot/kernel.git sched/pelt
in testcase: hackbench
on test machine: 8 threads Ivy Bridge with 16G memory
with following parameters:
nr_threads: 50%
mode: process
ipc: socket
cpufreq_governor: performance
test-description: Hackbench is both a benchmark and a stress test for the Linux kernel
scheduler.
test-url:
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sc...
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/ipc/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
gcc-6/performance/socket/x86_64-rhel-7.2/process/50%/debian-x86_64-2016-08-31.cgz/lkp-ivb-d01/hackbench
commit:
fba9fb5f5f ("sched: use load_avg for selecting idlest group")
d455080958 ("sched: look for idle cpu at wake up")
fba9fb5f5f79866e d4550809586dc011ef8c5a9179
---------------- --------------------------
%stddev %change %stddev
\ | \
34162896 ± 1% -8.0% 31437510 ± 1% hackbench.time.involuntary_context_switches
733.00 ± 0% +1.3% 742.25 ± 0% hackbench.time.percent_of_cpu_this_job_got
1.547e+08 ± 1% -9.7% 1.397e+08 ± 0% hackbench.time.voluntary_context_switches
20736 ± 20% +254.5% 73507 ± 25% softirqs.NET_RX
432890 ± 15% +89.1% 818743 ± 18% softirqs.RCU
528670 ± 0% +18.9% 628410 ± 1% softirqs.SCHED
75.25 ± 1% -11.3% 66.75 ± 0% vmstat.procs.r
363821 ± 0% -6.3% 341078 ± 0% vmstat.system.cs
56805 ± 0% -7.4% 52594 ± 0% vmstat.system.in
6352414 ± 2% -17.8% 5221427 ± 1% proc-vmstat.numa_hit
6352382 ± 2% -17.8% 5221393 ± 1% proc-vmstat.numa_local
18746436 ± 1% -18.5% 15281493 ± 1% proc-vmstat.pgalloc_normal
18734627 ± 1% -18.5% 15269275 ± 1% proc-vmstat.pgfree
92.92 ± 0% +1.3% 94.16 ± 0% turbostat.%Busy
3431 ± 0% +1.3% 3476 ± 0% turbostat.Avg_MHz
6.95 ± 0% -17.6% 5.72 ± 0% turbostat.CPU%c1
0.03 ± 0% -50.0% 0.02 ± 33% turbostat.CPU%c3
49520562 ± 3% -47.6% 25951758 ± 1% cpuidle.C1E-IVB.time
874280 ± 1% -17.1% 725099 ± 1% cpuidle.C1E-IVB.usage
24613560 ± 5% -78.2% 5355682 ± 1% cpuidle.C3-IVB.time
196895 ± 3% -63.3% 72279 ± 1% cpuidle.C3-IVB.usage
12706889 ± 5% -45.6% 6918345 ± 1% cpuidle.C6-IVB.time
37427 ± 8% -75.0% 9356 ± 1% cpuidle.C6-IVB.usage
181016 ± 1% +21.0% 218998 ± 2% cpuidle.POLL.usage
0.50 ± 0% +1.7% 0.51 ± 0% perf-stat.branch-miss-rate%
10.45 ± 3% -16.3% 8.75 ± 1% perf-stat.cache-miss-rate%
1.051e+11 ± 1% +13.1% 1.189e+11 ± 1% perf-stat.cache-references
2.227e+08 ± 1% -6.8% 2.075e+08 ± 0% perf-stat.context-switches
15827085 ± 0% +55.5% 24608395 ± 1% perf-stat.cpu-migrations
2.82e+12 ± 1% -2.5% 2.749e+12 ± 1% perf-stat.dTLB-stores
9.469e+08 ± 1% -3.6% 9.126e+08 ± 1% perf-stat.iTLB-load-misses
0.74 ± 0% -2.7% 0.72 ± 1% perf-stat.ipc
16738 ± 11% -31.2% 11515 ± 16% sched_debug.cfs_rq:/.min_vruntime.stddev
16747 ± 11% -31.3% 11512 ± 16% sched_debug.cfs_rq:/.spread0.stddev
1107 ± 4% +11.0% 1229 ± 3% sched_debug.cfs_rq:/.util_avg.avg
1497 ± 6% +12.4% 1683 ± 3% sched_debug.cfs_rq:/.util_avg.max
239.41 ± 11% +19.2% 285.40 ± 5% sched_debug.cfs_rq:/.util_avg.stddev
226803 ± 7% -34.6% 148369 ± 8% sched_debug.cpu.avg_idle.max
69458 ± 8% -41.4% 40697 ± 35% sched_debug.cpu.avg_idle.stddev
1518435 ± 2% +16.3% 1766592 ± 6% sched_debug.cpu.nr_switches.stddev
3.33 ± 2% -8.0% 3.06 ± 1%
perf-profile.calltrace.cycles-pp.__kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
3.02 ± 5% +16.7% 3.52 ± 2%
perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_sync_key.sock_def_readable.unix_stream_sendmsg.sock_sendmsg
4.20 ± 6% +26.3% 5.30 ± 2%
perf-profile.calltrace.cycles-pp.__wake_up_sync_key.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
1.01 ± 12% +38.0% 1.39 ± 9%
perf-profile.calltrace.cycles-pp._raw_spin_lock.sock_sendmsg.sock_write_iter.__vfs_write.vfs_write
0.77 ± 21% +114.2% 1.66 ± 2%
perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_sync_key.sock_def_readable.unix_stream_sendmsg.sock_sendmsg
2.92 ± 5% +17.5% 3.43 ± 2%
perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.sock_def_readable.unix_stream_sendmsg
4.46 ± 2% -18.0% 3.65 ± 1%
perf-profile.calltrace.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
1.05 ± 9% -22.4% 0.82 ± 7%
perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
5.43 ± 2% -13.8% 4.68 ± 1%
perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary
4.45 ± 2% -18.1% 3.65 ± 1%
perf-profile.calltrace.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
4.42 ± 2% -18.4% 3.61 ± 1%
perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
2.88 ± 5% +17.2% 3.38 ± 2%
perf-profile.calltrace.cycles-pp.default_wake_function.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.sock_def_readable
4.78 ± 3% -25.7% 3.55 ± 1%
perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
0.71 ± 22% +116.1% 1.54 ± 2%
perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__wake_up_sync_key.sock_def_readable.unix_stream_sendmsg
0.00 ± -1% +Inf% 0.82 ± 22%
perf-profile.calltrace.cycles-pp.select_task_rq_fair.try_to_wake_up.default_wake_function.autoremove_wake_function.__wake_up_common
1.77 ± 7% +15.5% 2.04 ± 2%
perf-profile.calltrace.cycles-pp.skb_queue_tail.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
5.76 ± 4% +22.9% 7.07 ± 1%
perf-profile.calltrace.cycles-pp.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.__vfs_write
5.47 ± 2% -13.8% 4.71 ± 1%
perf-profile.calltrace.cycles-pp.start_secondary
2.79 ± 5% +18.4% 3.31 ± 3%
perf-profile.calltrace.cycles-pp.try_to_wake_up.default_wake_function.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
1.63 ± 2% -13.6% 1.41 ± 1%
perf-profile.children.cycles-pp.___slab_alloc
1.79 ± 2% -14.0% 1.54 ± 1%
perf-profile.children.cycles-pp.__slab_alloc
3.47 ± 5% +18.1% 4.10 ± 1%
perf-profile.children.cycles-pp.__wake_up_common
4.66 ± 5% +26.3% 5.89 ± 1%
perf-profile.children.cycles-pp.__wake_up_sync_key
3.78 ± 4% +34.6% 5.08 ± 1%
perf-profile.children.cycles-pp._raw_spin_lock_irqsave
3.37 ± 4% +18.9% 4.00 ± 2%
perf-profile.children.cycles-pp.autoremove_wake_function
5.01 ± 1% -17.8% 4.12 ± 1%
perf-profile.children.cycles-pp.call_cpuidle
6.11 ± 1% -13.4% 5.29 ± 0%
perf-profile.children.cycles-pp.cpu_startup_entry
5.00 ± 1% -17.7% 4.11 ± 1%
perf-profile.children.cycles-pp.cpuidle_enter
4.96 ± 1% -18.0% 4.07 ± 1%
perf-profile.children.cycles-pp.cpuidle_enter_state
3.32 ± 5% +18.5% 3.93 ± 2%
perf-profile.children.cycles-pp.default_wake_function
4.89 ± 1% -18.2% 4.00 ± 1% perf-profile.children.cycles-pp.intel_idle
2.36 ± 5% +47.3% 3.48 ± 4%
perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.76 ± 6% +54.4% 1.18 ± 4%
perf-profile.children.cycles-pp.prepare_to_wait
0.69 ± 6% +89.2% 1.31 ± 4%
perf-profile.children.cycles-pp.select_task_rq_fair
5.80 ± 4% +22.7% 7.11 ± 1%
perf-profile.children.cycles-pp.sock_def_readable
5.47 ± 2% -13.8% 4.71 ± 1%
perf-profile.children.cycles-pp.start_secondary
3.21 ± 4% +19.8% 3.85 ± 2%
perf-profile.children.cycles-pp.try_to_wake_up
1.48 ± 2% +10.7% 1.64 ± 3%
perf-profile.children.cycles-pp.unix_write_space
2.44 ± 1% +16.1% 2.83 ± 1%
perf-profile.self.cycles-pp._raw_spin_lock_irqsave
4.89 ± 1% -18.2% 4.00 ± 1% perf-profile.self.cycles-pp.intel_idle
2.36 ± 5% +47.3% 3.48 ± 4%
perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.58 ± 0% +13.3% 1.79 ± 3%
perf-profile.self.cycles-pp.sock_def_readable
perf-stat.cpu-migrations
2.8e+07 ++----------------------------------------------------------------+
| O |
2.6e+07 O+ O O O O O O O O O O O O O O O O O O |
| O O O O
2.4e+07 ++ O |
| |
2.2e+07 ++ |
| |
2e+07 ++ |
| |
1.8e+07 ++ |
| |
1.6e+07 *+.*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..*.*..*..*..* |
| |
1.4e+07 ++----------------------------------------------------------------+
turbostat.Avg_MHz
3490 ++-------------------------------------------------------------------+
| O O O O O O O |
3480 O+ O O O O O O O O O O |
| O O O O O O O
| |
3470 ++ |
| |
3460 ++ |
| |
3450 ++ |
| |
| |
3440 ++ .*..*.*..*..*.. .*..*..*..*..*.. |
*..*..*. *..*..*..*..* *.. |
3430 ++-----------------------------------------------------*--*-*--*-----+
turbostat._Busy
94.6 ++-------------------------------------------------------------------+
| O |
94.4 O+ O O O O O O O O O O O O O O O O |
94.2 ++ O O O O O |
| O O
94 ++ |
93.8 ++ |
| |
93.6 ++ |
93.4 ++ |
| |
93.2 ++ .*.. .*..*.. .*.. |
93 *+.*..*..*..*.*. *..*..*..*..*..* *. *..*.. |
| *..*.*..* |
92.8 ++-------------------------------------------------------------------+
hackbench.time.percent_of_cpu_this_job_got
746 ++--------------------------------------------------------------------+
| O |
744 O+ O O O O O O O O O O O O O O O |
| O O O O O |
742 ++ O O O
| |
740 ++ |
| |
738 ++ |
| |
736 ++ |
*.. .*..*.. .*..*.*..*.. .*..*..*..*..*.. |
734 ++ *. *. *..*..*..*. *. |
| *..*..*..* |
732 ++--------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong