FYI, we noticed a -3.8% regression of netperf.Throughput_Mbps due to commit:
commit 99a544df069c325e161d4f5246586161d84e9e82 ("timer: Switch to a non cascading
wheel")
https://github.com/yyu168/linux.git WIP.timers
in testcase: netperf
on test machine: 48 threads Ivytown Ivy Bridge-EP with 64G memory
with following parameters:
cluster=cs-localhost/cpufreq_governor=performance/ip=ipv4/nr_threads=200%/runtime=300s/test=SCTP_STREAM
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-4.9/performance/ipv4/x86_64-rhel/200%/debian-x86_64-2015-02-07.cgz/300s/ivb43/SCTP_STREAM/netperf
commit:
672573b753 ("timer: Reduce the CPU index space to 256k")
99a544df06 ("timer: Switch to a non cascading wheel")
672573b753e951a6 99a544df069c325e161d4f5246
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4
kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
1:4 -25% :4 kmsg.Spurious_LAPIC_timer_interrupt_on_cpu
%stddev %change %stddev
\ | \
4.25 ± 0% -3.8% 4.09 ± 0% netperf.Throughput_Mbps
73653 ± 0% -2.9% 71503 ± 0% netperf.time.voluntary_context_switches
748.50 ± 9% -22.5% 580.25 ± 3% cpuidle.POLL.usage
2745 ± 2% -5.1% 2605 ± 1% vmstat.system.cs
5.58 ± 0% +0.8% 5.62 ± 0% turbostat.%Busy
86.00 ± 0% +2.0% 87.75 ± 0% turbostat.Avg_MHz
1265 ± 6% -8.4% 1158 ± 4% sched_debug.cfs_rq:/.min_vruntime.avg
3948 ± 2% -8.7% 3603 ± 1% sched_debug.cpu.nr_load_updates.stddev
0.23 ± 1% +8.0% 0.25 ± 3% sched_debug.cpu.nr_running.stddev
4.66 ± 7% +29.9% 6.05 ± 7%
perf-profile.cycles-pp.__tick_nohz_idle_enter.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
2.29 ± 27% -26.2% 1.69 ± 5%
perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.rest_init.start_kernel.x86_64_start_reservations
2.29 ± 27% -26.2% 1.69 ± 5%
perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.rest_init.start_kernel
1.33 ± 3% -20.5% 1.06 ± 15%
perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.rest_init
0.00 ± -1% +Inf% 1.58 ± 18%
perf-profile.cycles-pp.get_next_timer_interrupt.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_irq_exit.irq_exit
8.03 ± 5% +16.3% 9.34 ± 3%
perf-profile.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
2.73 ± 44% -31.3% 1.87 ± 1%
perf-profile.cycles-pp.rest_init.start_kernel.x86_64_start_reservations.x86_64_start_kernel
2.73 ± 44% -31.3% 1.87 ± 1%
perf-profile.cycles-pp.start_kernel.x86_64_start_reservations.x86_64_start_kernel
4.83 ± 6% +29.9% 6.28 ± 6%
perf-profile.cycles-pp.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
3.53 ± 8% +38.3% 4.88 ± 7%
perf-profile.cycles-pp.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt
2.73 ± 44% -31.3% 1.87 ± 1% perf-profile.cycles-pp.x86_64_start_kernel
2.73 ± 44% -31.3% 1.87 ± 1%
perf-profile.cycles-pp.x86_64_start_reservations.x86_64_start_kernel
3.797e+10 ± 1% +6.7% 4.052e+10 ± 3% perf-stat.L1-dcache-loads
2.954e+08 ± 1% +9.0% 3.219e+08 ± 2% perf-stat.L1-dcache-prefetch-misses
1.406e+09 ± 2% +4.3% 1.466e+09 ± 1% perf-stat.LLC-prefetches
1.897e+10 ± 3% +16.1% 2.202e+10 ± 3% perf-stat.branch-instructions
2.176e+09 ± 3% +5.2% 2.29e+09 ± 3% perf-stat.branch-load-misses
1.84e+10 ± 3% +17.5% 2.161e+10 ± 4% perf-stat.branch-loads
4.727e+10 ± 0% +1.7% 4.805e+10 ± 1% perf-stat.bus-cycles
831778 ± 2% -5.0% 789901 ± 1% perf-stat.context-switches
9.632e+11 ± 1% +3.5% 9.97e+11 ± 0% perf-stat.cpu-cycles
174418 ± 0% -1.7% 171529 ± 0% perf-stat.cpu-migrations
3.815e+10 ± 2% +8.2% 4.129e+10 ± 5% perf-stat.dTLB-loads
20335861 ± 1% -16.1% 17057586 ± 2% perf-stat.iTLB-loads
9.431e+10 ± 2% +14.8% 1.083e+11 ± 2% perf-stat.instructions
3.957e+08 ± 0% -6.2% 3.711e+08 ± 2% perf-stat.node-loads
netperf.Throughput_Mbps
4.26 ++-------------------------------------------------------------------+
*.*.*.*..*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.*
4.24 ++ |
4.22 ++ |
| |
4.2 ++ |
4.18 ++ |
| |
4.16 ++ |
4.14 ++ |
| |
4.12 ++ |
4.1 O+O O O O O O |
| O O O O O O O O O O O O O O O O O O O O O O O O O
4.08 ++---------------O---------------------------------------------------+
netperf.time.voluntary_context_switches
74500 ++-----------------*------------------------------------------------+
| : + .*.. .*. |
74000 *+*. .*.. .*. : * *. .*.*.* *.. .*.*.*. .*..*. .*
73500 ++ * *.* *.* *.* *.* *.* *.*.* |
| |
73000 ++ |
72500 ++ |
| |
72000 ++ O O O O |
71500 ++ O O O O O O |
| O O O O O O O O O O O O O O O O
71000 ++ |
70500 ++ |
O O O O O O O |
70000 ++------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Thanks,
Xiaolong