Hello!
I have been testing performance of Hyperscan after compiling it with different
optimization levels using gcc 4.9.2 and got somewhat unexpected results.
Tests were performed on Ivy Bridge architecture.
Unexpected results:
1. -O2 runs about 5%-10% faster than -O3
2. Compiling with AVX extension gives same or even reduced performance as compiling
without AVX extension(using SSE4.2)
If anyone performed similar optimization tuning and can suggest any optimizations,
including using different compiler than gcc(for example intel compiler) it would be much
appreciated!
Thanks!
Stas
Show replies by date
Hi Stas,
In general we see good performance with recent (5.x) versions of GCC and the default build
configuration - i.e. a release build with the "-O3 -march=native" flags.
It's quite possible that for a particular test, you might see different results with
different compiler flags.
Different scanning engines within Hyperscan make different use of the various
architectural features we use. For example, the literal matching engines and the NFA can
make use of AVX2 where available, while a pattern set dominated by DFA execution will not.
In general, we would expect adding AVX2 support to the build not to reduce performance, so
that is unusual.
(I should note that Hyperscan does not directly make use of features provided by AVX, only
AVX2.)
We can help tune your performance if you can describe your test in some detail; there are
many variables that can make a difference.
Can you share the following data?
- Which Hyperscan version you are using.
- A sample of the patterns you are scanning for.
- A description of the traffic you are scanning.
- How you are performing your test; are you using the "pcapbench"
example, or some other test?
- Are you scanning in block, streaming or vectored mode?
Have you tried profiling your performance test with a sampling profiler (like the Linux
"perf" tool)? If so, the output of that will make it easy to see where the
cycles are being spent as you are scanning.
Best regards,
Justin
From: Hyperscan [mailto:hyperscan-bounces@lists.01.org] On Behalf Of Stanislav Podolsky
Sent: Sunday, May 08, 2016 11:08 PM
To: hyperscan(a)lists.01.org
Subject: [Hyperscan] hyperscan library, optimizing compilation
Hello!
I have been testing performance of Hyperscan after compiling it with different
optimization levels using gcc 4.9.2 and got somewhat unexpected results.
Tests were performed on Ivy Bridge architecture.
Unexpected results:
1. -O2 runs about 5%-10% faster than -O3
2. Compiling with AVX extension gives same or even reduced performance as compiling
without AVX extension(using SSE4.2)
If anyone performed similar optimization tuning and can suggest any optimizations,
including using different compiler than gcc(for example intel compiler) it would be much
appreciated!
Thanks!
Stas