Hyperscan v4.4.0
by Barr, Matthew
Hi All,
We have just released Hyperscan v4.4.0 on Github.
https://github.com/01org/hyperscan/releases/tag/v4.4.0
This release contains a lot of changes since v4.3, however other than one small addition the external API is unchanged.
Here is the change log entry:
[4.4.0] 2017-01-20
- Introduce the "fat runtime" build. This will build several variants of the
Hyperscan scanning engine specialised for different processor feature sets,
and use the appropriate one for the host at runtime. This uses the "ifunc"
indirect function attribute provided by GCC and is currently available on
Linux only, where it is the default for release builds.
- New API function: add the `hs_valid_platform()` function. This function tests
whether the host provides the SSSE3 instruction set required by Hyperscan.
- Introduce a new standard benchmarking tool, "hsbench". This provides an easy
way to measure Hyperscan's performance for a particular set of patterns and
corpus of data to be scanned.
- Introduce a 64-bit GPR LimEx NFA model, which uses 64-bit GPRs on 64-bit
hosts and SSE registers on 32-bit hosts.
- Introduce a new DFA model ("McSheng") which is a hybrid of the existing
McClellan and Sheng models. This improves scanning performance for some
cases.
- Introduce lookaround specialisations to improve scanning performance.
- Improve the handling of long literals by moving confirmation to the Rose
interpreter and simplifying the hash table used to track them in streaming
mode.
- Improve compile time optimisation for removing redundant paths from
expression graphs.
- Build: improve support for building with MSVC toolchain.
- Reduce the size of small write DFAs used for small scans in block mode.
- Introduce a custom graph type (`ue2_graph`) used in place of the Boost Graph
Library's `adjacency_list` type. Improves compile time performance and type
safety.
- Improve scanning performance of the McClellan DFA.
- Bugfix for a very unusual SOM case where the incorrect start offset was
reported for a match.
- Bugfix for issue #37, removing execute permissions from some source files.
- Bugfix for issue #41, handle Windows line endings in pattern files.
regards,
Matt.
4 years, 1 month
Re: [Hyperscan] Hyperscan v4.4.0
by Jason Taylor
Great work, thanks to everyone that worked on this!
JT
On Jan 19, 2017 22:43, "Barr, Matthew" <matthew.barr(a)intel.com> wrote:
Hi All,
We have just released Hyperscan v4.4.0 on Github.
https://github.com/01org/hyperscan/releases/tag/v4.4.0
This release contains a lot of changes since v4.3, however other than one
small addition the external API is unchanged.
Here is the change log entry:
[4.4.0] 2017-01-20
- Introduce the "fat runtime" build. This will build several variants of the
Hyperscan scanning engine specialised for different processor feature
sets,
and use the appropriate one for the host at runtime. This uses the "ifunc"
indirect function attribute provided by GCC and is currently available on
Linux only, where it is the default for release builds.
- New API function: add the `hs_valid_platform()` function. This function
tests
whether the host provides the SSSE3 instruction set required by Hyperscan.
- Introduce a new standard benchmarking tool, "hsbench". This provides an
easy
way to measure Hyperscan's performance for a particular set of patterns
and
corpus of data to be scanned.
- Introduce a 64-bit GPR LimEx NFA model, which uses 64-bit GPRs on 64-bit
hosts and SSE registers on 32-bit hosts.
- Introduce a new DFA model ("McSheng") which is a hybrid of the existing
McClellan and Sheng models. This improves scanning performance for some
cases.
- Introduce lookaround specialisations to improve scanning performance.
- Improve the handling of long literals by moving confirmation to the Rose
interpreter and simplifying the hash table used to track them in streaming
mode.
- Improve compile time optimisation for removing redundant paths from
expression graphs.
- Build: improve support for building with MSVC toolchain.
- Reduce the size of small write DFAs used for small scans in block mode.
- Introduce a custom graph type (`ue2_graph`) used in place of the Boost
Graph
Library's `adjacency_list` type. Improves compile time performance and
type
safety.
- Improve scanning performance of the McClellan DFA.
- Bugfix for a very unusual SOM case where the incorrect start offset was
reported for a match.
- Bugfix for issue #37, removing execute permissions from some source files.
- Bugfix for issue #41, handle Windows line endings in pattern files.
regards,
Matt.
4 years, 1 month
Upcoming Hyperscan 4.4 release
by Barr, Matthew
Hi all,
We are very close to releasing v4.4 of Hyperscan, which includes a new feature that we are calling the "fat runtime". This has been pushed to the develop branch on Github if anyone would like to test it before the release.
Essentially, when building the fat runtime, the Hyperscan runtime code will be compiled multiple times for different instruction sets, and these compiled objects are combined into one library. There are no changes to how user applications are built against this library.
When applications are executed, the correct version of the runtime is selected for the machine that it is running on. This is done using a CPUID check for the presence of the instruction set, and then an indirect function is resolved so that the right version of each API function is used. There is no impact on function call performance, as this check and resolution is performed by the ELF loader once when the binary is loaded.
As this requires compiler, libc, and binutils support, at this time the fat runtime will only be enabled for Linux builds where the compiler supports the indirect function "ifunc" function attribute:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-...
This attribute should be on all supported versions of GCC, and recent versions of Clang and ICC. There is currently no support for this feature on non-Linux systems.
Release builds of Hyperscan will default to having the fat runtime enabled - this is the "FAT_RUNTIME" flag in CMake.
Regards,
Matt.
4 years, 1 month
Exceeding max pattern length causing segfault
by Matt Grimm
I am getting a segfault from hs_compile_multi() when trying to compile a
pattern of length ~1.5 MB. Using the same test code, I can compile a
similarly structured, but much smaller (~1K) pattern without error.
I see in grey.h that there is a fixed limit on pattern size, but also that
an error should be thrown when my pattern exceeds the limit:
https://github.com/01org/hyperscan
/blob/master/src/compiler/compiler.cpp#L229
Unfortunately, it is the strlen() call on that line that is causing the
segfault:
Program received signal SIGSEGV, Segmentation fault.
strlen () at ../sysdeps/x86_64/strlen.S:106
106 ../sysdeps/x86_64/strlen.S: No such file or directory.
(gdb) bt
#0 strlen () at ../sysdeps/x86_64/strlen.S:106
#1 0x000000000040a9d6 in ue2::addExpression (ng=..., index=index@entry=0,
expression=0x7ffff7075028 <error: Cannot access memory at address
0x7ffff7075028>, flags=1, ext=0x0, id=0)
at /home/ubuntu/workspace/hyperscan/src/compiler/compiler.cpp:229
#2 0x0000000000408845 in ue2::hs_compile_multi_int
(expressions=expressions@entry=0x8fa260, flags=flags@entry=0x8fa280,
ids=ids@entry=0x8fa2a0, ext=ext@entry=0x0, elements=elements@entry=1,
mode=mode@entry=1, platform=platform@entry=0x0,
db=db@entry=0x7fffffffde60, comp_error=comp_error@entry=0x7fffffffde68,
g=...) at /home/ubuntu/workspace/hyperscan/src/hs.cpp:229
#3 0x0000000000408fec in hs_compile_multi (expressions=0x8fa260,
flags=0x8fa280, ids=0x8fa2a0, elements=1, mode=1, platform=0x0,
db=0x7fffffffde60, error=0x7fffffffde68) at /home/ubuntu/workspace/hyperscan
/src/hs.cpp:297
#4 0x0000000000406aae in main (argc=2, argv=0x7fffffffe1e8) at
hs_compile.c:35
A few questions:
1) How can I tell at runtime what the max allowed pattern size is?
2) Is the above segfault a bug or a misuse of the library?
Thanks,
m.
4 years, 1 month
Encounter not found instruction error and poor performance of hyperscan
by Datong Li
Hi all, had anyone encounter this problem like me below:
[root@XXXX hs_build]# cmake --build .
/usr/bin/make64 MAC=64
Scanning dependencies of target ragel_Parser
[ 0%] Generating src/parser/Parser.cpp
[ 0%] Built target ragel_Parser
Scanning dependencies of target hs_exec
[ 0%] Building C object CMakeFiles/hs_exec.dir/src/alloc.c.o
[ 1%] Building C object CMakeFiles/hs_exec.dir/src/runtime.c.o
/tmp/ccoj8u4q.s: Assembler messages:
/tmp/ccoj8u4q.s:460: Error: no such instruction: `vpbroadcastb %xmm2,%xmm2'
/tmp/ccoj8u4q.s:470: Error: no such instruction: `vpbroadcastb %xmm3,%xmm3'
/tmp/ccoj8u4q.s:836: Error: no such instruction: `vpbroadcastb %xmm2,%xmm2'
/tmp/ccoj8u4q.s:846: Error: no such instruction: `vpbroadcastb %xmm3,%xmm3'
/tmp/ccoj8u4q.s:1191: Error: no such instruction: `vpbroadcastb %xmm1,%xmm1'
/tmp/ccoj8u4q.s:1318: Error: no such instruction: `vpbroadcastb %xmm1,%xmm1'
/tmp/ccoj8u4q.s:1933: Error: no such instruction: `shlx %r8,%rax,%rax'
/tmp/ccoj8u4q.s:2033: Error: no such instruction: `shlx %rdx,%r11,%rdx'
/tmp/ccoj8u4q.s:2052: Error: no such instruction: `shrx %esi,%eax,%eax'
/tmp/ccoj8u4q.s:2132: Error: no such instruction: `shlx %rax,%r12,%rax'
/tmp/ccoj8u4q.s:4303: Error: no such instruction: `shlx %rdx,%r11,%rdx'
/tmp/ccoj8u4q.s:4322: Error: no such instruction: `shrx %r8d,%eax,%eax'
/tmp/ccoj8u4q.s:4367: Error: no such instruction: `shlx %rax,%r15,%rax'
/tmp/ccoj8u4q.s:4437: Error: no such instruction: `shlx %rsi,%rax,%rax'
/tmp/ccoj8u4q.s:5203: Error: no such instruction: `shlx %rsi,%rax,%rax'
/tmp/ccoj8u4q.s:5388: Error: no such instruction: `shlx %rsi,%rax,%rax'
/tmp/ccoj8u4q.s:5675: Error: no such instruction: `shlx %rdx,%rax,%rdx'
/tmp/ccoj8u4q.s:5694: Error: no such instruction: `shrx %edi,%eax,%eax'
/tmp/ccoj8u4q.s:5747: Error: no such instruction: `shlx %rax,%rsi,%rax'
/tmp/ccoj8u4q.s:5838: Error: no such instruction: `shlx %rdx,%rax,%rdx'
/tmp/ccoj8u4q.s:5857: Error: no such instruction: `shrx %r8d,%eax,%eax'
/tmp/ccoj8u4q.s:5910: Error: no such instruction: `shlx %rax,%rsi,%rax'
/tmp/ccoj8u4q.s:6527: Error: no such instruction: `shlx %rdx,%r11,%rdx'
/tmp/ccoj8u4q.s:6546: Error: no such instruction: `shrx %r8d,%eax,%eax'
/tmp/ccoj8u4q.s:6590: Error: no such instruction: `shlx %rax,%r14,%rax'
/tmp/ccoj8u4q.s:6660: Error: no such instruction: `shlx %rsi,%rax,%rax'
/tmp/ccoj8u4q.s:7455: Error: no such instruction: `shlx %rdx,%r11,%rdx'
/tmp/ccoj8u4q.s:7474: Error: no such instruction: `shrx %r8d,%eax,%eax'
/tmp/ccoj8u4q.s:7519: Error: no such instruction: `shlx %rax,%r15,%rax'
/tmp/ccoj8u4q.s:7589: Error: no such instruction: `shlx %rsi,%rax,%rax'
/tmp/ccoj8u4q.s:8398: Error: no such instruction: `shlx %rdi,%rax,%rax'
/tmp/ccoj8u4q.s:8976: Error: no such instruction: `shlx %r8,%rax,%rax'
/tmp/ccoj8u4q.s:9384: Error: no such instruction: `shlx %rdx,%rax,%rdx'
/tmp/ccoj8u4q.s:9403: Error: no such instruction: `shrx %r8d,%eax,%eax'
/tmp/ccoj8u4q.s:9456: Error: no such instruction: `shlx %rax,%rdi,%rax'
/tmp/ccoj8u4q.s:9547: Error: no such instruction: `shlx %rdx,%rax,%rdx'
/tmp/ccoj8u4q.s:9566: Error: no such instruction: `shrx %ecx,%eax,%eax'
/tmp/ccoj8u4q.s:9619: Error: no such instruction: `shlx %rax,%r8,%rax'
/tmp/ccoj8u4q.s:9977: Error: no such instruction: `shlx %rcx,%r15,%rcx'
/tmp/ccoj8u4q.s:9996: Error: no such instruction: `shrx %r9d,%edx,%edx'
/tmp/ccoj8u4q.s:10039: Error: no such instruction: `shlx %rdx,%r14,%rdx'
/tmp/ccoj8u4q.s:10107: Error: no such instruction: `shlx %rsi,%rdx,%rdx'
make64[2]: *** [CMakeFiles/hs_exec.dir/src/runtime.c.o] Error 1
make64[1]: *** [CMakeFiles/hs_exec.dir/all] Error 2
make64: *** [all] Error 2
It seems like a problem about cpu instruction set, my /proc/cpuinfo return :
processor : 11
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
stepping : 2
microcode : 0x31
cpu MHz : 2400.000
cache size : 15360 KB
physical id : 1
siblings : 6
core id : 5
cpu cores : 6
apicid : 26
initial apicid : 26
fpu : yes
fpu_exception : yes
cpuid level : 15
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm
abm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
bogomips : 4793.25
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
[root@XXXX hs_build]# gcc -march=native -Q --help=target | grep march
-march= core-avx2
when I forbidden the instruction set avx and bmi2, then it could complete
the compilation, but the hyperscan 's performance is just about 500Mb/s,
it's really long distance with the official numbers. I doubt that it maybe
relate with the instruction sets.
--
With best regards,
Datong Li
4 years, 1 month