MM SPM, bench tool and general Simd/v3 #11617

victorjulien · 2024-08-07T19:07:20Z

Tool to benchmark detection engine content inspection, which is the inspection of individual groups of content, etc matches for a buffer. Also add a set of basic tests for the various single pattern matching implementation. Output is in csv. To files for the rule based tests. To stdout for the spm tests.

Rename to match coding style. Update callers.

AVX2 implementation that compares 32 bytes at a time. Rearrange code to make parts reusable. Fall back to smaller SIMD for remaining buffer. When (remaining) buffer is smaller than 32 bytes fall back to other SIMD implementations that deal with 16 bytes of data per iteration. Add 16/32/64 byte implementations using AVX512.

Implement for AVX512, AVX2 and SSE42.

Wrapper around `memmem`. The case sensitive search is implemented by directly calling `memmem`. As there is no case insensitieve variant available, a wrapper around memmem is created, that takes a sliding window approach: 1. take a slice of the haystack 2. convert it to lowercase 3. search it using memmem 4. move window forward

For the transform tolower, use new SIMD enabled tolower logic. On an AVX2 system, this gives a noticeable speed up: Non-SIMD: -------------------------------------------------------------------------------------------------------------------------------- Date: 8/4/2024 -- 20:06:58 -------------------------------------------------------------------------------------------------------------------------------- Stats for: total -------------------------------------------------------------------------------------------------------------------------------- Prefilter Ticks Called Max Ticks Avg Bytes Called Max Bytes Avg Bytes Ticks/Byte -------------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- file_data#172 (to_lowercase) 3318781786 424799 3201525 7812.00 252733244 20026 98304 12620.00 13.00 AVX2: -------------------------------------------------------------------------------------------------------------------------------- Date: 8/4/2024 -- 20:08:11 -------------------------------------------------------------------------------------------------------------------------------- Stats for: total -------------------------------------------------------------------------------------------------------------------------------- Prefilter Ticks Called Max Ticks Avg Bytes Called Max Bytes Avg Bytes Ticks/Byte -------------------------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- --------------- file_data#172 (to_lowercase) 865647888 424798 487271 2037.00 249009608 20326 98304 12250.00 3.00

codecov · 2024-08-07T20:03:37Z

Codecov Report

Attention: Patch coverage is 56.81159% with 149 lines in your changes missing coverage. Please review.

Project coverage is 82.48%. Comparing base (61cb14d) to head (68ab0c2).
Report is 140 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #11617      +/-   ##
==========================================
- Coverage   82.53%   82.48%   -0.05%     
==========================================
  Files         923      924       +1     
  Lines      248838   249228     +390     
==========================================
+ Hits       205381   205587     +206     
- Misses      43457    43641     +184

Flag	Coverage Δ
fuzzcorpus	`60.48% <45.45%> (-0.09%)`	⬇️
livemode	`18.57% <27.84%> (-0.08%)`	⬇️
pcap	`43.98% <45.45%> (-0.16%)`	⬇️
suricata-verify	`61.78% <45.96%> (-0.04%)`	⬇️
unittests	`59.04% <58.46%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

suricata-qa · 2024-08-07T23:50:02Z

Information: QA ran without warnings.

Pipeline 22015

src/util-spm.h

jasonish · 2024-08-08T17:50:42Z

src/util-memcmp.h

 #define UPPER_LOW   0x40 /* "A" - 1 */
 #define UPPER_HIGH  0x5B /* "Z" + 1 */

-static inline int SCMemcmpLowercase(const void *s1, const void *s2, size_t len)
+// clang-format off
+static char scmemcmp_sse41_ul[16] __attribute__((aligned(16))) = {


In general, as someone not that familiar with sse/avx/etc, I'd like to see more comments around what these are for?

src/util-memcmp.c

inashivb

Fun code.
Comments:

Minor nits inline. Nothing blocking the merge.
Tool worked well and created the expected benchmarks ✅
SIMD calculations to convert to lower seemed correct ✅
memmem SPM seemed to work as intended ✅

tools/benches/bench-content-inspect/main.c

inashivb · 2024-08-12T11:02:51Z

tools/benches/bench-content-inspect/main.c

+
+        uint64_t nsecs = diff.tv_sec * 1000000000ULL + diff.tv_nsec;
+        uint64_t nsecs_avg = nsecs / cnt;
+        total_nsecs += nsecs_avg;


Naming is a bit odd. Makes you wonder why does the average get added to the total?

inashivb · 2024-08-12T11:06:55Z

tools/benches/bench-content-inspect/main.c

+        uint64_t nsecs = diff.tv_sec * 1000000000ULL + diff.tv_nsec;
+        uint64_t nsecs_avg = nsecs / cnt;
+        total_nsecs += nsecs_avg;
+        total_evals++;


nit: Could we just use i?

inashivb · 2024-08-28T08:48:57Z

src/util-memcmp.h

+        mask1 = _mm256_cmpgt_epi8(b2, upper1);
+        /* mark all chars lower than upper2 */
+        mask2 = _mm256_cmpgt_epi8(upper2, b2);
+        /* merge the two, leaving only those that are true in both */


nit: They just have to be equal, not necessarily true.

good catch, but not a nit. I think this is a logic error. The goal is to take both masks (one for lower bound and the one for upper bound) and create a mask that is only true for bytes that satisfy both. Switching to _mm256_and_si256

ok. I tested this one on a small dataset and it gave correct results.. 🤔
Could you please share a string where this shows the logical error?

Edit: checked it again. It seems like an unneeded op indeed but not wrong. It's just that the condition in which both the masks are false cannot happen. lmk wdyt
In clearer words: This looks like a not so straightforward way of ANDing the masks to me as intended. Would indeed be good to replace w a proper and call. lmk wdyt

catenacyber

Needs a rebase and there is a logic error to fix apparently

catenacyber · 2025-01-28T15:06:26Z

What is the plan for this Victor ?

victorjulien added 8 commits August 7, 2024 18:46

github-actions: build and run bench tool

8226070

memcpy: rename memcpy_tolower

136c972

Rename to match coding style. Update callers.

memcpy: implement tolower using SIMD

f89b316

Implement for AVX512, AVX2 and SSE42.

spm: minor unittest cleanup

1467690

victorjulien requested review from jasonish and a team as code owners August 7, 2024 19:07

This was referenced Aug 7, 2024

New spm "mm"; Simd optimizations v2 #11615

Closed

Ci bench tool/v18 #11614

Closed

jasonish reviewed Aug 8, 2024

View reviewed changes

src/util-spm.h Show resolved Hide resolved

jasonish reviewed Aug 8, 2024

View reviewed changes

src/util-memcmp.c Show resolved Hide resolved

inashivb reviewed Aug 28, 2024

View reviewed changes

catenacyber added the needs rebase Needs rebase to master label Sep 3, 2024

catenacyber requested changes Sep 3, 2024

View reviewed changes

victorjulien marked this pull request as draft September 11, 2024 08:29

victorjulien mentioned this pull request Sep 13, 2024

Implement Memcmp SIMD for arm64 NEON and SVE #11725

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MM SPM, bench tool and general Simd/v3 #11617

MM SPM, bench tool and general Simd/v3 #11617

victorjulien commented Aug 7, 2024

codecov bot commented Aug 7, 2024 •

edited

Loading

suricata-qa commented Aug 7, 2024

jasonish Aug 8, 2024

inashivb left a comment

inashivb Aug 12, 2024

inashivb Aug 12, 2024

inashivb Aug 28, 2024

victorjulien Sep 1, 2024

inashivb Sep 2, 2024 •

edited

Loading

catenacyber left a comment

catenacyber commented Jan 28, 2025

MM SPM, bench tool and general Simd/v3 #11617

Are you sure you want to change the base?

MM SPM, bench tool and general Simd/v3 #11617

Conversation

victorjulien commented Aug 7, 2024

codecov bot commented Aug 7, 2024 • edited Loading

Codecov Report

suricata-qa commented Aug 7, 2024

jasonish Aug 8, 2024

Choose a reason for hiding this comment

inashivb left a comment

Choose a reason for hiding this comment

inashivb Aug 12, 2024

Choose a reason for hiding this comment

inashivb Aug 12, 2024

Choose a reason for hiding this comment

inashivb Aug 28, 2024

Choose a reason for hiding this comment

victorjulien Sep 1, 2024

Choose a reason for hiding this comment

inashivb Sep 2, 2024 • edited Loading

Choose a reason for hiding this comment

catenacyber left a comment

Choose a reason for hiding this comment

catenacyber commented Jan 28, 2025

codecov bot commented Aug 7, 2024 •

edited

Loading

inashivb Sep 2, 2024 •

edited

Loading