Snort mailing list archives

Re: Optimized implementation of AC and AC_Q pattern matching algorithms


From: abed mohammad kamaluddin <abedamu () gmail com>
Date: Tue, 29 Jan 2013 00:17:52 +0530

Hi Pablo,

I tried you modification on OCTEON today and was disappointed that
there was a drop in performance on it, however on Intel there is
significant improvement in perf. In comparison to unmodified snort
using Network traffic and ac-q:

                                      cross-compiling (Octeon)
      gcc (Intel)
Using your modification - 15% down                                      21% up
Using mine                  -  11% up
       12% up


On Intel, we can remove sindex altogether:

       state = ps[2 + xlatcase[T[0]]]

Maybe it could be because I am cross compiling ; there were lot of
wasted cycles in my case. Unless I introduce an intermediate state, I
am not getting any improvement. I had earlier analyzed this routine in
detail and each cycle reduced in this macro gave significant perf
improvements.

Could you please mention the following:
- Which compiler have you used (gcc/icc version).
- If you can provide the pcap file you are using for the tests it would help.

I will try your modifications on a bigger Intel machine sometime this
week and report the findings.

Thanks,
Abed M K


On Sun, Jan 27, 2013 at 12:27 AM, abed mohammad kamaluddin
<abedamu () gmail com> wrote:

Hi Pablo,

That looks neater :) I will try your modifications on our system on Monday
when I return to work. I used zero-alert traffic to measure perf which gave
me 11% up on 2.9.0.4.

PS: I have noticed a more than 15% overall performance drop on 2.9.4 as
compare to 2.9.3.1 when compiled with exactly same options. I haven't
analyzed it so far, however this particular optimization has the same
relative improvement on all versions (2.9.0.4, 2.9.3.1 and 2.9.4) .

Thanks,
Abed



Message: 2

Date: Sat, 26 Jan 2013 16:50:37 +0100
From: Pablo Cantos <pablocantos () gmail com>
Subject: Re: [Snort-devel] Optimized implementation of AC and AC_Q
        pattern matching algorithms
To: snort-devel () lists sourceforge net
Message-ID:

<CADQcQ1OzJDso9FEQhG77dGtSTEiX=J80bxRDMv9t8Yz7WhH0-g () mail gmail com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Abed,

First of all, thanks for your contribution.

I have checked your proposal in snort 2.9.1 by using two different pcap
files, one of them (612MB-sized) throws around 900k alerts and the other
one (1GB-sized) throws just 55 alerts. I have used an AMD Turion II
dual-core mobile M520 at 2.3GHz, 512KB cache L2 by each core and 4GB RAM.

These are the performance jumps that I have obtained:

612MB file -> 4.24% for MPSE, 5.93% for ac-q
1GB file -> 7.93% for MPSE, 8.03% for ac-q

These results were obtained by measuring the times taken to analyze each
pcap file by the MPSE and AC, separately.

I have been working for several months with Snort to get improvements in
other areas of the MPSE, I have also studied the AC_SEARCH macros and I
didn't find them completely efficients but I didn't go further. Now,
after
seeing your proposal and checking the AC_SEARCH_Q again I thought that
taking small changes it could work even better.

You have suggested to pre-fetch the new state before was used it:

#define AC_SEARCH_Q \
+   *acstate_t new_state;* \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
        sindex = xlatcase[T[0]]; \
+       *new_state = ps[2 + sindex];*  \
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
*-       state = ps[2 + sindex]; \
*
+       *state = new_state;* \
    }
 #endif

But I think the routine could be more efficient too if we just fetch the
new state in the end, and we move down one line:

#define AC_SEARCH_Q \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
*-       sindex = xlatcase[T[0]]; \*
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
*+       sindex = xlatcase[T[0]]; \*
        state = ps[2 + sindex]; \
    }
#endif

In this way, I think the number of instructions could be reduced too.

And these are the results that I have obtained:

612MB file -> 15.66% for MPSE, 22.13% for ac-q
1GB file -> 34.49% for MPSE, 35.13% for ac-q

I will try to repeat these tests using other more powerful computers. In
any case, we will give more information on Monday.
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 3
Date: Sat, 26 Jan 2013 11:26:55 -0500
From: Joel Esler <jesler () sourcefire com>
Subject: Re: [Snort-devel] Optimized implementation of AC and AC_Q
        pattern matching algorithms
To: Pablo Cantos <pablocantos () gmail com>
Cc: "snort-devel () lists sourceforge net"
        <snort-devel () lists sourceforge net>
Message-ID: <2B25430D-9736-49BB-9A81-EC9AF163C28F () sourcefire com>
Content-Type: text/plain; charset="utf-8"

Pablo,

Great stuff. We need to make sure we are testing and modifying against
2.9.4 though.

--
Joel Esler
Sent from my iPhone ?

On Jan 26, 2013, at 10:50 AM, Pablo Cantos <pablocantos () gmail com> wrote:

Hi Abed,

First of all, thanks for your contribution.

I have checked your proposal in snort 2.9.1 by using two different pcap
files, one of them (612MB-sized) throws around 900k alerts and the other one
(1GB-sized) throws just 55 alerts. I have used an AMD Turion II dual-core
mobile M520 at 2.3GHz, 512KB cache L2 by each core and 4GB RAM.

These are the performance jumps that I have obtained:

612MB file -> 4.24% for MPSE, 5.93% for ac-q
1GB file -> 7.93% for MPSE, 8.03% for ac-q

These results were obtained by measuring the times taken to analyze
each pcap file by the MPSE and AC, separately.

I have been working for several months with Snort to get improvements
in other areas of the MPSE, I have also studied the AC_SEARCH macros and I
didn't find them completely efficients but I didn't go further. Now, after
seeing your proposal and checking the AC_SEARCH_Q again I thought that
taking small changes it could work even better.

You have suggested to pre-fetch the new state before was used it:

#define AC_SEARCH_Q \
+   acstate_t new_state; \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
        sindex = xlatcase[T[0]]; \
+       new_state = ps[2 + sindex];  \
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
-       state = ps[2 + sindex]; \
+       state = new_state; \
    }
#endif

But I think the routine could be more efficient too if we just fetch
the new state in the end, and we move down one line:

#define AC_SEARCH_Q \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
-       sindex = xlatcase[T[0]]; \
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
+       sindex = xlatcase[T[0]]; \
        state = ps[2 + sindex]; \
    }
#endif

In this way, I think the number of instructions could be reduced too.

And these are the results that I have obtained:

612MB file -> 15.66% for MPSE, 22.13% for ac-q
1GB file -> 34.49% for MPSE, 35.13% for ac-q

I will try to repeat these tests using other more powerful computers.
In any case, we will give more information on Monday.


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Snort-devel mailing list
Snort-devel () lists sourceforge net
https://lists.sourceforge.net/lists/listinfo/snort-devel
Archive:
http://sourceforge.net/mailarchive/forum.php?forum_name=snort-devel

Please visit http://blog.snort.org for the latest news about Snort!
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------



------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d

------------------------------


_______________________________________________
Snort-devel mailing list
Snort-devel () lists sourceforge net
https://lists.sourceforge.net/lists/listinfo/snort-devel


End of Snort-devel Digest, Vol 78, Issue 10
*******************************************



------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Snort-devel mailing list
Snort-devel () lists sourceforge net
https://lists.sourceforge.net/lists/listinfo/snort-devel
Archive:
http://sourceforge.net/mailarchive/forum.php?forum_name=snort-devel

Please visit http://blog.snort.org for the latest news about Snort!


Current thread: