Firewall Wizards mailing list archives

Re: parsing logs ultra-fast inline


From: "Adrian Grigorof" <adi () grigorof com>
Date: Mon, 6 Feb 2006 23:08:59 -0500

To clarify this issue, I did not mean the total number of message types that
you can find in the documentation. I meant the message types that you find
in a typical log. While I was not thinking about a Cisco VPN concentrator in
particular when I posted my initial message, I've just ran a script against
a log from such device and there were 103 message types. I would say that's
closer to 50 than to 2049. And Cisco seems to be quite verbose when it comes
to logging. But I remember trying to get some information from a Nortel
Contivity VPN few years ago - in debug mode there were maybe 10 or 15
message types. Which one would you prefer?

Another example, Cisco Pix firewall - from all the logs that we gathered
from our customers, and there were quite a few, we found a little bit over
160 unique message types. If you check the documentation, you'll probably
find that in theory there could be thousands as well.

I am not a super-programmer but I did a little test. I took a Cisco VPN log
entry a measured how long it took me to write a regex for it. So for this:

2005-04-22 10:10:43 Local0.Notice 192.168.83.130 4721869 04/22/2005
10:10:43.280 SEV=4 AUTH/23 RPT=1560 192.168.176.104  User [192.168.176.104]
Group [192.168.176.104] disconnected: duration: 11:36:49

it took me 2 minutes to write this:

(\d{4})-(\d{2})-(\d{2}) (\d\d:\d\d:\d\d)\t(.*?)\t(.*?)\t(.*?) (.*?) (.*?)
SEV=(.*?) (.*?) RPT=(.*?) (.*?) User \[(.*?)\] Group \[(.*?)\] (.*?):
duration: (.*)

A regex that would capture the information applicable to a AUTH/23 message
type. And I can say that it was more a typing skills issue rather then one
of regex knowledge. Now, 103 unique message types for a Cisco VPN x 2 min =
206 min or approx. 3.5 hours. But let's make it 35 hours... it still just
few days of work for a medium level programmer to generate the regex for the
most common messages from a Cisco VPN. You want to do all 2049 of them? Do
the math - 8.5 days (8 hours per day) but let's be a good boss, give the
programmer a whole month! Better still, you can outsource it to Eastern
Europe or India. The good part, you only need to do it once.

So let's face it, for practical purposes, the number of message types is not
an issue, not if you are willing to hardcode your program for a particular
device. What is an issue is the size of the logs - and yes, if you have to
analyze large log files, each with a large number of message types it may
require all sorts of tricks to do the job, including the type of parallel
processing that Marcus was mentioning. But what choice do you have when this
is the case? For performance reasons, you have to hardcode the parsing of
each message type. More than that, you have to hardcode them so the most
common ones are checked first.

That being said, interpreting the results of parsing is as hard for small
logs as it is for large logs and much harder than writing some regular
expressions. And that was the point of my initial post, there were several
discussions about how to parse the logs but hardly anything about what to do
with the results.

I remember the
http://airsnarf.shmoo.com/pipermail/loganalysis/2005-December/002906.html
thread - there were a couple of interesting messages, mostly about the
inability to correlate logs from various sources. I couldn't see though how
that was a log parsing problem - and in fact hardly anybody complained about
being unable to parse the logs. Instead, the most common issue was the
(in)ability to extract useful data from these logs.


Regards,

Adrian Grigorof
Altair Technologies
www.altairtech.ca
www.eventid.net


----- Original Message ----- 
From: "Anton Chuvakin" <anton () chuvakin org>
To: "Adrian Grigorof" <adi () grigorof com>;
<firewall-wizards () honor icsalabs com>
Sent: Monday, February 06, 2006 17:05
Subject: Re: [fw-wiz] parsing logs ultra-fast inline


All,

While I am preparing to enter this discussion in full force :-), I
figured I'd shoot a quick one on this:

meaning. Take Tina's VPN example - how many types of log entries you would
expect from a VPN concentrator? From my experience, not more than 20 but
let's assume there are 50. Give a sample from each entry to a Perl

He-he, no :-) I just looked at the old documentation bundle of Cisco
VPN 3000 messages and its nowhere near the above. How about 2049
unique messages documented by Cisco?

Parsing IS often a challenge, e.g. see this and the discussion that
ensued:
http://airsnarf.shmoo.com/pipermail/loganalysis/2005-December/002906.html

Syslog is where it becomes just plain  extreme (50,000 message types
anybody?), as Marcus pointed out, but there are some other fun areas
where it is tough.

Best,
--
Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA
     http://www.chuvakin.org
http://www.securitywarrior.com


_______________________________________________
firewall-wizards mailing list
firewall-wizards () honor icsalabs com
http://honor.icsalabs.com/mailman/listinfo/firewall-wizards


Current thread: