Firewall Wizards mailing list archives
Re: parsing logs ultra-fast inline
From: "Adrian Grigorof" <adi () grigorof com>
Date: Mon, 6 Feb 2006 23:08:59 -0500
To clarify this issue, I did not mean the total number of message types that you can find in the documentation. I meant the message types that you find in a typical log. While I was not thinking about a Cisco VPN concentrator in particular when I posted my initial message, I've just ran a script against a log from such device and there were 103 message types. I would say that's closer to 50 than to 2049. And Cisco seems to be quite verbose when it comes to logging. But I remember trying to get some information from a Nortel Contivity VPN few years ago - in debug mode there were maybe 10 or 15 message types. Which one would you prefer? Another example, Cisco Pix firewall - from all the logs that we gathered from our customers, and there were quite a few, we found a little bit over 160 unique message types. If you check the documentation, you'll probably find that in theory there could be thousands as well. I am not a super-programmer but I did a little test. I took a Cisco VPN log entry a measured how long it took me to write a regex for it. So for this: 2005-04-22 10:10:43 Local0.Notice 192.168.83.130 4721869 04/22/2005 10:10:43.280 SEV=4 AUTH/23 RPT=1560 192.168.176.104 User [192.168.176.104] Group [192.168.176.104] disconnected: duration: 11:36:49 it took me 2 minutes to write this: (\d{4})-(\d{2})-(\d{2}) (\d\d:\d\d:\d\d)\t(.*?)\t(.*?)\t(.*?) (.*?) (.*?) SEV=(.*?) (.*?) RPT=(.*?) (.*?) User \[(.*?)\] Group \[(.*?)\] (.*?): duration: (.*) A regex that would capture the information applicable to a AUTH/23 message type. And I can say that it was more a typing skills issue rather then one of regex knowledge. Now, 103 unique message types for a Cisco VPN x 2 min = 206 min or approx. 3.5 hours. But let's make it 35 hours... it still just few days of work for a medium level programmer to generate the regex for the most common messages from a Cisco VPN. You want to do all 2049 of them? Do the math - 8.5 days (8 hours per day) but let's be a good boss, give the programmer a whole month! Better still, you can outsource it to Eastern Europe or India. The good part, you only need to do it once. So let's face it, for practical purposes, the number of message types is not an issue, not if you are willing to hardcode your program for a particular device. What is an issue is the size of the logs - and yes, if you have to analyze large log files, each with a large number of message types it may require all sorts of tricks to do the job, including the type of parallel processing that Marcus was mentioning. But what choice do you have when this is the case? For performance reasons, you have to hardcode the parsing of each message type. More than that, you have to hardcode them so the most common ones are checked first. That being said, interpreting the results of parsing is as hard for small logs as it is for large logs and much harder than writing some regular expressions. And that was the point of my initial post, there were several discussions about how to parse the logs but hardly anything about what to do with the results. I remember the http://airsnarf.shmoo.com/pipermail/loganalysis/2005-December/002906.html thread - there were a couple of interesting messages, mostly about the inability to correlate logs from various sources. I couldn't see though how that was a log parsing problem - and in fact hardly anybody complained about being unable to parse the logs. Instead, the most common issue was the (in)ability to extract useful data from these logs. Regards, Adrian Grigorof Altair Technologies www.altairtech.ca www.eventid.net ----- Original Message ----- From: "Anton Chuvakin" <anton () chuvakin org> To: "Adrian Grigorof" <adi () grigorof com>; <firewall-wizards () honor icsalabs com> Sent: Monday, February 06, 2006 17:05 Subject: Re: [fw-wiz] parsing logs ultra-fast inline All, While I am preparing to enter this discussion in full force :-), I figured I'd shoot a quick one on this:
meaning. Take Tina's VPN example - how many types of log entries you would expect from a VPN concentrator? From my experience, not more than 20 but let's assume there are 50. Give a sample from each entry to a Perl
He-he, no :-) I just looked at the old documentation bundle of Cisco VPN 3000 messages and its nowhere near the above. How about 2049 unique messages documented by Cisco? Parsing IS often a challenge, e.g. see this and the discussion that ensued: http://airsnarf.shmoo.com/pipermail/loganalysis/2005-December/002906.html Syslog is where it becomes just plain extreme (50,000 message types anybody?), as Marcus pointed out, but there are some other fun areas where it is tough. Best, -- Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA http://www.chuvakin.org http://www.securitywarrior.com _______________________________________________ firewall-wizards mailing list firewall-wizards () honor icsalabs com http://honor.icsalabs.com/mailman/listinfo/firewall-wizards
Current thread:
- Re: parsing logs ultra-fast inline, (continued)
- Re: parsing logs ultra-fast inline Chuck Swiger (Feb 02)
- RE: parsing logs ultra-fast inline Tina Bird (Feb 02)
- Re: parsing logs ultra-fast inline Adrian Grigorof (Feb 03)
- Re: parsing logs ultra-fast inline Chuck Swiger (Feb 07)
- Re: parsing logs ultra-fast inline Marcus J. Ranum (Feb 07)
- Re: parsing logs ultra-fast inline Brian Loe (Feb 08)
- Message not available
- Re: parsing logs ultra-fast inline Marcus J. Ranum (Feb 08)
- Re: parsing logs ultra-fast inline John Adams (Feb 09)
- Re: parsing logs ultra-fast inline Adrian Grigorof (Feb 03)
- RE: parsing logs ultra-fast inline Paul Melson (Feb 15)
- Re: parsing logs ultra-fast inline Anton Chuvakin (Feb 07)
- Re: parsing logs ultra-fast inline Adrian Grigorof (Feb 07)
- Re: parsing logs ultra-fast inline Patrick M. Hausen (Feb 07)
- RE: parsing logs ultra-fast inline Tina Bird (Feb 07)