Snort mailing list archives

Re: Snort + HTSQL dashboard application


From: Dan Ferris <dferris () prometheusresearch com>
Date: Fri, 06 May 2011 14:30:28 -0600

Right now the backend for my demo is Postgresql.  Queries are being auto 
limited to 10,000 rows because the database is in a KVM VM and I don't 
want a query to crash anything.  The DB is around 500,000 rows as of 
yesterday afternoon.  I could probably have a few million rows, except I 
have Snort mostly firewalled off from the world and threshold.conf is 
set to only allow a few events per hour.

HTSQL supports Postgres, Mysql, Sqlite, Oracle, and MSSQL.  Oracle and 
MSSQL require commercial licenses.

Martin, you are right about what you said concerning views and 
simplifying that SQL query.  I was trying to find a complex example 
intentionally.  We feel that HTSQL is easier than regular SQL, but that 
isn't to say there is absolutely no learning curve or that it replaces 
SQL in all circumstances and applications.

HTSQL is designed to provide a few niche services:

1.  It's easier for less technical users than regular SQL.  Currently, 
Our main users are medical research scientists.  They wanted a way to 
access their raw data without needing to learn SQL or having to use a 
Thick database client like psql.  All they need is a browser and they 
can query the DB.

2.  You can use the queries in interesting ways.  Since it's a URL you 
can bookmark it, email it, IM it, etc.

3.  HTSQL makes creating things like dashboards pretty easy.  If you 
noticed, the Snort dashboard doesn't use any code except for HTML and 
Javascirpt.  I imagine writing the same thing in Python would take a few 
weeks and be a few hundred lines of code at least.

Thank you for the feedback,

Dan

On 5/6/2011 11:53 AM, beenph wrote:
Im almost on the same pace as Martin here as i would add, how does it
scale when you have ~10k event (just wondering).

And does it support multiple DBMS backend?
Would it be wrong to say that the query it generated would mainly only
run on mysql?

-elz


On Fri, May 6, 2011 at 11:26 AM, Martin Holste<mcholste () gmail com>  wrote:
Here's an example of what I mean, this is ths HTSQL query used to generate
the Events / Protocol / Hour graph:

http://snort.htsql.org
/((event?date(timestamp)=today())^{hour(timestamp)}){*1 :as
Day,count(^.tcphdr) :as TCP,count(^.udphdr) :as UDP, count(^.icmphdr) :as
ICMP}

This is the SQL equivalent as generated by the HTSQL engine:

  SELECT "event"."hour",
        COALESCE("tcphdr"."count", 0),
        COALESCE("udphdr"."count", 0),
        COALESCE("icmphdr"."count", 0)
  FROM (SELECT CAST(EXTRACT(HOUR FROM "event"."timestamp") AS INTEGER) AS
"hour"
       FROM "public"."event" AS "event"
       WHERE (CAST(EXTRACT(HOUR FROM "event"."timestamp") AS INTEGER) IS NOT
NULL)
             AND (CAST("event"."timestamp" AS DATE) = CURRENT_DATE)
       GROUP BY 1) AS "event"
      LEFT OUTER JOIN (SELECT COUNT(TRUE) AS "count",
                              CAST(EXTRACT(HOUR FROM "event"."timestamp") AS
INTEGER) AS "hour"
                       FROM "public"."event" AS "event"
                            INNER JOIN "public"."tcphdr" AS "tcphdr"
                                       ON (("event"."sid" = "tcphdr"."sid")
AND ("event"."cid" = "tcphdr"."cid"))
                       WHERE (CAST(EXTRACT(HOUR FROM "event"."timestamp") AS
INTEGER) IS NOT NULL)
                             AND (CAST("event"."timestamp" AS DATE) =
CURRENT_DATE)
                       GROUP BY 2) AS "tcphdr"
                      ON ("event"."hour" = "tcphdr"."hour")
      LEFT OUTER JOIN (SELECT COUNT(TRUE) AS "count",
                              CAST(EXTRACT(HOUR FROM "event"."timestamp") AS
INTEGER) AS "hour"
                       FROM "public"."event" AS "event"
                            INNER JOIN "public"."udphdr" AS "udphdr"
                                       ON (("event"."sid" = "udphdr"."sid")
AND ("event"."cid" = "udphdr"."cid"))
                       WHERE (CAST(EXTRACT(HOUR FROM "event"."timestamp") AS
INTEGER) IS NOT NULL)
                             AND (CAST("event"."timestamp" AS DATE) =
CURRENT_DATE)
                       GROUP BY 2) AS "udphdr"
                      ON ("event"."hour" = "udphdr"."hour")
      LEFT OUTER JOIN (SELECT COUNT(TRUE) AS "count",
                              CAST(EXTRACT(HOUR FROM "event"."timestamp") AS
INTEGER) AS "hour"
                       FROM "public"."event" AS "event"
                            INNER JOIN "public"."icmphdr" AS "icmphdr"
                                       ON (("event"."sid" = "icmphdr"."sid")
AND ("event"."cid" = "icmphdr"."cid"))
                       WHERE (CAST(EXTRACT(HOUR FROM "event"."timestamp") AS
INTEGER) IS NOT NULL)
                             AND (CAST("event"."timestamp" AS DATE) =
CURRENT_DATE)
                       GROUP BY 2) AS "icmphdr"
                      ON ("event"."hour" = "icmphdr"."hour")
  ORDER BY 1 ASC


Fair enough, but most of the time the ugly parts of SQL are abstracted
away in a view.  For instance, I have a view called v_events which
looks very ugly when defining the view:
CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY
DEFINER VIEW `v_events` AS select `event`.`sid` AS `sid`,`event`.`cid`
AS `cid`,concat(`event`.`sid`,_latin1'-',`event`.`cid`) AS
`id`,`event`.`timestamp` AS
`timestamp`,unix_timestamp(`event`.`timestamp`) AS
`timestamp_int`,inet_ntoa(`iphdr`.`ip_src`) AS
`srcip`,`iphdr`.`ip_src` AS `srcip_int`,inet_ntoa(`iphdr`.`ip_dst`) AS
`dstip`,`iphdr`.`ip_dst` AS `dstip_int`,`iphdr`.`ip_proto` AS
`ip_proto`,if((`tcphdr`.`tcp_sport` is not
null),`tcphdr`.`tcp_sport`,if((`udphdr`.`udp_sport` is not
null),`udphdr`.`udp_sport`,if((`icmphdr`.`icmp_type` is not
null),`icmphdr`.`icmp_type`,_latin1''))) AS
`srcport`,if((`tcphdr`.`tcp_dport` is not
null),`tcphdr`.`tcp_dport`,if((`udphdr`.`udp_dport` is not
null),`udphdr`.`udp_dport`,if((`icmphdr`.`icmp_code` is not
null),`icmphdr`.`icmp_type`,_latin1''))) AS
`dstport`,`signature`.`sig_name` AS
`sig_name`,`signature`.`sig_priority` AS
`sig_priority`,`signature`.`sig_sid` AS
`sig_sid`,`sig_class`.`sig_class_name` AS `sig_class_name` from
((((((`event` join `iphdr` on(((`event`.`sid` = `iphdr`.`sid`) and
(`event`.`cid` = `iphdr`.`cid`)))) join `signature`
on((`event`.`signature` = `signature`.`sig_id`))) join `sig_class`
on((`signature`.`sig_class_id` = `sig_class`.`sig_class_id`))) left
join `tcphdr` on(((`event`.`sid` = `tcphdr`.`sid`) and (`event`.`cid`
= `tcphdr`.`cid`)))) left join `udphdr` on(((`event`.`sid` =
`udphdr`.`sid`) and (`event`.`cid` = `udphdr`.`cid`)))) left join
`icmphdr` on(((`event`.`sid` = `icmphdr`.`sid`) and (`event`.`cid` =
`icmphdr`.`cid`))))

but provides nice, clean columns the way HTSQL does:
+----------------+---------------------+------+-----+---------+-------+
| Field          | Type                | Null | Key | Default | Extra |
+----------------+---------------------+------+-----+---------+-------+
| sid            | int(10) unsigned    | NO   |     | NULL    |       |
| cid            | int(10) unsigned    | NO   |     | NULL    |       |
| id             | varbinary(23)       | NO   |     |         |       |
| timestamp      | datetime            | NO   |     | NULL    |       |
| timestamp_int  | bigint(10)          | NO   |     | 0       |       |
| srcip          | varbinary(31)       | NO   |     |         |       |
| srcip_int      | int(10) unsigned    | NO   |     | NULL    |       |
| dstip          | varbinary(31)       | NO   |     |         |       |
| dstip_int      | int(10) unsigned    | NO   |     | NULL    |       |
| ip_proto       | tinyint(3) unsigned | NO   |     | NULL    |       |
| srcport        | varbinary(6)        | YES  |     | NULL    |       |
| dstport        | varbinary(6)        | YES  |     | NULL    |       |
| sig_name       | varchar(255)        | NO   |     | NULL    |       |
| sig_priority   | int(10) unsigned    | YES  |     | NULL    |       |
| sig_sid        | int(10) unsigned    | YES  |     | NULL    |       |
| sig_class_name | varchar(60)         | NO   |     | NULL    |       |
+----------------+---------------------+------+-----+---------+-------+

So the actual SQL query is much cleaner than you've shown and would be
more like:
SELECT COUNT(*) AS num_events, HOUR(timestamp), ip_proto FROM v_events
WHERE DATE(timestamp)=DATE(NOW()) GROUP BY 2,3 ORDER BY 2,3;

I think the main thing I'm wondering is this: your query looks just as
indecipherable to most users as the SQL, so this is clearly something
that web apps are going to be dynamically building behind the scenes,
right?  If so, then I'm having trouble believing that it is easier to
programmatically generate HTSQL versus SQL, because SQL was structured
for exactly such a purpose.  If not and you are expecting end users to
understand HTSQL, then I think it is unfair to compare it against raw
SQL table queries, because most of the time, users are presented views
of the data which are much cleaner.

My skepticism aside, this is a niche that needs to be filled--there
are not nearly enough decent web app OLAP/dimensional analysis tool
sets out there, so I applaud and encourage projects like these.

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Snort-users mailing list
Snort-users () lists sourceforge net
Go to this URL to change user options or unsubscribe:
https://lists.sourceforge.net/lists/listinfo/snort-users
Snort-users list archive:
http://www.geocrawler.com/redir-sf.php3?list=snort-users


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Snort-users mailing list
Snort-users () lists sourceforge net
Go to this URL to change user options or unsubscribe:
https://lists.sourceforge.net/lists/listinfo/snort-users
Snort-users list archive:
http://www.geocrawler.com/redir-sf.php3?list=snort-users


Current thread: