Dailydave mailing list archives

Re: data mining, computer security


From: Vitaly Osipov <vitaly.osipov () gmail com>
Date: Fri, 7 May 2010 09:48:15 +1000

This is a cool-sounding idea - data mining hostile data. I don't think there
has been much research in this area - all well-known algorithms assume a
mountain of inert data that they try extracting knowledge from. At the same
time, most of these algorithms have well-researched problems and it is
usually possible to come up with a data set that will confuse any given
algorithm.

It will be really interesting to see some research on how a small
contamination can be introduced in various data set to render a (one or
several) data mining algorithm inefficient. Then, how to counteract this
contamination by using multiple algorithms, all kinds of aggregation,
bagging and so on. Pity I am not a postdoc...

On Thu, May 6, 2010 at 3:50 PM, Josh Saxe <joshsaxe () yahoo com> wrote:

Hi --

Here's a thought I thought might get an interesting response on this list.

I've been thinking: cyber-security, narrowly constructed is focused on
protecting IT resources from being compromised or disabled.  Perhaps search
and classification problems are also an arena for cybersecurity, goals being
obfuscating patterns in a dataset, or finding them.

Anyone who wants to avoid classification by a facial recognition algorithm
should know how that algorithm works, and use that knowledge to evade
detection (is there a shape of fake mustache that optimally throws off a
given face classifier? =).  On the other hand the designers of the facial
recognition algorithms need to learn from the 'evaders' and respond with
more sophisticated algorithms.  This isn't dissimilar from the cat and mouse
game played between exploit developers and application security whitehats.
 And as the size of the world's databases, and the 'instrumentation' of
human activity grows, I wonder if this search / evasion problem won't become
more and more important.

There are definitely a lot of arenas where this is important.  Credit card
fraudsters can strengthen their positions by understanding the anomaly
detection algorithms used by the credit card companies and modifying their
purchasing behaviors to elide their detection.  Terrorists building training
camps need to understand image analysis algorithms that government agencies
run on satellite photos.  Of course, spammers have already been playing this
game with text classification algorithms, 'link-spammers' have played this
game with web-graph analysis algorithms, etc, so this isn't a new idea, but
one that I think is likely to become more important.

Josh


_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave

Current thread: