WebApp Sec mailing list archives

Re: Hit Throttling - Content Theft Prevention


From: Steve Shah <sshah () risingedge org>
Date: Wed, 19 Oct 2005 08:13:18 -0700

On Wed, Oct 19, 2005 at 04:11:56AM -0600, Kurt Seifried wrote:
use the same user agent....) and they also come (for my site anyways) from 
a few well defined class C's. Whitelisting legitimate crawlers isn't to 
hard (user agent string, network blocks, reverse DNS, etc.).

There are several well known / large search engines out there that
you'll want to whitelist for. (MSN, Google, Yahoo, Amazon's, AskJeeves, etc.)
However, the challenge is that servers move *all the time*. Heck,
the servers don't have to move for their netblocks to move. Doing
an IP whitelist can be dangerous if you aren't vigilant about tracking
where the search engines are. 

Assuming that you do want to whitelist search engines, (not a good
idea, IMO, a previous poster pointed out the paradox already), you
may want to do this by using a warning system instead of a blocking
system. If a user-agent comes from an IP address that is not in your
whitelist, issue a email notification to you on a regular basis.
Take a look at the situation and then decide if you want to blacklist
the IP that is originating the request.

Of course, if that person is coming from AOL, you may not want to do
that. 

-Steve

-- 
Steve Shah
sshah () RisingEdge org 


Current thread: