WebApp Sec mailing list archives
Re: Hit Throttling - Content Theft Prevention
From: "Kurt Seifried" <bt () seifried org>
Date: Wed, 19 Oct 2005 04:11:56 -0600
If their is anything worth harvesting off the site that is publicly available and one blocks the IP address Google cache is always an option. Hidden links in white text: What will happen if a legitimate spider such as Google (again) comes across this link? shall they be blocked also. Is that what you want to do?
Google is easy to deal with since they are well behaved (in my experience with a site that has over 100,000 pages google is the best to index timely and sanely without bringing my servers to their knees). google cache can be disabled in the document with a directive. Google uses a well known user agent that can be white listed (but wait.. wouldn't this cause attackers to use the same user agent....) and they also come (for my site anyways) from a few well defined class C's. Whitelisting legitimate crawlers isn't to hard (user agent string, network blocks, reverse DNS, etc.).
-Eoin
-Kurt
Current thread:
- Hit Throttling - Content Theft Prevention Nik Cubrilovic (Oct 18)
- Re: Hit Throttling - Content Theft Prevention Kurt Seifried (Oct 18)
- Re: Hit Throttling - Content Theft Prevention Nik Cubrilovic (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Peter Conrad (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Eoin Keary (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Kurt Seifried (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Steve Shah (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Nik Cubrilovic (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Kurt Seifried (Oct 18)
- Message not available
- Re: Hit Throttling - Content Theft Prevention focus (Oct 19)
- Re: Hit Throttling - Content Theft Prevention Nik Cubrilovic (Oct 19)