Full Disclosure mailing list archives
Re: Google's robots.txt handling
From: Jeffrey Walton <noloader () gmail com>
Date: Thu, 13 Dec 2012 17:37:52 -0500
On Thu, Dec 13, 2012 at 7:52 AM, Philip Whitehouse <philip () whiuk com> wrote:
I restate my email's second point. Google is indexing robots.txt because (from all the examples I can see) robots.txt doesn't contain a line to disallow indexing of robots.txt It is possible that some web sites provide actual content in a file that happens to be called robots.txt (e.g a website concerned with AI development). Could Google do better by removing the file? Sure. But as webmasters haven't told them not to, even though they have provided other files not to index, Google is doing exactly what they were asked.
Webmasters don't have to in the US - the Computer Fraud and Abuse Act (CFAA) means Google (et al) must operate within the authority granted by the webmasters. If that means the webmasters decide they don't want their site crawled, then Google (et al) has exceeded its authority and broken US Federal law. Just ask Weev. This system needs a submission based whitelist. Jeff _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/
Current thread:
- Re: Google's robots.txt handling, (continued)
- Re: Google's robots.txt handling Swair Mehta (Dec 11)
- Re: Google's robots.txt handling Stefan Edwards (Dec 11)
- Re: Google's robots.txt handling Gildseth, Tommy (Dec 11)
- Re: Google's robots.txt handling Philip Whitehouse (Dec 11)
- Re: Google's robots.txt handling Denis McMahon (Dec 11)
- Re: Google's robots.txt handling Lehman, Jim (Dec 12)
- Re: Google's robots.txt handling Christoph Gruber (Dec 12)
- Re: Google's robots.txt handling Patrick Webster (Dec 12)
- Re: Google's robots.txt handling Mario Vilas (Dec 13)
- Re: Google's robots.txt handling Philip Whitehouse (Dec 13)
- Re: Google's robots.txt handling Jeffrey Walton (Dec 13)
- Re: Google's robots.txt handling Julius Kivimäki (Dec 14)
- Re: Google's robots.txt handling Christoph Gruber (Dec 12)
- Re: Google's robots.txt handling Lehman, Jim (Dec 13)
- Re: Google's robots.txt handling Ulisses Montenegro (Dec 11)
- Re: Google's robots.txt handling Philip Whitehouse (Dec 11)