nanog mailing list archives

Re: STILL Paging Google...


From: Michael.Dillon () btradianz com
Date: Wed, 16 Nov 2005 14:39:04 +0000


matthew () elvey com (Matthew Elvey) [Wed 16 Nov 2005, 01:56 CET]:
Still no word from google, or indication that there's anything wrong 
with the robots.txt.  Google's estimated hit count is going slightly up, 
instead of way down.

Way back in the early '90's someone came up with an
elegant solution to this problem. When building a site
in a folder named /httproot, all dynamic pages, i.e.
scripts, were placed in a folder named /httproot/cgi-bin
Then somebody invented robots.txt to allow people to
tell spiders to leave the cgi-bin folder alone.

Sites which follow the ancient paradigm do not run
into these kinds of problems. Some people would say that
asking the world to re-engineer the robots.txt protocol
instead of building sites compliant with the protocol,
is in violation of the robustness principle as expressed
by Jon Postel in RFC 793 section 2.10 and reiterated in 
section 4.5 of RFC 3117.

When something doesn't work, the correct operational
response is to fix it.

--Michael Dillon


Current thread: