nanog mailing list archives
Re: www.gigablast.com
From: Stephane Bortzmeyer <bortzmeyer () nic fr>
Date: Mon, 17 Jul 2006 11:07:32 +0200
On Wed, Jul 12, 2006 at 06:24:08PM -0400, Jim Popovitch <jimpop () yahoo com> wrote a message of 32 lines which said:
The strangeness is that some of their crawling is looking for URLs with multiple exclamation points, those URLs never existed. This may be indicative of a character translation on my system or theirs.
From my experience (and I talked with people - or at least intelligent
bots - at Gigablast), their HTML parser is seriously broken and it generates non-existing URL quite often. For instance <a href="http://www.example.fr/Cafe%20au%20lait"> will make their crawler ask for "/Cafe". I reported the problem months ago but I got nothing except standard "Thanks for telling us".
Current thread:
- www.gigablast.com Jim Popovitch (Jul 12)
- Re: www.gigablast.com Jim Popovitch (Jul 12)
- RE: www.gigablast.com David Schwartz (Jul 12)
- RE: www.gigablast.com Bill Woodcock (Jul 13)
- RE: www.gigablast.com David Schwartz (Jul 12)
- Re: www.gigablast.com Malcolm Staudinger (Jul 12)
- Re: www.gigablast.com Payam Tarverdyan Chychi (Jul 12)
- Re: www.gigablast.com Jeremy Chadwick (Jul 12)
- Re: www.gigablast.com Jim Popovitch (Jul 12)
- Re: www.gigablast.com Stephane Bortzmeyer (Jul 17)
- Re: www.gigablast.com Jim Popovitch (Jul 12)