Penetration Testing mailing list archives
RE: Spidering
From: <merigoth () gmail com>
Date: Wed, 23 Jan 2008 10:02:03 -1000
Also, I'd recommend the book "webbots, spiders, and screen scrappers' by michael schrenk. It has some good ideas about spidering and crawlers. -----Original Message----- From: listbounce () securityfocus com [mailto:listbounce () securityfocus com] On Behalf Of Tonnerre Lombard Sent: Sunday, January 20, 2008 9:38 PM To: me me Cc: pen-test () securityfocus com Subject: Re: Spidering Salut, On Thu, 17 Jan 2008 17:58:41 +0000 "me me" <securityoneoone () googlemail com> wrote:
Whilst I don't expect it to get everything (JavaScript etc is going
to
take manual intervention, so is a number of other possible technologies), I have never really found a tool that I consider to
be
the defacto spidering tool from this perspective. One of the biggest
problems is a lot of the spiders seem to choke on really big sites,
or
go into infinite loops etc etc.
Yes, Microsoft Passport is very evil there, as an example. My trick to solve the Microsoft Passport Problem is to search every link if it contains an URLencoded version of the current URL and if it does, ignore it. That appears to avoid deadloops. I haven't yet seen other deadloops as far as I remember, but then again I didn't index very much yet. Tonnerre -- SyGroup GmbH Tonnerre Lombard Solutions Systematiques Tel:+41 61 333 80 33 Güterstrasse 86 Fax:+41 61 383 14 67 4053 Basel Web:www.sygroup.ch tonnerre.lombard () sygroup ch ------------------------------------------------------------------------ This list is sponsored by: Cenzic Need to secure your web apps NOW? Cenzic finds more, "real" vulnerabilities fast. Click to try it, buy it or download a solution FREE today! http://www.cenzic.com/downloads ------------------------------------------------------------------------
Current thread:
- Spidering me me (Jan 18)
- RE: Spidering Thor (Hammer of God) (Jan 22)
- Re: Spidering Tonnerre Lombard (Jan 22)
- RE: Spidering merigoth (Jan 23)
- <Possible follow-ups>
- RE: Spidering thomas chamberlain (Jan 23)
- Re: Re: Spidering metabolic_76 (Jan 23)