Penetration Testing mailing list archives

Re: Optimal wildcard search algorithm

From: Tim <tim-pentest () sentinelchicken org>
Date: Tue, 28 Nov 2006 09:02:37 -0500

You'd get around your storage problems by using a depth-first search, I 
believe.  But at the same time, I don't think it's the problem you think 
it is. 

Consider the following algorithm:

function recursiveInject(string S)
For each character C in [a-z0-9] DO
   inject 'SC'
   if match then store SC // i.e., a username exists that is only the 
value of SC
   inject 'SC%' (for SQL)
   if match then recursiveInject (SC)
Next C

Your storage usage is the number of total usernames times the average 
length of each username.  And, since you only search strings that 
indicate matches, you are spending at most (size of alphabet) * (number 
of  usernames) * (average length of usernames) on queries that won't go 
anywhere, which is significantly better than traditional brute-forcing, 
where you expect to waste (size of alphabet) ^ (max username length) - 
(number of usernames) queries.  I think if what you've described is the 
extent of your injection exploit's power (i.e., it only indicates 
whether a username matches a given string, which may or may not include 
wildcards), then you can't expect to get any better than this.



I guess you're right... since the number of usernames is probably not
incredibly large, the storage wouldn't be that bad in any real-world
system.

However, might other approaches be faster than a simple,
single-character depth or breadth first attack?  For instance, with a
relatively small number of users, narrowing the alphabet first may speed
up subsequent searches significantly.  Suppose we try the following:

*a*
*b*
...
*9*

and we determine that just a single letter can be eliminated from the
alphabet.  How many users must be in the system, with usernames of say,
8 characters in length, in order for this to be an overall benefit?  Of
course, mid-way through a search, one could check the character set on a
remaining subset of users with queries like:

jo*a*
jo*b*
...

I imagine there's some break point where the up-front cost of checking
the whole alphabet is beneficial if there's enough users AND there's at
least one letter that can be eliminated.  On the other hand, more users
means there's a greater likelyhood that all characters are used at least
once, which makes it pointless.

Any thoughts on how one would approach this question?  One reason I am
so concerned with optimality, is that I often start off by manually
brute forcing apps like this, and I'd like to minimize the guessing if
possible.

thanks for the feedback,
tim

------------------------------------------------------------------------
This List Sponsored by: Cenzic

Need to secure your web apps?
Cenzic Hailstorm finds vulnerabilities fast.
Click the link to buy it, try it or download Hailstorm for FREE.
http://www.cenzic.com/products_services/download_hailstorm.php?camp=701600000008bOW
------------------------------------------------------------------------

Current thread:

Re: Optimal wildcard search algorithm Tim (Dec 01)
- <Possible follow-ups>
- Re: Optimal wildcard search algorithm Dathan Bennett (Dec 01)
- Re: Optimal wildcard search algorithm Rogan Dawes (Dec 01)