Vulnerability Development mailing list archives

Re: combinations of 4


From: Valdis.Kletnieks () vt edu
Date: Tue, 09 Apr 2002 00:10:57 -0400

On Mon, 08 Apr 2002 23:08:02 EDT, Michael Greenberg said:

1) "do not contain null" - that's *different* than "A..Z".  There's a
*lot* of instructions that contain non-printable non-null characters.
Most of them, in fact.  For a 32-bit instruction, there are 2^32 (or
4,294,967,296, of which 4,228,250,625 (or 255^4) do *not* contain nulls.
Only 66,716,671 (or about 1.5%) of all possible 32-bit instructions *DO*
contain a null.

While your other comments are insightful (and incredibly interesting), 
I believe KF intended to simply insert all of these generated sequences 
into a call to __asm__() or a similar function and then 'sort the rest 
out in gdb', that is, see what calls were valid.  While your method 
will get you better results, isn't a kludge, and will probably work 
while his will not, that's not quite what he was asking.

Well.. Hmm.. 4 billion or so possibilities, at 4 bytes a pop - that's
16G for the .o file - and the .c/.s will be bigger (all those "__asm__()"
chew it up - plus any dorking around you need to get gcc, xlc, /bin/as
or whatever to *create* a 16G .o file (hint - most Power chips are 32-bit ;)

I hope you have a *big* /tmp and lots of time... ;)

But you already *know* that of the 4G entries, only 66M or so are
"interesting" in that they contain a null - so turning it around and
making a list of "everything that DOES have a null" and saying "If
it's not in this list, it doesn't have a null" is a *lot* faster....

What *might* be interesting is generating *just those 66M* bit
patterns, and getting gdb to disassemble them, so you have a list of
66M instructions to avoid...  and that's almost plausible to do.

for (a=0;a<256;a++)
  for (b=0;b<256;b++)
    for (c=0;c<256;c++)
      print "\0$a$b$c $a\0$b$c $a$b\0$c $a$b$c\0";

Yes, this generates some duplicates.  Deal with it - /bin/uniq ;)

My analysis of instruction formats took me all of 15 minutes with 2
IBM AIX manuals - and hopefully produced a lot more insight into the
problem than simply brute-forcing all 4 billion possibilities For
instance, all the 'X' format instructions have primary opcode 31 or 63
- and the ones with 63 tend be the floating-point instructions while
31 tends to be integer and memory references. (Hmm..  I smell a
5-input AND gate here ;)

4 billion instructions - or 11 instruction formats (of which 3 have only
one instruction of that format), and for each format it's pretty easy
to figure it out.  Work smarter, not harder. ;)

-- 
                                Valdis Kletnieks
                                Computer Systems Senior Engineer
                                Virginia Tech

Attachment: _bin
Description:


Current thread: