Information Security News mailing list archives

Cracking an algorithm bit by bit conclusion.

From: InfoSec News <isn () c4i org>
Date: Thu, 6 Feb 2003 00:18:32 -0600 (CST)
+------------------------------------------------------------------+
|  Linux Security: Tips, Tricks, and Hackery                       |
|  Published by Onsight, Inc.                                      |
|                                                                  |
|  05-February-2003                                                |
|  http://www.hackinglinuxexposed.com/articles/20030205.html       |
+------------------------------------------------------------------+

This issue sponsored by Onsight, Inc, your source for open-source
solutions.

Onsight offers on-site training on Basic Perl programming, Advanced
Perl programming, CGI programming using Perl, Tcl/Tk, XML and
JavaScript. All courses are hands-on, designed by real-world
consultants and fully customizable. Because all classes are on-site,
our overhead is low and our prices consistently beat those of our
competitors. Every Onsight instructor is a seasoned consultant able
to provide back-end web programming, network security, system
administration and other support services.

For more information, visit http://www.onsight.com

--------------------------------------------------------------------

Cracking an algorithm bit by bit conclusion.
By Brian Hatch

Summary: We complete our reverse engineering of a terribly-lame
encryption algorithm.

Last week we stripped off the outer layer of our home-grown crypto
algorithm through some intuition. The data was made pure ASCII using
a modified uuencode system.

(For those of you keeping score, David Nash was the winner of the
contest, having written a perl script earily similar to the one at
the end of this newsletter. One copy of HLEv2 is coming his way.)

Again, here are the sample strings you had to work with:

  # String 1
  !!@!1P!=P!?P!=P!?`!>`!<0!?0!;0!?0!=@!B`!,`!>@!A0!,P!>@!B@!A`

  # String 2
  !T`!M0!TP!V0!X0!Y0!C@!P0!Y0!W0!UP!Y@!H@

  # String 3
  !8@!>@!E`!EP!H`!GP!I0!GP!60!A@!I`!J@!L@!M@!7P!A0!N0!L@!L@!MP!J@!J@

  # String 4
  !^`!P0![0![0!IP!]0!H@!Z`!Y0!^0!I@!``![0!]0!]@!^@!`P!K0!`0!_0!_P!"`!P`

  # String 5
  !S@!O0!W`!SP!BP!V0!X@!X@!XP!D`!G@!D@!YP!V0![0!Z@!EP!X0![`!F@!W0!X0!\`!\@!K0


Using the following perl script, you can reformat and uudecode the
output to it's original binary state:


  prompt$ cat decrypt.pl
  #!/usr/bin/perl
  use strict;

  my $encrypted =
     '!!@!1P!=P!?P!=P!?`!>`!<0!?0!;0!?0!=@!B`!,`!>@!A0!,P!>@!B@!A`';

  # Add the backticks again
  $encrypted =~ s/(...)/\1``/g;

  print unpack( "u", $encrypted );

  prompt$ perl decrypt.pl
  Gww|xq}m}v_0z_3z__b


(Note that, since some letters will be outside the printable range,
I'll show them as underscores for the rest of this article.)

Using this first string, we have some printable characters as our
output, but other encrypted strings yielded very ugly results,
outside the printable ASCII range. So instead, let's see the actual
numerical values of those characters:


  prompt$ cat decrypt.pl
  #/usr/bin/perl
  use strict;

  my $encrypted =
     '!!@!1P!=P!?P!=P!?`!>`!<0!?0!;0!?0!=@!B`!,`!>@!A0!,P!>@!B@!A`';

  # Add the backticks again
  $encrypted =~ s/(...)/\1``/g;

  my $uudecoded = unpack( "u", $encrypted);

  # Loop through the letters one by one
  for my $char ( split //, $uudecoded ) {
          print ord($char), " ";
  }
  print "\n";

  prompt$ perl decrypt.pl
  6 71 119 127 119 124 120 113 125 109 125 118 136 48 122 133 51 122 138 132


Hmmn. Now most of those numbers seem to come from a very small range,
109 -> 138, and the difference (29) is pretty close to the size of
the English alphabet (26). What if they are all shifted over from the
actual letters? Let's see what happens if we ASCII-ify our encrypted
string, and try to bring the letters back into a proper range:


  prompt$ cat decrypt.pl
  #!/usr/bin/perl
  ...
  # Loop through the letters one by one
  for ( my $offset=0; $offset < 50; $offset++ ) {
          for my $char ( split //, $uudecoded ) {
                  print chr( ord($char) - $offset ), " ";
          }
          print "\n";
  }

  prompt$ perl decrypt.pl
  _ G w _ w | x q } m } v _ 0 z _ 3 z _ _
  _ F v ~ v { w p | l | u _ / y _ 2 y _ _
  _ E u } u z v o { k { t _ . x _ 1 x _ _
  _ D t | t y u n z j z s _ - w _ 0 w _ _
  _ C s { s x t m y i y r _ , v _ / v _ _
  _ B r z r w s l x h x q _ + u _ . u _
  _ A q y q v r k w g w p _ * t _ _ t _ ~
  ...


Well, this certainly looks more like normal letters here, but it's
definitely not the answer. Note how the beginning begins with a
capital, which could make sense.

Then I tried to figure out if it was a substitution cipher from this
point. The garbage characters toward the end, with two printable
characters in the middle may be spaces. If that were the case, then
the two letter word might be something like "an", "if", or "is".
Unfortunately, my attempts to come up with a direct substitution
cipher didn't get anywhere.

When comparing the output of this decryption stage against the other
strings I had to work with, I noticed that subsequent characters
tended to get larger on the whole. Perhaps there was a bit more
manipulation going on here. So I tweaked my loop again, adding a
$count variable which would be subtracted from each ASCII value, with
$count growing by one each time:

  prompt$ cat decrypt.pl
  #!/usr/bin/perl
  ...
  # Loop through the letters one by one
  for ( my $offset=0; $offset < 50; $offset++ ) {
          my $count=0;
          for my $char ( split //, $uudecoded ) {
                  print chr( ord($char) - $offset - $count), " ";
                  $count++;
          }
          print "\n";
  }

  prompt$ perl decrypt.pl
  _ F u | s w r j u d s k | # l v # i x q
  _ E t { r v q i t c r j { " k u " h w p
  _ D s z q u p h s b q i z ! j t ! g v o
  _ C r y p t o g r a p h y   i s   f u n
  _ B q x o s n f q ` o g x  h r  e t m
  _ A p w n r m e p _ n f w  g q  d s l
  _ @ o v m q l d o ^ m e v  f p  c r k
  ...

Pay dirt! Look at the fourth line, "_Cryptography is fun" -- seems
like we've figured out the encoding. This occurred when $offset was
set to 3. So it would seem that the algorithm to encrypt was
something like this:

  * Take the ASCII value of the character to be 'encrypted'.
  * Add 3 ($offset)
  * Add the position of this character ($count)
  * Uuencode
  * Strip the last two characters of the uuencoded value (``)

I ran the same program, this time with the next encrypted string as
input, and got the following:

  # Using string 2 this time
  prompt$ perl decrypt.pl
  Ð ´ Ñ Ö Ý à _ º Ý Ô Í Û _
  Ï ³ Ð Õ Ü ß _ ¹ Ü Ó Ì Ú _
  Î ² Ï Ô Û Þ _ ¸ Û Ò Ë Ù _
  Í ± Î Ó Ú Ý _ · Ú Ñ Ê Ø _
  Ì ° Í Ò Ù Ü _ ¶ Ù Ð É × _
  Ë ¯ Ì Ñ Ø Û _ µ Ø Ï È Ö _
  Ê ® Ë Ð × Ú _ ´ × Î Ç Õ _
  É  Ê Ï Ö Ù _ ³ Ö Í Æ Ô _
  È ¬ É Î Õ Ø _ ² Õ Ì Å Ó _
  Ç « È Í Ô × _ ± Ô Ë Ä Ò _
  ...

Drat. I was hoping to see text on the 4th line, but was just getting
garbage. Well, not exactly garbage, there were still unprintable
characters in the same place as if they'd be word breaks, while the
rest was filled with bytes with the high bit set. I compared the
ASCII values of this set against the first set. This second string
was offset much more than 3 characters. I tweaked the script to go
from an $offset of 0 to 256 to see if anywhere there was a readable
string. I also prefixed the $offset value so I could keep better
track of things:

  # Using string 2
  prompt$ perl decrypt.pl
  0: Ð ´ Ñ Ö Ý à _ º Ý Ô Í Û _
  1: Ï ³ Ð Õ Ü ß _ ¹ Ü Ó Ì Ú _
  2: Î ² Ï Ô Û Þ _ ¸ Û Ò Ë Ù _
  3: Í ± Î Ó Ú Ý _ · Ú Ñ Ê Ø _
  4: Ì ° Í Ò Ù Ü _ ¶ Ù Ð É × _
  ...
  101: k O l q x { # U x o h v 1
  102: j N k p w z " T w n g u 0
  103: i M j o v y ! S v m f t /
  104: h L i n u x   R u l e s .
  105: g K h m t w  Q t k d r -
  106: f J g l s v  P s j c q ,
  ...

Aha! Look at line #104 - "hLinux Rules.". We're definitely onto
something.

So, it seems our algorithm was correct, but the offset kept changing.
Running the rest of the encrypted strings also yields readable output
with different values for $offset. Although, for some reason, one of
them was not quite readable:

  # Using string 4 this time
  prompt$ perl decrypt.pl
  ...
  124: | D o n ' t   e a t   _ e l l o _ _ _ n o _ .
  ...

Hmmn - those underscores (originally unprintable characters)
indicated I didn't quite have my algorithm tweaked correctly. Looking
at the raw ASCII values, I realized I'd not taken into account the
fact that when you take an original ASCII value and add or subtract
increasing numbers, eventually you'll go beyond the bounds of an 8
bit number. So, tweaking a bit further I added the following and
re-ran it:


  prompt$ cat decrypt.pl
  #!/usr/bin/perl
  ...
          for my $char ( split //, $uudecoded ) {
                  my $ascii = ord($char) - $offset - $count;
                  $ascii += 256 while $ascii < 0;
                  print chr( $ascii ), " ";
                  $count++;
          }
          print "\n";
  }

  prompt$ perl decrypt.pl
  ...
  124: | D o n ' t   e a t   y e l l o w   s n o w .
  ...


Excellent. Now we can decrypt the strings correctly. Unfortunately,
there's still one nagging problem: we have no way to find the right
$offset value, and need to try many of them and either find the right
one using our eyes, or write something to check and see when the
results match our expectations.

I tried seeing if the offset was related to the size of the message,
any of the other related fields in the database, or the message
itself.[1] Then it struck me how blind I was - the offset must be
stored in the mystery first character.

Converting the initial character of each string to it's uudecoded
ascii value you get:

  String 1: !!@`` == 6
  String 2: !T``` == 208
  String 3: !8@`` == 98
  String 4: !^``` == 248
  String 5: !S@`` == 206

The first thing that jumps at you is that each value is even. The
next thing that jumps out is that if you divide them by two, you get
the offset that we need. The first message had offset 3, the 4th had
offset 124, etc. Problem solved in full.

So what problems were built into this custom crypto?

  * There was a random offset, but this offset was contained within
    the encrypted strings barely obfuscated.
   
  * The randomization (adding the character number to the ascii value
    before uuencoding) was extremely week, and pretty easy to guess.
   
  * The uuencoding was used to make the encrypted version pure
    printable text, but made the output regular and led to some
    correct assumptions about the plaintext and the method of
    creating the ciphertext.
   
  * The random initializer, in order to only occupy one byte, needed
    to be between 0 and 127, which is only 7 bits of 'security'.[2]

I whipped up a set of perl subroutines to encrypt/decrypt text that
was consistent with the algorithm that was in use, and presented them
to the client who was relying on this home-grown cryptography to
protect their sensitive data. After the panic and four letter words
subsided, they quickly petitioned to create a committee to asses the
feasibility of having new cryptographic security systems investigated
to replace their bad algorithm.[3]

Later they showed me their encryption/decryption C code. It was many
pages long, not counting the pages it took to re-implement a one-byte
uuencode/decode algorithm. The amount of mathematics and manipulation
that went into the algorithm were far more advanced than the simple
solution I had come across, involving several independent[4]
variables taken together as per-message keys. Their algorithm did
require the knowledge of a secret key, only available on their
server. However the way in which these values were (mis)used, ended
up cancelling out the need for this key, and everything reduced to
the much simpler algorithm which I'd reverse engineered.

This, of course, leads us to another tenant of cryptography:

  * Complexity does not guarantee security

So, thus ends our tale of one bad crypto system. Unfortunately, there
are many many more out there, waiting to be broken.[5]

Here are the encryption/decryption routines for this bad crypto.
Unfortunately, I can't provide the original C sources that were used
for comparison. Feel free to use this as a learning/investigative
tool, but for goodness sake don't use them for anything that should
be secret.



 #!/usr/bin/perl
 # badcrypto.pl
 # copyright 2003, Brian Hatch, released under the GPL.
 #
 # Reverse-engineered algorithms.  Don't use these - they
 # are excellent examples of cryptography done poorly.
 #
 # Usage:  require "VeryBadCrypto.pl";
 #         $encrypted = VeryBadCrypto::encrypt( $string );
 #         $decrypted = VeryBadCrypto::decrypt( $encrypted );
 
 package VeryBadCrypto;
 
 sub encrypt {
        my($message) = join "", @_;
        my $encrypted;

        # Pick random initializer.  Must be less than 128
        my $count = int rand(127);

        # Break up message into individual characters
        my @characters = (chr($count), split //, $message);

        for my $char ( @characters ) {

                my $ascii = $count + ord($char);

                # convert to a 0-255 number
                $ascii = $ascii % 256;

                my $uuencoded = pack "u", chr($ascii);

                # Strip the trailing `` chars we have from uuencoding
                $uuencoded = substr($uuencoded,0,3);

                $encrypted .= $uuencoded;

                $count++;

                # Avoid potential problems when integers wrap around
                $count %= 256;
        }
        $encrypted
 }

 sub decrypt {
        my($encrypted) = join "", @_;
        my $decrypted;

        # Put back trailing "``" characters
        $encrypted =~ s/(...)/\1``/g;

        my $uudecoded = unpack "u", $encrypted;
        my @characters = split //, $uudecoded;

        my $initializer = shift @characters;

        # Find original initializer value
        $initializer = ord($initializer) / 2;

        # add one to compensate since the first one was used
        # to encode the initializer itself.
        $initializer++;
        
        # Ok, we've got our initializer, time to decrypt the rest.
        for my $char ( @characters ) {
                my $ascii = ord($char);

                $ascii -= $initializer;
                $ascii += 256 while $ascii < 0;

                $decrypted .= chr($ascii);

                $initializer++;
        }
        $decrypted;
 }


NOTES:

[1] Which was pretty foolish, since the message wouldn't be available
to the decrypt routine, but I wasn't thinking clearly.

[2] If you could even call this algorithm secure.

[3] Ahh, now if that isn't a definitive step, I don't know what is.

[4] But, by their error, these other values were not independent,
which is why they ended up being irrelevant.

[5] And if you live in the US, you'd better hope they're not being
used to protect copyrighted works, or your knowledge of rot13 could
land you in jail for DMCA violations.

                            -------------                            
Brian Hatch is Chief Hacker at Onsight, Inc and author of Hacking
Linux Exposed and Building Linux VPNs. He developed his first
(horribly insecure) cryptographic algorithm when he was six. It was
no better than rot13, but took up a lot more space. Brian can be
reached at brian () hackinglinuxexposed com.

--------------------------------------------------------------------
This newsletter is distributed by Onsight, Inc.

The list is managed with MailMan (http://www.list.org). You can
subscribe, unsubscribe, or change your password by visiting
http://lists.onsight.com/ or by sending email to
linux_security-request () lists onsight com.

Archives of this and previous newsletters are available at
http://www.hackinglinuxexposed.com/articles/

--------------------------------------------------------------------

Copyright 2003, Brian Hatch.



-
ISN is currently hosted by Attrition.org

To unsubscribe email majordomo () attrition org with 'unsubscribe isn'
in the BODY of the mail.
Current thread:

Cracking an algorithm bit by bit conclusion. InfoSec News (Feb 05)