Interesting People mailing list archives

IP: Steganogaphy 101


From: David Farber <dave () farber net>
Date: Fri, 02 Nov 2001 14:41:51 -0500


Date: Fri, 2 Nov 2001 14:31:17 -0500
To: farber () cis upenn edu
From: Peter Wayner <pcw () flyzone com>


if any IP goes, let me have a brief review djf

Dave--

I can't tell you what Niels Provos is going to talk about, but I have followed his research and others in the process of completing the second edition of _Disappearing Cryptography_, a book focusing on steganographic topics. So here's a quick summary of the area.

Most of the recent media interest in steganography deals with hiding data in the noise of digitized images or sound files. These are very compelling stories, especially when newspaper writers can hint that the terrorists hid their secret plans in pornography. There's no real evidence that those involved in the Sept. 11 attacks used the technology. In fact, the media is now including backpedaling from at least one who was said to have found it. (http://www.heise.de/tp/english/inhalt/te/11027/1.html) In any case, there's no reason why the attackers couldn't have used it. The technology exists and it is not hard to use.

The basic technique flips the least significant bit of each pixel to encode a message. So if one pixel has a red component of 140 (10001100 in binary), a single bit of 1 can be hidden by adding a tiny bit more red and increasing the strength to 141 (10001101 in binary). A message of 0 can be send by leaving that red component unchanged.

A large file is hidden in a picture by breaking it up into bits and spreading them around an image. One basic technique uses a key to drive a cryptographically secure random number generator which chooses the pixels to hold the message. Only the key holder can pick out the bits in the right order.

The amount of space available is truly staggering. 1/8th of the image file can replaced with hidden information without much distortion. 1/4 is certainly possible. I might point out that there are many legitimate uses around. Some medical x-ray folks talk about using steganography to add doctors comments to a digitized image. (They use a better algorithm which is beyond this note.) The image still works with legacy machines saving clinics money, but newer software can extract the doctor's comments. The RIAA/MPAA are also big fans of using steganography to defend the rights of copyright holders.

How can we detect steganography? There are two different types of detection:

*) Full detection-- You find the key and extract the message.
*) Statistical detection-- You check the image's statistical profile and see something fishy.

The first technique is not easy to accomplish. If the cryptographic random number generator is any good, you will need plenty of hard work and time. Most of the programs aren't very sophisticated, however, and I believe that Provos was able to recover a file from an image created by ABC news.

The second technique is the one that generates all of the discussion about Ebay, porn, and other random images floating around the net. The problem is that many basic steganographic tools leave statistical artifacts in their messages. The people searching for hidden information download lots of images and then check the statistical profile. A bad set of statistics raises an alarm bell, but it's far from certain. Statistics are just averages and they don't always work out with mathematical precision. Without actually breaking the system completely, recovering the key, and reading the message, it's almost impossible to be certain that something's hidden.

What do these statistical artifacts look like? In most cases, the files get _more_ random looking as data is hidden inside them. This is because digital cameras and scanners aren't very precise. The least significant bit is often highly correlated with the more significant bits. Think of a very bright spot on the image, perhaps caused by a glint of sunlight. These peg the pixels at the maximum value, usually 255 (11111111 in binary). There aren't that many 254s skewing the number of 1's and 0's in the least significant bit plane.

The data inserted into the file, however, is usually more random especially if it's been compressed before inserting it. The number of 0's and 1's are close to equal and they usually have no correlation with the highest order bits. The basic steganographic tools don't pay attention to the structure of the cover image. (Better ones do, but they're beyond the range of the basic detectors...) So a very random least significant bit plane sticks out like a sore thumb.

One simple way to detect randomness like this is a chi-squared test. Checking the distribution of 0's and 1's in the different bit planes will identify some glaring steganography. Better tests compare and contrast the various bit planes of pixels with their neighbors.

This test has one big problem. The amount randomness in the final image depends upon the size of the inserted message and the amount of randomness in it. If the secret message is not very random or its not very large, the chi-squared test doesn't notice it. Size is very important. Many of the tests use hidden messages that are 5-12% of the cover image. Shorter messages don't raise problems. The easiest way to thwart the detectors is to be brief. (I should point out that there's plenty of bandwidth and plenty of cover material. 1% of a 100k image is still more than 1000 characters.)

The test is also prone to false positives. If you set the threshold very low, images of highly textured scenes start to look pretty random and thus suspicious. Unless you find the algorithm that created the hidden data, you can't prove anything except that there's a lot of randomness.


I'm sorry this is just a quick write up of the field. More information can be found by either reading some of the papers or reading _Disappearing Cryptography_. A BIBTEX bibliography follows.

Feel free to write with specific questions.

-Peter






@inproceedings { ettinger98steganalysis,
    author = "Mark Ettinger",
    title = "Steganalysis and Game Equilibria",
    booktitle = "Information Hiding",
    pages = "319-328",
    year = "1998",
    url = "citeseer.nj.nec.com/71354.html"
}
 @inproceedings {Westfeld2001,
   author="Andreas Westfeld",
title="High Capacity Depsite Better Steganalysis: F5- A Steganographic Algorithm",
   pages="301-315",
   booktitle="Fourth Information Hiding Workshop",
   year="2001",
} @misc{ rui-steganalysis,
    author = "J. Fridrich and Rui Du and Meng Long",
    title = "Steganalysis Of Lsb Encoding In Color Images",
    url = "citeseer.nj.nec.com/403441.html"
}



@inproceedings { johnson98steganalysis,
    author = "Neil F. Johnson and Sushil Jajodia",
title = "Steganalysis of Images Created Using Current Steganography Software",
    booktitle = "Information Hiding, Second International Workshop",
    pages = "273-289",
    year = "1998",
    url = "citeseer.nj.nec.com/johnson98steganalysis.html"
}



@misc{ johnson98steganalysis2,
    author = "Neil F. Johnson and Sushil Jajodia",
    title = "Steganalysis: The Investigation of Hidden Information",
text = "N. Johnson and S. Jajodia, Steganalysis: The Investigation of Hidden Information,
      Proc. of the 1998 IEEE Information Technology Conference, Syracuse, New
      York, September 1-3, 1998.",
    year = "1998"
}


@techreport{ provos01probabilistic,
    author = "Niels Provos",
    title = "Probabilistic Methods for Improving Information Hiding",
    month="January",
    institution="University of Michigan",
    number="01-1",
    text = " Probabilistic Methods for Improving Information Hiding. CITI
      Technical Report 01-1, January 2001. Submitted for publication.",
    year = "2001",
    url = "citeseer.nj.nec.com/provos01probabilistic.html"
}

@misc{ provos-defending,
    author = "Niels Provos",
    title = "Defending Against Statistical Steganalysis",
    url = "citeseer.nj.nec.com/provos01defending.html"
}


@book{wayner96,
author="Peter Wayner",
title="Disappearing Cryptography",
year=1996,
publisher="AP Professional",
address="Chestnut Hill, MA"}

Second edition due next year.


For archives see:
http://www.interesting-people.org/archives/interesting-people/


Current thread: