Interesting People mailing list archives

IP: Steganogaphy 101

From: David Farber <dave () farber net>
Date: Fri, 02 Nov 2001 14:41:51 -0500

Date: Fri, 2 Nov 2001 14:31:17 -0500
To: farber () cis upenn edu
From: Peter Wayner <pcw () flyzone com>
if any IP goes, let me have a brief review djf
Dave--
I can't tell you what Niels Provos is going to talk about, but I havefollowed his research and others in the process of completing the secondedition of _Disappearing Cryptography_, a book focusing on steganographictopics. So here's a quick summary of the area.
Most of the recent media interest in steganography deals with hiding datain the noise of digitized images or sound files. These are very compellingstories, especially when newspaper writers can hint that the terroristshid their secret plans in pornography. There's no real evidence that thoseinvolved in the Sept. 11 attacks used the technology. In fact, the mediais now including backpedaling from at least one who was said to have foundit. (http://www.heise.de/tp/english/inhalt/te/11027/1.html) In any case,there's no reason why the attackers couldn't have used it. The technologyexists and it is not hard to use.
The basic technique flips the least significant bit of each pixel toencode a message. So if one pixel has a red component of 140 (10001100 inbinary), a single bit of 1 can be hidden by adding a tiny bit more red andincreasing the strength to 141 (10001101 in binary). A message of 0 can besend by leaving that red component unchanged.
A large file is hidden in a picture by breaking it up into bits andspreading them around an image. One basic technique uses a key to drive acryptographically secure random number generator which chooses the pixelsto hold the message. Only the key holder can pick out the bits in theright order.
The amount of space available is truly staggering. 1/8th of the image filecan replaced with hidden information without much distortion. 1/4 iscertainly possible. I might point out that there are many legitimate usesaround. Some medical x-ray folks talk about using steganography to adddoctors comments to a digitized image. (They use a better algorithm whichis beyond this note.) The image still works with legacy machines savingclinics money, but newer software can extract the doctor's comments. TheRIAA/MPAA are also big fans of using steganography to defend the rights ofcopyright holders.
How can we detect steganography? There are two different types of detection:

*) Full detection-- You find the key and extract the message.
*) Statistical detection-- You check the image's statistical profile andsee something fishy.
The first technique is not easy to accomplish. If the cryptographic randomnumber generator is any good, you will need plenty of hard work and time.Most of the programs aren't very sophisticated, however, and I believethat Provos was able to recover a file from an image created by ABC news.
The second technique is the one that generates all of the discussion aboutEbay, porn, and other random images floating around the net. The problemis that many basic steganographic tools leave statistical artifacts intheir messages. The people searching for hidden information download lotsof images and then check the statistical profile. A bad set of statisticsraises an alarm bell, but it's far from certain. Statistics are justaverages and they don't always work out with mathematical precision.Without actually breaking the system completely, recovering the key, andreading the message, it's almost impossible to be certain that something'shidden.
What do these statistical artifacts look like? In most cases, the filesget _more_ random looking as data is hidden inside them. This is becausedigital cameras and scanners aren't very precise. The least significantbit is often highly correlated with the more significant bits. Think of avery bright spot on the image, perhaps caused by a glint of sunlight.These peg the pixels at the maximum value, usually 255 (11111111 inbinary). There aren't that many 254s skewing the number of 1's and 0's inthe least significant bit plane.
The data inserted into the file, however, is usually more randomespecially if it's been compressed before inserting it. The number of 0'sand 1's are close to equal and they usually have no correlation with thehighest order bits. The basic steganographic tools don't pay attention tothe structure of the cover image. (Better ones do, but they're beyond therange of the basic detectors...) So a very random least significant bitplane sticks out like a sore thumb.
One simple way to detect randomness like this is a chi-squared test.Checking the distribution of 0's and 1's in the different bit planes willidentify some glaring steganography. Better tests compare and contrast thevarious bit planes of pixels with their neighbors.
This test has one big problem. The amount randomness in the final imagedepends upon the size of the inserted message and the amount ofrandomness in it. If the secret message is not very random or its not verylarge, the chi-squared test doesn't notice it. Size is very important.Many of the tests use hidden messages that are 5-12% of the cover image.Shorter messages don't raise problems. The easiest way to thwart thedetectors is to be brief. (I should point out that there's plenty ofbandwidth and plenty of cover material. 1% of a 100k image is still morethan 1000 characters.)
The test is also prone to false positives. If you set the threshold verylow, images of highly textured scenes start to look pretty random and thussuspicious. Unless you find the algorithm that created the hidden data,you can't prove anything except that there's a lot of randomness.
I'm sorry this is just a quick write up of the field. More information canbe found by either reading some of the papers or reading _DisappearingCryptography_. A BIBTEX bibliography follows.
Feel free to write with specific questions.

-Peter






@inproceedings { ettinger98steganalysis,
    author = "Mark Ettinger",
    title = "Steganalysis and Game Equilibria",
    booktitle = "Information Hiding",
    pages = "319-328",
    year = "1998",
    url = "citeseer.nj.nec.com/71354.html"
}
 @inproceedings {Westfeld2001,
   author="Andreas Westfeld",
title="High Capacity Depsite Better Steganalysis: F5- A SteganographicAlgorithm",
   pages="301-315",
   booktitle="Fourth Information Hiding Workshop",
   year="2001",
} @misc{ rui-steganalysis,
    author = "J. Fridrich and Rui Du and Meng Long",
    title = "Steganalysis Of Lsb Encoding In Color Images",
    url = "citeseer.nj.nec.com/403441.html"
}



@inproceedings { johnson98steganalysis,
    author = "Neil F. Johnson and Sushil Jajodia",
title = "Steganalysis of Images Created Using Current SteganographySoftware",
    booktitle = "Information Hiding, Second International Workshop",
    pages = "273-289",
    year = "1998",
    url = "citeseer.nj.nec.com/johnson98steganalysis.html"
}



@misc{ johnson98steganalysis2,
    author = "Neil F. Johnson and Sushil Jajodia",
    title = "Steganalysis: The Investigation of Hidden Information",
text = "N. Johnson and S. Jajodia, Steganalysis: The Investigation ofHidden Information,
      Proc. of the 1998 IEEE Information Technology Conference, Syracuse, New
      York, September 1-3, 1998.",
    year = "1998"
}


@techreport{ provos01probabilistic,
    author = "Niels Provos",
    title = "Probabilistic Methods for Improving Information Hiding",
    month="January",
    institution="University of Michigan",
    number="01-1",
    text = " Probabilistic Methods for Improving Information Hiding. CITI
      Technical Report 01-1, January 2001. Submitted for publication.",
    year = "2001",
    url = "citeseer.nj.nec.com/provos01probabilistic.html"
}

@misc{ provos-defending,
    author = "Niels Provos",
    title = "Defending Against Statistical Steganalysis",
    url = "citeseer.nj.nec.com/provos01defending.html"
}


@book{wayner96,
author="Peter Wayner",
title="Disappearing Cryptography",
year=1996,
publisher="AP Professional",
address="Chestnut Hill, MA"}

Second edition due next year.



For archives see:
http://www.interesting-people.org/archives/interesting-people/

Current thread:

IP: Steganogaphy 101 David Farber (Nov 02)