Vulnerability Development mailing list archives
Re: Audio fingerprinting (was Re: hacksdmi?)
From: Geoff Schmidt <geoff () TUNEPRINT COM>
Date: Mon, 16 Oct 2000 05:11:02 -0400
Lincoln Yeoh wrote:
If you only need to write 32 bits into a 120 second long song it doesn't look too bad. But once you start trying to shove PKI type keys into it, woohoo. A bit a second at 90db S/N is probably noticeable to the "Golden Ears" folks out there, so good luck getting 10 bits a sec....
Another aspect to keep in mind is that the watermarking camp generally tries to design their algorithms to permit the recovery of the watermark data from *any* n-second segment of the track, where n is typically less than, oh, say, five.
I think Shannon's famous formula will help tell you how much you have to play with given an acceptable signal level for music. It doesn't tell you how to do it, but at least it's easy to prove at which point it becomes very difficult.
You actually have a lot more leeway than Shannon would lead you to believe, because of psychoacoustic considerations: because of how the 'spectrum analyzer' in the ear works, at any given point in time there are chunks of spectrum where human beings are very insensitive to noise. (The location of these areas are a complicated and not fully understood function of the sound.) And yes, the people who do this kind of research use data from 'golden ears' :) On the other hand, it struck me as I was writing this that in an important way watermarking is harder than mp3 compression: gains in mp3 compression (ie, file size reduction) come from two sources: encoding multiple psychoacoutically identical (or similar) waveforms as the same thing, and storing the signal in a more compact way that uses fewer bytes. These map broadly to the lossy and non-lossy stages of the algorithm, in that order, and the sum of these two effects is the redundancy that a mp3 coder identifies and extracts from the waveform. You can think of watermarking from an information theoretic perspective: if you're inserting an n-bit watermark into a signal per unit time, you either have to find n bits of redundancy in the signal per unit time, or you have to lose n bits of information from the signal per unit time (or a combo of the two.) Otherwise you'd be getting something for nothing. But here's the kicker, and the point of this little digression: a watermark algorithm can only use the _first_ of the two sources of redundancy, because it has to survive file format conversion. At this point it's reasonable to wonder how much of an mp3 coder's gains are due to the first source and how much are due to the second. The party line is that it varies with time: for spectrally simple blocks, the psychoacoustics don't help much but you get great compression; for spectrally complex blocks, the psychoacoustics are a big win but you don't get much compression. (This balance is usually recognized as a design win.) The upshot is that watermarks don't get the balance: if you're trying to cram a constant number of watermark bits per unit time, you have to rely on just the psychoacoustics, which means there are some points in a track (and some entire tracks!) where you lose. People should step in and correct me if I'm wrong on that last rumination. Geoff
Current thread:
- Audio fingerprinting (was Re: hacksdmi?) Geoff Schmidt (Oct 13)
- Re: Audio fingerprinting (was Re: hacksdmi?) Thierry (Oct 14)
- Re: Audio fingerprinting (was Re: hacksdmi?) Geoff Schmidt (Oct 14)
- Re: Audio fingerprinting (was Re: hacksdmi?) Bluefish (P.Magnusson) (Oct 16)
- Re: Audio fingerprinting (was Re: hacksdmi?) Geoff Schmidt (Oct 14)
- Re: Audio fingerprinting (was Re: hacksdmi?) Lincoln Yeoh (Oct 15)
- Re: Audio fingerprinting (was Re: hacksdmi?) Geoff Schmidt (Oct 16)
- Re: Audio fingerprinting (was Re: hacksdmi?) Thierry (Oct 14)