funsec mailing list archives

Re: An OCR plug-in for Spamassassin


From: Valdis.Kletnieks () vt edu
Date: Mon, 17 Apr 2006 00:22:09 -0400

On Sun, 16 Apr 2006 22:07:16 MDT, Dude VanWinkle said:

Does anyone know how CPU intensive would it be to OCR the amount of
messages a large corporation receives in one day? I seem to remember
omnipage taking a bit of the cycles when it was running and even
though the OCR systems in development will be a sight better, I am
still curious if this technology will at first end up being a possible
DoS before it matures.

OCR it the first time, and then save the data.  When another JPG
shows up, just compute a hash of it, and if you've seen that hash before,
use the OCR from the first time.

That should work till the miscreants start pumping out differing tweaked images
for each spam... ;)

Attachment: _bin
Description:

_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.

Current thread: