Educause Security Discussion mailing list archives

Re: SSN file scanner (C source available)


From: Wyman Miles <wm63 () CORNELL EDU>
Date: Fri, 12 May 2006 10:02:54 -0400

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

At their heart, all of these tools are one flavor or another of pcregrep.

A somewhat organized "find it and nuke it" movement has started at Cornell,
where the departments are conducting periodic, organized searches for
confidential data and either encrypting, moving, or removing it.

What we're striving to build here are LAN-capable tools with centralized
logging and unattended operation to support that effort.

- --On Friday, May 12, 2006 8:17 AM -0500 Roger Safian
<r-safian () NORTHWESTERN EDU> wrote:

If it's on any use, here's a post I made a while back to our local
user group about looking for SSN's and credit card numbers using
grep.

--

Greetings,

We have been asked to recommend tools for examining the files on a machine
to see if they contain sensitive data.  These requests are typically for
compliance purposes and are often focused on Social Security Numbers,
and credit card data.  With the upcoming effective date for Illinois
HB 1633, there is perhaps renewed interest in this topic.

A tool that you can use to examine file contents is available by
default on both Mac OS X and Unix systems.  This tools is grep.
There are versions of grep available for the PC as well.

I examined three of the PC grep tools available for download,
PowerGrep 3.2, Examine32 4.31, and Windows Grep 2.3.  All of
these tools worked fine on my machine and tests.  My preference
was for Windows Grep, since it nicely displayed the results in
a clickable window of the program.  All are available for download
at download.com, or Google the name for other locations.

Grep can use regular expressions to look for data within a file.
The following strings when used in grep will find Social Security
and credit card numbers.

SSNs 123-45-6789 or 123 45 6789

[0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9][0-9][0-9]|[0-9][0-9][0-9]\
[0-9][0-9]\ [0-9][0-9][0-9][0-9]

Visa/Mastercard Discover 1234-5678-1234-5678 or 1234 5678 1234 5678

[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9]-[0-9][0-
9][0-9][0-9]|[0-9][0-9][0-9][0-9]\ [0-9][0-9][0-9][0-9]\
[0-9][0-9][0-9][0-9]\ [0-9][0-9][0-9][0-9]

American Express 1234-567890-12345 or 1234 567890 12345

[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9
][0-9]|[0-9][0-9][0-9][0-9]\ [0-9][0-9][0-9][0-9][0-9][0-9]\
[0-9][0-9][0-9][0-9][0-9]

Please note that the above search strings should only be on one
line.  You may need to modify them slightly for your particular
OS.  In particular you may need to replace the "|" with "\|"
for some versions of Unix, including Mac OS.  You should also
look at all files, including binary, as well as examine the
contents of zip files if possible.  On a Unix system, the -a
will examine binary files as text.

Please examine the contents of any files carefully.  I know on my system,
I found a file containing flow data that matched the social security
number format.  Just because you get a particular hit does not
automatically mean the data is of concern.

Please contact me with any questions or concerns.

--
Roger A. Safian
r-safian () northwestern edu (email) public key available on many key
servers. (847) 491-4058   (voice)
(847) 467-6500   (Fax) "You're never too old to have a great childhood!"



Wyman Miles
Senior Security Engineer
Cornell University, Ithaca, NY
(607) 255-8421
-----BEGIN PGP SIGNATURE-----
Version: Mulberry PGP Plugin v3.0
Comment: processed by Mulberry PGP Plugin

iQA/AwUBRGSVjsRE6QfTb3V0EQJoVACg4DfjB/NnXfxE9xIhLHF9ozJIGNYAn0rn
yd2C09rGCbaQdMxRq+XBhp/D
=aMNw
-----END PGP SIGNATURE-----

Current thread: