Educause Security Discussion mailing list archives

Re: SSN file scanner (C source available)


From: Roger Safian <r-safian () NORTHWESTERN EDU>
Date: Fri, 12 May 2006 08:17:47 -0500

If it's on any use, here's a post I made a while back to our local
user group about looking for SSN's and credit card numbers using
grep.

--

Greetings,

We have been asked to recommend tools for examining the files on a machine
to see if they contain sensitive data.  These requests are typically for
compliance purposes and are often focused on Social Security Numbers,
and credit card data.  With the upcoming effective date for Illinois
HB 1633, there is perhaps renewed interest in this topic.

A tool that you can use to examine file contents is available by
default on both Mac OS X and Unix systems.  This tools is grep.
There are versions of grep available for the PC as well.

I examined three of the PC grep tools available for download,
PowerGrep 3.2, Examine32 4.31, and Windows Grep 2.3.  All of
these tools worked fine on my machine and tests.  My preference
was for Windows Grep, since it nicely displayed the results in
a clickable window of the program.  All are available for download
at download.com, or Google the name for other locations.

Grep can use regular expressions to look for data within a file.
The following strings when used in grep will find Social Security
and credit card numbers.

SSNs 123-45-6789 or 123 45 6789

[0-9][0-9][0-9]\-[0-9][0-9]\-[0-9][0-9][0-9][0-9]|[0-9][0-9][0-9]\ [0-9][0-9]\ [0-9][0-9][0-9][0-9]

Visa/Mastercard Discover 1234-5678-1234-5678 or 1234 5678 1234 5678

[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]|[0-9][0-9][0-9][0-9]\ 
[0-9][0-9][0-9][0-9]\ [0-9][0-9][0-9][0-9]\ [0-9][0-9][0-9][0-9]

American Express 1234-567890-12345 or 1234 567890 12345

[0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9][0-9][0-9]\-[0-9][0-9][0-9][0-9][0-9]|[0-9][0-9][0-9][0-9]\ 
[0-9][0-9][0-9][0-9][0-9][0-9]\ [0-9][0-9][0-9][0-9][0-9]

Please note that the above search strings should only be on one
line.  You may need to modify them slightly for your particular
OS.  In particular you may need to replace the "|" with "\|"
for some versions of Unix, including Mac OS.  You should also
look at all files, including binary, as well as examine the
contents of zip files if possible.  On a Unix system, the -a
will examine binary files as text.

Please examine the contents of any files carefully.  I know on my system,
I found a file containing flow data that matched the social security
number format.  Just because you get a particular hit does not automatically
mean the data is of concern.

Please contact me with any questions or concerns.

--
Roger A. Safian
r-safian () northwestern edu (email) public key available on many key servers.
(847) 491-4058   (voice)
(847) 467-6500   (Fax) "You're never too old to have a great childhood!"

Current thread: