Educause Security Discussion mailing list archives

Re: Sensitive data detection


From: Wyman Miles <wm63 () CORNELL EDU>
Date: Fri, 20 Apr 2007 17:56:34 -0400

Runs of 9 are an extremely difficult problem.  You can bracket them with a
\D (nondigit) or \b (word break), which sometimes helps.  Validation
against the SSA area/group data helps a little.  If your institution draws
heavily from a predictable population, you can use the approach Colorado
employs and write geographically dependent regexes.  But no, there is no
silver bullet.  SINs are a far easier problem as they're Luhn-derived,
like CC#s.



I'd be interested in hearing people's feedback about the issues with
high false positive rates and 9 digit SSNs in evaluating these
tools.  Most the datastores I come across here store SSN without
hyphens, and creating regexs for any combination of 9 digit numbers
has always returned high false positives, so much so its borderline
useless.  There are some special rules for SSNs, but nothing like
creditcard luhn checks.

At 11:15 AM 4/20/2007, Harold Winshel wrote:
We're also looking to use Cornell's Spider program for
Rutgers-Camden Arts & Sciences faculty and staff.

At 01:52 PM 4/20/2007, you wrote:
On 4/20/07, Curt Wilson <curtw () siu edu> wrote:
Dear Educause security community,

For those that are currently working on a project involving the
identification of sensitive data across campus, I have some items of
potential interest. I know that Teneble (Nessus) recently announced a
module that can check (with host credentials) a host for the presence
of
selected types of sensitive data, but what we have chosen is
Proventsure's Asarium software. We are in the early stages of testing,
but it looks to be a tremendously helpful tool for such a large task
(depending upon the size of your institution).

Thanks Curt.  A freeware package that works in this same area is
the Cornell Spider

http://www.cit.cornell.edu/computer/security/tools/
http://www.cit.cornell.edu/computer/security/tools/spider-cap.html

--
Peter N. Wan (peter.wan () oit gatech edu)     258 Fourth Street, Rich 244
Senior Information Security Engineer        Atlanta, Georgia 30332-0700
USA
OIT, Information Security                   +1 (404) 894-7766 AIM:
oitispnw
Georgia Institute of Technology             GT FIRST Team Representative

Harold Winshel
Computing and Instructional Technologies
Faculty of Arts & Sciences
Rutgers University, Camden Campus
311 N. 5th Street, Room B10 Armitage Hall
Camden NJ 08102
(856) 225-6669 (O)

---------------------------------------------------------------------------------------------------

Josh Drummond
Security Architect
Administrative Computing Services, University of California - Irvine
jdrummon () uci edu
949.824.9574



Wyman Miles
Senior Security Engineer
Cornell University

Current thread: