Educause Security Discussion mailing list archives

Re: Identity Finder


From: "Peterman, Martin (mdp4s)" <mdp4s () VIRGINIA EDU>
Date: Fri, 18 Dec 2009 10:46:49 -0500

I have sample data that I would like to share: please know that it came from Identity Finder.  Is there an EDUCAUSE 
drop zone where we access files?

Along the same lines, it would be great if we could build a big repository of test files of every ilk!

Please be advised that SSN conform to certain rules - not all 9 digit numbers will show up in a scan.

Thanks,
Marty


Marty Peterman, CISSP                                       
peterman () virginia edu
Information Security Analyst
Information Security, Policy, and Records Office (ISPRO)
Office of the Vice President/CIO
University of Virginia, 2400 Old Ivy Rd.                 Phone  434.243.4909
Box 400898, Charlottesville, VA 22904-4898               Fax    434.243.9197
http://www.itc.virginia.edu/security/    


-----Original Message-----
From: The EDUCAUSE Security Constituent Group Listserv [mailto:SECURITY () LISTSERV EDUCAUSE EDU] On Behalf Of Brad Judy
Sent: Friday, December 18, 2009 10:28 AM
To: SECURITY () LISTSERV EDUCAUSE EDU
Subject: Re: [SECURITY] Identity Finder

I want to echo Randy's statement about testing the tools with sample data
from your environment.  At a minimum, pull together a representative sample
of files you encounter.  Ideally, pull together two testing data sets: one
set of known positives in a variety of file formats and another very large
set of a wide variety of files.  The first will allow you to test false
negatives and the second will allow you to test false positives.  The second
will also allow you to performance benchmark the tools, which can vary in
speed by a factor of 10 or more.

When putting together test data, remember to include old files formats.  A
large portion of the sensitive data found is in older, forgotten documents
often from when student ID might have been the SSN.  Think about what
spreadsheet or database formats might have been popular on your campus 5-10
years ago.  

Regardless of the tool you select, use the information from your testing to
tune it to match your institution's tolerance for FN/FP.  In general,
commercial tools error towards false negative and open source ones error
towards false positive, but most can be tuned to meet your needs.  It's
unlikely that any tool will exactly match your institution's risk/usability
balance goals out of the box.  

One can also consider a multi-phased approach to data searching.  Perhaps
you start with the high risk items and only flag files with more than X
number of hits, then retune later to catch the smaller fish.  Or, you can
target high risk systems (internet-facing, portable, file shares, etc)
before moving to broader groups.  This can ensure that your effort starts
with big wins and lower end-user pain.  

Brad Judy

Emory University


-----Original Message-----
From: The EDUCAUSE Security Constituent Group Listserv
[mailto:SECURITY () LISTSERV EDUCAUSE EDU] On Behalf Of randy marchany
Sent: Friday, December 18, 2009 9:37 AM
To: SECURITY () LISTSERV EDUCAUSE EDU
Subject: Re: [SECURITY] Identity Finder

We wrote one of the freeware tools (Find_SSN, Find_CCN) and use
IdentityFinder as well. IdentityFinder has the ability to be run on
remote machines and some of our dept admins like that feature. The
other tools don't have that ability. IdentityFinder does NOT run on
Unix systems and since most of our database servers run on Unix/linux
system, IdentityFinder doesn't help us there. The Windows version is
excellent but I'm disappointed in the Mac version. Someone else
mentioned the Mac version is a work in progress and I would agree with
that assessment. It's still a very good product. Our Find_SSN/CCN tool
runs on all platforms (Mac, Windows, Linux/unix).

As far as false positives go, our tool is the best at reducing the
number of false positives. The biggest complaint you will get from
your users is "do I have to look at ALL of those files to see if
there's sensitive data?". The answer is a) yes b) move all of those
files into a folder and encrypt it and look for it later. All of the
tools including ours will generate false positives. The key is having
a sensitive data policy or standard in place. This will help you with
users who don't want to look through all of them.

The other problem with these tools is that none of them play well with
Outlook/exchange .pst files which is probably where most of the
sensitive data would be found in email attachments. I believe
IdentityFinder requires you to log into Exchange first and that's
their hook into .pst type files. My info may be dated but I believe
it's still correct.

This is the biggest issue with upper mgt.

I would suggest building a test folder with regular files, Microsoft
office files (.xls, .doc, Project, Visio, etc.), PDF files, .pst
files, binaries, small database table) and run all of the tools
against that folder and see the results. The advantage of the
commercial tools include the report format (auditors will like it) but
the freeware tools will simply generate a list of hyperlinks that
point to the files in question.

Randy Marchany
VA Tech IT Security Office

Current thread: