Educause Security Discussion mailing list archives

Re: Locating Personally Identifiable Information


From: Brad Judy <Brad.Judy () COLORADO EDU>
Date: Tue, 12 Feb 2008 09:30:05 -0700

We spent a bit of time looking into the tools and processes.  We came up
with a multi-phase and multi-pronged approach.

First, I'll say we had a couple of data exposure incidents in the past,
which generated some motivation and support for pursuing the subject of
private data identification and protection.  We had existing private
data protection standards, but people are often unaware of the data they
have (usually old, long-forgotten data), so we have been focusing on
identifying that data.  

Our multi-campus system office created a policy that defines the
university data sensitivity classifications
(https://www.cu.edu/policies/General/IT-Sec_InfoClassification_P.pdf) as
well as one that discusses IT asset inventories
(https://www.cu.edu/policies/General/IT-Sec_UnivOps.pdf) 

We tested various tools and tuned them to best work with the types of
data we were seeing on campus (the incidents gave us some good
experience in this area).  For the Windows platform, we have focused on
Cornell's Spider 2.x and for other platforms, we have mainly been
working with U Texas' Java-based SENF.  Virginia Tech also makes a pair
of tools (Find_SSN and Find_CCN) written in Python.  There are several
commercial options (Proventsure, IdentityFinder, Tablus, DBdatafinder,
Vontu, etc), but all were Windows-only and most were focused on
centralized reporting.  Given the distributed nature of our campus, we
wanted tools with localized reporting.  In the end, we distribute a
version of Spider with custom searches/configuration and have been
directing people to use the stock version of SENF for other platforms.
We developed a more complete Spider manual for more advanced users (IT
admins) and held a Spider training class.  

We also developed our own scripts for some command-line Unix searching,
primarily for searching central Solaris hosts.  This is one of our other
prongs, we scan our primary web server once a month.  We've done various
forms of Google hacking as well, which helps with department-managed web
servers.  

The search tools are tricky and we get some complaints about the false
positives, but we've taken the tuning as far as we are comfortable given
the file and data types we have observed during our incidents.  For
desktops we manage, our staff can help run the tools, but all decisions
about what to do with data are left to the user.

We developed some quick guidelines on what to do with identified private
data (http://www.colorado.edu/its/security/PrData_QkRef_Table.pdf) to
help guide end-users who do their own searching.  

For the process, we are approaching it as an IT asset inventory issue.
We developed some guidelines for asset inventory
(http://www.colorado.edu/its/security/assetinventory/) and divided the
campus departments into two phases.  The first phase consisted of
departments that were more likely to be systemically storing private
data.  Part of this group were departments known to have databases of
SSN's in the past (a few years ago we converted from SSN to a student
ID) and the other part were departments with credit card merchant ID's.
They were asked to scan their systems, remove private data that was no
longer needed, report which systems would continue to store private
data, and attest to those systems meeting campus security standards for
private data storage
(http://www.colorado.edu/its/docs/policies/Requirements_for_Private_Data
_Systems_2007.pdf).  

The second phase (which I'm just getting rolling) is making the same
request of the remaining departments on campus.  In the near future
we'll be adding some more documents to
http://www.colorado.edu/its/security/assetinventory/ so you'll be able
to see copies of the assertion form and a flowchart of the recommended
process for departments.

That turned out to be a long e-mail, hopefully there was some useful
stuff in there.

Brad Judy

IT Security Office
University of Colorado at Boulder

-----Original Message-----
From: David, Elaine [mailto:elaine.david () UCONN EDU] 
Sent: Tuesday, February 12, 2008 8:51 AM
To: SECURITY () LISTSERV EDUCAUSE EDU
Subject: [SECURITY] Locating Personally Identifiable Information

At the University of Connecticut we are looking to deploy software for
locating personally identifiable information such as social security
numbers, credit card numbers, etc. in our efforts to help us manage and
protect sensitive data.

We have identified several products that we have tested for
functionality, among them: Cornell's Spider Forensic Tool, Velosecure's
Identity Finder, and Proventsure's Self PII Detection. 

I am interested in learning whether other institutions have implemented
a tool for identifying/locating sensitive information, and if so:
(1) Which tool are they using?
(2) How is the tool being deployed? E.g. Do you just make it available
for use by your staff? Do you have support staff who run the tool for
individuals who request it or can individuals run it themselves?  Is it
mandatory or voluntary to use the tool? 
(3) Any other information that might be useful.

Thank you in advance for any information that you can provide to us.

- Elaine

Elaine David
Assistant Vice President for Information Services
Director of Information Technology Security, Policy & Quality Assurance
University of Connecticut
Storrs, Connecticut 06269-3138 
Phone: (860) 486-1362
Fax: (860) 486-5744
Email: Elaine.David () uconn edu

 

CONFIDENTIALITY NOTICE: If you have received this e-mail in error,
please immediately notify the sender by e-mail at the address shown and
delete all copies of this message. This e-mail transmission may contain
information that is proprietary, privileged, confidential, or otherwise
legally exempt from disclosure. If you are not the named addressee,
please be aware that you are not authorized to open, read, print,
retain, copy, or disseminate this message or any part of it. Thank you
for your compliance.

 

Current thread: