PaulDotCom mailing list archives

Office 2007 Metadata Parser


From: alif016 at gmail.com (Andrew)
Date: Sun, 21 Dec 2008 13:21:57 -0600

I've made a python script for parsing docx files' users and metadata
which fits Larry's wish list fairly well.
http://www.bitbucket.org/alif/office-2007-metadata-parser/src/

The script is office2007_meta.py, there are a few examples I've put on
the wiki:
http://www.bitbucket.org/alif/office-2007-metadata-parser/wiki/Home

It dynamically unzips docx files and reads core.xml, then parses it and
prints out one username per line. It accepts any number of files and has
a flag for only printing unique users (-e for exclusive), and a verbose
flag, -v, for seeing which file each user is in.

It can also easily be extended by using it's OfficeXML class.
--
Andrew
<http://www.bitbucket.org/alif/office-2007-metadata-parser/src/tip/office2007_meta.py>



Current thread: