Interesting People mailing list archives

A new Mosaic site for patent searching (fwd)


From: David Farber <farber () central cis upenn edu>
Date: Wed, 12 Oct 1994 19:02:10 -0400

From: srctran () world std com (Gregory Aharonian)
To: patents () world std com




     I have created a Mosaic site with 500 MB of PTO & patenting information,
including the beginning of an Internet site that provides full searching
capabilities of the PTO's patent text databases for free.  Beyond lots of
documents, the current Mosaic site allows people to retreive patent titles
in any class/subclass by clicking through a few screens. The patent title
data goes from 3500000 to Dec 1993.


  The Mosaic site is at:


                http://sunsite.unc.edu/patents/intropat.html




(the Sun servers are going through special upgrades from Sun, so crash from
time to time, so if your http request fails, try a little later in the day)
with the following top level menu items:




         Determine patent class/subclass using Manual of Classification
             Master list of all 400+ patent classes
             Design patent groups
             ELECTRONIC patent groups
             MECHANICAL patent groups
             ENGINEERING patent groups
             CHEMICAL patent groups


         Determine patent class/subclass using Index to Classification


         RETRIEVE patent titles using class/subclass code


         Patent documents from the PTO, PCT, EPO, etc.
             Phone numbers for various PTO offices
             PTO Examining Groups - key personel and contact points
             Special PTO P.O. boxes for sending materials to the PTO
             Crystal City Public Patent Searching Room
             PTO depository libraries across the country
             US Patent filing fees
             Massachusetts roster of attorneys
             Preparation of Patent Drawings - PTO guide
             37 CFR 1.84 appendices to Patent Drawings guide
             Drawing examples appendix 4 of Patent Drawing guide
             Current PCT countries and future expansion
             Paris Convention for int. property protection


         US Code Section 35 - federal patent laws


         IPNS - Internet Patent News Service


         Archive of stories from the IPNS, etc.


                              ====================


    A FEW NOTES


    Some of patent titles being retreived are truncated in length at 157
characters.  This is a quirk of the C code I hacked together to provide quick
retreivals of hundreds of patent titles for each request.  This will be fixed
at some point in the future.  Also, many of the pages of information you will
be requesting range in size from 20K to 60K, so make sure you save what you
retreive so you don't have to rerequest the information.  Many of the text
files are in plain ASCII format. If any of you want to volunteer to convert
them to HTML format, please let me know.  Finally, if you encounter any errors
in the patent data being sent out, please let me know.


    HOW GOOD IS IT?


    The above said, here is my first review:
        "I saw you at the MIT Entrepreneur-Club meeting today, and so the
         first thing I did upon getting back to my office was doing a simple
         patent search.  So I would like to thank you: Your system is great!
         It took me about fifteen minutes to complete a search that had taken
         me a couple of hours and more than $100 to do on the CompuServe
         (using Dialog, really) last week."


                              ====================


                                 WHAT'S NEXT ??


    SHORT TERM


    In the short term, I plan to update the patent titles to June 1994, as
soon as I can borrow the June 1994 CASSIS CDROMs (which hopefully will be in
the next few weeks - if you have a set I can borrow, let me know). I have a
copy of MPEP as well as the International Patent Classification scheme, both
of which I will break into many small files (and maybe map the International
Patent Scheme into the USPTO scheme to allow patent title retreival).


    FALL/WINTER TERM


    The next step in putting patent data onto the Internet is to make all of
the patent abstracts since 1970 Mosaic/Lynx accessible, with a WAIS and other
search servers to allow keyword searching.  Given the expected load such a
Mosaic site will experience, I probably will outwear my welcome on UNC's Sun
system.  Thus a dedicated line to the Internet, a dedicated (and redundant)
file server, 12 gigabytes of disk space, plus lots of patent data will all
have to be acquired, plus overhead.


    I estimate it will cost $100,000 to acquire all of this equipment and
data, and prepare the system for Mosaic access over the Internet, and take
about a month to get up and running.  If possible, I will be able to include
the first claim.  These gigabyte file sizes somewhat stretch Mosaic/bandwidth
and retreival tools, so some software will have to be written to optimize
multi-user retreival.  In particular, these additional files will include tons
of ASSIGNEE information which everyone has been requesting.


    There are about 2000 direct subscribing sites for my Internet Patent News
Service, so if the majority of the individuals who have been receiving the
service could donate $50, and companies $500, then most if not all of the
funds can be raised.   Since for the $50/$500, you are getting the equivalent
of the $200/$300-a-year CASSIS CDROMs, the $600 Official Gazette (minus the
pictures), and the $50/hr and up online patent abstract searching fees, in
addition to my Internet Patent News Service, the requested donations don't
seem too out of whack. And of course, larger donations will be warmly received
and help speed up putting patent data onto the Internet.


    So check out the Mosaic site, and if you like what you see, and want more,
please consider making a donation.  Please send any donations to me at:
Greg Aharonian, Internet Patent News Service, P.O. Box 404, Belmont, MA,02178
and/or call me at 617-489-3727.  I will be glad to provide invoices to any
companies that want to have a bill for their records (call it patent search
services rendered).


    LONG TERM


    The next stage after patent abstracts is that of putting up the full text
to patent claims, a 20 gigabyte project costing $100,000, followed by the
next stage of putting up the full texts to patent specifications, a 120
gigabyte project costing $200,000.  For the time being, all I do is dream
about such things.  Beyond these stages are stages for foreign patent data,
and development of patent preparation and analysis tools.   But first, let's
get the abstracts onto the Internet.


    GOVERNMENT CONTACTS


    Putting tens of gigabytes of patent data onto the Internet for free access
makes it one of the largest, if not largest, such database on the Internet. As
such, it will be a great boon not only to patent activities, but general R&D
in the United States.  As such, there should be one of more of the technology
government agencies interested in supporting this endeavor. If any of you have
contacts at the DOE/NASA/ARPA/NSF/NII/NIH/DOC/UN, please let me know or
contact such people directly.  Given the international use of the Internet,
I expect many people around the world to access the US patent data, and hope
that one of the UN educational/science agencies could lend support.


    LOCAL SERVERS


    For a modest fee, I will be glad to set up local servers that mirror the
Mosaic site at corporations whose employees plan to make great use of the
patent information I am making available.  In the interest of decreasing the
load on the Internet (and for those corporations not wanting their employees'
searches floating over the Internet), consider having a mirror site setup on
your local WANs/LANs.   The current configuration requires 500 megabytes of
space.  In particular, there are many Patent News subscribers at IBM, HP,
Motorola, HP and ATT where mirror systems seems prudent.


    DATA ACCURACY


    All of the US patent data I send out and store on the Mosaic server comes
from the Patent and Trademark Office itself, either from the CASSIS CDROMs,
or downloads from APS.  For the most part, I do not alter the data in any way,
so any errors that might occur in the data do not originate with me.  For
example, about 5000 patent circa 4,790,000 have their titles truncated beyond
what I have to do for the server software.  This is because the CASSIS CDROMs
I used to acquire the patent titles had these truncated records.  I plan to
download data from APS to correct the problem.  But beyond incidents like
this, my data is as accurate as that of the PTO.




    FINAL APPEAL


    (Well not really). The technology existed last year to put up all of the
US patent data onto the Internet for free public access at a very reasonable
(in Beltway terms) cost, and can be done year-end with sufficient support.
But to do all of this, I am going to need your support and help.  I have
taken this effort as far as I can on my own, and would like to go further.
So think about the quality and quantity of patent information you have
received to date, and the additional patent information such an expanded
Mosaic site will provide you, and then consider making a donation to this
effort.  A little bit from a lot of people can go a long way, and the patent
data is too important to wait for the government to figure out what the real
value of the Internet is.




Greg Aharonian
Internet Patent News Service





--
http://www.eff.org/~mech/mech.html       Stanton McCandlish
mailto:mech () eff org              mech () eff org

http://www.eff.org/               Electronic Frontier Fndtn.

http://www.eff.org/~mech/a.html   Online Activist



Current thread: