Interesting People mailing list archives

IP: Defining the line between hacking and web surfing... -- from


From: Dave Farber <farber () cis upenn edu>
Date: Wed, 08 Jul 1998 07:39:18 -0500

Date: Wed, 1 Jul 1998 19:49:14 -0700
From: Eli Goldberg <eli () prometheus-music com>
Subject: Defining the line between hacking and web surfing...


I've recently been faced with a very curious intellectual dilemma: at what
point is the web browsing that we do potentially --- and unknowingly ---
crossing the line into illegal hacking?


RISKS has explored this topic before (such as with alternate uses of 
robots.txt files, such as for finding interesting stuff like 
http://www.cnn.com/webmaster_logs/). 


Here are two recent encounters that have left me rather perplexed:


Case #1: A lot of AFS directories (a network file system popularized by 
CMU in the 1980s) have been starting to appear in recent months as 
publically viewable HTTP directories, without the knowledge of their 
owners. (In many cases, the directory owners have since graduated or 
moved to a staff position, leaving countless long-forgotten files and 
E-mail archives in their home directory.)


On two occasions in the past month (one at MIT, and one at CMU), 
I've performed ordinary web searches using ordinary search engines, and 
ended up finding private documents belonging to friends, with personal 
and confidential information. 


In each case, I immediately alerted the friend, and they had the 
permissions changed immediately, and the offending material removed. 
(Now, removing the summaries from a dozen search engines for hundreds of 
pages will be another matter. ;)


Could perhaps a tenuous argument be constructed that an individual 
reading these private documents --- after realizing that they were not 
meant to be publically posted --- was hacking?




Case #2: A *lot* of webmasters omit index.html files in critical 
directories, or perhaps forget to configure their servers to deny access 
to directory listings to HTTP directories that lack index.html files. 
This renders any casual web surfer trivially able to surf the actual 
directory tree of their web site --- including their CGI directory --- 
and associated private data files.


I've encountered this twice tonight --- once while attempting to 
post a housing vacancy at a local University's housing list (system was 
down, and I was curious why ;), and a second time while browsing a web 
site of a music publisher whose works I have enjoyed in the past.


In the latter case, I immediately stumbled upon full archives of 
this company's (unprotected) customer orders, web logs, & associated 
information, and other information that I believe any company should 
reasonably consider private. Ouch! 


Let's say I went ahead and read those files. 


Say, I was curious about more information about the company's 
customers buying habits, and had no malicious or criminal intent. Would 
this be breaking the law? 


On one hand, the webmaster *probably* didn't intend for the 
information to be public. Does a difference truly exist between 
exploiting known configuration errors in web sites, and exploiting known 
configuration errors in networked UNIX systems to access information not 
meant to be public?


On the other hand, it doesn't matter what they intend. They *have* 
made it public, and they've just placed it on a server where any bozo 
with a web browser can get to it just by typing a regular URL; how could 
one be breaking the law by viewing what they've already placed in a 
public area for viewing? 


(Certainly, I never signed an agreement to limit my use of the web site to
merely clicking on links, and have every right to type whatever I'd like
into the URL field!)


Now, let's say a competitor to the company in question happened to stumble
upon the same URL and data. What, then?


Current thread: