Politech mailing list archives
FC: Microsoft blames outage on router misconfiguration, not attack
From: Declan McCullagh <declan () well com>
Date: Thu, 25 Jan 2001 11:10:13 -0500
******* Microsoft's statement: http://www.microsoft.com/info/siteaccess.htm ******* http://www.wired.com/news/technology/0,1282,41412,00.html How, Why Microsoft Went Down by Declan McCullagh (declan () wired com) 6:00 a.m. Jan. 25, 2001 PST Microsoft's websites were offline for up to 23 hours -- the most dramatic snafu to date on the Internet -- because of an equipment misconfiguration, the company says. A series of problems centering around its collection of routers in Canyon Park, Wash. -- near the company's headquarters -- is what the company blames for knocking out dozens of Microsoft (MSFT) properties including hotmail.com and msn.com, frustrating millions of users and providing acute embarrassment for a company that is offering the promise of unprecedented reliability in marketing its Internet products. "We screwed up. (Tuesday) night at around 6:30 p.m. Pacific time we made a configuration change to the routers on the DNS network," spokesman Adam Sohn said Wednesday evening. The company said in a statement that it took nearly a day to determine what was wrong and undo the changes. Microsoft's sites -- including microsoft.com, slate.com, expedia.com and msnbc.com -- started to work properly again at about 4:30 p.m., PST, Wednesday. Media Metrix reports that the combined properties, not including news sites, received 54 million unique visitors in December. Technical experts blame Microsoft's design decisions for exacerbating its woes. All the affected Microsoft sites rely on just four Windows servers, located in the company's Canyon Park data center, to forward users to the right destination via the Domain Name System (DNS). Because all four DNS servers -- which translate names like microsoft.com into its 207.46.230.218 numeric address -- share the same routers, all are vulnerable to hardware glitches or a technician's error. "Sure, small organizations have their DNS servers located together and there's nothing wrong with that," says Rich Kulawiec, a consultant with 20 years of networking experience. "But national or global organizations should, as standard operating procedure, have their DNS servers on different networks served by different ISPs and running on different operating systems -- Solaris and FreeBSD, or Linux and HPUX -- so as to minimize the threats for DoS attacks, known OS vulnerabilities, and connectivity issues." Some companies already offer supra-reliable DNS to nervous customers worried about downtime. Nominium, a Redwood City, Calif. startup, boasts its has many collections of DNS servers, each with at least two different hardware and OS platforms, and each connected to two different ISPs. "If an entire (Nominium) site fails, the other sites around the world would continue to serve customers' domain data," the company's white paper says. Ultradns.com offers a similar service. "The problem that Microsoft is experiencing once again illustrates the fact that even if you are a technically competent organization, your business is at significant risk without a highly reliable DNS infrastructure," said William Thomas, president and CEO of Nominum. Making matters worse for Microsoft's frantic technicians was that they were racing against time: For efficiency's sake, ISPs, corporations and universities keep caches of the numeric IP addresses of frequently-visited sites. But caches began to expire at different times across the Internet yesterday, which meant Microsoft's properties began to fade, gradually, from public view. [...] ------------------------------------------------------------------------- POLITECH -- Declan McCullagh's politics and technology mailing list You may redistribute this message freely if it remains intact. To subscribe, visit http://www.politechbot.com/info/subscribe.html This message is archived at http://www.politechbot.com/ -------------------------------------------------------------------------
Current thread:
- FC: Microsoft blames outage on router misconfiguration, not attack Declan McCullagh (Jan 25)