IDS mailing list archives
Re: IPS Reliability/Availability
From: Mike Smith <geek_brigades () yahoo com>
Date: Sat, 18 Feb 2006 09:45:36 -0800 (PST)
My original message was prompted by a similar thread posted on the TippingPoint list. Please read below, ath the end of the message the original thread I compiled from the TippingPoint user list. Indeed ASICs, FPGAs may be able do sustain more throughput, but I honestly think that for lower bandwidth/high latentcy networks, a mature, well tested *NIX/Intel/RISC product can be more stable than ASICS/PGAs technologies that are still working to iron out their problems. And TippingPoint is an example. Thanks, Mike ############# ORIGINAL MESSAGE Inquiring the TippingPoint User list about hardware problems: I would like to query the community about any experiences with hardware issues seen with the 2400 series IPA's We have had 2 of them go bad on us weeks apart. One set of ports on a segment was giving us problems, and the other went into layer-2 fallback with a MZDM error. Just trying to get a feel on how the hardware is holding up. Thanks, Sr. Network Security Engineer Enterprise Security T. Rowe Price ---------------------------------------------------- We have changed our 2400 twice already. First time a HD problem, second time a port problem. The second time one segment stopped working and another would not forward layer 3 traffic in a SX/LX GBIC combination after the latest TOS upgrade. XXX XXX Lead Information Security Officer/Engineer Super Computing Technology Coordinator Norfolk State University ----------------------------------- We have 4 of the 2400âs at one location and have had 3 RMAâd (one of them replaced twice). All have been within the past 6 months. Problems ranging from: - device becomes unmanageable but still passes traffic (no https, ssh, console or LCD functionality) - suspended task errors which causes about 80% packet loss on all segments - bad disk Devices affected are on both copper and fiber (LX and SX) segments Another major problem we face is the fact that after a TOS upgrade, the segment interfaces revert back to auto/auto speed/duplex settings. This causes issues with our copper segments and makes it very risky to upgrade remote sites with no tech support on-site. Itâs in the release notes as an upgrade issue but has been outstanding for months now. Systems Administrator Avid Technology -------------------------------- We have 4 of the 400s and 3 RMA occurrences in two years. The units did not fail, but went into degraded performance state, do to a thermal alarm threshold trigger. We have the same issue with the segments reverting back to auto/auto during TOS upgrade. Hopefully this will finally be corrected with the next release. Sr Network Analyst Cooper Cameron ----------------------------------- out of several 2400s and 2+ years we've seen a couple cpu fans and one disk go bad, but these came at pretty long intervals and weren't particularly surprising. the rmas went fine, even over weekends... it seems each one was better then the one before so progress in the right direction. i imagine the 5000e's (with their no moving parts) will be even better, and i'd love to see the same non-moving parts spread out to the other platforms like our beloved 2400s Manager of Security Resources UNC Chapel Hill --------------------------------------------- Our 2400 IPA suffered a problem whereby the unit would drop packets in long TCP sessions. It was not noticed during short file transfers, e.g. most web traffic. But if you tried to FTP lots of data, like our Physics department was doing, it would drop a packet here and there and the TCP retransmission logic was unable to recover. Our only workaround was to drop the box back into Layer 2 FallBack mode and just do bridging. Unfortunately, when we sent the box back, we never found out the root cause of the device failure. Tsk. Tsk. Thankfully, our new 5000 IPA has been running smoothly without a hitch. College of William and Mary Information Technology - Network Engineering ------------- We lost one 2400 a few months ago due to Thermal Failure. The RMA box was DOA. 2d RMA worked but lost a Power Supply on that one a few weeks ago. The 2400 does not log events for PS issues. Senior Network Engineer WakeMed ---------------------------------------------------------------------- We have 2 of the 200's for little over a year and I'm in the process of RMA for one of them for a bad disk right now. Does anybody else find it unacceptable that Tipping Point is unable to process an RMA on a Weekend? I confirmed the Disk errors on Saturday and was unable to request a new unit be shipped until Monday and it looks like the unit won't be in house until tomorrow. Luckily I have a secondary unit but that's a long time to run with downgraded redundancy. Network Security Admin III http://www.elementk.com ############################# Answer Message from Don Ward, TippingPoint VP Engineering: From: don_ward () 3com com [mailto:don_ward () 3com com] Sent: Tuesday, November 08, 2005 6:00 PM To: Tipping Point Users Group Cc: Tipping Point Users Group Subject: RE: [tippingpoint] Issues with 2400 Series All-- Based on the great feedback to this thread (all honest, all healthy to bring forth), I must share with everyone what TippingPoint has been in process doing for the past 8 months to address both hardware and software quality issues. The majority of RMAs in the past 2 years have primarily been the result of faulty HDDs and CPU fans. HDDs have failed for two primary reasons: 1. High levels of adhoc read/write cycles (wearing drives out) 2. Overwriting non-protected areas of the drive leading to file corruption, non-bootable drives, and/or drive failure We have addressed both above issues to-date as follows: 1. Invoked a scheduled process for writes (via the RAMDISK function - starting in TOS 1.4.2 and beyond) so drives do not wear out as quickly 2. Invoked a HDD Patch in 2.1.3.6321 that protects the HDD from file corruption/overwrites so drives do not both report superfluous error messages and/or die due to file/boot sector corruption We have started the process of addressing CPU fan (thermal event issues) by phasing out CPU fans (which are prone to fail throughout the industry) and replace with both solid-state drives (removing spinning parts) and heat pipes (removing the need for fans). This change of materials has been instituted in the 5000E product and will phase into other IPS hardware models moving forward in the next couple of months. The variety of software bugs around memory corruption (i.e. non-responsive management access, page faults) that have very frequently resulted in RMAs have been resolved with TOS 2.1.3.6321 as well. Over 700 customers have upgraded to 2.1.3.6321 in the past 5 weeks and the code has been running very stable to-date. We have witnessed a large reduction in the number of RMAs as well. Best Regards, /dw __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ------------------------------------------------------------------------ Test Your IDS Is your IDS deployed correctly? Find out quickly and easily by testing it with real-world attacks from CORE IMPACT. Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708 to learn more. ------------------------------------------------------------------------
Current thread:
- RE: IPS Reliability/Availability, (continued)
- RE: IPS Reliability/Availability Alan Shimel (Feb 21)
- Re: IPS Reliability/Availability Martin Roesch (Feb 21)
- Re: IPS Reliability/Availability Bob Walder (Feb 22)
- Re: IPS Reliability/Availability Sap . (Feb 24)
- Re: IPS Reliability/Availability Bob Walder (Feb 24)
- Re: IPS Reliability/Availability Gwendolynn ferch Elydyr (Feb 26)
- RE: IPS Reliability/Availability Mike Barkett (Feb 19)
- RE: IPS Reliability/Availability Alan Shimel (Feb 21)
- Re: IPS Reliability/Availability Bob Walder (Feb 22)
- Re: IPS Reliability/Availability David W. Goodrum (Feb 19)
- Re: IPS Reliability/Availability Mike Smith (Feb 22)
- RE: IPS Reliability/Availability Alan Shimel (Feb 21)