Security Basics mailing list archives

Invisible dilemma - ARP flush


From: WALI <hkhasgiwale () gmail com>
Date: Sat, 10 Mar 2007 22:54:54 +0400


We have 100 MBps EoATM link between two office buildings. Say A and B. Server and majority of users are in Building A while a few (about 150) are in Building B. Router on the Building B end is configured for QoS as there is also Voice traffic floating across.

The connection between the two buildings has been recently upgraded to 100 MBps from initial 10Mbps.

Once every 2-3 days, users from building B starts to complain about slow network connections to Servers lying in Building A. The usual ping from B to A that takes <1ms, increases to 30-40ms. Ethereal shows no Broadcast traffic. Building A users complain of no such problems either. 100 Mbps connectivity between the two buildings remains under utilised. To me, it seems to be a problem local to Building B. We have four L3 48 port switches cascaded with gigabit uplink to each other. 2 VLANS and spanning tree enabled on all.

Crazy Solution: I take out any patch cable and re-inserts it, the problem gets resolved. I reset any switch, the problem gets resolved. I disconnect any uplink cable between the four switches or do a ARP reset thru command line, the problem gets resolved for couple of hours or even days.

But where could the problem lie?

I have ran Nessus, did find quite a few windows unpatched machines in Building B that had lost their connection with WSUS, so did the patching. Made sure that all the machines are running latest anti-virus definitions. Sent a mail across to all users to get their laptops checked for latest updates (few have returned although).

What else can I do next time the problem recurs. It's a mystery till now. The switch support provider has upgraded the IOS and says there is nothing wrong with the switch. The VoIP provider maintains there instruments are fine. Is there a bandwidth monitoring free software? What else can help me here apart from routine wireshark/ethereal?

Where else could the problem lie?


Current thread: