nanog mailing list archives

Re: TCP Window Scaling issue


From: Zach Hill <zach.reborn () gmail com>
Date: Thu, 24 Jul 2014 14:33:56 -0400

*First round of packet captures*

Here are the snippets from a packet capture.

First is the SYN from Server A to Server B http://i.imgur.com/E5cu4ev.png Here
is the SYN from Server B backhttp://i.imgur.com/RRSAl8G.png

Second test from Server C to Server B: First is the SYN from Server C to
Server B http://i.imgur.com/Jc2K6bT.pngand the SYN from Server B to Server
C http://i.imgur.com/pbvx9jJ.png

I guess I'm at a loss as to why in scenario 1 neither are sending window
scaling at all. Is it because Server A isn't attempting or initializing?

I'm in the process of setting up a VM that I can SPAN for a capture from
the source of Server A. This will allow me to compare packets at each side.

*Second round of packet captures*

Now I just don't even know what is going on...

Is this quantum physics now? Did the state just change by me looking at it?
Here are some new screencaps. The only change that's been made was a SPAN
port enabled on the Nexus7k sourced at Server A and destination for my new
tcpdump capture server.

Site 1 captures: 1 http://i.imgur.com/K5r7FaG.png 2
http://i.imgur.com/wfnfLyi.png

Site 2 capture: 1 http://i.imgur.com/vpY2lnh.png 2
http://i.imgur.com/UyL3V6L.png

Now they are both communicating a window size. Speed is still slow at
400-450KBps


On Thu, Jul 24, 2014 at 1:23 PM, Matthew Petach <mpetach () netflight com>
wrote:




On Thu, Jul 24, 2014 at 9:51 AM, Zach Hill <zach.reborn () gmail com> wrote:

Also just to reiterate I would lean more heavily on something fishy in

the WAN cloud if all traffic from Site 1 to Site 2 were not seeing tcp
window scaling properly, however it's only for Server A that is seeing
this. Server A is able to properly TCP window scale for any local traffic.


Remember, the WAN cloud is just that, a cloud;
it's not likely to be a single link underneath it all;
so one bad link/bad port/bad device in the cloud
can affect just a sub-portion of the traffic, depending
on the 5-tuple hashing that takes place.

An interesting test would be to be give server A
a different address (secondary address should be
fine, all you need to do is source packets from a
different source address) and see if your scaling
suddenly reappears.  If it does, it's definitely down
to the 5-tuple hashing happening within The Cloud(tm).

Matt



On Thu, Jul 24, 2014 at 12:47 PM, Zach Hill <zach.reborn () gmail com>
wrote:

Hi Machael,

Let me setup another packet capture at each side to see if the initial
packets are being modified at all.

Thanks,


On Thu, Jul 24, 2014 at 12:39 PM, Michael Brown <
michael () supermathie net>
wrote:

On 14-07-24 12:30 PM, Zach Hill wrote:
Hi Tony. No firewall in the way.

Physical flow is as below.

Server A -> Nexus 7k -> 3845 router -> Sprint MPLS -> 3845 router ->
Cisco
3750x stack -> Server B

I blame the cloud.

Dump the actual packets as they leave Server A and arrive at Server B
(and vice-versa!). Does it get modified en route?

M.

--
Michael Brown            | The true sysadmin does not adjust his
behaviour
Systems Administrator    | to fit the machine.  He adjusts the machine
michael () supermathie net  | until it behaves properly.  With a hammer,
                         | if necessary.  - Brian








Current thread: