Firewall Wizards mailing list archives

FW-1 load balancing


From: Neil Ratzlaff <Neil.Ratzlaff () ucop edu>
Date: Thu, 06 Nov 1997 09:57:14 -0800

PROBLEM:

I am trying to use FireWall-1 (solaris 2.5.1 on Sparc20, running FW 3.0b)
for load balancing a group of https web servers (type = Other, method =
Round Robin).  I am not using address translation, other than the IP
changes FW-1 does in 'Other' method.  The certificate is for a virtual
machine on the same subnet as the servers.  I started with 2 servers and
things worked well.  I added a third server (H) to the server group, and
this server began to get the majority of hits.  When H is not in the group
but still available directly on the WWW, it gets almost no hits.   This is
very reproducible, as I can move the server in and out of the group and see
the same responses every time.  Many of the hits were not coming via the
load balancing rule.

I have run snoop on the external interface of the firewall, and see nothing
coming in directed to the third server.  All inbound packets come to the
virtual server for load balancing, even as some connections bypass the load
balancing rule even as they hit the virtual server.

A.B.C.4 and A.B.C.11 are the first two addresses, and A.B.C.1 is the third.
 I began to wonder if FW-1 thinks that A.B.C.1 is a special address for
load balancing.  


SOLUTION:

After trying many things, I did change the name and IP address of the third
server.  When I placed it back into the server group, every thing seems to
work fine, and I decided that the IP address was indeed a problem. 

But investigation by the application people showed that the cgi script on
the third server  retained the original machine name, not the logical
server name the other two servers had.  If this was returned to clients and
cached, it could explain some of the behavior we saw.  Then with no machine
of that name still around, no load problems were seen. This does not
explain why no incoming packets (using snoop) on the external interface
were looking for the third server during the times it was getting hit more
than its share of time.  If I change the address back and disrupt their
service, I get lynched, so I can't test this.

An additional item I noticed is that load balancing appeared to work (H got
1/3 of the hits) after putting the third server into the server group and
installing the database.  But H was never hit under the load balancing
rule, only under a later more general accept rule, although it maintained
the 1/3 ratio it should have had under load balance.  After I installed the
policy, H was hit 1/3 of the time under the load balance rule.

At least it now works, and I hope eventually to test the components again
and determine exactly why.  I thank the people who offered suggestions.

Neil






Current thread: