nanog mailing list archives

Re: Question about EX - SRX redundancy


From: Anurag Bhatia <me () anuragbhatia com>
Date: Sun, 14 Jun 2015 01:09:44 +0530

Hello everyone


Just thought to update over here that I was able to get it done as needed.
Some quick points across the same on building redundancy between Juniper EX
and SRX devices:


   1. Virtual chassis in EX is very different from clustering in SRX and I
   did not realized the same initially.
   2. Key difference is that in virtual chassis both devices run in stacked
   config and act as single device while routing engine of primary/master EX
   is used.
   3. In case of SRX only one device runs at a time and ports of other SRX
   (slave) do not access traffic at all as long as it see the master is up via
   heartbeat.



So keeping above points in mind, I did 6 cables connection between EX and
SRX. On SRX side all 6 ports belong to same reth bundle (3 on SRX-0 and 3
on SRX-1). On Ex side configuration is in a way to use two ae bundles. If
we use same ae bundle for all 6 ports then problem comes up as a % of
traffic will hit SRX-1 (slave/secondary) and would be trashed which is not
desired. Hence we need to make two ae bundles as say ae1 and ae2. One
bundle goes towards one SRX (say master SRX-0) and other bundle goes
towards other SRX-1. Ports in ae1 and ae2 can be distributed across
multiple Ex to ensure redundancy.





So setup can work as:


EX-0 >> 2 patches >> SRX-0
EX-1 >> 2 patches >> SRX-1

Ex-0 >> 1 patch >> SRX-1
Ex-1 >> 1 patch >> SRX-0


Now all ports on Ex towards SRX0 go in ae1 and SRX1 go in ae2.



Thanks everyone for help and inputs. Have a good weekend ahead!



On Fri, Apr 3, 2015 at 3:52 AM, Hugo Slabbert <hugo () slabnet com> wrote:

I just want to confirm your setup.

The "criss-cross" setup you were describing is different from what I
described.

You listed:

 > EX0 (ae1) >> Two Patches to SRX0 (reth1)
EX0 (ae1) >> One patch to SRX1 (reth1)

EX1 (ae2)  >> Two Patches to SRX1 (reth1)
EX1 (ae2)  >> One patch to SRX0 (reth1)


...meaning that your AEs cannot survive losing either one of the EX VC
members, and you're splitting each AE's connectivity across the two SRX
chassis cluster members.  You need to dedicate an AE to an SRX chassis
cluster member.

IOW: ae18 should have one LAG member on EX0 and one member on EX1, and
both of those physical ports go to SRX0.  Likewise, ae20 should have one
LAG member on EX0 and one member on EX1, and both of those physical ports
go to SRX1.

When you shut one of the AEs (e.g. ae18) in the setup I describe, you
*will* lose connectivity to its corresponding SRX, as those are
fate-sharing.  You would need to configure interface monitoring on the
chassis cluster to flip over the primary to 2nd SRX in order to survive
that, since the second AE (ae20) that is tied to the 2nd SRX is still up.

Your failure modes are:

e.g. 1: lose an EX, you lose the throughput that's being contributed to
the AE by that VC member's ports, but both SRXs remain available and the
primary shouldn't flip (provided your node priorities and
interface-monitoring weights are set accordingly).

e.g. 2: shut an AE (which spans both EX VC members), one SRX goes dark
since you've killed the AE that's dedicated to it, and the primary will
need to flip (either through interface monitoring or manually) in order for
the setup to remain online.

--
Hugo


On Fri 2015-Apr-03 02:41:35 +0530, Anurag Bhatia <me () anuragbhatia com>
wrote:

 Hi


Tried exactly same. Note: it's ae18 and ae20 on EX side and reth4 on SRX
side.


Initially worked but when I took down ae18, i.e ae18 is disabled, now on
ae20 I am getting:

show interfaces ae20
Physical interface: ae20, Enabled, Physical link is Up
 Interface index: 533, SNMP ifIndex: 924
 Link-level type: Ethernet, MTU: 1514, Speed: 2Gbps, BPDU Error: None,
MAC-REWRITE Error: None, Loopback: Disabled, Source filtering: Disabled,
 Flow control: Disabled, Minimum links needed: 1, Minimum bandwidth
needed: 0



on reth4 on SRX I am getting:

show interfaces reth4
Physical interface: reth4, Enabled, Physical link is Down
 Interface index: 132, SNMP ifIndex: 696



Any idea why so? All physical ports are up (none is shut) and only thing
which I shut is one of ae bundles. Also rather then disabling ae18 if I
disabled associated physical ports behavior is just the same i.e reth4
goes
down.




Thanks for your time and help!



On Fri, Apr 3, 2015 at 12:25 AM, Hugo Slabbert <hugo () slabnet com> wrote:

 Putting the EXs in a VC and splitting your AEs across the 2x VC members
takes care of that.

EXVC  (ae1)  >> Two Patches to SRX0 (reth1)
EXVC  (ae2)  >> Two Patches to SRX1 (reth1)

...where EXVC is a VC composed of EX0 and EX1, and ae1 and ae2 both have
one member interface from each VC member.

In a failure of EX0 or EX1, your throughput on ae1 and ae2 halves as they
each lose a LAG member, but both SRX0 and SRX1 are still reachable.

--
Hugo


On Thu 2015-Apr-02 23:50:46 +0530, Anurag Bhatia <me () anuragbhatia com>
wrote:

 Hi




Yes,


Since SRX0 connected to EX0 and SRX1 connected to EX1 (only). Thus
either
pair - 0 will work or pair - 1 will work. I wish if criss crossing
worked
then failure of one EX would have still made both SRX available.


In current worst case scenario - failure of EX0 and SRX1 can cause full
outage.



Thanks.

On Thu, Apr 2, 2015 at 9:21 PM, Hugo Slabbert <hugo () slabnet com> wrote:

 In:


 > EX0  (ae1) >> Two Patches to SRX0 (reth1)

 > EX1   (ae2)  >> Two Patches to SRX1 (reth1)



  with:


 > that if one EX goes down then I cannot make use of other
corresponding

 SRX.



  Do you mean that e.g. if SRX0 is the chassis cluster primary and
EX0

goes
down, then you can't use SRX0, but you would like to be able to survive
EX0
going down *without* failing over the SRX chassis cluster to SRX1?

--
Hugo


On Thu 2015-Apr-02 20:47:03 +0530, Anurag Bhatia <me () anuragbhatia com>
wrote:

 Hi



I thought cross chassis lag is supposed by the use of reth bundled at
SRX
end. I read this is basically the major difference in reth Vs ae
bundle
in
SRX.


Interesting factor here is that ae bundles can spread across multiple
EX
chassis in a virtual chassis environment but this cannot be the case
with
ae bundles in SRX.




Thanks.

On Thu, Apr 2, 2015 at 7:59 PM, Bill Blackford <bblackford () gmail com>
wrote:

 It's my understanding that a cross chassis LAG is not supported. If
there

 is a way, I'm not aware of it. I'm running the same set up as your
working
example in my locations and for now, this suits my requirements.

Sent from my iPhone

On Apr 2, 2015, at 07:12, Anurag Bhatia <me () anuragbhatia com>
wrote:

Hello everyone!




I have got two Juniper EX series switches (on virtual chassis) and
two
SRX
devices on native clustering.


I am trying to have a highly available redundancy between them with
atleast
2Gbps capacity all the time but kind of failing. I followed
Juniper's
official page here
<http://kb.juniper.net/InfoCenter/index?page=content&id=KB22474>
as
well as
this detailed forum link here
<
http://forums.juniper.net/t5/SRX-Services-Gateway/Best-way-
of-redundancy-between-SRX-and-EX/td-p/181365

.


I wish to have a case where devices are connected criss cross and
following
the documentation I get two ae bundles in EX side and one single
reth
bundle on SRX side. Both ae bundles on EX side have identical
configuration
and VLAN has both ae interfaces called up.


If I do not go for criss cross connectivity like this:



EX0  (ae1) >> Two Patches to SRX0 (reth1)
EX1   (ae2)  >> Two Patches to SRX1 (reth1)


Then it works all well and redundancy works fine. In this case as
long
as 1
out of 4 patch is connected connectivity stays live but this has
trade
off
that if one EX goes down then I cannot make use of other
corresponding
SRX.

If I do criss connectivity, something like:


EX0 (ae1) >> Two Patches to SRX0 (reth1)
EX0 (ae1) >> One patch to SRX1 (reth1)

EX1 (ae2)  >> Two Patches to SRX1 (reth1)
EX1 (ae2)  >> One patch to SRX0 (reth1)


In this config system behaves very oddly with one ae pair (and it's
corresponding physical ports) working well while failover to other
ae
bundle fails completely.



I was wondering if someone can point me out here.




Appreciate your time and help!





--


Anurag Bhatia
anuragbhatia.com

Linkedin <http://in.linkedin.com/in/anuragbhatia21> | Twitter
<https://twitter.com/anurag_bhatia>
Skype: anuragbhatia.com

PGP Key Fingerprint: 3115 677D 2E94 B696 651B 870C C06D D524 245E
58E2




--


Anurag Bhatia
anuragbhatia.com

Linkedin <http://in.linkedin.com/in/anuragbhatia21> | Twitter
<https://twitter.com/anurag_bhatia>
Skype: anuragbhatia.com

PGP Key Fingerprint: 3115 677D 2E94 B696 651B 870C C06D D524 245E 58E2




--


Anurag Bhatia
anuragbhatia.com

Linkedin <http://in.linkedin.com/in/anuragbhatia21> | Twitter
<https://twitter.com/anurag_bhatia>
Skype: anuragbhatia.com

PGP Key Fingerprint: 3115 677D 2E94 B696 651B 870C C06D D524 245E 58E2




--


Anurag Bhatia
anuragbhatia.com

Linkedin <http://in.linkedin.com/in/anuragbhatia21> | Twitter
<https://twitter.com/anurag_bhatia>
Skype: anuragbhatia.com

PGP Key Fingerprint: 3115 677D 2E94 B696 651B 870C C06D D524 245E 58E2




-- 


Anurag Bhatia
anuragbhatia.com

Linkedin <http://in.linkedin.com/in/anuragbhatia21> | Twitter
<https://twitter.com/anurag_bhatia>
Skype: anuragbhatia.com

PGP Key Fingerprint: 3115 677D 2E94 B696 651B 870C C06D D524 245E 58E2


Current thread: