Project

General

Profile

Bug #3138

Load balancing / multi wan / fail over not working anymore

Added by Mastah Naleh over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Category:
Multi-WAN
Target version:
Start date:
08/10/2013
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.1
Affected Architecture:

Description

Resume
Since RC1 my fail over doesn't switch back to tier1 and stay to tier2 even if tier1 is back online and has better ping.

Here is my conf
Gateways
  • WAN (with monitor ip)
  • Opt1 (with monitor ip)

Group
  • Failover (WAN & Opt1) with rules : WAN tier1 (Higher priority) / Opt1 tier2 (Lower priority) / Packet Loss or High Latency
  • If WAN fail switch to Opt1, if WAN come back switch back to WAN

What is really occurring on pfsense
When WAN fail traffic switch to Opt1, but when WAN come back online traffic doesn't switch back to WAN, it stay lock on Opt1.
Also, even if WAN come back online, I need to restart multiple time aping service for the console to show that WAN is Online.

History

#1 Updated by Mastah Naleh over 6 years ago

Somehow the message got messed up, here is a better presentation.

Resume

Since RC1 my fail over doesn't switch back to tier1 and stay to tier2 even if tier1 is back online and has better ping.

Here is my conf

Gateways
  • WAN (with monitor ip)
  • Opt1 (with monitor ip)
Group
  • Failover (WAN & Opt1) with rules : WAN tier1 (Higher priority) / Opt1 tier2 (Lower priority) / Packet Loss or High Latency
  • If WAN fail switch to Opt1, if WAN come back switch back to WAN

What is really occurring on pfsense

When WAN fail traffic switch to Opt1, but when WAN come back online traffic doesn't switch back to WAN, it stay lock on Opt1.
Also, even if WAN come back online, I need to restart multiple time aping service for the console to show that WAN is Online.

#2 Updated by Mastah Naleh over 6 years ago

Might be related and extend : https://redmine.pfsense.org/issues/3134

#3 Updated by Mastah Naleh over 6 years ago

I'll add a precision :

The failover work properly when it's the monitor IP that doesn't respond any more (ie: switch to tier2 then switch to tier1).

But when the interface is put down (power surge on the gateway for example or gateway electrical restart) and then restarted, the interface is put on pending. Eventually, if the interface come back online, you need to restart manually apinger to see the interface in a state other than "pending", but still, the group won't switch back to tier1 (WAN in my case) and stick to tier2 (Opt1 in my case)

History of event n°1:
1) Group tier1 (WAN) / tier2 (Opt1) is using tier1 (WAN) as traffic
2) tier1 (WAN) goes offline (electrical restart)
3) tier1 declared as offline by pfsense (no more ping of monitor)
4) group doesn't switch to tier2 (it should have switch to tier2)
5) tier1 come back online, but apinger doesn't ping, tier1 put in pending until apinger manualy restarted

History of event n°2:
1) pfsense reboot
2) while pfsense reboot, tier1 goes offline
3) pfsense up and runing (tier1 is offline)
4) tier1 (WAN) come back online
5) tier1 declared as pending
6) group is using tier2 as traffic
7) manual restart of apinger
8) tier1 marked as online
9) group is not switching to tier1 by stick on tier2 as traffic (even though tier1 monitor is responding) / traffic should now be passing by tier1 but is not
10) offline and reonline tier2 (Opt1) switch traffic back to tier1 (WAN)

This two scenario were working properly on RC0 (one of the latests version of RC0), until I switch to RC1 recently. So something got broken in between.

#4 Updated by Chris Buechler over 6 years ago

  • Status changed from New to Closed

cause of this is same as #3134 (related to apinger changes in general that we're working on).

#5 Updated by Mastah Naleh over 6 years ago

Thanks.

I had a strong feeling it was related to apinger.

Also available in: Atom PDF