Project

General

Profile

Actions

Feature #855

open

Ability to selectively kill states on gateways recovery

Added by Chris Buechler over 11 years ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Multi-WAN
Target version:
Start date:
08/27/2010
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Plus-Next
Release Notes:
Default

Description

The current practice of killing all states when a connection goes down on that downed connection is fine for the majority of scenarios, but some would like to see additional options. First, the ability to optionally kill states to fail back once the original connection recovers. I suspect there may be other desired scenarios as well, which can be added here as they're encountered.


Related issues

Related to Feature #12807: Clear Active Secondary WAN ConnectionsDuplicate

Actions
Related to Feature #12092: Utilize new ``pfctl`` abilities to kill statesFeedbackJim Pingle06/29/2021

Actions
Actions #1

Updated by xavier Lemaire about 6 years ago

Chris Buechler wrote:

The current practice of killing all states when a connection goes down on that downed connection is fine for the majority of scenarios, but some would like to see additional options. First, the ability to optionally kill states to fail back once the original connection recovers. I suspect there may be other desired scenarios as well, which can be added here as they're encountered.

Hi Chris,

As i am crazy i am testing this change in /etc/rc.gateway_alarm :
It s not very clean but i hope it s going to do the job.

GW="$1"

if [ -z "$GW" ]; then
exit 1
fi
if [ "$3" = 0 ]; then
for i in $(ps -aux | grep dpinger | grep -v grep | grep -v "$1" | awk '{print $18}');
do
/sbin/pfctl -k "${i}";
done
fi

/usr/local/sbin/pfSctl \
-c "service reload dyndns ${GW}" \
-c "service reload ipsecdns" \
-c "service reload openvpn ${GW}" \
-c "filter reload" >/dev/null 2>&1

exit $?

Actions #2

Updated by Julien REVERT almost 6 years ago

Is it still plan to have "states killing" on gateway failback?

I have the issue that UDP connections of ip phones or OpenVPN clients remain on the backup wan when master wan is back.

The main issue is at pfsense startup because if master wan is up after backup wan, all iphones and OpenVPN client are registered on the backup wan and keep this config until I do a manual flush states.

How to fix this issue before having an option like "flush states on gateway back"?

Thanks.

Actions #3

Updated by James M almost 6 years ago

Julien REVERT wrote:

Is it still plan to have "states killing" on gateway failback?

I have the issue that UDP connections of ip phones or OpenVPN clients remain on the backup wan when master wan is back.

The main issue is at pfsense startup because if master wan is up after backup wan, all iphones and OpenVPN client are registered on the backup wan and keep this config until I do a manual flush states.

How to fix this issue before having an option like "flush states on gateway back"?

Thanks.

I agree with Julien, something like this is needed for state failback after a connection is down.

Actions #4

Updated by → luckman212 almost 6 years ago

This would be especially useful for VOIP, where there are often frequent registrations or other SIP traffic that keeps the states locked to the failover WAN even after the primary has come back online. This results in excess usage charges and also poor quality calls where e.g. the failover line is a 4G metered connection. So I would love to see this as well.

I just noticed that this feature request is 6 years old. :/

Actions #5

Updated by Travis McMurry almost 5 years ago

As echoed by others, I'm seeing the same thing for VOIP and other devices which auto negotiate VPN tunnels which maintain constant connectivity - Femtocells/Microcells, Meraki branded equipment...

It's also a cost concern as the failover options I use tend to be OOB/4G/LTE, if devices in a failover situation stay connected to a metered connection, that does incur extra cost for unnecessarily consumed bandwidth.

As of 8/3/2017, it's now a 7 year old feature request. Nudge. :-)

Actions #6

Updated by Jim Pingle over 3 years ago

  • Target version changed from Future to 48

See also: #7605

Actions #7

Updated by Andrew Bucklin over 3 years ago

+1 I'm surprised this isn't already a feature. I noticed this today when we our primary connection came back online, but our off-site data backups (which traverse a OpenVPN client connection) were still going over the secondary WAN link, which is 500x slower than the primary WAN. Thank you!

Actions #8

Updated by Jim Pingle about 3 years ago

  • Target version changed from 48 to 2.5.0
Actions #9

Updated by Marc H almost 2 years ago

+1 - this is a badly needed feature with multi WAN where one connection is truly a "backup only" connection. High cost metered LTE, etc... We need an option to force states to fail back to the primary WAN when it is available. Thanks.

Actions #10

Updated by Raffi T over 1 year ago

+1 I haven't really been hurt by this until recently while performing a big backup job to the cloud. Failover occurred briefly but there was still a significant amount of data usage on the metered 4G backup connection well after the event. I had to disable the gateway monitoring action while performing this backup. It says this was requested 10 years ago? Ouch, not enough people requesting it?

Actions #11

Updated by Anonymous over 1 year ago

  • Assignee set to Renato Botelho
Actions #12

Updated by Anonymous over 1 year ago

  • Target version changed from 2.5.0 to CE-Next
Actions #13

Updated by aptalca aptalca about 1 year ago

I just hit this issue with a failover LTE connection (metered).

I have almost everything go out over a wireguard tunnel on 2.5.0.

After a main WAN connection loss, everything successfully switched over to the failover LTE gateway. However, after the main WAN came back online and once again became the default gateway, the wireguard tunnel remained going over the backup LTE gateway indefinitely (until I manually intervened).

Hopefully 10 years is a charm? :-)

Thanks

Actions #14

Updated by Viktor Gurov 3 months ago

  • Related to Feature #12807: Clear Active Secondary WAN Connections added
Actions #15

Updated by Jim Pingle 2 months ago

  • Related to Feature #12092: Utilize new ``pfctl`` abilities to kill states added
Actions #16

Updated by Jim Pingle 2 months ago

  • Subject changed from More flexible options for state killing based on WAN status to Ability to selectively kill states on lower tier gateways when higher tier recovers
  • Assignee deleted (Renato Botelho)
  • Plus Target Version set to Plus-Next

Updating subject. Many scenarios are now possible with #12092 and also some more will be covered by #12931 so this can be reduced in scope to the single scenario of killing states for lower tier gateways.

Thanks to recent changes in pfctl this is closer to reality. There is now an ability to kill by the gateway information in a state (pfctl -k gateway -k x.x.x.x). This can be leveraged here to make it much easier to clear these states.

Now there are only a few items to take care of:

  • Needs a new option to selectively activate this feature somewhere, either globally or on a gateway group.
  • This only makes logical sense to activate on a single gateway group. There isn't a way to only kill states from a gateway address while also restricting that to states created by a specific gateway group.
    • If the option is global, it should include a way to select the failover group for which it applies.
    • If the option is on a gateway group, it should only be possible to activate it on a single group, similar to changing the default gateway. Either deactivate the other or throw an input error saying it can't be enabled on more than one per address family.
  • Must note that it only works for states matching policy routing rules, as it will not work for traffic matching rules that rely on default gateway switching directly. Those states have 0.0.0.0/:: in their data.
  • The actual action will have to be careful to ONLY activate when a gateway recovers from down to up state, not on every filter reload or page load that notices a gateway is up. The gateway alarm script may be the optimal place to handle this, but needs testing to ensure it doesn't activate too often.
  • Similar to #12092 it might be nice to have a new per-gateway option to override this behavior
Actions #17

Updated by Jim Pingle 2 months ago

  • Subject changed from Ability to selectively kill states on lower tier gateways when higher tier recovers to Ability to selectively kill states on gateways recovery
Actions

Also available in: Atom PDF