Project

General

Profile

Actions

Regression #11570

closed

Gateway monitoring services is not always restarted on interface events, which may prevent a WAN from recovering back to an online state

Added by M L about 3 years ago. Updated 5 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Gateway Monitoring
Target version:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Release Notes:
Force Exclusion
Affected Version:
2.5.x
Affected Architecture:
All

Description

Good evening. This seems to be a new bug in 2.5, and was not a problem in 2.4. In gateway group configured for main/failover (tier 1 and tier 2), the switch from main to failover worked perfectly. But when the main is restored, it fails to even notice and doesn't failback. This has been reported by numerous users in the subreddit. My post on reddit: https://www.reddit.com/r/PFSENSE/comments/lnuolf/failover_back_to_main_wan_not_switching_without/

This is actually a very expensive and troubling bug. Many people use an LTE modem with metered data, paying by the MB or GB for data. This bug keeps racking up dollars until you go in to manually change it back.

Main to failover switching:
  1. Unplug WAN1
  2. WAN1 interface status shows link down. Check.
  3. Gateway monitor detects loss and marks as offline. Check.
  4. Default gateway changes to WAN2. Check.
  5. Traffic begins flowing properly on WAN2 (only 30 seconds downtime). Check.
  6. Dynamic DNS clients (5) all get updated. Check.
  7. OpenVPN clients (3) all go down and come back up on WAN2. Check.
  8. All systems normal, no meltdowns, smoke contained in devices.
Failover back to main, not so great:
  1. Plug in WAN1
  2. WAN1 interface status shows link up with the IP. Check.
  3. Gateway monitor shows pending/unknown.
  4. The end. Default gateway fails to switch back to main, and obviously nothing else after that happens either.
I can go into System > Routing > Click Save/Apply (no changes), and that seems to kick the gateway monitor. The default gateway switches back to main.
  1. Traffic begins flowing on the main virtually uninterrupted. Check.
  2. Dynamic DNS clients all update back to the main. Check.
  3. OpenVPN clients fail to change back to the main. The OpenVPN clients all remain on WAN2. I have to restart the OpenVPN service for each client, and then they come back up on the main.
  4. All systems back to normal. Yay.

I understand the OpenVPN not cycling back may be an existing issue for many years that people solve with a cron job. But the rest of this problem is new with 2.5.


Files

clipboard-202206071059-njc9h.png (324 KB) clipboard-202206071059-njc9h.png → luckman212, 06/07/2022 09:59 AM
11570test.diff (1.36 KB) 11570test.diff Marcos M, 06/07/2022 08:54 PM

Related issues

Related to Bug #11142: rc.newwanip restarts VPN services when the IP matchesResolvedViktor Gurov12/08/2020

Actions
Related to Regression #12215: OpenVPN does not resync when running on a gateway groupClosed

Actions
Related to Bug #12771: Automatic filter reload with OpenVPN client gateway uplink happens too soon or not at allResolvedViktor Gurov

Actions
Related to Bug #12613: DNS Resolver does not restart during link up/down events on a static IP address interfaceResolvedViktor Gurov

Actions
Related to Bug #12811: Services are not restarted when PPP interfaces connectResolvedJim Pingle

Actions
Related to Regression #14616: dpinger does not start after renewing DHCPResolvedMarcos M

Actions
Related to Bug #12920: Gateway behavior differs when the gateway does not exist in the configurationFeedbackMarcos M

Actions
Related to Bug #14725: Primary IPv6 interface address may be incorrect when a ULA is setResolvedMarcos M

Actions
Related to Bug #12947: DHCP6 client does not take any action if the interface IPv6 address changes during renewalFeedback

Actions
Actions

Also available in: Atom PDF