Project

General

Profile

Actions

Regression #11570

open

Gateway monitoring services is not always restarted on interface events, which may prevent a WAN from recovering back to an online state

Added by M L about 1 year ago. Updated 13 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
Gateways
Target version:
Start date:
02/27/2021
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
22.05
Release Notes:
Default
Affected Version:
2.5.x
Affected Architecture:
All

Description

Good evening. This seems to be a new bug in 2.5, and was not a problem in 2.4. In gateway group configured for main/failover (tier 1 and tier 2), the switch from main to failover worked perfectly. But when the main is restored, it fails to even notice and doesn't failback. This has been reported by numerous users in the subreddit. My post on reddit: https://www.reddit.com/r/PFSENSE/comments/lnuolf/failover_back_to_main_wan_not_switching_without/

This is actually a very expensive and troubling bug. Many people use an LTE modem with metered data, paying by the MB or GB for data. This bug keeps racking up dollars until you go in to manually change it back.

Main to failover switching:
  1. Unplug WAN1
  2. WAN1 interface status shows link down. Check.
  3. Gateway monitor detects loss and marks as offline. Check.
  4. Default gateway changes to WAN2. Check.
  5. Traffic begins flowing properly on WAN2 (only 30 seconds downtime). Check.
  6. Dynamic DNS clients (5) all get updated. Check.
  7. OpenVPN clients (3) all go down and come back up on WAN2. Check.
  8. All systems normal, no meltdowns, smoke contained in devices.
Failover back to main, not so great:
  1. Plug in WAN1
  2. WAN1 interface status shows link up with the IP. Check.
  3. Gateway monitor shows pending/unknown.
  4. The end. Default gateway fails to switch back to main, and obviously nothing else after that happens either.
I can go into System > Routing > Click Save/Apply (no changes), and that seems to kick the gateway monitor. The default gateway switches back to main.
  1. Traffic begins flowing on the main virtually uninterrupted. Check.
  2. Dynamic DNS clients all update back to the main. Check.
  3. OpenVPN clients fail to change back to the main. The OpenVPN clients all remain on WAN2. I have to restart the OpenVPN service for each client, and then they come back up on the main.
  4. All systems back to normal. Yay.

I understand the OpenVPN not cycling back may be an existing issue for many years that people solve with a cron job. But the rest of this problem is new with 2.5.


Related issues

Related to Bug #11142: rc.newwanip restarts VPN services when the IP matchesResolvedViktor Gurov12/08/2020

Actions
Related to Regression #12215: OpenVPN does not resync when running on a gateway groupFeedback

Actions
Related to Bug #12771: Automatic filter reload with OpenVPN client gateway uplink happens too soon or not at allFeedbackViktor Gurov

Actions
Related to Bug #12613: DNS Resolver does not restart during link up/down events on a static IP address interfaceFeedbackViktor Gurov

Actions
Related to Bug #12811: Services are not restarted when PPP interfaces connectFeedbackViktor Gurov02/16/2022

Actions
Actions #1

Updated by M L about 1 year ago

I forgot to mention... this does problem only seems to occur when you fail the main by way of unplugging the WAN interface, or powering off the modem, where the link goes down. If you fail the main by for example unplugging the coax to the cable modem, or the ISP goes down, something other than the actual link going down, everything works fine in both directions.

Actions #2

Updated by Viktor Gurov about 1 year ago

related to #10716 and #11298 (?)

Actions #3

Updated by Viktor Gurov about 1 year ago

M L wrote:

Failover back to main, not so great:
  1. Plug in WAN1
  2. WAN1 interface status shows link up with the IP. Check.
  3. Gateway monitor shows pending/unknown.
  4. The end. Default gateway fails to switch back to main, and obviously nothing else after that happens either.

Unable to reproduce this part - after a while the Gateway monitor shows "Online" and successfully restarts the filter/ovpn/ipsec on WAN1.

Maybe there is some kind of race condition

Actions #4

Updated by James Blanton about 1 year ago

Viktor Gurov wrote:

M L wrote:

Failover back to main, not so great:
  1. Plug in WAN1
  2. WAN1 interface status shows link up with the IP. Check.
  3. Gateway monitor shows pending/unknown.
  4. The end. Default gateway fails to switch back to main, and obviously nothing else after that happens either.

Unable to reproduce this part - after a while the Gateway monitor shows "Online" and successfully restarts the filter/ovpn/ipsec on WAN1.

Maybe there is some kind of race condition

This sounds similar to my issue on Bug #11630.

Actions #5

Updated by Fred Latke about 1 year ago

I can reproduce exactly the same behavior. If I loose connectivity to the ISP or disconect the coaxil cable from my modem, the main WAN gateway gets placed as default just fine after the outage. If I disconnect the UTP cable or turn off the router, after everythings back up the interface status will show as up, but the gateways widget will show the interface as "offline, packet loss".

Going into System > Routing and clicking save/apply without any changes fixes everything.

Actions #6

Updated by Marcos Mendoza 12 months ago

It would seem this is fixed on 2.5.1/2.6 according to the comment on #11805

Hi, just want to report its working fine now for me using the latest dev CE version 2.6.0.a.20210524.0100
More details: Running in Hyper-V, Gateway group Load balancing with 3 Tier 1 Openvpn Gateways.
For me, 2.5.0-dev broke the Gateway Group. 2.5.1 broke Port forward and fixed Gateway Groups, 2.6.0.a fixed them both.

If you were/are having this issue, please test on either of these versions.

Actions #7

Updated by Jim Pingle 12 months ago

  • Status changed from New to Feedback
Actions #8

Updated by Lars Möller 8 months ago

We are having the same problem on SG-3100, XG-7100, SG-5100. It occours on 21.* up to 21.05.1. On 2.4.5 everything was fine.

The problem occours if the main WAN is DHCP. In another setup where main WAN is PPPOE everything is working fine.

Here 2 example setups:

Not working, it never switches back to main:
Main WAN: DHCP (LTE-Hybrid Router) (Interface is not going down, but has packet loss)
Backup WAN: DHCP (DSL-Router, very slow)
Gateway Group: "Packet Loss" or "Packet Loss or low latency"

Working fine in case of main WAN down (could not test packet loss case, main WAN is very reliable):
Main WAN: PPPOE (Fiber-Modem)
Backup WAN: fixed IPv4 (VDSL Lancom Router)
Gateway Group: "Packet Loss" or "Packet Loss or low latency"

The only work around we could find is to manually switch WANs. Our customers are getting more and more frustrated. When can we expect a solution?

Actions #9

Updated by Chris B 6 months ago

I'm seeing this on 21.05.2-RELEASE too. Once failover from WAN to WAN2 happens it will never fail back. the WAN get a DHCP address but the gateway stays Pending. Even pulling out WAN2 completely just causes the default to go away and you end up with nothing. WAN never comes out of Pending until you bounce WAN.
WAN is Tier1 and WAN2 is Tier2.

Actions #10

Updated by Marcos Mendoza 6 months ago

Tested this on 22.01.a.20211013.0500 - it worked correctly (as in the default gateway did change under Diagnostics / Routes). The logging is somewhat inconsistent however:

Statically assigned:

Nov 2 20:47:24     rc.gateway_alarm     62185     >>> Gateway alarm: WAN1GW (Addr:192.0.2.1 Alarm:1 RTT:.383ms RTTsd:.133ms Loss:22%)
Nov 2 20:47:24     check_reload_status     384     updating dyndns WAN1GW
Nov 2 20:47:24     check_reload_status     384     Restarting IPsec tunnels
Nov 2 20:47:24     check_reload_status     384     Restarting OpenVPN tunnels/interfaces
Nov 2 20:47:24     check_reload_status     384     Reloading filter
Nov 2 20:47:25     php-fpm     40189     /rc.dyndns.update: MONITOR: WAN1GW has packet loss, omitting from routing group WANGWGROUP
Nov 2 20:47:25     php-fpm     40189     192.0.2.1|192.0.2.2|WAN1GW|0.385ms|0.134ms|24%|down|highloss
Nov 2 20:47:25     php-fpm     40189     /rc.dyndns.update: Gateway, switch to: WAN2GW
Nov 2 20:47:25     php-fpm     40189     /rc.dyndns.update: Default gateway setting WAN2GW as default.
Nov 2 20:47:25     php-fpm     14272     /rc.openvpn: Gateway, switch to: WAN2GW
Nov 2 20:47:25     php-fpm     14272     /rc.openvpn: Default gateway setting WAN2GW as default.
Nov 2 20:47:25     php-fpm     14272     /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Nov 2 20:47:26     php-fpm     40189     /rc.dyndns.update: phpDynDNS: updating cache file /conf/dyndns_WANGWGROUP_rfc2136_'sitea.dyndns.lab.arpa'_ns1.lab.arpa.cache: 192.0.2.244
Nov 2 20:47:40     php-fpm     97321     /rc.ipsec: IPSEC: One or more IPsec tunnel gateways have changed. Refreshing.
Nov 2 20:47:40     check_reload_status     384     Reloading filter
Nov 2 20:47:41     php-fpm     97321     /rc.ipsec: Gateway, none 'available' for inet6, use the first one configured. ''
Nov 2 20:49:26     rc.gateway_alarm     4482     >>> Gateway alarm: WAN1GW (Addr:192.0.2.1 Alarm:0 RTT:.394ms RTTsd:.196ms Loss:5%)
Nov 2 20:49:26     check_reload_status     384     updating dyndns WAN1GW
Nov 2 20:49:26     check_reload_status     384     Restarting IPsec tunnels
Nov 2 20:49:26     check_reload_status     384     Restarting OpenVPN tunnels/interfaces
Nov 2 20:49:26     check_reload_status     384     Reloading filter
Nov 2 20:49:27     php-fpm     13321     /rc.dyndns.update: MONITOR: WAN1GW is available now, adding to routing group WANGWGROUP
Nov 2 20:49:27     php-fpm     13321     192.0.2.1|192.0.2.2|WAN1GW|0.394ms|0.195ms|4%|online|none
Nov 2 20:49:27     php-fpm     13321     /rc.dyndns.update: Gateway, switch to: WAN1GW
Nov 2 20:49:27     php-fpm     13321     /rc.dyndns.update: Default gateway setting WAN1GW as default.
Nov 2 20:49:27     php-fpm     38053     /rc.openvpn: Gateway, switch to: WAN1GW
Nov 2 20:49:27     php-fpm     38053     /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Nov 2 20:49:28     php-fpm     13321     /rc.dyndns.update: phpDynDNS: updating cache file /conf/dyndns_WANGWGROUP_rfc2136_'sitea.dyndns.lab.arpa'_ns1.lab.arpa.cache: 192.0.2.4
Nov 2 20:49:42     check_reload_status     384     Reloading filter 

DHCP:

Nov 2 21:37:09     rc.gateway_alarm     82217     >>> Gateway alarm: WAN1_DHCP (Addr:192.0.2.1 Alarm:1 RTT:.855ms RTTsd:4.492ms Loss:21%)
Nov 2 21:37:09     check_reload_status     384     updating dyndns WAN1_DHCP
Nov 2 21:37:09     check_reload_status     384     Restarting IPsec tunnels
Nov 2 21:37:09     check_reload_status     384     Restarting OpenVPN tunnels/interfaces
Nov 2 21:37:09     check_reload_status     384     Reloading filter
Nov 2 21:37:10     php-fpm     45785     /rc.openvpn: MONITOR: WAN1_DHCP has packet loss, omitting from routing group WANGWGROUP
Nov 2 21:37:10     php-fpm     45785     192.0.2.1|192.0.2.2|WAN1_DHCP|0.875ms|4.566ms|23%|down|highloss
Nov 2 21:37:10     php-fpm     45785     /rc.openvpn: Gateway, switch to: WAN2_DHCP
Nov 2 21:37:10     php-fpm     45785     /rc.openvpn: Default gateway setting Interface WAN2_DHCP Gateway as default.
Nov 2 21:37:10     php-fpm     45785     /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Nov 2 21:37:10     php-fpm     45785     /rc.openvpn: route_add_or_change: Invalid gateway and/or network interface ipsec1
Nov 2 21:37:25     check_reload_status     384     Reloading filter
Nov 2 21:39:15     rc.gateway_alarm     94172     >>> Gateway alarm: WAN1_DHCP (Addr:192.0.2.1 Alarm:0 RTT:.408ms RTTsd:.142ms Loss:5%)
Nov 2 21:39:15     check_reload_status     384     updating dyndns WAN1_DHCP
Nov 2 21:39:15     check_reload_status     384     Restarting IPsec tunnels
Nov 2 21:39:15     check_reload_status     384     Restarting OpenVPN tunnels/interfaces
Nov 2 21:39:15     check_reload_status     384     Reloading filter
Nov 2 21:39:31     php-fpm     19377     /rc.ipsec: IPSEC: One or more IPsec tunnel gateways have changed. Refreshing.
Nov 2 21:39:31     check_reload_status     384     Reloading filter
Nov 2 21:39:32     php-fpm     19377     /rc.ipsec: Gateway, none 'available' for inet6, use the first one configured. '' 

Another try using DHCP:

Nov 2 21:58:51     rc.gateway_alarm     2969     >>> Gateway alarm: WAN1_DHCP (Addr:192.0.2.1 Alarm:1 RTT:.447ms RTTsd:.242ms Loss:22%)
Nov 2 21:58:51     check_reload_status     384     updating dyndns WAN1_DHCP
Nov 2 21:58:51     check_reload_status     384     Restarting IPsec tunnels
Nov 2 21:58:51     check_reload_status     384     Restarting OpenVPN tunnels/interfaces
Nov 2 21:58:51     check_reload_status     384     Reloading filter
Nov 2 21:58:53     php-fpm     45785     /rc.dyndns.update: phpDynDNS: updating cache file /conf/dyndns_WANGWGROUP_rfc2136_'sitea.dyndns.lab.arpa'_ns1.lab.arpa.cache: 192.0.2.242
Nov 2 21:58:54     php-fpm     45785     /rc.dyndns.update: phpDynDNS: Not updating sitea.dyndns.lab.arpa A record because the IP address has not changed.
Nov 2 21:59:07     check_reload_status     384     Reloading filter
Nov 2 22:00:13     rc.gateway_alarm     16897     >>> Gateway alarm: WAN1_DHCP (Addr:192.0.2.1 Alarm:0 RTT:.699ms RTTsd:3.371ms Loss:6%)
Nov 2 22:00:13     check_reload_status     384     updating dyndns WAN1_DHCP
Nov 2 22:00:13     check_reload_status     384     Restarting IPsec tunnels
Nov 2 22:00:13     check_reload_status     384     Restarting OpenVPN tunnels/interfaces
Nov 2 22:00:13     check_reload_status     384     Reloading filter
Nov 2 22:00:15     php-fpm     19377     /rc.openvpn: MONITOR: WAN1_DHCP is available now, adding to routing group WANGWGROUP
Nov 2 22:00:15     php-fpm     19377     192.0.2.1|192.0.2.2|WAN1_DHCP|0.688ms|3.327ms|4%|online|none
Nov 2 22:00:15     php-fpm     19377     /rc.openvpn: Gateway, switch to: WAN1_DHCP
Nov 2 22:00:15     php-fpm     19377     /rc.openvpn: Default gateway setting Interface WAN1_DHCP Gateway as default.
Nov 2 22:00:15     php-fpm     45785     /rc.dyndns.update: Gateway, switch to: WAN1_DHCP
Nov 2 22:00:15     php-fpm     19377     /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Nov 2 22:00:15     php-fpm     19377     /rc.openvpn: route_add_or_change: Invalid gateway and/or network interface ipsec1
Nov 2 22:00:15     php-fpm     45785     /rc.dyndns.update: phpDynDNS: updating cache file /conf/dyndns_WANGWGROUP_rfc2136_'sitea.dyndns.lab.arpa'_ns1.lab.arpa.cache: 192.0.2.2
Nov 2 22:00:16     php-fpm     45785     /rc.dyndns.update: phpDynDNS: Not updating sitea.dyndns.lab.arpa A record because the IP address has not changed. 

Actions #11

Updated by Viktor Gurov 6 months ago

  • Status changed from Feedback to New

same issue on 22.01.a.20211029.0500 - once failover from WAN to LTE happens it will never fail back until I manually click 'apply' on the System / Routing / Gateways page.

Actions #12

Updated by dave wilson 5 months ago

Does anyone have a good automated workaround? I have Starlink (DHCP) as primary WAN and LTE modem w/ethernet as backup. Should I try assigning static IPs for primary? The manual 'click apply' isn't ideal if I'm not available to execute it.

Actions #13

Updated by Scott Silver 5 months ago

I think I may have tracked down one of the problems here. It seems that pfSense is forgetting to reset the gateway monitor when the WAN interface comes back up in certain cases. In my case, the WAN IP comes back up as the same IP address as it was previous. So newwanip, the script that runs when a WAN gets a new IP, seems to not reset the gateway monitor (because it checks for this case, possibly as an optimization, possibly for other reasons I don't understand)

Here are the details:

  • One of my interfaces goes away, so pfSense loses one of its WANs.
  • When it comes back pfSense requests a new IP via DHCP.
  • Subsequently there is the script rc.newwanip that is supposed to run when a WAN interfaces gets a new IP.
  • rc.newwanip guards this code with "isSameAsLastWANAddress()" and since my ISP issues the same address, pfSense does not run this code.
  • This code, in particular, would reset the gateway monitor. Since pfSense does not reset it, the old instance of the gateway monitor (dpinger) will continue to run. However, it can never send out any new ICMP/ping messages because the socket refers to a dead interface and not the new one so no pings come back.
  • Thus, dpinger never thinks the interface comes back.
  • So why does running dpinger from the command line work, even when the gateway monitor instance doesn't? When we run dpinger from the comman dpinger gets a working socket for the new interface.
  • The "quick but probably wrong" fix is to make this code on line 204 always run. See that I OR'd in 1 into the conditional below.
if (/*added so we do this all the time*/ 1 || !is_ipaddr($oldip) || ($curwanip != $oldip) ||
    (!is_ipaddrv4($config['interfaces'][$interface]['ipaddr']) && ($config['interfaces'][$interface]['ipaddr'] != 'dhcp'))) {
    /*
     * Some services (e.g. dyndns, see ticket #4066) depend on
     * filter_configure() to be called before, otherwise pass out
     * route-to rules have the old ip set in 'from' and connections
     * do not go through the correct link
     */
    filter_configure_sync();

    /* reconfigure our gateway monitor, dpinger results need to be 
     * available when configuring the default gateway */
    setup_gateways_monitor();
Actions #14

Updated by Scott Silver 5 months ago

Note that https://redmine.pfsense.org/issues/11142 was the bug that someone fixed that tries to solve some other problem.

I suspect the correct fix will not touch the VPN and will only reset gateway_monitor.

Actions #15

Updated by Viktor Gurov 5 months ago

  • Related to Bug #11142: rc.newwanip restarts VPN services when the IP matches added
Actions #16

Updated by Viktor Gurov 5 months ago

  • Tracker changed from Bug to Regression
Actions #18

Updated by Jim Pingle 5 months ago

  • Assignee set to Viktor Gurov
  • Priority changed from High to Normal
  • Target version set to CE-Next
  • Plus Target Version set to 22.05
Actions #19

Updated by Jim Pingle 5 months ago

  • Status changed from New to Pull Request Review
Actions #20

Updated by Viktor Gurov 5 months ago

  • Related to Regression #12215: OpenVPN does not resync when running on a gateway group added
Actions #21

Updated by Viktor Gurov 3 months ago

  • Related to Bug #12771: Automatic filter reload with OpenVPN client gateway uplink happens too soon or not at all added
Actions #22

Updated by Viktor Gurov 3 months ago

  • Status changed from Pull Request Review to Feedback
  • % Done changed from 0 to 100
Actions #23

Updated by Viktor Gurov 3 months ago

  • Related to Bug #12613: DNS Resolver does not restart during link up/down events on a static IP address interface added
Actions #24

Updated by → luckman212 3 months ago

Did this make it into 2.6 / 22.01 or do we need to use System Patches to get it? - edit nevermind, I see it's targeted at 22.05

Actions #25

Updated by Viktor Gurov 3 months ago

  • Related to Bug #12811: Services are not restarted when PPP interfaces connect added
Actions #26

Updated by Jim Pingle 3 months ago

  • Target version changed from CE-Next to 2.7.0
Actions #27

Updated by Wayne Sherman about 2 months ago

Setup:
2.6.0-RELEASE (amd64), dual WAN with both WANs on DHCP, and failover via Gateway groups. (default gateway = PreferWAN1)

Test:
Unplugging one of the WAN network cables, wait for a few minutes, and then plug back in

Problems:
1) dpinger does not monitor a WAN port after the port comes back up
2) If I manually restart dpinger, both gateways show as online, but the default gateway does not switch back to WAN1.

Fixed by patch:
After applying the patch, both problems above are fixed.
( https://redmine.pfsense.org/projects/pfsense/repository/1/revisions/ec73bb89489d830ec21c4e04ffa3ec401791b55d )

New problem after patching:
After applying the patch referenced above, a new problem shows up in the logs with an error trying to restart unbound:
pfSense php-fpm[373]: /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1648663263] unbound[14890:0] error: bind: address already in use [1648663263] unbound[14890:0] fatal error: could not open ports'

Unbound error in context:
Mar 30 11:00:53 pfSense php-fpm[372]: /rc.linkup: DEVD Ethernet attached event for opt1
Mar 30 11:00:53 pfSense php-fpm[372]: /rc.linkup: HOTPLUG: Configuring interface opt1
Mar 30 11:01:00 pfSense check_reload_status[411]: rc.newwanip starting igb1
Mar 30 11:01:00 pfSense check_reload_status[411]: Restarting IPsec tunnels
Mar 30 11:01:01 pfSense php-fpm[373]: /rc.newwanip: rc.newwanip: Info: starting on igb1.
Mar 30 11:01:01 pfSense php-fpm[373]: /rc.newwanip: rc.newwanip: on (IP address: 192.168.12.150) (interface: WAN2[opt1]) (real interface: igb1).
Mar 30 11:01:03 pfSense php-fpm[373]: /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1648663263] unbound[14890:0] error: bind: address already in use [1648663263] unbound[14890:0] fatal error: could not open ports'
Mar 30 11:01:03 pfSense check_reload_status[411]: updating dyndns opt1
Mar 30 11:01:04 pfSense php-fpm[373]: /rc.newwanip: Resyncing OpenVPN instances for interface WAN2.
Mar 30 11:01:04 pfSense php-fpm[373]: /rc.newwanip: Creating rrd update script
Mar 30 11:01:07 pfSense php-fpm[373]: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 192.168.12.150 -> 192.168.12.150 - Restarting packages.

Actions #28

Updated by Jim Pingle about 2 months ago

  • Subject changed from Gateway group doesn't failback from tier 2 to tier 1, worked properly in 2.4 to Gateway monitoring services is not always restarted on interface events, which may prevent a WAN from recovering back to an online state
Actions #29

Updated by Viktor Gurov about 1 month ago

Wayne Sherman wrote in #note-27:

Setup:
2.6.0-RELEASE (amd64), dual WAN with both WANs on DHCP, and failover via Gateway groups. (default gateway = PreferWAN1)

Test:
Unplugging one of the WAN network cables, wait for a few minutes, and then plug back in

Problems:
1) dpinger does not monitor a WAN port after the port comes back up
2) If I manually restart dpinger, both gateways show as online, but the default gateway does not switch back to WAN1.

Fixed by patch:
After applying the patch, both problems above are fixed.
( https://redmine.pfsense.org/projects/pfsense/repository/1/revisions/ec73bb89489d830ec21c4e04ffa3ec401791b55d )

New problem after patching:
After applying the patch referenced above, a new problem shows up in the logs with an error trying to restart unbound:
@pfSense php-fpm373: /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1648663263] unbound[14890:0] error: bind: address already in use [1648663263] unbound[14890:0] fatal error: could not open ports'

Unable to reproduce on pfSense-22.05.a.20220407.0600 - everything works fine, without unbound errors.
Please test on the latest snapshots, and if it happens again, provide unbound configuration.

Actions #30

Updated by Jürgen Echter 13 days ago

Viktor Gurov wrote in #note-29:

Wayne Sherman wrote in #note-27:

Setup:
2.6.0-RELEASE (amd64), dual WAN with both WANs on DHCP, and failover via Gateway groups. (default gateway = PreferWAN1)

Test:
Unplugging one of the WAN network cables, wait for a few minutes, and then plug back in

Problems:
1) dpinger does not monitor a WAN port after the port comes back up
2) If I manually restart dpinger, both gateways show as online, but the default gateway does not switch back to WAN1.

Fixed by patch:
After applying the patch, both problems above are fixed.
( https://redmine.pfsense.org/projects/pfsense/repository/1/revisions/ec73bb89489d830ec21c4e04ffa3ec401791b55d )

New problem after patching:
After applying the patch referenced above, a new problem shows up in the logs with an error trying to restart unbound:
@pfSense php-fpm373: /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1648663263] unbound[14890:0] error: bind: address already in use [1648663263] unbound[14890:0] fatal error: could not open ports'

Unable to reproduce on pfSense-22.05.a.20220407.0600 - everything works fine, without unbound errors.
Please test on the latest snapshots, and if it happens again, provide unbound configuration.

i also added the patch, but i still have the same problem. If i disable monitoring in the routing tab, and re-enable it, it is working again, else it stays on pending on the dashboard and doesn't switch back to online.

If you need any information just tell me. pfsense 2.6.0

Actions #31

Updated by Marcos Mendoza 13 days ago

What interface(s) does unbound have assigned? Is this a VM?

Actions

Also available in: Atom PDF