Project

General

Profile

Actions

Bug #3815

closed

Gateway monitoring broken

Added by Tobias Wolter over 9 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Gateway Monitoring
Target version:
Start date:
08/19/2014
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
All
Affected Architecture:

Description

Cheers,

Gateway monitoring seems utterly broken ATM. We get barrages of log messages along these lines:

Aug 19 15:12:51     php: rc.newipsecdns: Gateways status could not be determined, considering all as up/active. (Group: WANGW_FAILOVER)
Aug 19 15:12:51     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:51     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:51     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:51     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:51     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:51     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:51     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:51     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:51     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:51     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:51     php: rc.newipsecdns: Forcefully reloading IPsec racoon daemon
Aug 19 15:12:46     php: rc.newipsecdns: Gateways status could not be determined, considering all as up/active. (Group: WANGW_FAILOVER)
Aug 19 15:12:46     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:46     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:46     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:46     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:46     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:46     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:46     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:46     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:46     php: rc.newipsecdns: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:46     php: rc.newipsecdns: Default gateway down setting WANGW2 as default!
Aug 19 15:12:46     php: rc.newipsecdns: IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
Aug 19 15:12:33     php: rc.filter_configure_sync: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:33     php: rc.filter_configure_sync: Default gateway down setting WANGW2 as default!
Aug 19 15:12:31     php: rc.dyndns.update: MONITOR: WANGW1 is down, removing from routing group WANGW_FAILOVER
Aug 19 15:12:31     php: rc.dyndns.update: Default gateway down setting WANGW2 as default!
Aug 19 15:12:31     php: rc.openvpn: OpenVPN: One or more OpenVPN tunnel endpoints may have changed its IP. Reloading endpoints that may use WANGW1.

Yes, WANGW1 is actually down as a testing measure. But that should not mean that pfsense/apinger should be allowed to start acting crazy and reassigning the default gateway every other millisecond.

Because we're trying to test a failover solution for implementation, we have the following options set:

  • Reinitiate IPsec on gateway state change
  • Allow changing of default gateway

The OpenVPN connection is pretty relaxed concerning this issue - probably because it's not directly running on the failover group, but rather using NAT via the LAN interface - but the IPsec connection (which goes the direct route) is utterly unusable, as it restarts every other minute at best.

The relevant rc. scripts complaining about this change by the minute, but it mostly seems to be something from the IPsec subsystem.

Especially the "considering all as active" bit is wildly irritating, at it just does seem to give up after five attempts of thinking WANGW1 is down and then setting it as up again.

Actions

Also available in: Atom PDF