Bug #8914
closedGateway switch events cause a huge amount of log spew
0%
Description
When a gateway failure/defgw switch event occurs, there seem to be some code paths that get iterated over dozens of times in a very short span of time. This floods system.log making it hard to read. Specifically, these 2 phrases are repeated over and over:
Gateways status could not be determined, considering all as up/active
Keep current gateway, its already part of the group members
This is part of an effort to try to make the logs readable again, while still allowing verbose logging if needed via a new hidden config option from the developer shell:
$config['system']['gw-debug'] = true;
write_config("Enable Developer Spew");
exec
exit
Updated by Renato Botelho about 6 years ago
- Status changed from New to Feedback
- Assignee set to Renato Botelho
- Target version set to 2.4.4-GS
PR merged
Updated by Paighton Bisconer about 6 years ago
Replicated in 2.4.4-RELEASE
Steps to replicate:
Configure Gateway Group with 2 WANs
Start Tier 1 down, Tier 2 up
Connect Tier 1, wait for link to configure, disconnect Tier 1.
Oct 3 00:47:41 php-fpm 97158 /rc.openvpn: MONITOR: WAN_DHCP is down, omitting from routing group df 192.168.34.194|192.168.34.204|WAN_DHCP|0.224ms|0.035ms|25%|down Oct 3 00:47:41 php-fpm 97158 /rc.openvpn: Gateways status could not be determined, considering all as up/active. (Group: df) Oct 3 00:47:42 php-fpm 97158 /rc.openvpn: Gateway, switch to: Oct 3 00:47:42 php-fpm 337 /rc.dyndns.update: Gateways status could not be determined, considering all as up/active. (Group: df) Oct 3 00:47:42 php-fpm 61881 /rc.filter_configure_sync: Gateways status could not be determined, considering all as up/active. (Group: df) Oct 3 00:47:42 php-fpm 337 /rc.dyndns.update: Gateway, switch to: Oct 3 00:47:42 php-fpm 337 /rc.dyndns.update: Gateways status could not be determined, considering all as up/active. (Group: df) Oct 3 00:47:42 php-fpm 61881 /rc.filter_configure_sync: Gateway, switch to: Oct 3 00:47:42 php-fpm 61881 /rc.filter_configure_sync: Gateways status could not be determined, considering all as up/active. (Group: df)
Had different results each time, in each of 3 tests there were 2 entries, 5 entries, and 7 entries for that message.
Oct 3 00:47:04 php-fpm 73667 /rc.newwanip: rc.newwanip: Info: starting on igb0. Oct 3 00:47:04 php-fpm 73667 /rc.newwanip: rc.newwanip: on (IP address: 192.168.34.204) (interface: WAN[wan]) (real interface: igb0). Oct 3 00:47:04 php-fpm 73667 /rc.newwanip: Gateways status could not be determined, considering all as up/active. (Group: df) Oct 3 00:47:04 php-fpm 73667 /rc.newwanip: Keep current gateway, its already part of the group members. Oct 3 00:47:04 php-fpm 73667 /rc.newwanip: Gateways status could not be determined, considering all as up/active. (Group: df) Oct 3 00:47:05 php-fpm 73667 /rc.newwanip: 73667MONITOR: WAN_DHCP is available now, adding to routing group df 192.168.34.194|192.168.34.204|WAN_DHCP|0.263ms|0.028ms|0.0%|none Oct 3 00:47:05 php-fpm 73667 /rc.newwanip: Keep current gateway, its already part of the group members. Oct 3 00:47:05 php-fpm 97158 /rc.linkup: Keep current gateway, its already part of the group members.
Also seeing keep current gateway
Updated by Anonymous about 6 years ago
- Target version changed from 2.4.4-GS to 2.4.4-p1
Updated by Anonymous about 6 years ago
- Target version changed from 2.4.4-p1 to 2.4.4-GS
Updated by Danilo Zrenjanin about 6 years ago
Tried to replicate the issue in 2.4.4-RELEASE.
Under logs, I was getting only "Keep current gateway, its already part of the group members" but only a couple of such logs. There wasn't any flood.
Oct 5 22:43:00 php-fpm 353 /rc.openvpn: Keep current gateway, its already part of the group members. Oct 5 22:43:00 php-fpm 353 /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. '' Oct 5 22:43:00 php-fpm 90203 /rc.dyndns.update: Keep current gateway, its already part of the group members. Oct 5 22:43:00 php-fpm 30383 /rc.filter_configure_sync: Keep current gateway, its already part of the group members. Oct 5 22:43:00 php-fpm 40961 /rc.newwanip: rc.newwanip: Info: starting on ix0. Oct 5 22:43:00 php-fpm 40961 /rc.newwanip: rc.newwanip: on (IP address: 192.168.1.222) (interface: WAN2[opt1]) (real interface: ix0).
Updated by Flole Systems about 6 years ago
Could you please check if you are also experiencing latency spikes on LAN Interface during a reconnect (run a ping to pfsense LAN IP and reconnect? You might be having an issue I am trying to troubleshoot in the Forums for a few days.
Updated by Luiz Souza about 6 years ago
- Target version changed from 2.4.4-GS to 2.4.4-p1
Updated by Chris Macmahon about 6 years ago
Not sure what the acceptable level of log spam is:
Nov 22 09:56:56 check_reload_status Reloading filter
Nov 22 09:56:55 php-fpm 333 /system_gateways.php: Default gateway setting Interface WAN_DHCP Gateway as default.
Nov 22 09:56:53 php-fpm 31885 /system_gateways.php: 31885MONITOR: WAN_DHCP is available now, adding to routing group break_the_internet 192.168.91.1|192.168.91.195|WAN_DHCP|0.581ms|1.056ms|0.0%|none
Nov 22 09:56:52 check_reload_status Syncing firewall
Nov 22 09:56:04 check_reload_status Updating all dyndns
Nov 22 09:56:04 check_reload_status Reloading filter
Nov 22 09:56:03 php-fpm 333 /system_gateways.php: Default gateway setting Interface OPT1_DHCP Gateway as default.
Nov 22 09:56:02 php-fpm 31885 /system_gateways.php: MONITOR: WAN_DHCP is down, omitting from routing group break_the_internet 192.168.91.1|192.168.91.195|WAN_DHCP|0.409ms|0.59ms|0.0%|force_down
That is a Gateway down and up event cycle.
Updated by Vladimir Lind about 6 years ago
Tested on VM HA cluster - CE 2.4.5-DEVELOPMENT (amd64) built on Tue Nov 20 16:55:31 EST 2018:
No "Gateways status could not be determined" events in syslog on gateway failure's (disabled tier1 WAN on primary node then enabled it back) - ran icmp from host on lan behind HA cluster - no latency to LAN or 8.8.8.8 during gw up/down events. Just 1 lost icmp packet on HA failover to secondary as it should be.
Not seeing any issues.
Updated by Renato Botelho about 6 years ago
- Status changed from Feedback to Resolved