Project

General

Profile

Actions

Bug #8914

closed

Gateway switch events cause a huge amount of log spew

Added by → luckman212 over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Category:
Logging
Target version:
Start date:
09/18/2018
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.4.4
Affected Architecture:

Description

When a gateway failure/defgw switch event occurs, there seem to be some code paths that get iterated over dozens of times in a very short span of time. This floods system.log making it hard to read. Specifically, these 2 phrases are repeated over and over:

Gateways status could not be determined, considering all as up/active
Keep current gateway, its already part of the group members

This is part of an effort to try to make the logs readable again, while still allowing verbose logging if needed via a new hidden config option from the developer shell:

$config['system']['gw-debug'] = true;
write_config("Enable Developer Spew");
exec
exit
Actions #1

Updated by Renato Botelho over 5 years ago

  • Status changed from New to Feedback
  • Assignee set to Renato Botelho
  • Target version set to 2.4.4-GS

PR merged

Actions #2

Updated by Paighton Bisconer over 5 years ago

Replicated in 2.4.4-RELEASE

Steps to replicate:
Configure Gateway Group with 2 WANs
Start Tier 1 down, Tier 2 up
Connect Tier 1, wait for link to configure, disconnect Tier 1.

Oct 3 00:47:41    php-fpm    97158    /rc.openvpn: MONITOR: WAN_DHCP is down, omitting from routing group df 192.168.34.194|192.168.34.204|WAN_DHCP|0.224ms|0.035ms|25%|down
Oct 3 00:47:41    php-fpm    97158    /rc.openvpn: Gateways status could not be determined, considering all as up/active. (Group: df)
Oct 3 00:47:42    php-fpm    97158    /rc.openvpn: Gateway, switch to:
Oct 3 00:47:42    php-fpm    337    /rc.dyndns.update: Gateways status could not be determined, considering all as up/active. (Group: df)
Oct 3 00:47:42    php-fpm    61881    /rc.filter_configure_sync: Gateways status could not be determined, considering all as up/active. (Group: df)
Oct 3 00:47:42    php-fpm    337    /rc.dyndns.update: Gateway, switch to:
Oct 3 00:47:42    php-fpm    337    /rc.dyndns.update: Gateways status could not be determined, considering all as up/active. (Group: df)
Oct 3 00:47:42    php-fpm    61881    /rc.filter_configure_sync: Gateway, switch to:
Oct 3 00:47:42    php-fpm    61881    /rc.filter_configure_sync: Gateways status could not be determined, considering all as up/active. (Group: df)

Had different results each time, in each of 3 tests there were 2 entries, 5 entries, and 7 entries for that message.

Oct 3 00:47:04    php-fpm    73667    /rc.newwanip: rc.newwanip: Info: starting on igb0.
Oct 3 00:47:04    php-fpm    73667    /rc.newwanip: rc.newwanip: on (IP address: 192.168.34.204) (interface: WAN[wan]) (real interface: igb0).
Oct 3 00:47:04    php-fpm    73667    /rc.newwanip: Gateways status could not be determined, considering all as up/active. (Group: df)
Oct 3 00:47:04    php-fpm    73667    /rc.newwanip: Keep current gateway, its already part of the group members.
Oct 3 00:47:04    php-fpm    73667    /rc.newwanip: Gateways status could not be determined, considering all as up/active. (Group: df)
Oct 3 00:47:05    php-fpm    73667    /rc.newwanip: 73667MONITOR: WAN_DHCP is available now, adding to routing group df 192.168.34.194|192.168.34.204|WAN_DHCP|0.263ms|0.028ms|0.0%|none
Oct 3 00:47:05    php-fpm    73667    /rc.newwanip: Keep current gateway, its already part of the group members.
Oct 3 00:47:05    php-fpm    97158    /rc.linkup: Keep current gateway, its already part of the group members.

Also seeing keep current gateway

Actions #3

Updated by Anonymous over 5 years ago

  • Target version changed from 2.4.4-GS to 2.4.4-p1
Actions #4

Updated by Anonymous over 5 years ago

  • Target version changed from 2.4.4-p1 to 2.4.4-GS
Actions #5

Updated by Danilo Zrenjanin over 5 years ago

Tried to replicate the issue in 2.4.4-RELEASE.

Under logs, I was getting only "Keep current gateway, its already part of the group members" but only a couple of such logs. There wasn't any flood.

Oct 5 22:43:00     php-fpm     353     /rc.openvpn: Keep current gateway, its already part of the group members.
Oct 5 22:43:00     php-fpm     353     /rc.openvpn: Gateway, none 'available' for inet6, use the first one configured. ''
Oct 5 22:43:00     php-fpm     90203     /rc.dyndns.update: Keep current gateway, its already part of the group members.
Oct 5 22:43:00     php-fpm     30383     /rc.filter_configure_sync: Keep current gateway, its already part of the group members.
Oct 5 22:43:00     php-fpm     40961     /rc.newwanip: rc.newwanip: Info: starting on ix0.
Oct 5 22:43:00     php-fpm     40961     /rc.newwanip: rc.newwanip: on (IP address: 192.168.1.222) (interface: WAN2[opt1]) (real interface: ix0).
Actions #6

Updated by Flole Systems over 5 years ago

Could you please check if you are also experiencing latency spikes on LAN Interface during a reconnect (run a ping to pfsense LAN IP and reconnect? You might be having an issue I am trying to troubleshoot in the Forums for a few days.

Actions #7

Updated by Luiz Souza over 5 years ago

  • Target version changed from 2.4.4-GS to 2.4.4-p1
Actions #8

Updated by Chris Macmahon over 5 years ago

Not sure what the acceptable level of log spam is:

Nov 22 09:56:56 check_reload_status Reloading filter
Nov 22 09:56:55 php-fpm 333 /system_gateways.php: Default gateway setting Interface WAN_DHCP Gateway as default.
Nov 22 09:56:53 php-fpm 31885 /system_gateways.php: 31885MONITOR: WAN_DHCP is available now, adding to routing group break_the_internet 192.168.91.1|192.168.91.195|WAN_DHCP|0.581ms|1.056ms|0.0%|none

Nov 22 09:56:52 check_reload_status Syncing firewall
Nov 22 09:56:04 check_reload_status Updating all dyndns
Nov 22 09:56:04 check_reload_status Reloading filter
Nov 22 09:56:03 php-fpm 333 /system_gateways.php: Default gateway setting Interface OPT1_DHCP Gateway as default.
Nov 22 09:56:02 php-fpm 31885 /system_gateways.php: MONITOR: WAN_DHCP is down, omitting from routing group break_the_internet 192.168.91.1|192.168.91.195|WAN_DHCP|0.409ms|0.59ms|0.0%|force_down

That is a Gateway down and up event cycle.

Actions #9

Updated by Vladimir Lind over 5 years ago

Tested on VM HA cluster - CE 2.4.5-DEVELOPMENT (amd64) built on Tue Nov 20 16:55:31 EST 2018:

No "Gateways status could not be determined" events in syslog on gateway failure's (disabled tier1 WAN on primary node then enabled it back) - ran icmp from host on lan behind HA cluster - no latency to LAN or 8.8.8.8 during gw up/down events. Just 1 lost icmp packet on HA failover to secondary as it should be.

Not seeing any issues.

Actions #10

Updated by Renato Botelho over 5 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF