Project

General

Profile

Actions

Regression #12961

closed

CARP event storm when leaving persistent CARP maintenance mode

Added by Florian Apolloner about 2 years ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Viktor Gurov
Category:
CARP
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
22.05
Release Notes:
Default
Affected Version:
2.6.0
Affected Architecture:

Description

Hi,

this is a very weird issue so I will try my best to describe it. I think this is a regression that we are seeing after converting our box to pfSense-Plus but I am not to sure (could also be that it is simply a bug and not a regression).

We run a HA setup with many CARP IPs (basically one for every interface and we have 17 interfaces. Onto those CARP interfaces we do stack virtual IPs, around O(10) per CARP IP. I have verified that the CARP and VIP subnet masks do match the parent interface as per https://docs.netgate.com/pfsense/en/latest/firewall/virtual-ip-address-comparison.html .

When I tried to leave persistent CARP maintenance mode on the primary instance (which would then take over master status again) the system log started to fill with:

Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 kernel: carp: 8@vtnet2.511: BACKUP -> INIT (hardware interface up)
Mar 18 15:37:41 pfSense1 kernel: carp: 8@vtnet2.511: INIT -> BACKUP (initialization complete)
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp backup event
Mar 18 15:37:41 pfSense1 kernel: carp: 18@vtnet2.711: BACKUP -> INIT (hardware interface up)
Mar 18 15:37:41 pfSense1 kernel: carp: 18@vtnet2.711: INIT -> BACKUP (initialization complete)
... long list of carp backup events and multiple message for the same "8@vtnet2.511" identifier (probably for the VIPs as well)
Mar 18 15:37:41 pfSense1 kernel: carp: 13@vtnet2.720: BACKUP -> MASTER (preempting a slower master)
Mar 18 15:37:41 pfSense1 kernel: carp: 200@vtnet2.708: BACKUP -> MASTER (preempting a slower master)
... list of all interfaces becoming master
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp master event
Mar 18 15:37:41 pfSense1 check_reload_status[408]: Carp master event
... many carp master events

All in all this log shows:

➜  ~ rg 'Carp backup event' carp3.txt|wc
    148    1184   10212
➜  ~ rg 'Carp master event' carp3.txt|wc
     17     136    1173

Here the 17 master events match the number of "physical" interfaces we have. I am not sure about the backup events. Taking as an example which is a CARP interface with 4 IPs I get 24 entries. So it went 6 times for each VIP between BACKUP -> INIT -> BACKUP before it actually become MASTER. But the counts are not consistent…

Continuing with the logs:

... Interleaved with the carp master events it seems that carp handling kicks in:
Mar 18 15:37:43 pfSense1 php-fpm[31225]: /rc.carpbackup: HA cluster member "(10.7.11.1@vtnet2.711): (INTERN)" has resumed CARP state "BACKUP" for vhid 18
Mar 18 15:37:43 pfSense1 php-fpm[87138]: /rc.carpbackup: HA cluster member "(10.7.11.1@vtnet2.711): (INTERN)" has resumed CARP state "BACKUP" for vhid 18
Mar 18 15:37:43 pfSense1 php-fpm[87138]: /rc.carpbackup: HA cluster member "(10.7.11.129@vtnet2.711): (INTERN)" has resumed CARP state "BACKUP" for vhid 18
Mar 18 15:37:43 pfSense1 php-fpm[31225]: /rc.carpbackup: HA cluster member "(10.7.11.129@vtnet2.711): (INTERN)" has resumed CARP state "BACKUP" for vhid 18
... we see pages of those (275 entries), the last one roughly 2 minutes after the first one!

At this point we also see entries like this:
Mar 18 15:38:08 pfSense1 php-fpm[87138]: /rc.carpbackup: Stopping OpenVPN server instance on DMZ because of transition to CARP backup.
Mar 18 15:38:08 pfSense1 php-fpm[31225]: /rc.carpbackup: Stopping OpenVPN server instance on DMZ because of transition to CARP backup.
Mar 18 15:38:08 pfSense1 check_reload_status[408]: Reloading filter
Mar 18 15:38:08 pfSense1 kernel: ovpns1: link state changed to UP
Mar 18 15:38:08 pfSense1 check_reload_status[408]: rc.newwanip starting ovpns1
Mar 18 15:38:08 pfSense1 check_reload_status[408]: Reloading filter
Mar 18 15:38:08 pfSense1 kernel: ovpns3: link state changed to UP
Mar 18 15:38:08 pfSense1 check_reload_status[408]: rc.newwanip starting ovpns3
Mar 18 15:38:10 pfSense1 kernel: ovpns1: link state changed to DOWN
Mar 18 15:38:10 pfSense1 check_reload_status[408]: Reloading filter
Mar 18 15:38:10 pfSense1 kernel: ovpns1: link state changed to UP
Mar 18 15:38:10 pfSense1 check_reload_status[408]: rc.newwanip starting ovpns1
Mar 18 15:38:13 pfSense1 kernel: ovpns3: link state changed to DOWN
Mar 18 15:38:13 pfSense1 kernel: ovpns1: link state changed to DOWN
Mar 18 15:38:13 pfSense1 kernel: ovpns3: link state changed to UP
Mar 18 15:38:13 pfSense1 check_reload_status[408]: rc.newwanip starting ovpns3

At roughly 4 minutes after leaving persistent carp maintenance mode the system log becomes quiet again and the web interface is reachable again. According to the hypervisor and pfsense itself the CPU usage was relatively high due to all the restarts of services (?). While CARP had transitioned successfully to MASTER (as confirmed by the second firewall) pfsense did require minutes to handle the events. Routing etc during that time worked without any issues but the webiface was down etc…

I'll happily provide more information; just let me know what you need.


Related issues

Related to Bug #12227: Changing VHID on CARP VIP does not update VHID of related IP Alias VIPsResolvedViktor Gurov

Actions
Actions #1

Updated by Steffen Wagner about 2 years ago

I can confirm to see the absolute same behaviour in pfSense 2.6.0 CE with a very similar setup!

Actions #2

Updated by Florian Apolloner about 2 years ago

Okay, this can be nicely reproduced by making the secondary enter & leave persistent carp maintenance mode. I added a debug print to rc.carpbackup saying something along the lines of:

Got triggered for 8@vtnet2.511

When I entered maintenance mode I got Got triggered for 40 times for an interface with 14 ips. The whole cycle roughly takes a minute to finish for all interfaces:

Apr 21 11:52:44 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:44 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:44 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:44 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:44 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:45 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:45 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:45 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:45 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:52:45 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 18@vtnet2.711
Apr 21 11:52:45 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 18@vtnet2.711
Apr 21 11:52:46 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 18@vtnet2.711
Apr 21 11:52:46 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:46 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:46 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:46 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:46 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:47 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:48 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:49 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:49 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:49 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:50 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:51 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:52 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:52 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:52 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:52 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:54 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:54 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:55 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:55 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:55 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:56 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:57 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:57 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:59 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:59 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:52:59 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:00 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:00 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:01 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:02 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:02 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:03 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:03 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:03 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:04 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:05 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:06 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:06 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:06 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 2@vtnet2.192
Apr 21 11:53:07 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:07 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:09 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:09 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:09 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:09 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:09 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:10 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:10 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:10 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:10 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:11 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:11 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:11 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 7@vtnet2.707
Apr 21 11:53:11 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:11 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:12 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:12 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:12 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:13 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:14 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:14 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:14 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:15 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:15 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:15 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:16 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:17 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:17 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:17 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:18 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:18 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:19 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:19 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:19 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:20 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:20 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:20 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:21 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:22 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:22 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:22 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:23 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:23 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:24 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:24 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:25 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:25 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:25 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:25 pfSense2 php-fpm[84243]: /rc.carpbackup: Got triggered for 195@vtnet0
Apr 21 11:53:26 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:27 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:28 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:28 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:28 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:28 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 8@vtnet2.511
Apr 21 11:53:28 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 106@vtnet2.706
Apr 21 11:53:28 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 106@vtnet2.706
Apr 21 11:53:28 pfSense2 php-fpm[14568]: /rc.carpbackup: Got triggered for 106@vtnet2.706
Apr 21 11:53:28 pfSense2 php-fpm[50696]: /rc.carpbackup: Got triggered for 106@vtnet2.706

So the main question is: why do we get this many events.

What I see from the logs: If I do leave persistent carp maint. I get:

Apr 21 11:55:17 pfSense2 kernel: carp: 18@vtnet2.711: BACKUP -> INIT (hardware interface up)
Apr 21 11:55:17 pfSense2 kernel: carp: 18@vtnet2.711: INIT -> BACKUP (initialization complete)
Apr 21 11:55:17 pfSense2 check_reload_status[408]: Carp backup event
Apr 21 11:55:17 pfSense2 kernel: carp: 18@vtnet2.711: BACKUP -> INIT (hardware interface up)
Apr 21 11:55:17 pfSense2 kernel: carp: 18@vtnet2.711: INIT -> BACKUP (initialization complete)
Apr 21 11:55:17 pfSense2 check_reload_status[408]: Carp backup event
Apr 21 11:55:17 pfSense2 check_reload_status[408]: Carp backup event
Apr 21 11:55:17 pfSense2 check_reload_status[408]: Carp backup event
Apr 21 11:55:17 pfSense2 check_reload_status[408]: Carp backup event

Apr 21 11:55:19 pfSense2 php-fpm[37401]: /rc.carpbackup: Got triggered for 18@vtnet2.711
Apr 21 11:55:20 pfSense2 php-fpm[60048]: /rc.carpbackup: Got triggered for 18@vtnet2.711
Apr 21 11:55:20 pfSense2 php-fpm[29213]: /rc.carpbackup: Got triggered for 18@vtnet2.711
Apr 21 11:55:20 pfSense2 php-fpm[66744]: /rc.carpbackup: Got triggered for 18@vtnet2.711

From the looks of it I will get a backup event when switching from BACKUP -> INIT which is weird (I guess?). Then I also seem to get an extra event for every IP alias on the interface which results in this massive duplication.

Oddly enough I see no events at all for interfaces which only have a CARP IP and no aliases on them…

Actions #3

Updated by Florian Apolloner about 2 years ago

Ok, this gets all triggered via https://github.com/pfsense/pfsense/blob/48cf54f850c5bf4fe26a8e33deb449807e71c204/src/etc/pfSense-devd.conf#L32-L50 -- I wonder if it would make sense to dedup those events somehow.

Actions #4

Updated by Florian Apolloner about 2 years ago

Okay, I do have found the cause for this issue: https://github.com/pfsense/pfsense/commit/6514012d33705dda99d0def4421f5560ad969af5 -- commenting out this code block leads back to normal behavior. Original issue: https://redmine.pfsense.org/issues/12227

Actions #5

Updated by Viktor Gurov about 2 years ago

  • Project changed from pfSense Plus to pfSense
  • Category changed from CARP to CARP
  • Affected Plus Version deleted (22.01)
  • Affected Version set to 2.6.0
Actions #6

Updated by Viktor Gurov about 2 years ago

  • Related to Bug #12227: Changing VHID on CARP VIP does not update VHID of related IP Alias VIPs added
Actions #7

Updated by Viktor Gurov about 2 years ago

  • Assignee set to Viktor Gurov
  • Target version set to 2.7.0
  • Plus Target Version set to 22.05
Actions #8

Updated by Jim Pingle about 2 years ago

  • Status changed from New to Pull Request Review
Actions #9

Updated by Florian Apolloner about 2 years ago

Any chance of sharing the patch here for a community review? I think I have a good idea about what is going wrong and might have opinions on possible fixes…

Actions #10

Updated by Viktor Gurov about 2 years ago

  • Status changed from Pull Request Review to Feedback
  • % Done changed from 0 to 100
Actions #11

Updated by Florian Apolloner about 2 years ago

While this most certainly fixes the reported issue I feel like this change is still somewhat fragile. The main problem here imo is that the kernel will send CARP events for changes of the VIPs and we have no good way of detecting that because the carp event handlers only get vhid@iface. Deduping/aggregating CARP events seems like a dangerous option so the next best option would be to reconfigure the IP alias if (and only if) the VHID did change. While this would still cause an event storm (at least on a single interface) if a CARP IP is changed it would limit the fallout and reduce the chances of locking up the firewall.

Actions #12

Updated by Viktor Gurov about 2 years ago

  • Status changed from Feedback to New

Florian Apolloner wrote in #note-11:

While this most certainly fixes the reported issue I feel like this change is still somewhat fragile. The main problem here imo is that the kernel will send CARP events for changes of the VIPs and we have no good way of detecting that because the carp event handlers only get vhid@iface. Deduping/aggregating CARP events seems like a dangerous option so the next best option would be to reconfigure the IP alias if (and only if) the VHID did change. While this would still cause an event storm (at least on a single interface) if a CARP IP is changed it would limit the fallout and reduce the chances of locking up the firewall.

optimization:
https://gitlab.netgate.com/pfSense/pfSense/-/merge_requests/739

Actions #13

Updated by Jim Pingle about 2 years ago

  • Status changed from New to Pull Request Review
Actions #14

Updated by Viktor Gurov about 2 years ago

  • Status changed from Pull Request Review to Feedback
Actions #15

Updated by Florian Apolloner about 2 years ago

I only looked over the code because I am heading out into the weekend but the code looks good. Thanks for that Viktor!

Actions #16

Updated by Bill Hughes almost 2 years ago

Florian Apolloner wrote in #note-15:

I only looked over the code because I am heading out into the weekend but the code looks good. Thanks for that Viktor!

We are seeing the exact same behaviour in our setup (22.01).

Unclear as to the meaning of the status "Feedback" -- does this mean it will likely be in the next release (22.05)?

Note: I looked for instructions on when/why/how to post or indicate our interest. Sorry if this is not the correct place. Thank you for all the work this team does.

Actions #17

Updated by Jim Pingle almost 2 years ago

  • Status changed from Feedback to Resolved

I'm only seeing one event per VIP now as expected.

Actions #18

Updated by Jim Pingle 11 months ago

  • Subject changed from CARP event storm when leaving persistent CARP maintenance mode. to CARP event storm when leaving persistent CARP maintenance mode
Actions

Also available in: Atom PDF