Bug #16454
openIPv6 CARP events initiated by HA/pfsync
0%
Description
Hello,
I've just upgraded from a very old version of pfSense (2.4.5p1) all the way up to pfSense 2.8.1.
On the pfSense 2.8.1 version, I am getting IPv6 CARP alerts any time a configuration change is made on the primary or when pfSense does it's hourly sync. I have plenty of CARP virtual IPs (IPv4 plus IPv6) but when the sync event happens, only the IPv6 IPs are generating the alerts.
Below is a portion of the system.log output (from the secondary first, then the primary) that shows this happening. I've redacted any public IPs and email addresses from the output.
Secondary node system.log snippet:
Sep 29 09:00:00 deal-edge2 php[41376]: [pfBlockerNG] Starting cron process. Sep 29 09:00:00 deal-edge2 php[41376]: /usr/local/www/pfblockerng/pfblockerng.php: Configuration Change: (system): pfBlockerNG: saving DNSBL changes Sep 29 09:00:00 deal-edge2 check_reload_status[502]: Syncing firewall Sep 29 09:00:00 deal-edge2 php-fpm[17161]: /xmlrpc.php: Configuration Change: (system)@192.168.230.251: Merged in config (pfblockerng, pfblockerngipsettings, pfblockerngdnsblsettings, pfblockernglistsv4, pfblockerngdnsbl, pfblockerngsafesearch sections) from XMLRPC client. Sep 29 09:00:00 deal-edge2 check_reload_status[502]: Syncing firewall Sep 29 09:00:02 deal-edge2 php-fpm[63508]: /xmlrpc.php: Configuration Change: (system)@192.168.230.251: Merged in config (staticroutes, gateways, virtualip, system, aliases, ca, cert, crl, dhcpd, dnshaper, filter, ipsec, nat, openvpn, schedules, shaper, unbound, wol sections) from XMLRPC client. Sep 29 09:00:02 deal-edge2 check_reload_status[502]: Syncing firewall Sep 29 09:00:02 deal-edge2 check_reload_status[502]: Carp backup event Sep 29 09:00:02 deal-edge2 check_reload_status[502]: Carp backup event Sep 29 09:00:02 deal-edge2 check_reload_status[502]: Carp backup event Sep 29 09:00:02 deal-edge2 check_reload_status[502]: Carp backup event Sep 29 09:00:02 deal-edge2 php-fpm[63508]: /xmlrpc.php: waiting for pfsync... Sep 29 09:00:02 deal-edge2 kernel: carp: 31@igb5: BACKUP -> INIT (hardware interface up) Sep 29 09:00:02 deal-edge2 kernel: carp: 31@igb5: INIT -> BACKUP (initialization complete) Sep 29 09:00:02 deal-edge2 kernel: carp: 33@igb5: BACKUP -> INIT (hardware interface up) Sep 29 09:00:02 deal-edge2 kernel: carp: 33@igb5: INIT -> BACKUP (initialization complete) Sep 29 09:00:02 deal-edge2 kernel: carp: demoted by 0 to 0 (pfsync bulk start) Sep 29 09:00:03 deal-edge2 php-fpm[63508]: /xmlrpc.php: pfsync done in 1 seconds. Sep 29 09:00:03 deal-edge2 php-fpm[63508]: /xmlrpc.php: Configuring CARP settings finalize... Sep 29 09:00:03 deal-edge2 check_reload_status[502]: Reloading filter Sep 29 09:00:03 deal-edge2 php-fpm[63508]: /xmlrpc.php: Default gateway setting as default. Sep 29 09:00:03 deal-edge2 php-fpm[63508]: /xmlrpc.php: Removing static route for monitor 8.8.8.8 and adding a new route through xxx.xxx.xxx.xxx Sep 29 09:00:03 deal-edge2 php-fpm[63508]: /xmlrpc.php: Removing static route for monitor 1.1.1.1 and adding a new route through yyy.yyy.yyy.yyy Sep 29 09:00:03 deal-edge2 php-cgi[74266]: notify_monitor.php: Message sent to <redadcted@example.com> OK Sep 29 09:00:04 deal-edge2 php-fpm[41797]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX:dea1::1@igb5): (WANLT)" has resumed CARP state "BACKUP" for vhid 33 Sep 29 09:00:05 deal-edge2 php-fpm[16650]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX::2@igb5): (WANLT)" has resumed CARP state "BACKUP" for vhid 31 Sep 29 09:00:05 deal-edge2 php-fpm[64393]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX:dea1::1@igb5): (WANLT)" has resumed CARP state "BACKUP" for vhid 33 Sep 29 09:00:05 deal-edge2 php-fpm[17161]: /rc.carpbackup: HA cluster member "(XXXX:XXXX:XXXX::2@igb5): (WANLT)" has resumed CARP state "BACKUP" for vhid 3
Primary node system.log output snippet:
Sep 29 09:00:00 deal-edge1 php[75712]: [pfBlockerNG] Starting cron process.
Sep 29 09:00:00 deal-edge1 php[75712]: /usr/local/www/pfblockerng/pfblockerng.php: Configuration Change: (system): pfBlockerNG: saving DNSBL changes
Sep 29 09:00:00 deal-edge1 check_reload_status[574]: Syncing firewall
Sep 29 09:00:00 deal-edge1 php[75712]: [pfBlockerNG] XMLRPC sync is starting.
Sep 29 09:00:00 deal-edge1 php[75712]: /usr/local/www/pfblockerng/pfblockerng.php: Beginning XMLRPC sync data to https://192.168.230.252:10443/xmlrpc.php.
Sep 29 09:00:00 deal-edge1 php[75712]: /usr/local/www/pfblockerng/pfblockerng.php: XMLRPC reload data success with https://192.168.230.252:10443/xmlrpc.php (pfsense.merge_installedpackages_section).
Sep 29 09:00:00 deal-edge1 php[75712]: [pfBlockerNG] XMLRPC sync to [ 192.168.230.252:{port} ] completed successfully.
Sep 29 09:00:01 deal-edge1 php-fpm[29035]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.230.252:10443/xmlrpc.php.
Sep 29 09:00:01 deal-edge1 php-fpm[29035]: /rc.filter_synchronize: XMLRPC reload data success with https://192.168.230.252:10443/xmlrpc.php (pfsense.host_firmware_version).
Sep 29 09:00:01 deal-edge1 php-fpm[29035]: /rc.filter_synchronize: XMLRPC versioncheck: 24.0 -- 24.0
Sep 29 09:00:01 deal-edge1 php-fpm[29035]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://192.168.230.252:10443/xmlrpc.php.
Sep 29 09:00:13 deal-edge1 php-fpm[29035]: /rc.filter_synchronize: XMLRPC reload data success with https://192.168.230.252:10443/xmlrpc.php (pfsense.restore_config_section).
Files
Related issues
Updated by Jim Pingle 6 months ago
- Related to Bug #16508: CARP VIPs for IPv6 switch to INIT status and then back to BACKUP on all XMLRPC sync events added
Updated by Levi Grizzle 10 days ago
I'm also noticing an issues with IPv6 and failover between a pair of 8300 Max units running Plus 26.03. Upon investigation and being directed to this redmine report, I was advised to add my issue information here as well.
Upon firing off 'Temporarily Disable CARP' from the primary, it appears that failover does its job accordingly, however after a short while later, the WAN interface IPv6 CARP entry appears to die. The upstream routers (VRRP setup) lose IPv6 connectivity to only the WAN interface's CARP entry. IPv4 continues to work without an issue. While the primary unit is still in 'temp CARP Maintenance' mode, when IPv6 drops out, I'm not seeing anything reported in either of the systems' logs indicating any issues that sticks out. Furthermore, from what I'm also observing is that IPv6 on my DMZ interface is NOT dropping at this time, but due to the drop on the WAN interface, routing to the DMZ IPv6 subnet is cut off due to the upstream default route destined to use the WAN's IPv6 CARP entry is unavailable.
In order to restore IPv6 on the WAN interface, a simple edit to the CARP assignment initiated from the primary unit restores this individual CARP entry, and routing resumes. It is also further noticed that packet loss then ensues downstream on the DMZ interface where an IPv6 IP Alias (tied to the DMZ CARP IPv6 entry). Packet loss is also noticed directly to the CARP entry assigned to the DMZ interface from a remote node that is directly connected to that segment.
It is also further viewed that when edits to the CARP entries from the primary unit are made, these interfaces appear to take on a MASTER status under CARP. The IPv6 CARP entry on the DMZ interface of both units are currently master, while the IPv6 CARP entry for the WAN interface on the primary unit is master, and the secondary unit is BACKUP. Editing a CARP entry from the master while it's in 'Temp Disable CARP' mode appears to have triggered only said interfaces edited to become enabled for CARP status.
Furthermore, after placing the primary unit in 'Persistent CARP Maintenance Mode', the IPv6 CARP entry for the WAN interface on the secondary unit has resumed 'MASTER' status again resulting in proper workings of the associated CARP entries (WAN/DMZ IPv6) for what appears to have also cleared CARP related issues altogether on the secondary unit, but only for a period of time before IPv6 on the WAN interface (CARP ONLY) drops. It is also to be noted that when the primary unit was issued the 'Enter Persistent CARP Maintenance Mode' process, an error message pops us reporting the following:
"CARP has detected a problem and this unit has a non-zero demotion status.
Check the link status on all interfaces configured with CARP VIPs and search the System Log for CARP demotion-related events."
Upon execution of 'Leave Persistent CARP Maintenance Mode' on the primary unit, fallback does not appear to take place until 'Reset CARP Demotion Status' is executed. At this time, both IPv4 and IPv6 continue to work w/o any issues.
The end result, HA failover w/IPv6 is holding us back from pushing these units into production.
Updated by Kris Phillips 7 days ago
- Status changed from New to Feedback
Tested this on 26.03 with Multicast CARP. With Static IPv6, I'm not able to recreate this any longer on 26.03. I was previously able to on 25.07.1 or prior, but it seems this was either intentionally or inadvertently fixed along the way.