Project

General

Profile

Actions

Bug #4446

closed

IP Alias with CARP VIP parent is not removed from OS on secondary node when deleted

Added by Andreas Pflug about 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Normal
Category:
CARP
Target version:
Start date:
02/19/2015
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.2
Affected Architecture:

Description

I used to have a CARP VIP for any IP address my pf cluster has to handle since that used to be the only way (changed in 2.1 I believe?).

With 2.2, I've been defining a single IP Alias on the WAN's CARP VIP, and had trouble with the backup router flipping its WAN VIP to master, resulting in a dual-master situation and intermittent problems.
Redefining the VIP to CARP made the problem disappear immediately.

Actions #1

Updated by Chris Buechler about 9 years ago

  • Status changed from New to Feedback

this definitely works in general. when you end up in dual master is if it doesn't sync the alias across for some reason, or one I've seen where that IP was manually added on the secondary as an IP alias of an interface it fails to add it as an alias of the CARP, which gives you dual master.

likely this is a misconfiguration along those lines. if it's not, there is some circumstance more specific than described that's triggering it.

Actions #2

Updated by Andreas Pflug about 9 years ago

Chris Buechler wrote:

this definitely works in general. when you end up in dual master is if it doesn't sync the alias across for some reason, or one I've seen where that IP was manually added on the secondary as an IP alias of an interface it fails to add it as an alias of the CARP, which gives you dual master.

While trying to solve the problem, I manually removed all VIFs (carp as well as alias) from the secondary's config.xml and rebooted. After that, I re-created the VIFs by triggering a sync from the master, so I made sure there's no backup specific config.

To make it clear: the dual-master situation is on the CARP interface that's used for IP alias; removing the IP alias restored normal master-backup CARP.

What info do you need to investigate "some circumstances more specific" which seem to apply here?

Actions #3

Updated by Chris Buechler about 9 years ago

what does the output of ifconfig show on the secondary?

Actions #4

Updated by Andreas Pflug about 9 years ago

Ok, I know how to reproduce this.

On the master, I have this config:

xn0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        inet xx.xx.xx.91 netmask 0xfffffff8 broadcast xx.xx.xx.95
        inet xx.xx.xx.90 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 90
        inet xx.xx.xx.94 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 94
        carp: MASTER vhid 90 advbase 1 advskew 0
        carp: MASTER vhid 94 advbase 1 advskew 0

No config is performed on the backup, just syncing over.

xn0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        inet xx.xx.xx.92 netmask 0xfffffff8 broadcast xx.xx.xx.95
        inet xx.xx.xx.90 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 90
        inet xx.xx.xx.94 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 94
        carp: BACKUP vhid 90 advbase 1 advskew 100
        carp: BACKUP vhid 94 advbase 1 advskew 100

Looks as expected.

Now I create an IP alias on vhid94, and apply. Things still work ok.

I delete the IP alias. The page won't show "Apply settings now" so I don't do it; the alias is gone immediately, but now CARP on vhid94 is double-master.

I'm triggering a sync by saving CARP settings, no improvement; the backup will show this:

xn0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        inet xx.xx.xx.92 netmask 0xfffffff8 broadcast xx.xx.xx.95
        inet xx.xx.xx.90 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 90
        inet xx.xx.xx.94 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 94
        inet xx.xx.xx.93 netmask 0xfffffff8 broadcast xx.xx.xx.95 vhid 94
        carp: BACKUP vhid 90 advbase 1 advskew 100
        carp: MASTER vhid 94 advbase 1 advskew 100

Apparently, the deletion of the ip alias wasn't promoted correctly to the secondary.

Actions #5

Updated by Jim Pingle about 9 years ago

  • Subject changed from double CARP master if used for IP alias to IP Alias with CARP VIP parent is not removed from OS on secondary node when deleted
  • Status changed from Feedback to Confirmed

Updated the description to be more accurate. The actual problem appears to be that deleting an IP Alias VIP with a CARP parent from the primary does not also delete that VIP from the actual interface on the secondary. The GUI entry is gone but if you check ifconfig the entry is still present on the secondary.

Because the secondary has an IP address using the VHID that the master does not, it believes it should be master instead (at least for that VIP).

The the example below, I added 192.168.71.5 as an IP alias VIP using the CARP VIP as the parent, and then removed it.

root@primary: ifconfig em1
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
    ether 00:0c:29:9b:d3:d5
    inet 192.168.71.2 netmask 0xffffff00 broadcast 192.168.71.255 
    inet6 fe80::20c:29ff:fe9b:d3d5%em1 prefixlen 64 scopeid 0x2 
    inet 192.168.71.1 netmask 0xffffff00 broadcast 192.168.71.255 vhid 72 
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
    carp: MASTER vhid 72 advbase 1 advskew 0

root@secondary: ifconfig em1
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
    options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
    ether 00:0c:29:f5:96:9f
    inet 192.168.71.3 netmask 0xffffff00 broadcast 192.168.71.255 
    inet6 fe80::20c:29ff:fef5:969f%em1 prefixlen 64 scopeid 0x2 
    inet 192.168.71.1 netmask 0xffffff00 broadcast 192.168.71.255 vhid 72 
    inet 192.168.71.5 netmask 0xffffff00 broadcast 192.168.71.255 vhid 72 
    nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active
    carp: MASTER vhid 72 advbase 1 advskew 100

Actions #6

Updated by Chris Buechler about 9 years ago

  • Assignee set to Ermal Luçi
Actions #7

Updated by Ermal Luçi about 9 years ago

  • Status changed from Confirmed to Feedback
  • % Done changed from 0 to 100
Actions #8

Updated by Ermal Luçi about 9 years ago

Actions #9

Updated by Chris Buechler about 9 years ago

  • Status changed from Feedback to Confirmed

that didn't change the behavior. That whole code path there seems to have issues, like what I fixed in my last commit on this ticket. For IP aliases with a CARP parent, the oldvips array ends up containing something like:

array(1) {
  ["192.0.2.205"]=>
  string(23) "wan_vip200192.0.2.20532" 
}

Then under "Cleanup remaining old carps", as soon as it starts that foreach, $oldvipar['interface'] is only a single letter. 'w' in the above case.

Actions #10

Updated by Phillip Davis about 9 years ago

This looks related to commit https://github.com/pfsense/pfsense/commit/89f171b052fbe72aed654d2a1c3d5a24e9bf9902
That was applied only to master back on 16 Jan 2015. Among other things it moved
+function get_possible_listen_ips() from system.inc to interfaces.inc and changed the format of the array returned by that function.
I noticed this when testing with some stuff from master - see pull request https://github.com/pfsense/pfsense/pull/1549 which fixes up a use of get_possible_listen_ips() by services_snmp.php
So it is a bit tricky to make changes in this area - using get_possible_listen_ips() on master is different to using it on 2.2 branch.
Also, the code in master, interfaces.inc, function get_possible_traffic_source_addresses() looks wrong. It calls get_possible_listen_ips(), which returns an array in the "new" format. get_possible_traffic_source_addresses() calls the returned array $sourceips - then it proceeds to add VPN server/client entries to the $sourceips array, but it in the "old" format.
I have no idea why the commit was only applied to master - but it will cause confusion for getting related code into 2.2 branch and it has bugs that I can see.

Actions #11

Updated by Renato Botelho about 9 years ago

  • Assignee changed from Ermal Luçi to Renato Botelho
Actions #12

Updated by Renato Botelho about 9 years ago

  • Status changed from Confirmed to Feedback
Actions #15

Updated by Chris Buechler about 9 years ago

  • Status changed from Feedback to Resolved

fixed

Actions

Also available in: Atom PDF