Bug #975
closedCARP / vip interface disappears on slave after interface change
0%
Description
In my testing 2.0 (build Mon Oct 25 02:28:25 EDT 2010) I think I have found an issue when
multiple CARP virtual interfaces are configured.
After changing the configuration on the interface for example change of subnet mask,
the system seems to forget about the (first?) vip on the slave box:
ifconfig -a before:
master:
vip200: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet 172.20.0.15 netmask 0xffffffff
carp: MASTER vhid 200 advbase 1 advskew 0
vip201: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet 94.143.111.70 netmask 0xffffffff
carp: MASTER vhid 201 advbase 1 advskew 0
backup:
vip200: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet 172.20.0.15 netmask 0xffffffff
carp: BACKUP vhid 200 advbase 1 advskew 100
vip201: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet xx.xx.xx.70 netmask 0xffffffff
carp: BACKUP vhid 201 advbase 1 advskew 100
(Now, on the MASTER, change the interface subnet mask, click apply changes)
After, on slave, vip200 is gone:
vip201: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet xx.xx.xx.70 netmask 0xffffffff
carp: BACKUP vhid 201 advbase 1 advskew 100
If, on the slave, I go to interfaces -> LAN -> SAVE, without making any
changes, it restores the vips.
(At this point, on the slave, it sometimes jumps out to a wrong, old style stylesheet and thinks
the interface is not enabled. Click to home page to get it back again.)
vip201: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet xx.xx.xx.70 netmask 0xffffffff
carp: BACKUP vhid 201 advbase 1 advskew 100
vip200: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet 172.20.0.15 netmask 0xffffffff
carp: BACKUP vhid 200 advbase 1 advskew 100
Also, I have observed a problem when doing ifconfig xxx down on the primary,
the slave takes over as it should, but on doing ifconfig xxx up on the primary,
primary gets stuck in INIT state and does not take over. (or is there some sort
of preempt delay before it comes bask? I tried waiting but seemed to stay like that)
Web interface shows both as in INIT on primary, ifconfig shows actually
that vip200 is stuck in INIT, but vip201 claims to be MASTER:
vip200: flags=8<LOOPBACK> metric 0 mtu 1500
inet 172.20.0.15 netmask 0xffffffff
carp: INIT vhid 200 advbase 1 advskew 0
vip201: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet xx.xx.xx.70 netmask 0xffffffff
carp: MASTER vhid 201 advbase 1 advskew 0
Meanwhile, the secondary claims he is also the master:
vip201: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet xx.xx.xx.70 netmask 0xffffffff
carp: BACKUP vhid 201 advbase 1 advskew 100
vip200: flags=49<UP,LOOPBACK,RUNNING> metric 0 mtu 1500
inet 172.20.0.15 netmask 0xffffffff
carp: MASTER vhid 200 advbase 1 advskew 100
On the master, clicking on the interface config page and then SAVE restores
it to being the master, but then deletes the first VIP from the slave as described
above.
(see attached screenshots)
Regards,
Rob
Files
Updated by Chris Buechler about 14 years ago
- Project changed from pfSense Packages to pfSense
Updated by Chris Buechler about 14 years ago
- Category set to Virtual IP Addresses
- Target version set to 2.0
Updated by Rob Lister about 14 years ago
I think this is possibly related to Bug #959
Will wait and see if that is corrected first and test again on a newer version.
Rob
Updated by Ermal Luçi about 14 years ago
- Status changed from New to Feedback
Please try latest snapshot.
Updated by Rob Lister about 14 years ago
Have updated both boxes to snapshot built on Wed Oct 27 18:59:53 EDT 2010 and the problem
still seems to be there.
Rob
Updated by Rob Lister about 14 years ago
Maybe that snapshot doesn't include the patch, as the date on it is
20:56 on 27th. Latest snapshot is only Oct 27 18:59:53.
Updated by Chris Buechler about 14 years ago
Rob, is this fixed on the latest snapshot?
Updated by Rob Lister about 14 years ago
Yes, I had been unable to update because of problems with the amd64 build and met with disaster that
meant had to rebuild it all from the last known bootable version.
(I hit the problem described in http://redmine.pfsense.org/issues/995)
We have to use this version in production as Broadcomm NIC is not properly supported in the stable version.
(Only comes up at 10/full and we need 1000/full!)
But I got brave enough to try again, and as of build: Wed Nov 17 05:45:58 UTC 2010, it has been running
for about 9 days and this problem has gone.
Thanks,
Rob
Updated by Chris Buechler about 14 years ago
- Status changed from Feedback to Resolved