Bug #9453
closedReconfiguring a parent LAGG interface breaks its VLANs
100%
Description
Environment: SG-1000
Not sure if this is valid in other environment.
Upon boot, all the VLANs get orphaned.
The SG-1000 was working previously, with a very similar config, i am not sure what caused the issue (upgrade from 2.4.4 to 2.4.4-p1 or config upgrade).
I tend to exclude the config, see to the end of the bug the test done.
Seems similar to:
https://redmine.pfsense.org/issues/3976
https://redmine.pfsense.org/issues/8527
dmesg:
Trying to mount root from ufs:/dev/ufsid/5af4c96aa287b62c [rw,noatime]...
Warning: no time-of-day clock registered, system time will not be set accurately
random: unblocking device.
cpsw0: link state changed to UP
lagg0: IPv6 addresses on cpsw0 have been removed before adding it as a member to prevent IPv6 address scope violation.
lagg0: link state changed to UP
cpsw1: link state changed to UP
lagg0: IPv6 addresses on cpsw1 have been removed before adding it as a member to prevent IPv6 address scope violation.
vlan0: changing name to 'lagg0.7'
vlan1: changing name to 'lagg0.9'
vlan2: changing name to 'lagg0.10'
vlan3: changing name to 'lagg0.11'
vlan4: changing name to 'lagg0.12'
vlan5: changing name to 'lagg0.13'
vlan6: changing name to 'lagg0.8'
lagg0: link state changed to DOWN
lagg0.7: link state changed to DOWN
lagg0.8: link state changed to DOWN
lagg0.9: link state changed to DOWN
lagg0.10: link state changed to DOWN
lagg0.11: link state changed to DOWN
lagg0.12: link state changed to DOWN
lagg0.13: link state changed to DOWN
lagg0: link state changed to UP
cpsw0: promiscuous mode enabled
cpsw1: promiscuous mode enabled
lagg0: promiscuous mode enabled
carp: 1@lagg0: INIT -> BACKUP (initialization complete)
lagg0.11: promiscuous mode enabled
carp: demoted by 240 to 240 (interface down)
lagg0.12: promiscuous mode enabled
carp: demoted by 240 to 480 (interface down)
lagg0.10: promiscuous mode enabled
carp: demoted by 240 to 720 (interface down)
lagg0.9: promiscuous mode enabled
carp: demoted by 240 to 960 (interface down)
lagg0.7: promiscuous mode enabled
carp: demoted by 240 to 1200 (interface down)
carp: 7@lagg0: INIT -> BACKUP (initialization complete)
carp: 7@lagg0: BACKUP -> MASTER (master timed out)
pflog0: promiscuous mode enabled
carp: 7@lagg0: MASTER -> BACKUP (more frequent advertisement received)
ifa_maintain_loopback_route: deletion failed for interface lagg0: 3
Example nic.
Please note the following:
vlan: 0 vlanpcp: 0 parent interface: <none>
- ifconfig lagg0.7
lagg0.7: flags=8903<UP,BROADCAST,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=80000<LINKSTATE>
ether c8:df:84:c1:16:37
inet6 fe80::cadf:84ff:fec1:1637%lagg0.7 prefixlen 64 tentative scopeid 0x8
inet 172.16.77.242 netmask 0xffffff00 broadcast 172.16.77.255
inet 172.16.77.240 netmask 0xffffffff broadcast 172.16.77.240 vhid 6
groups: vlan
carp: INIT vhid 6 advbase 1 advskew 100
vlan: 0 vlanpcp: 0 parent interface: <none>
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
[root@pf2-tos ~]# ifconfig lagg0.8
lagg0.8: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=80000<LINKSTATE>
ether c8:df:84:c1:16:37
inet6 fe80::cadf:84ff:fec1:1637%lagg0.8 prefixlen 64 tentative scopeid 0xe
inet 172.16.78.242 netmask 0xffffff00 broadcast 172.16.78.255
groups: vlan
vlan: 0 vlanpcp: 0 parent interface: <none>
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> - ifconfig lagg0
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1492
options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
ether c8:df:84:c1:16:37
inet6 fe80::cadf:84ff:fec1:1637%lagg0 prefixlen 64 scopeid 0x7
inet 172.16.8.242 netmask 0xffffff00 broadcast 172.16.8.255
inet 172.16.8.240 netmask 0xffffffff broadcast 172.16.8.240 vhid 1
inet 172.16.8.251 netmask 0xffffffff broadcast 172.16.8.251 vhid 7
laggproto lacp lagghash l2,l3,l4
laggport: cpsw0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
laggport: cpsw1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
groups: lagg
carp: BACKUP vhid 1 advbase 7 advskew 101
carp: BACKUP vhid 7 advbase 1 advskew 100
media: Ethernet autoselect
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> #
Test done:
1) remove all but 1 VLAN (11). reboot. Issue still present
2) remove all firewall rules but the anti-lockout. reboot. issue still present.
3) factory reset, restore a previous configuration (just in case)
4) remove again all VLANs but a couple, change VLAN for WAN, remove all frewall rules but anti-lockout. reboot. Issue still present
5) remove all CARP. reboot. issue still present
6) remove cpsw0 from the LAGG. vlans goes up (i would bet because of network restart). reboot. Issue still persist.
7) removed the package resulting installed (nrpe) -- from config file as webui was not helping out listing the installed packages
7) add cpsw0 on the LAGG, remove cpsw1. vlans goes up (i would bet because of network restart). reboot. Issue still persist.
Files
Related issues
Updated by Jim Pingle about 5 years ago
- Category changed from Interfaces to LAGG Interfaces
Updated by Marcos M 10 months ago
- Has duplicate Bug #12926: Changing LAGG type on CARP interfaces makes VIPs go to an "init" State added
Updated by Marcos M 10 months ago
- Has duplicate Bug #13344: Vlan loses parent interface when changing LAGG mtu to jumbo frames added
Updated by Marcos M 10 months ago
- Has duplicate Bug #14603: LAGG VLAN Interfaces report parent no longer exists added
Updated by Marcos M 10 months ago
- Has duplicate Bug #14083: Adding MSS and MTU values on a LAGG VLAN interface breaks connectivity added
Updated by Marcos M 10 months ago
- Has duplicate Bug #13473: No IPv6 address acquired after reboot/dhcp6c not starting added
Updated by Marcos M 10 months ago
- Subject changed from VLAN Interfaces on LAGG get orphaned at boot to Reconfiguring the parent LAGG interface does not handle its child VLANs
- Status changed from New to In Progress
- Assignee set to Marcos M
- Target version set to 2.8.0
- % Done changed from 0 to 50
- Plus Target Version set to 24.03
- Release Notes set to Default
Updated by Marcos M 9 months ago
- Status changed from In Progress to Feedback
- % Done changed from 50 to 100
Fixed in 88674cdb01ba38adc71f12be73e0305bb6f57ccd.
Updated by Mike Moore 9 months ago
Could the fix resolve https://redmine.pfsense.org/issues/14659 or https://redmine.pfsense.org/issues/14483
Updated by Marcos M 9 months ago
- Subject changed from Reconfiguring the parent LAGG interface does not handle its child VLANs to Reconfiguring a parent LAGG interface breaks its VLANs
Mike Moore wrote in #note-10:
Could the fix resolve https://redmine.pfsense.org/issues/14659 or https://redmine.pfsense.org/issues/14483
Nope. Those are essentially the same issue: interfaces are reconfigured rather than "updated" when a change is made.
Updated by Jordan G 5 months ago
- Status changed from Resolved to Confirmed
changing anything regarding the parent interface stops all communication
lagg0.4091: flags=8803<UP,BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: LAN options=4000000<MEXTPG> ether inet 192.168.71.1 netmask 0xffffff00 broadcast 192.168.71.255 inet6 fe80::208:a2ff:fe10:1176%lagg0.4091 prefixlen 64 tentative scopeid 0x15 groups: vlan vlan: 0 vlanproto: 0x0000 vlanpcp: 0 parent interface: <none> nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
testing with 7100 on 24.03 release
Updated by Jim Pingle 5 months ago
- Plus Target Version changed from 24.03 to 24.07
Updated by Steve Wheeler 5 months ago
I can't replicate that in 24.03. Setting the lagg0 interface MTU (after assigning it) in a 7100 results in a ~30s outage while the lagg re-establishes. But after that it starts pasing traffic again without intervention.
Updated by Marcos M 5 months ago
- Status changed from Confirmed to Resolved
To reproduce the issue, the parent interface (lagg0
) needs to be added to the configuration as disabled. When an interface is configured as disabled in the GUI, the interface is not added to the system, hence the child interfaces (e.g. lagg0.4091
) have no parent and do not work. I believe this is the expected behavior; any inconsistencies that lead to lagg0
existing while disabled in the GUI should be detailed on a separate redmine.
Updated by Steve N 5 months ago
- File 1000020721.jpg 1000020721.jpg added
Steve Wheeler wrote in #note-16:
I can't replicate that in 24.03. Setting the lagg0 interface MTU (after assigning it) in a 7100 results in a ~30s outage while the lagg re-establishes. But after that it starts passing traffic again without intervention.
Reboot the device. In my case, this is a surefire way to "break" it.
In fact, I recently updated to 24.03 after seeing this status as "Resolved", and it's worse on this version. My unit gets stuck in a loop at "Configuring VLAN interfaces" and never finishes coming up. I had to boot into single user mode and manually restore a configuration without the MTU/MSS configuration in order to get it to boot. Previously (23.09) it would boot but act dead, no traffic on LAN/WAN (lagg0.4091/lagg0.4090) but would at least boot into the normal serial console menu so I could easily reset config from there.
Steps to reproduce:
configure MTU=1428 and MSS=1388 on WAN (lagg0.4090) interface. Reboot.
Updated by Marcos M 5 months ago
@Steve N
Do you have the parent lagg interface assigned and disabled? See:
https://redmine.pfsense.org/issues/15452
Updated by Marcos M 5 months ago
- Related to Bug #15452: Unexpected/Undefined behaviour of disabled interfaces added
Updated by Steve N 5 months ago
I don't even know how I would assign and disable the interface, my bug was actually https://redmine.pfsense.org/issues/14083 but it was marked as a duplicate of this one so I responded here. The LAGG0 interface is not assigned to anything !
! in the Assignments section of the web UI, if that answers the question.
Updated by Jordan G 3 months ago
The patch from https://redmine.pfsense.org/issues/14083 works to prevent the connectivity issues experienced as a result of changing MTU values, tested on xg-7100 and 41/6100. Looks resolved currently