Bug #14083
closedAdding MSS and MTU values on a LAGG VLAN interface breaks connectivity
Added by Danilo Zrenjanin over 1 year ago. Updated about 1 month ago.
100%
Description
Steps to reproduce:
- Under Interfaces/WAN, define MTU 1480 and MSS 1440. Save and Apply the changes.
- Reboot the firewall.
- Following the boot logs, I found nothing pointing to the issue. However, after the boot process finishes, the WAN interface didn't receive the IP from the DHCP server, and the DHCP server doesn't provide addresses to the clients connected to the LAN interface. I tried manually defining IP on the client machine, but I couldn't ping the 192.168.1.1 (XG-7100) Lan address or access the GUI.
Files
Related issues
Updated by Jordan G over 1 year ago
confirmed on 7100 running 23.01 - after setting mtu/mss and rebooting system receives and displays IP on WAN in console but gui cannot be reached and ping test from console reports sendto: network is down, trying to ping from a different host returns destination host unreachable.
Updated by Jim Pingle over 1 year ago
- Project changed from pfSense to pfSense Plus
- Category changed from Interfaces to Interfaces
- Status changed from New to Duplicate
Updated by Danilo Zrenjanin over 1 year ago
We had a customer complaining about similar behavior at Netgate 2100. However, I couldn't reproduce this behavior on Netgate 2100. I defined MTU/MSS on mvneta0 and mvneta1, and everything worked fine. It seems that only XG-7100 is affected.
Updated by Lukas Macura over 1 year ago
Is there any progress here?
This is serious bug which affects all XG-7100s path MTU discovery.
Is there any workaround for this please?
Updated by Danilo Zrenjanin over 1 year ago
- Priority changed from Normal to High
I tested against:
23.05-RC (amd64) built on Mon May 15 22:17:39 UTC 2023 FreeBSD 14.0-CURRENT
The problem persists.
Updated by Joakim Plate over 1 year ago
I think i may be affected by this on a Netgate 3100. I had an MTU set on WAN interface 1480, which had been seemingly been working properly for ages on 22 series. I then upgraded to 23.01 yesterday, and started having really strange intermittent (some sites) connection issues. Once i found this issue, i tried removing the MTU setting on the WAN interface, and things went back to normal. Should be said, i'm not sure why I had it set in the first place.
Updated by Kris Phillips over 1 year ago
Just ran into this with another customer running 23.05.1 on a 7100. Adding an <mtu> value to any interface on the switchports will trigger this.
Updated by Kris Phillips over 1 year ago
Other behavior notes:
If you run an ifconfig lagg0 from shell, the lagg will show up and both of the ix interfaces will show ACTIVE just fine. However, the "Assign Interfaces" option from VGA/Serial console will not show lagg0 as an assignable interface with this bug. Additionally, the vlan subniterfaces will show this for their VLAN config:
groups: vlan
vlan: 0 vlanproto: 0x0000 vlanpcp: 0 parent interface: <none>
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
All of the VLANs will show <none> for the parent interface and 0 for the VLAN tag, but the interface will still be lagg0.#### as expected.
If you manage to get into the webConfigurator post-boot and save/apply the VLANs one at a time, they will all come up including the ones with an MTU/MSS set, so it doesn't seem to be a capability issue. Probably something getting "caught up" during the process on boot.
Updated by Kris Phillips over 1 year ago
Tested this on the Netgate 3100 and it appears to be isolated to only the 7100. Setting an MTU on LAN while using or not using 802.1q VLAN tagging does not cause any link issues on the switchports.
Updated by Jordan G about 1 year ago
I seem to also be able to reproduce this behavior using the ix interfaces on cordoba platform to create a LAGG (LACP) and setting MTU to 9000 and then trying to adjust any of the child VLAN's MTU that are also on LAGG, running 23.05.1
Updated by Daniel Hoffend about 1 year ago
I can confirm the issue with pfSense 2.7. We're using multiple vlan interfaces on an lagg1 interface. (lagg1.40, lagg1.41 ...) and set mtu to 1440 and mss to 1400 (due to vpn tunnels and unknown provider links).
When I create the lagg1 interface and vlan subinterfaces and change the interface assignments everything seems to work until I reboot the pfsense (vm via libvirtd). Using the the same mtu/mss settings on a non-lagg interface (vtnet0 for example) works like expected.
Updated by Kris Phillips about 1 year ago
Daniel Hoffend wrote in #note-12:
I can confirm the issue with pfSense 2.7. We're using multiple vlan interfaces on an lagg1 interface. (lagg1.40, lagg1.41 ...) and set mtu to 1440 and mss to 1400 (due to vpn tunnels and unknown provider links).
When I create the lagg1 interface and vlan subinterfaces and change the interface assignments everything seems to work until I reboot the pfsense (vm via libvirtd). Using the the same mtu/mss settings on a non-lagg interface (vtnet0 for example) works like expected.
Hello Daniel,
Do you see the same issue as I mentioned earlier regarding the sub-interfaces showing <none> for the parent interface? Please advise.
Updated by Marcos M 7 months ago
- File lagg.patch added
- Before upgrading to 24.03 where you hit the loop, remove the MTU/MSS values from the lagg interfaces
- Upgrade to 24.03
- Apply the patch using the System Patches package (copy/paste contents)
- Set the MTU/MSS values
- Reboot
Edit: We've reproduced the issue - no need to test.
Updated by Marcos M 6 months ago
- File diff.txt diff.txt added
- Subject changed from Adding MSS and MTU values on XG-7100 WAN interface breaks the network connectivity on the firewall to Adding MSS and MTU values on a LAGG VLAN interface breaks connectivity
- Assignee set to Marcos M
- Target version set to 2.8.0
- Plus Target Version set to 24.07
- Affected Architecture All added
- Affected Architecture deleted (
7100)
The looping issue seems to be triggered when there are at least two assigned VLAN interfaces with a LAGG parent, and one of the VLANs has the MTU/MSS set. The issue is not specific to the 7100. A patch is attached for testing.
https://gitlab.netgate.com/pfSense/pfSense/-/merge_requests/1149
Updated by Lev Prokofev 6 months ago
The client confirmed that the patch solves the issue #2754566672
Updated by Jim Pingle 6 months ago
- Plus Target Version changed from 24.07 to 24.08
Updated by Chris Linstruth 6 months ago
This patch resolved an issue I was having as well. lagg0 assigned, enabled, and unnumbered, MTU 9000 set on it, and several VLANs.
Updated by Jim Pingle about 1 month ago
- Plus Target Version changed from 24.08 to 24.11