Project

General

Profile

Actions

Bug #14083

closed

Adding MSS and MTU values on a LAGG VLAN interface breaks connectivity

Added by Danilo Zrenjanin over 1 year ago. Updated about 1 month ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Interfaces
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
24.11
Release Notes:
Default
Affected Version:
Affected Architecture:
All

Description

Steps to reproduce:

  1. Under Interfaces/WAN, define MTU 1480 and MSS 1440. Save and Apply the changes.
  2. Reboot the firewall.
  3. Following the boot logs, I found nothing pointing to the issue. However, after the boot process finishes, the WAN interface didn't receive the IP from the DHCP server, and the DHCP server doesn't provide addresses to the clients connected to the LAN interface. I tried manually defining IP on the client machine, but I couldn't ping the 192.168.1.1 (XG-7100) Lan address or access the GUI.

Files

diff.txt (3.97 KB) diff.txt patch Marcos M, 05/11/2024 12:11 AM

Related issues

Is duplicate of Bug #9453: Reconfiguring a parent LAGG interface breaks its VLANsResolvedMarcos M04/04/2019

Actions
Actions #1

Updated by Jordan G over 1 year ago

confirmed on 7100 running 23.01 - after setting mtu/mss and rebooting system receives and displays IP on WAN in console but gui cannot be reached and ping test from console reports sendto: network is down, trying to ping from a different host returns destination host unreachable.

Actions #2

Updated by Jim Pingle over 1 year ago

  • Project changed from pfSense to pfSense Plus
  • Category changed from Interfaces to Interfaces
  • Status changed from New to Duplicate
Actions #3

Updated by Jim Pingle over 1 year ago

  • Status changed from Duplicate to New
Actions #4

Updated by Danilo Zrenjanin over 1 year ago

We had a customer complaining about similar behavior at Netgate 2100. However, I couldn't reproduce this behavior on Netgate 2100. I defined MTU/MSS on mvneta0 and mvneta1, and everything worked fine. It seems that only XG-7100 is affected.

Actions #5

Updated by Lukas Macura over 1 year ago

Is there any progress here?
This is serious bug which affects all XG-7100s path MTU discovery.
Is there any workaround for this please?

Actions #6

Updated by Danilo Zrenjanin over 1 year ago

  • Priority changed from Normal to High

I tested against:

23.05-RC (amd64)
built on Mon May 15 22:17:39 UTC 2023
FreeBSD 14.0-CURRENT

The problem persists.

Actions #7

Updated by Joakim Plate over 1 year ago

I think i may be affected by this on a Netgate 3100. I had an MTU set on WAN interface 1480, which had been seemingly been working properly for ages on 22 series. I then upgraded to 23.01 yesterday, and started having really strange intermittent (some sites) connection issues. Once i found this issue, i tried removing the MTU setting on the WAN interface, and things went back to normal. Should be said, i'm not sure why I had it set in the first place.

Actions #8

Updated by Kris Phillips over 1 year ago

Just ran into this with another customer running 23.05.1 on a 7100. Adding an <mtu> value to any interface on the switchports will trigger this.

Actions #9

Updated by Kris Phillips over 1 year ago

Other behavior notes:

If you run an ifconfig lagg0 from shell, the lagg will show up and both of the ix interfaces will show ACTIVE just fine. However, the "Assign Interfaces" option from VGA/Serial console will not show lagg0 as an assignable interface with this bug. Additionally, the vlan subniterfaces will show this for their VLAN config:

groups: vlan
vlan: 0 vlanproto: 0x0000 vlanpcp: 0 parent interface: <none>
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

All of the VLANs will show <none> for the parent interface and 0 for the VLAN tag, but the interface will still be lagg0.#### as expected.

If you manage to get into the webConfigurator post-boot and save/apply the VLANs one at a time, they will all come up including the ones with an MTU/MSS set, so it doesn't seem to be a capability issue. Probably something getting "caught up" during the process on boot.

Actions #10

Updated by Kris Phillips over 1 year ago

Tested this on the Netgate 3100 and it appears to be isolated to only the 7100. Setting an MTU on LAN while using or not using 802.1q VLAN tagging does not cause any link issues on the switchports.

Actions #11

Updated by Jordan G about 1 year ago

I seem to also be able to reproduce this behavior using the ix interfaces on cordoba platform to create a LAGG (LACP) and setting MTU to 9000 and then trying to adjust any of the child VLAN's MTU that are also on LAGG, running 23.05.1

Actions #12

Updated by Daniel Hoffend about 1 year ago

I can confirm the issue with pfSense 2.7. We're using multiple vlan interfaces on an lagg1 interface. (lagg1.40, lagg1.41 ...) and set mtu to 1440 and mss to 1400 (due to vpn tunnels and unknown provider links).

When I create the lagg1 interface and vlan subinterfaces and change the interface assignments everything seems to work until I reboot the pfsense (vm via libvirtd). Using the the same mtu/mss settings on a non-lagg interface (vtnet0 for example) works like expected.

Actions #13

Updated by Kris Phillips about 1 year ago

Daniel Hoffend wrote in #note-12:

I can confirm the issue with pfSense 2.7. We're using multiple vlan interfaces on an lagg1 interface. (lagg1.40, lagg1.41 ...) and set mtu to 1440 and mss to 1400 (due to vpn tunnels and unknown provider links).

When I create the lagg1 interface and vlan subinterfaces and change the interface assignments everything seems to work until I reboot the pfsense (vm via libvirtd). Using the the same mtu/mss settings on a non-lagg interface (vtnet0 for example) works like expected.

Hello Daniel,

Do you see the same issue as I mentioned earlier regarding the sub-interfaces showing <none> for the parent interface? Please advise.

Actions #14

Updated by Marcos M 11 months ago

  • Project changed from pfSense Plus to pfSense
  • Category changed from Interfaces to Interfaces
  • Status changed from New to Duplicate
Actions #15

Updated by Marcos M 11 months ago

  • Is duplicate of Bug #9453: Reconfiguring a parent LAGG interface breaks its VLANs added
Actions #16

Updated by Marcos M 7 months ago

  • Status changed from Duplicate to Feedback

Part of the issue here has been solved with #9453. Some situations remain where things can break - see: https://redmine.pfsense.org/issues/9453#note-22

Actions #17

Updated by Steve N 7 months ago

This behavior started for me when I moved to 23.05 and persists through 24.03, and is actually worse on 24.03 than it was on 23.09, as now it loops "Configuring VLAN interfaces" in the console and never comes up to the regular console menu.

Actions #18

Updated by Marcos M 7 months ago

  • File lagg.patch added
Thanks for the feedback - hopefully we'll have some better luck reproducing the issue now. In the meantime if it's not too much trouble, I'd be interested to see if the attached patch works. Apply it like so:
  1. Before upgrading to 24.03 where you hit the loop, remove the MTU/MSS values from the lagg interfaces
  2. Upgrade to 24.03
  3. Apply the patch using the System Patches package (copy/paste contents)
  4. Set the MTU/MSS values
  5. Reboot

Edit: We've reproduced the issue - no need to test.

Actions #19

Updated by Marcos M 7 months ago

  • File deleted (lagg.patch)
Actions #20

Updated by Steve N 7 months ago

Sorry, I didn't get notified of your latest post. I take it the patch did NOT resolve the issue then, but you've identified the problem. Please let me know if there's something else I can do to help!

Actions #21

Updated by Marcos M 6 months ago

  • File diff.txt diff.txt added
  • Subject changed from Adding MSS and MTU values on XG-7100 WAN interface breaks the network connectivity on the firewall to Adding MSS and MTU values on a LAGG VLAN interface breaks connectivity
  • Assignee set to Marcos M
  • Target version set to 2.8.0
  • Plus Target Version set to 24.07
  • Affected Architecture All added
  • Affected Architecture deleted (7100)

The looping issue seems to be triggered when there are at least two assigned VLAN interfaces with a LAGG parent, and one of the VLANs has the MTU/MSS set. The issue is not specific to the 7100. A patch is attached for testing.
https://gitlab.netgate.com/pfSense/pfSense/-/merge_requests/1149

Actions #22

Updated by Marcos M 6 months ago

  • Status changed from Feedback to Pull Request Review
Actions #23

Updated by Steve N 6 months ago

That patch appears to have done the trick, we have successfully booted completely with MTU/MSS values in place.

Actions #24

Updated by Lev Prokofev 6 months ago

The client confirmed that the patch solves the issue #2754566672

Actions #25

Updated by Marcos M 6 months ago

  • Status changed from Pull Request Review to Resolved
  • % Done changed from 0 to 100
Actions #26

Updated by Jim Pingle 6 months ago

  • Plus Target Version changed from 24.07 to 24.08
Actions #27

Updated by Chris Linstruth 6 months ago

This patch resolved an issue I was having as well. lagg0 assigned, enabled, and unnumbered, MTU 9000 set on it, and several VLANs.

Actions #28

Updated by Jordan G 5 months ago

Patch is working as tested on 41/61/7100 hardware

Actions #29

Updated by Jim Pingle about 1 month ago

  • Plus Target Version changed from 24.08 to 24.11
Actions

Also available in: Atom PDF