Project

General

Profile

Bug #8507

FreeBSD 11.2-BETA dhclient always uses server MTU value

Added by Jim Pingle 7 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Category:
Interfaces
Target version:
Start date:
05/11/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.4.4
Affected Architecture:
All

Description

I hit this while looking into #8506, it may not be related since it happens on other hardware. It also started around the switch to 11-stable.

Something is setting the MTU too low, and it's not in the configuration.

May 11 07:26:34 blooper kernel: nd6_setmtu0: new link MTU on mvneta0 (576) is too small for IPv6

That message happens on both an igb box and SG-3100. The routes for this interface only have that same low MTU:

: netstat -4rnW | egrep '(Mtu|mvneta0)'
Destination        Gateway            Flags       Use    Mtu      Netif Expire
default            216.252.41.1       UGS       21213    576    mvneta0
8.8.8.8            216.252.41.1       UGHS      55825    576    mvneta0
208.123.73.7       216.252.41.1       UGHS     182429    576    mvneta0
209.51.181.2       216.252.41.1       UGHS     489319    576    mvneta0
216.252.41.0/24    link#1             U             0    576    mvneta0

The interface configuration is a DHCP WAN:

        <opt1>
            <descr><![CDATA[Cable]]></descr>
            <if>igb2</if>
            <enable></enable>
            <alias-address></alias-address>
            <alias-subnet>32</alias-subnet>
            <spoofmac>00:xx:xx:xx:xx:xx</spoofmac>
            <monitorip>x.x.x.1</monitorip>
            <ipaddr>dhcp</ipaddr>
            <dhcphostname></dhcphostname>
            <dhcprejectfrom>192.168.100.1</dhcprejectfrom>
            <adv_dhcp_pt_timeout></adv_dhcp_pt_timeout>
            <adv_dhcp_pt_retry></adv_dhcp_pt_retry>
            <adv_dhcp_pt_select_timeout></adv_dhcp_pt_select_timeout>
            <adv_dhcp_pt_reboot></adv_dhcp_pt_reboot>
            <adv_dhcp_pt_backoff_cutoff></adv_dhcp_pt_backoff_cutoff>
            <adv_dhcp_pt_initial_interval></adv_dhcp_pt_initial_interval>
            <adv_dhcp_pt_values>SavedCfg</adv_dhcp_pt_values>
            <adv_dhcp_send_options></adv_dhcp_send_options>
            <adv_dhcp_request_options></adv_dhcp_request_options>
            <adv_dhcp_required_options></adv_dhcp_required_options>
            <adv_dhcp_option_modifiers></adv_dhcp_option_modifiers>
            <adv_dhcp_config_advanced></adv_dhcp_config_advanced>
            <adv_dhcp_config_file_override></adv_dhcp_config_file_override>
            <adv_dhcp_config_file_override_path></adv_dhcp_config_file_override_path>
        </opt1>

It does have a spoofed MAC, and there is a GIF interface on top of that interface as well as an OpenVPN and IPsec, but attempting to configure those on a fresh lab install doesn't trigger the issue, and removing them or disabling them here doesn't seem to affect it.

Other interfaces (PPPoE WAN, multiple local LANs) are unaffected.

I commented out every call to pfSense_interface_mtu() in interfaces.inc and the low MTU still happened, so it doesn't appear to be related to that function at least.

supersede-advanced.diff (531 Bytes) supersede-advanced.diff Jim Pingle, 10/02/2018 02:38 PM

Associated revisions

Revision 5fed4bf2 (diff)
Added by Jim Pingle 7 months ago

Supercede the DHCP server MTU to avoid setting it improperly and/or causing a link state loop. Ticket #8507 Ticket #8506

This requires a patch from https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206721#c12 which garga has imported into the tree.

Revision 5a703552 (diff)
Added by Jim Pingle 2 months ago

Supersede the DHCP MTU when advanced options are present. Issue #8507

Revision 3a8836a1 (diff)
Added by Jim Pingle 2 months ago

Supersede the DHCP MTU when advanced options are present. Issue #8507

(cherry picked from commit 5a7035523e9f70fa568d688915bf4aed2f0aac41)

History

#1 Updated by Jim Pingle 7 months ago

Same thing happens on a factory default configuration, so looking deeper at packet captures of the DHCP packets the ISP is sending an MTU of 576. This was not requested on 11.1, but is being requested and applied now.

/usr/local/sbin/pfSense-dhclient-script doesn't have any code to process the MTU, so it may be getting set by dhclient directly.

Comparing 2.4.3-p1 and 2.4.4, the only difference in the request is that the 2.4.4 client requests option 26 (Interface MTU) in addition to the other parameters.

#2 Updated by Jim Pingle 7 months ago

Looks like this is a recent change in FreeBSD dhclient to add support for the MTU:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206721
https://svnweb.freebsd.org/base?view=revision&revision=331179
https://reviews.freebsd.org/D5675
https://svnweb.freebsd.org/base?view=revision&revision=239564

It seems they were aware it could cause link bounces, and changed dhclient so it doesn't exit when the link drops, but it appears to still trip up our link handling scripts.

Based on that, this could end up being the same root cause as #8506 but it manifested differently depending on the hardware.

#3 Updated by Jim Pingle 7 months ago

I tried setting an explicit request list in the generated dhclient configuration which does not send a request for the MTU. Didn't help, if the server sends the MTU it still is taken and used.

I also tried setting a supersede statement for interface-mtu with 0, 1, and other values (e.g. 1402), all of which appear to be ignored in favor of the server-side MTU value every time.

It looks like the way it was coded into dhclient in https://reviews.freebsd.org/D5675 that it can't be ignored or overridden, which is rather awful. We may need to back that patch out or convince FreeBSD that it needs fixed to at least have a way to turn that off or respect the supersede. Preferably disable it entirely.

#4 Updated by Jim Pingle 7 months ago

  • Subject changed from Interface MTU being set incorrectly in some cases to FreeBSD 11.2-BETA dhclient always uses server MTU value

Updated the subject to be more accurate.

I also dropped a note on https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206721#c10

#5 Updated by Jim Pingle 7 months ago

  • Status changed from New to Feedback

Renato committed a patch that was added to the FreeBSD PR that should let supesede work, next snapshots should be better.

#6 Updated by Jim Pingle 7 months ago

The patch looks good. Setting a supersede of 0 in the dhclient config now allows the MTU change to be ignored. The test system I could replicate the MTU issue on is OK now, and there are multiple confirmations that the latest snapshot is working as expected from forum users.

We need to keep an eye on https://reviews.freebsd.org/D15484 as it evolves and is eventually committed to 11.2. Once it makes it into the tree, the manually added patch can be removed.

#7 Updated by Jim Pingle 6 months ago

  • Status changed from Feedback to Assigned
  • Assignee set to Renato Botelho
  • Priority changed from Very High to Normal
  • Affected Architecture set to All

The supersede change was committed and now has been MFC'd as well:

https://svnweb.freebsd.org/base?view=revision&revision=334787

We should be able to remove the patch we added manually in favor of the version now the tree.

#8 Updated by Jim Pingle 5 months ago

  • Status changed from Assigned to Resolved

We're on 11.2-RELEASE now with stock patches, working as expected.

#9 Updated by Bennett Feitell 2 months ago

This is still causing problems in pfSense 2.4.4-RELEASE.

I believe that the upstream patches to FreeBSD introduce the ability to supersede the option 26 interface-mtu value being returned by the DHCP server. The user must affirmatively add "supersede interface-mtu 0", or the system will set the bad MTU in the DHCP server response. Option 26 information is still being requested, and if it is returned by the server with the lease it will be enforced to the point of overriding an explicitly set MTU. If the MTU given in the lease is bad, things break.

Please see these threads:
https://forum.netgate.com/topic/136089/solved-and-revised-2-4-4-release-arpresolve-can-t-allocate-llinfo-for-gateway-on-interface0-dhcp-mtu-576
https://forum.netgate.com/topic/136121/wan-interfaces-fail-to-return-after-power-outage

#10 Updated by Jim Pingle 2 months ago

We already set that (see the linked commit above).

My edge firewall failed horribly because my ISP sent the MTU even when not requested, but it is working now. So there is something else happening here that isn't necessarily related to the dhclient config, but perhaps the presence of the proper code/patch in the binary.

#11 Updated by Bennett Feitell 2 months ago

All I know is that I needed to affirmatively set "supersede interface-mtu 0" in the option modifiers for the WAN config. The incantation does not seem to be issued by default. It takes user action from what I have seen.

#12 Updated by Jim Pingle 2 months ago

https://github.com/pfsense/pfsense/blob/master/src/etc/inc/interfaces.inc#L4990

The line should always be there by default, unless there is some other customization that prevented it from working. Check your copy of /etc/inc/interfaces.inc and make sure it matches the above line.

If you have any other custom settings in the WAN DHCP client settings, I'd be interested to know what they are.

#13 Updated by Bennett Feitell 2 months ago

I suspect that a prior hard setting of MTU on the interface may be interfering with the propagation of the fix in dhclient.conf on upgrades. I know that I have routinely hard set an MTU of 1500 on the boxes I adminster. Is it possible that the hard setting of MTU is preventing dhclient.conf from picking up the change on upgrade? If so, it would explain the behavior I have seen.

#14 Updated by Bennett Feitell 2 months ago

My copy of /etc/inc/interfaces.inc matches, and contains the line.

#15 Updated by Jim Pingle 2 months ago

Looks like there may be two ways this could happen:

#1: If you used advanced DHCP client options that clobber the default values. This case would be fixed by the attached patch.

#2: A custom dhclient configuration file, which would need to be fixed manually. Injecting automatic edits into that seems wrong, though. I'll add a note to the upgrade guide about it. Someone might conceivably need this behavior and not letting them override it seems like a bad idea.

#16 Updated by Bennett Feitell 2 months ago

I think I see a typo in your diff. "tsupersede".

#17 Updated by Jim Pingle 2 months ago

It's \t for a tab, then supersede.

#18 Updated by Bennett Feitell 2 months ago

Got it, thank you.

By advanced DHCP client options, do you mean specifying an MTU for the interface? If that would break the patch, I will guess that is what has been happening. The specified MTU is ignored, and the bad value in the lease is enforced.

#19 Updated by Jim Pingle 2 months ago

No, I mean settings in the DHCP Client Configuration box on the WAN when Advanced Configuration is set, like Protocol Timings, Presets, or Lease Requirements and Requests.

#20 Updated by Bennett Feitell 2 months ago

So a setting to reject leases from the cable modem itself in DHCP Client Configuration might cause this too, as would any other setting in that box. The machine in question definitely had reject leases from 192.168.100.1 set in that section.

#21 Updated by Jim Pingle 2 months ago

No, that setting would not do it on its own, as it is handled in a different way.

It only happens when adv_dhcp_config_advanced is set, which is the Advanced Configuration checkbox, or when Configuration Override is set pointing to a custom file, which is a different case.

#22 Updated by Bennett Feitell 2 months ago

Is it possible that just toggling the visibility of the advanced settings box to on and then saving and applying would interfere with the fix?

#23 Updated by Jim Pingle 2 months ago

If you check the box and save, it will overwrite the default DHCP values with what is in the boxes for Protocol Timings, Presets, or Lease Requirements and Requests, even if they are empty. The other settings do not require checking that box. So if someone did check it, they would lose the automatic supersede without my last patch, even if they made no other changes.

#24 Updated by Bennett Feitell 2 months ago

I suspect that might be happening to people. Thank you for being so attentive on this.

#25 Updated by Jim Pingle 2 months ago

I added a note to the upgrade guide pointing back here and offering a workaround. Thanks for confirming the details!

Also available in: Atom PDF