Bug #3941
closedadding a DHCP client interface results in missing default gateway on 2.2
100%
Description
Take a simple WAN and LAN setup, WAN on DHCP with its dynamic gateway marked as default, LAN static. Add a third NIC as OPT1, configured as a DHCP client. Save and apply changes on OPT1. Your default gateway is now gone.
Not just the first time either, any time you save and apply changes on OPT1 in that scenario, the default gateway is gone. On boot, it's fine, and any other gateway-related operation seems to handle that just fine.
Updated by Chris Buechler almost 10 years ago
the subject doesn't quite cover all the breakage this causes, there are various times that the default gateway is removed in the described circumstance.
Updated by Chris Buechler almost 10 years ago
- Assignee changed from Renato Botelho to Chris Buechler
I'll take this one
Updated by Chris Buechler almost 10 years ago
most I've found thus far is it still happens after removing all the "route delete default" commands from dhclient-script. Not seeing an obvious place it's happening. It's easy to replicate and re-replicate, set interface back to "none", save, apply. Set back to DHCP, save, apply. Default gateway gone.
Updated by Ermal Luçi almost 10 years ago
- Status changed from Confirmed to Feedback
Updated by Ermal Luçi almost 10 years ago
- % Done changed from 0 to 100
Applied in changeset d35dfaaecb5eabedade43738ba4f76967a7425a3.
Updated by Ermal Luçi almost 10 years ago
Applied in changeset 935fcedbca2dbe8c3d9eb41bc5739b511a9ec19a.
Updated by Chris Buechler almost 10 years ago
- Status changed from Feedback to Confirmed
- Assignee changed from Chris Buechler to Ermal Luçi
that didn't fix the issue described here
Updated by Chris Buechler almost 10 years ago
the fix earlier in rc.linkup didn't have any effect here. Dug through this more tonight. Best I can definitively say right now beyond the above is it's none of the 'route' commands in system.inc to blame.
Updated by Renato Botelho almost 10 years ago
- Assignee changed from Ermal Luçi to Renato Botelho
I'll take it.
Updated by Chris Buechler almost 10 years ago
- Assignee changed from Renato Botelho to Chris Buechler
- % Done changed from 100 to 50
getting close to finding this, back to me as I'm working on it now.
Updated by Chris Buechler almost 10 years ago
- Assignee changed from Chris Buechler to Ermal Luçi
found the exact spot where the issue happens. /sbin/dhclient-script, line 325.
$IFCONFIG $interface inet 0.0.0.0 netmask 0.0.0.0 broadcast 255.255.255.255 up
When you ifconfig an interface with 0.0.0.0/0.0.0.0, FreeBSD overrides the default route with that interface. Then later in dhclient-script, it blows away the 0.0.0.0 IP and leaves the system with no default route.
The interface that's assigned with 0.0.0.0/0.0.0.0 seems to be how dhclient finds its interface though. Take out the above line, or just change it to:
$IFCONFIG $interface up
and dhclient spits out:
dhclient: PREINIT em2: not found exiting.
So I'm stuck there in a catch 22. If you put that 0.0.0.0 on the interface, it blows up the routing table. If you don't, dhclient can't function.
Over to Ermal.
Updated by Phillip Davis almost 10 years ago
Just a thought - perhaps the interface can be set to all/part of the link-local address space 169.254.0.0/255.255.255.0 for example. DHCP client should be able to do its broadcast DHCP request and get the broadcast response (it should not be blocked by the link-local address blocks that were added recently), and when DHCP client succeeds the interface will get its proper address.
Nobody will be using link-local address space for real on their router.
If that works on single WAN, would also need to test it running in parallel on multiple WANs to see if having multiple WANs temporarily with the same link-local subnet specified causes any issues.
Updated by Chris Buechler almost 10 years ago
- Status changed from Confirmed to Feedback
- Assignee changed from Ermal Luçi to Chris Buechler
Thanks for the comment Phil, that thought process brought to mind an idea. Using a /32 mask instead of 0.0.0.0 fixes the original issue here, and seems to work in all circumstances. I just committed that change after it tested out fine on my test systems. Leaving for feedback, hopefully that'll resolve it without breaking anything.
I suspect if the request isn't sourced from 0.0.0.0, at least some DHCP servers won't answer. Possibly some bridge devices that will only pass DHCP requests sourced from 0.0.0.0 or a known IP (for renewals). So a source of anything other than 0.0.0.0 at that stage is potentially problematic.
Updated by Chris Buechler almost 10 years ago
dhclient-script in 2.1x used the same 0.0.0.0/0.0.0.0, so that's a change in behavior between FreeBSD 8.3 and 10.1. Checking the dhclient-script in stock FreeBSD 10.1, they use a /8 mask there. /32 seems to work fine as well, but I changed it to /8 to stay consistent with FreeBSD. Both /8 and /32 appear to work fine.
Also reviewed a diff between our dhclient-script and stock FreeBSD's to see if there were other changes in behavior from 8.3 to 10.1. Didn't see anything relevant there.
Updated by Chris Buechler almost 10 years ago
- Status changed from Feedback to Resolved
works in every scenario I can find