Project

General

Profile

Actions

Bug #13671

closed

DHCP client can fail permanently if an interface is down at boot

Added by Steve Wheeler about 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Interfaces
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
23.05
Release Notes:
Default
Affected Version:
2.6.0
Affected Architecture:
All

Description

If when the WAN is brought up at boot launching dhclient the interface is down it will fail and stop:

pfSense php-cgi: rc.bootup: The command '/sbin/dhclient -c /var/etc/dhclient_wan.conf igb0 > /tmp/igb0_output 2> /tmp/igb0_error_output' returned exit code '1', the output was ''

The output there being:

igb0: no link .............. giving up

If the WAN then comes up before bootup has completed it does not trigger dhclient to re-run leaving the WAN permanently down after boot.

This can happen typically if the WAN is connected to a modem and there is a power outage causing both devices to boot at the same time.

See:
https://forum.netgate.com/topic/171450/dhclient-exiting-on-wan
https://forum.netgate.com/topic/175937/bug-gateways-power-loss


Related issues

Related to Bug #9484: With proper timing on boot dhclient won't be started for WAN without manual interventionClosed04/25/2019

Actions
Actions #1

Updated by Steve Wheeler about 2 years ago

  • Related to Bug #9484: With proper timing on boot dhclient won't be started for WAN without manual intervention added
Actions #2

Updated by Steve Wheeler about 2 years ago

A workaround for this issue is to delay pfSense booting to allow an upstream device time to bring up the link.
This can be done by creating the file /boot/loader.conf.local and adding to it the line:

autoboot_delay="30" 

30s is usually more than required and can be set lower.

Actions #3

Updated by Jim Pingle about 2 years ago

/etc/rc.linkup explicitly exits if it detects the platform is booting. We might be able to insert a test there to check to ensure dhclient is running.

Actions #4

Updated by Jim Pingle about 2 years ago

  • Status changed from New to Feedback
  • Assignee set to Jim Pingle

Try this change, for example:

diff --git a/src/etc/rc.linkup b/src/etc/rc.linkup
index 1eb0f0e342..7c81e49a83 100755
--- a/src/etc/rc.linkup
+++ b/src/etc/rc.linkup
@@ -29,10 +29,6 @@ require_once("filter.inc");
 require_once("shaper.inc");
 require_once("interfaces.inc");

-if (platform_booting()) {
-       return;
-}
-
 function handle_argument_group($iface, $action) {
        global $g, $config;

@@ -162,6 +158,21 @@ if (!in_array($action, ['start', 'stop'])) {
                return;
 }

+if (platform_booting()) {
+       if (!empty($realiface) &&
+           (config_get_path("interfaces/{$realiface}/ipaddr", '') == 'dhcp')) {
+               if (find_dhclient_process($realiface) == 0) {
+                       /* dhclient is not running */
+                       $interface = convert_real_interface_to_friendly_interface_name($realiface);
+                       log_error("DHCP Client not running on {$interface} ($realiface) during boot, reconfiguring dhclient.");
+                       interface_dhcp_configure($interface);
+               }
+       } else {
+               log_error("Ignoring link event during boot sequence.");
+       }
+       return;
+}
+
 if (!empty($realiface)) {
        if (substr($realiface, 0, 4) == 'ovpn') {
                log_error("Ignoring link event for OpenVPN interface");

Untested, but sound in theory.

If that works I can check it in, but I'm hesitant to make that change without testing it first and I can't seem to trigger it here.

Actions #6

Updated by Jim Pingle about 2 years ago

  • Subject changed from dhclient can fail permanently if WAN is down at boot. to DHCP client can fail permanently if an interface is down at boot
  • Status changed from Feedback to Ready To Test

Updating subject for release notes.

Actions #7

Updated by Jim Pingle almost 2 years ago

  • Plus Target Version changed from 23.01 to 23.05

Moving to the next release so we have more time to reproduce and test.

Actions #8

Updated by Jim Pingle almost 2 years ago

  • Status changed from Ready To Test to In Progress

I was able to reproduce this in a VM finally. The key is to boot with the interface detected and then reconnect it just after the boot log shows the WAN configuration is finished.

The patch above doesn't fix it, but I have a commit coming that does.

Actions #9

Updated by Jim Pingle almost 2 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 0 to 100
Actions #10

Updated by Azamat Khakimyanov over 1 year ago

  • Status changed from Feedback to Resolved

Tested on 23.01

I was able to reproduce this Bug on my KVM by turning WAN (DHCP) interface off at certain moment during boot process and then turning it on again after about 30-40 seconds but before boot process has finished. WAN got no IP via DHCP, I saw no DHCP requests coming out of WAN and WAN port stuck in 'no IP' state.
After applying the patch JimP mentioned in his last comment, WAN got IP-address in any cases.

I'll mark this Bug as Resolved.

Actions

Also available in: Atom PDF