dhclient does not handle protocol timeouts or script failures correctly
pfSense-dhclient-script fails to return nonzero in the case where a DHCP timeout occurs and the cached gateway address is not pingable. This results in a case where the cached IP is removed from the interface, but dhclient is informed via the exit status of 0 that the IP was added successfully. As a result, the impacted interface remains without an IPv4 address until either the DHCP lease expires, the link flaps, or the DHCP lease is renewed manually, instead of the expected behavior of the DHCP protocol being restarted after the defined retry interval.
After addressing this with the patch 'pfsense-dhclient-script-patch.txt', I uncovered another apparent issue where the function 'priv_script_go' in dhclient.c does not correctly isolate and return the child process exit code from the return value of the wait() call.
This was further confirmed by verifying the current implementation in dhclient.c:
(wstatus & 0xff)
is not functionally equivalent to the definition of the WEXITSTATUS macro, which is defined as
((x) >> 8)
To address this, I have applied patch 'dhclient-patch.txt' to the FreeBSD 11.2 source tree, rebuilt dhclient, and installed the new binary to the pfSense appliance.
After performing both of these actions, dhclient and the associated script now behave as expected when a protocol timeout occurs and the cached gateway IP is not pingable, where a timeout is indicated in the relevant logs and the DHCP protocol restarted after the defined retry interval.
#2 Updated by Steve Wheeler about 2 months ago
This appears to have been committed in 12 stable:
But not did not make 12.0 release.