Project

General

Profile

Bug #10824

BIND shutdown - dynamic zones, unclean shutdown causes startup to not load zones

Added by Dave Tickem about 1 month ago. Updated 27 days ago.

Status:
Resolved
Priority:
Normal
Category:
BIND
Target version:
-
Start date:
08/09/2020
Due date:
% Done:

100%

Estimated time:
Affected Version:
Affected Architecture:
All

Description

Bind is killed quite TERMinally in the /usr/local/etc/rc.d/named.sh script - with a SIGTERM. This causes the server to stop uncleanly.

Bug is that this causes journal files with dynamic zone updates (e.g. DHCP server updating named zones) to not be synchronized with zones and fail to reload with errors such as:

journal rollforward failed: journal out of sync with zone

Could be fixed with an "rndc sync -clean" before shutdown, or another more graceful closure of named.

History

#1 Updated by Dave Tickem about 1 month ago

Sorry, very poor bug. Affected version is PFSENSE 2.4.5p1 and BIND 9.14_7.

#3 Updated by Dave Tickem about 1 month ago

Viktor Gurov wrote:

Fix:
https://github.com/pfsense/FreeBSD-ports/pull/917

    $rc['stop'] = <<<EOD
        {$BIND_LOCALBASE}/sbin/rndc -q -c "{$BIND_LOCALBASE}/etc/rndc.conf" sync -clean 2>/dev/null
        /usr/bin/killall -TERM named 2>/dev/null
        sleep 2

Proposed fix is a good starting point, however RNDC operations are asynchronous, so a sleep of 2 seconds or so is needed between the "rndc sync" and the following "killall" command to stop the BIND instance being killed before the sync has a chance to complete.

#4 Updated by Jim Pingle about 1 month ago

  • Status changed from New to Pull Request Review

#5 Updated by Dave Tickem about 1 month ago

Viktor Gurov wrote:

Fix:
https://github.com/pfsense/FreeBSD-ports/pull/917

Does not fix the issue - I have now tested with smallish journal (~50 entries). bind9 requires time to complete sync before hard terminate with SIGTERM.

Suggest:

rc_stop() {
                /usr/local/sbin/rndc sync -clean
                /usr/local/sbin/rndc stop
                /bin/sleep 5
                /usr/bin/killall -TERM named 2>/dev/null
}

This will slow down system shutdown and service restart - however, this is preferable in nearly all cases than to have bind9 appear to start properly, but not load all zones and only complain in the log files about sone zones (subset) "zone ignored because journal not in sync...". This is not an easy fault to troubleshoot. BIND appears to be running, but is only partially configured.

#6 Updated by Viktor Gurov about 1 month ago

Thanks, updated

#7 Updated by Renato Botelho about 1 month ago

  • Status changed from Pull Request Review to Feedback
  • Assignee set to Renato Botelho
  • % Done changed from 0 to 100

PR has been merged. Thanks!

#8 Updated by Dave Tickem 27 days ago

Renato Botelho wrote:

PR has been merged. Thanks!

Updated to bind 9.14_8, which includes this fix - rc.d script now cleanly stops bind/named.

Great fix - thanks!

#9 Updated by Jim Pingle 27 days ago

  • Status changed from Feedback to Resolved

Also available in: Atom PDF