Project

General

Profile

Actions

Bug #7592

closed

SG-1000: Unbound not always restarting properly after changes in /etc/hosts

Added by Manuel Frischknecht almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
DNS Resolver
Target version:
Start date:
05/21/2017
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.4
Affected Architecture:
SG-1000

Description

I'm observing an issue on my SG-1000 (came preloaded with 2.4b, currently running v. 2.4.0.b.20170430.0444) which causes unbound restarts to fail after the dhcpleases system sends it a SIGHUP due to changes in /etc/hosts.
As you'll see, in those cases unbound usually takes about ~10 seconds to restart (which is not all that surprising given the SG-1000's limited processing power). Also, the system log entry in the first case seems to be logged ~2 seconds before the "old" unbound has properly shut down (according to its logs). That makes me wonder if it's just the limited CPU speed of the SG-1000 that makes this bug observable.
The kqueue error from the first logs puzzles me a bit (since its not properly explained) - I can't tell if it's a symtpom or part of the cause. The error will also appear after a failed restart (when unbound isn't running anymore).

Failing case (changes in /etc/hosts)

unbound log:

May 21 13:49:10    unbound    22926:0    info: start of service (unbound 1.6.1).
May 21 13:57:36    unbound    22926:0    info: service stopped (unbound 1.6.1).
May 21 13:57:36    unbound    22926:0    info: server stats for thread 0: 6 queries, 4 answers from cache, 2 recursions, 0 prefetch, 0 rejected by ip ratelimiting
May 21 13:57:36    unbound    22926:0    info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
May 21 13:57:36    unbound    22926:0    info: average recursion processing time 0.308737 sec
May 21 13:57:36    unbound    22926:0    info: histogram of recursion processing times
May 21 13:57:36    unbound    22926:0    info: [25%]=0 median[50%]=0 [75%]=0
May 21 13:57:36    unbound    22926:0    info: lower(secs) upper(secs) recursions
May 21 13:57:36    unbound    22926:0    info: 0.065536 0.131072 1
May 21 13:57:36    unbound    22926:0    info: 0.262144 0.524288 1
May 21 13:57:36    unbound    22926:0    notice: Restart of unbound 1.6.1.
May 21 13:57:54    unbound    22926:0    notice: init module 0: validator
May 21 13:57:54    unbound    22926:0    notice: init module 1: iterator
May 21 13:57:55    unbound    22926:0    info: start of service (unbound 1.6.1).
May 21 13:57:55    unbound    22926:0    info: service stopped (unbound 1.6.1).
May 21 13:57:55    unbound    22926:0    info: server stats for thread 0: 0 queries, 0 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
May 21 13:57:55    unbound    22926:0    info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0

dhcp log:

May 21 13:57:36    dhcpleases        /etc/hosts changed size from original!
May 21 13:57:36    dhcpleases        Sending HUP signal to dns daemon(22926)
May 21 13:57:48    dhcpleases        /etc/hosts changed size from original!
May 21 13:57:49    dhcpleases        Sending HUP signal to dns daemon(22926)
May 21 13:57:53    dhcpleases        kqueue error: unkown
May 21 13:57:53    dhcpleases        Sending HUP signal to dns daemon(22926)
May 21 13:57:55    dhcpd        Internet Systems Consortium DHCP Server 4.3.5
May 21 13:57:55    dhcpd        Copyright 2004-2016 Internet Systems Consortium.
May 21 13:57:55    dhcpd        All rights reserved.
May 21 13:57:55    dhcpd        For info, please visit https://www.isc.org/software/dhcp/
May 21 13:57:55    dhcpd        Config file: /etc/dhcpd.conf
May 21 13:57:55    dhcpd        Database file: /var/db/dhcpd.leases
May 21 13:57:55    dhcpd        PID file: /var/run/dhcpd.pid
May 21 13:57:55    dhcpd        Internet Systems Consortium DHCP Server 4.3.5
May 21 13:57:55    dhcpd        Copyright 2004-2016 Internet Systems Consortium.
May 21 13:57:55    dhcpd        All rights reserved.
May 21 13:57:55    dhcpd        For info, please visit https://www.isc.org/software/dhcp/
May 21 13:57:55    dhcpd        Wrote 0 deleted host decls to leases file.
May 21 13:57:55    dhcpd        Wrote 0 new dynamic host decls to leases file.
May 21 13:57:55    dhcpd        Wrote 14 leases to leases file.
May 21 13:57:55    dhcpd        Listening on BPF/cpsw1/[MAC address]/[IP address]/[subnet mask]
May 21 13:57:55    dhcpd        Sending on BPF/cpsw1/[MAC address]/[IP address]/[subnet mask]
May 21 13:57:55    dhcpd        Sending on Socket/fallback/fallback-net
May 21 13:57:55    dhcpd        Server starting service.
May 21 13:57:55    dhcpleases        Sending HUP signal to dns daemon(22926)
May 21 13:57:55    dhcpleases        Sending HUP signal to dns daemon(22926)

System > General log:

May 21 13:57:53    php-fpm    62381    /rc.linkup: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1495367873] unbound[85962:0] error: bind: address already in use [1495367873] unbound[85962:0] fatal error: could not open ports'

Successfull case (changed DHCP leases)

In contrast to the /etc/hosts case, restarts due to changed dhcp leases seem to work fine:

unbound log:

May 21 13:38:30    unbound    22926:0    info: start of service (unbound 1.6.1).
May 21 13:48:58    unbound    22926:0    info: service stopped (unbound 1.6.1).
May 21 13:48:58    unbound    22926:0    info: server stats for thread 0: 8 queries, 6 answers from cache, 2 recursions, 0 prefetch, 0 rejected by ip ratelimiting
May 21 13:48:58    unbound    22926:0    info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
May 21 13:48:58    unbound    22926:0    info: average recursion processing time 0.409663 sec
May 21 13:48:58    unbound    22926:0    info: histogram of recursion processing times
May 21 13:48:58    unbound    22926:0    info: [25%]=0 median[50%]=0 [75%]=0
May 21 13:48:58    unbound    22926:0    info: lower(secs) upper(secs) recursions
May 21 13:48:58    unbound    22926:0    info: 0.262144 0.524288 2
May 21 13:48:58    unbound    22926:0    notice: Restart of unbound 1.6.1.
May 21 13:49:09    unbound    22926:0    notice: init module 0: validator
May 21 13:49:09    unbound    22926:0    notice: init module 1: iterator
May 21 13:49:10    unbound    22926:0    info: start of service (unbound 1.6.1).

dhcp log:

May 21 13:48:58    dhcpd        DHCPREQUEST for 192.168.1.104 from [MAC address] ([hostname]) via cpsw1
May 21 13:48:58    dhcpd        DHCPACK on 192.168.1.104 to [MAC address] ([hostname]) via cpsw1
May 21 13:48:58    dhcpleases        Sending HUP signal to dns daemon(22926)

Changes in /etc/hosts after failed restart

dhcp log:

May 21 15:30:51    dhcpleases        /etc/hosts changed size from original!
May 21 15:30:51    dhcpleases        Sending HUP signal to dns daemon(22926)
May 21 15:30:51    dhcpleases        Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
May 21 15:30:53    dhcpleases        kqueue error: unkown
May 21 15:30:53    dhcpleases        Sending HUP signal to dns daemon(79438)
Actions #1

Updated by Manuel Frischknecht almost 7 years ago

I just noticed I overlooked a part after rewording the issue:

"Those cases" on line 2 refers to the successfull ones due to normal DHCP lease changes (case number 2). Sadly, I'm not able to edit the issue to correct that mistake. Sorry.

Actions #2

Updated by Jim Pingle over 6 years ago

  • Status changed from New to Feedback

Can you try this again? There have been several changes to unbound service handling and other areas over the last few months that may have also addressed this problem.

Actions #3

Updated by Jim Thompson over 6 years ago

  • Assignee set to Jim Pingle
Actions #4

Updated by rub man over 6 years ago

Sep 2 02:01:45 dhcpleases kqueue error: unkown
Sep 2 02:01:45 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
Sep 2 02:01:45 dhcpleases /var/etc/hosts changed size from original!

I get the same errors except for the unbound.pid file not existing when I restart the router or make a change or restart the dhcp server.
edit: sorry not in 2.4.0 but in latest stable version.

Actions #5

Updated by Constantine Kormashev over 6 years ago

I found there are error messages in logs, but I can see unbound works fine on latest FW on sg1000

Error messages appear when DHCP settings are changed:
In case I just added static entry without IP only MAC and hostname, in this case hostname was not added to /etc/hosts

Sep 29 14:11:02 dhcpleases kqueue error: unkown
Sep 29 14:11:01 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
Sep 29 14:11:01 dhcpleases /etc/hosts changed size from original!
Sep 29 14:10:56 php-fpm 307 /services_dhcp_edit.php: The command '/usr/sbin/arp -d ''' returned exit code '1', the output was 'arp: : Unknown server error'
Sep 29 14:10:56 check_reload_status Syncing firewall

In case I added added static entry with IP, MAC and hostname and hostname was added to /etc/hosts
Sep 29 14:15:09 dhcpleases kqueue error: unkown
Sep 29 14:15:08 dhcpleases Could not deliver signal HUP to process because its pidfile (/var/run/unbound.pid) does not exist, No such process.
Sep 29 14:15:08 dhcpleases /etc/hosts changed size from original!
Sep 29 14:15:02 php-fpm 305 /services_dhcp_edit.php: The command '/usr/sbin/arp -d '192.168.1.99'' returned exit code '1', the output was 'arp: writing to routing socket: No such file or directory'
Sep 29 14:15:02 check_reload_status Syncing firewall

Lets looked at ps output
ps aux | grep unbound
unbound 51925 0.0 2.7 19896 13608 - Is 14:15 0:00.99 /usr/local/sbin/u

Actions #6

Updated by Anonymous over 6 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF