Project

General

Profile

Actions

Bug #11720

closed

Unbound crashing and can't stay up

Added by Xemanth - about 3 years ago. Updated about 3 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
DNS Resolver
Target version:
-
Start date:
03/23/2021
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Default
Affected Version:
2.5.0
Affected Architecture:
All

Description

Hi,

Last Sunday when I changed the verbosity level in OpenVPN configuration, Unbound started to crash like... a lot... and can't stay running.

I instantly noticed that I can't reconnect with OpenVPN as I get error in the client logs:
TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
TLS Error: TLS handshake failed

To get OpenVPN working I unticked in the Unbound configuration:
"OpenVPN Clients: [] Register connected OpenVPN clients in the DNS Resolver".

I needed to change the DHCP configuration to bypass Unbound with server 1.1.1.1 to get DNS connectivity. By using Unbound DNS service users get some error about bad DNS config and no DNS resolving.

Any ideas what is causing this and how to resolve it?
Is this a bug in Unbound code?

System logs are filled with this. The Unbound log has not collected anything since Sunday and seems to be down all the time.

Mar 20 13:49:31 php-fpm 78828 /rc.newwanip: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616240971] unbound[20604:0] error: bind: address already in use [1616240971] unbound[20604:0] fatal error: could not open ports'

Mar 20 15:00:03 php-fpm 78828 /services_dhcp.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616245203] unbound[98416:0] error: bind: address already in use [1616245203] unbound[98416:0] fatal error: could not open ports'

Mar 21 20:46:26 php-fpm 348 /status_services.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616352386] unbound[98903:0] error: bind: address already in use [1616352386] unbound[98903:0] fatal error: could not open ports'

Mar 21 21:03:06 php-fpm 348 /services_unbound.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616353386] unbound[18941:0] error: bind: address already in use [1616353386] unbound[18941:0] fatal error: could not open ports'

Mar 21 21:32:54 php-fpm 78828 /interfaces.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616355174] unbound[11674:0] error: bind: address already in use [1616355174] unbound[11674:0] fatal error: could not open ports'

Mar 21 21:42:53 php 56829 pfSsh.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616355773] unbound[38718:0] error: bind: address already in use [1616355773] unbound[38718:0] fatal error: could not open ports'

In OpenVPN logs I get this:
Mar 21 21:57:53 check_reload_status 390 Starting packages
Mar 21 21:57:54 php-fpm 78828 /rc.start_packages: Restarting/Starting all packages.
Mar 21 21:59:07 check_reload_status 390 Syncing firewall
Mar 21 21:59:41 php-fpm 30394 /services_unbound.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616356781] unbound[33893:0] error: bind: address already in use [1616356781] unbound[33893:0] fatal error: could not open ports'
Mar 21 22:01:24 php-fpm 30394 OpenVPN terminate old pid: 20907
Mar 21 22:01:27 php-fpm 30394 /status_services.php: OpenVPN ID server1 PID 20907 still running, killing.
Mar 21 22:01:27 kernel ovpns1: link state changed to DOWN
Mar 21 22:01:28 kernel ovpns1: link state changed to UP
Mar 21 22:01:28 php-fpm 30394 OpenVPN PID written: 84125
Mar 21 22:01:28 check_reload_status 390 Reloading filter
Mar 21 22:01:28 check_reload_status 390 rc.newwanip starting ovpns1
Mar 21 22:01:29 php-fpm 80155 /rc.newwanip: rc.newwanip: Info: starting on ovpns1.
Mar 21 22:01:29 php-fpm 80155 /rc.newwanip: rc.newwanip: on (IP address: 172.20.200.1) (interface: []) (real interface: ovpns1).
Mar 21 22:01:29 php-fpm 80155 /rc.newwanip: rc.newwanip called with empty interface.
Mar 21 22:01:29 php-fpm 80155 /rc.newwanip: Netgate pfSense Plus package system has detected an IP change or dynamic WAN reconnection - -> 172.20.200.1 - Restarting packages.
Mar 21 22:01:29 check_reload_status 390 Reloading filter

Actions #1

Updated by BBcan177 . about 3 years ago

If you set the Resolver Inbound/Outbound Interfaces to "All", does it still cause these errors?

I saw this FreeBSD Bug report:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250984

So it could be related to IPv6 and the requirement to append "::" to each IPv6 address in /etc/inc/unbound.inc (Lines 223-259)

Actions #2

Updated by BBcan177 . about 3 years ago

Also seeing this in your logs:

Mar 21 22:01:29 php-fpm 80155 /rc.newwanip: rc.newwanip: on (IP address: 172.20.200.1) (interface: []) (real interface: ovpns1).
Mar 21 22:01:29 php-fpm 80155 /rc.newwanip: rc.newwanip called with empty interface.
Actions #3

Updated by Xemanth - about 3 years ago

BBcan177 . wrote:

If you set the Resolver Inbound/Outbound Interfaces to "All", does it still cause these errors?

I saw this FreeBSD Bug report:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250984

So it could be related to IPv6 and the requirement to append "::" to each IPv6 address in /etc/inc/unbound.inc (Lines 223-259)

Hi,

It's now All. Do you want me to try something else?

Actions #4

Updated by Xemanth - about 3 years ago

Hmm that ipv6 is interesting. Do you think my system is affected even through I don't have any ipv6 interfaces?

Actions #5

Updated by Xemanth - about 3 years ago

I fixed this but not sure what fixed it.
1) I changed listen interfaces to listen production and guest traffic and outgoing interface to wan
And then I restarted unbound service -> logs showed instantly error
/services_unbound.php: The command '/usr/local/sbin/unbound -c /var/unbound/unbound.conf' returned exit code '1', the output was '[1616565354] unbound[70616:0] error: can't bind socket: Address already in use for 127.0.0.1 port 953 [1616565354] unbound[70616:0] error: cannot open control interface 127.0.0.1 953 [1616565354] unbound[70616:0] fatal error: could not open ports'

2) Then I disabled whole Unbound and saved and applied.
Then I re-enabled it.
-> ZADAM, unbound started!

3) I ticked Register DHCP leases & Register DHCP static mappings -
Saved and applied -> still works.

4) But next, I should tick register connected OpenVPN clients... but I'm a bit scared to do that as I work remotely by OpenVPN connection and can't lose connectivity to the server.

I still don't figure out what caused this.
Is it possible that Unbound service restart is not enough for pfsense to release listen sockets?
Unbound should have more verbosity levels for logging. There seems to be only one level.

Actions #6

Updated by Jim Pingle about 3 years ago

  • Status changed from New to Rejected

Since there isn't any clear information here that points to an actionable bug, I'm closing it out for now.

If you can find a way to reliably reproduce the original condition, please post on the Netgate Forum to discuss the problem and if a bug can be identified in pfSense code, then we can open a fresh issue with more accurate details.

Actions

Also available in: Atom PDF