Regression #11634
closedbind hangs when pfsense is reconnecting as an openvpn client to a TUN openvpn server
0%
Description
I encounter a problem with bind since 2.5.0, it stops responding to queries each time an openvpn disconnection/connection as a tun client is made to another ovpn server (so the pfsense here is an openvpn client to another pfsense)...
hereunder what happen in log when named stop responding...
filterdns9294: merge_config: configuration reload
[...]
named53473: network: error: creating IPv4 interface ovpnc2 failed; interface ignored
filterdns9294: merge_config: configuration reload
[...]
the only way to restart bind, without rebooting pfsense, is to kill named process via console, then start again via interface
to circumvent the problem :
- I changed the way openvpn is working by setting a TAP connection instead of TUN, so that interface is always ON even when reconnecting (it is working now for 40h without problem)
- I also removed the need to use filterdns by setting plain IP in aliases instead of FQDN
btw :
- the interfaces are set to "Listen on ALL interfaces" in bind config
- responding to queries and transferring zones thru vpn is required (and acl are sets)
- i upgraded to bind 9.16.12 with no more success to the problem
- unbound is totally disabled, only bind is used (there is no unbound process involved here)
Updated by Jim Pingle over 3 years ago
- Project changed from pfSense to pfSense Packages
- Category changed from DNS Resolver to BIND
Updated by Stéphane BARBARAY over 3 years ago
The problem seems worse than I thought : as soon as you restart an openvpn service, even as a server, or as soon as a network interface is reappearing, named will immediately hang...
Updated by itfabrica Tech over 3 years ago
Good day! I confirm the problem, I created a ticket, but I was told that this is not an error
https://redmine.pfsense.org/issues/11542#change-51602
Updated by Stéphane BARBARAY over 3 years ago
The problem is maybe not directly related, but I encountered this too, and if you wait 5mn before trying to reconnect without restarting openvpn service then it work again, but if you restart openvpn service then bind will stop processing queries because an interface disappeared then reappeared... So that the two combined problems are really annoying (to be polite)
Updated by Azamat Khakimyanov 5 months ago
- Status changed from New to Resolved
I was able to reproduce this issue on 2.5.0 CE (Bind 9.16_10).
With active and working Bind, so I was able to resolve FQDNs using dig @127.0.0.1 <domain name> command, I created a new OpenVPN client to another pfSense OpenVPN Server, and immediately after I created this OpenVPN tunnel
Jun 17 09:39:58 kernel tun1: changing name to 'ovpnc1' Jun 17 09:39:58 php-fpm 360 OpenVPN PID written: 44957 Jun 17 09:39:58 check_reload_status 389 Reloading filter Jun 17 09:39:59 kernel ovpnc1: link state changed to UP Jun 17 09:39:59 check_reload_status 389 rc.newwanip starting ovpnc1 Jun 17 09:40:00 php-fpm 360 /rc.newwanip: rc.newwanip: Info: starting on ovpnc1. Jun 17 09:40:00 php-fpm 360 /rc.newwanip: rc.newwanip: on (IP address: 10.205.200.2) (interface: []) (real interface: ovpnc1). Jun 17 09:40:00 php-fpm 360 /rc.newwanip: rc.newwanip called with empty interface. Jun 17 09:40:00 check_reload_status 389 Reloading filter Jun 17 09:40:00 php-fpm 360 /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> 10.205.200.2 - Restarting packages. Jun 17 09:40:00 check_reload_status 389 Starting packages Jun 17 09:40:01 php-fpm 360 /rc.start_packages: Restarting/Starting all packages.
Bind stopped on all interfaces. In DNS log I got
Jun 17 09:39:59 named 14375 listening on IPv4 interface ovpnc1, 10.205.200.2#53 Jun 17 09:39:59 named 14375 creating IPv4 interface ovpnc1 failed; interface ignored Jun 17 09:40:01 named 14375 received control channel command 'sync -clean' Jun 17 09:40:01 named 14375 dumping all zones, removing journal files: success Jun 17 09:40:01 named 14375 received control channel command 'stop -clean' Jun 17 09:40:01 named 14375 no longer listening on 192.168.122.200#53 Jun 17 09:40:01 named 14375 no longer listening on 172.31.200.1#53 Jun 17 09:40:01 named 14375 no longer listening on 127.0.0.1#53 Jun 17 09:40:01 named 14375 shutting down: flushing changes Jun 17 09:40:01 named 14375 stopping command channel on 127.0.0.1#8953
/Status/Services showed that named was still active.
Restarting of named service changed nothing. And I wasn't able to stop it because of this
Jun 17 10:11:19 php-fpm 360 /status_services.php: The command '/usr/local/etc/rc.d/named.sh stop' returned exit code '126', the output was 'sh: /usr/local/etc/rc.d/named.sh: Permission denied'
After reboot, named started to work as usual BUT after reconnecting OpenVPN it failed again.
I retested this config on 24.03 (Bind 9.17) and creation, restarting and disabling/enabling OpenVPN had no effect on Bind. It continued working.
I marked this Regression as resolved.