Bug #13542
closedBoot delay caused when OpenVPN config uses alias list that relies on DNS
0%
Description
pfSense+ 22.05 in Azure
I use OpenVPN with an alias list that includes 76 (and growing) FQDNs.
When the system is set to internal DNS with public fallback, the system hangs for 10+ minutes at boot at "Syncing OpenVPN settings", I assume this is because each record lookup fails and has to time out before it is resolved via public DNS.
Changing this option to public DNS only works around the issue, but there are some cases where I need the firewall to use internal DNS so work with domain overrides.
Perhaps the resolver could be brought online just after WAN is established, or the fallback behavior could be tweaked so that it falls back for an entire alias list instead of each individual entry (since tables are refreshed periodically anyway)
Updated by Kris Phillips about 2 years ago
This doesn't sound like a bug, as the issue is not present when using different DNS servers based on the original report. This sounds like an issue with DNS Rebinding or something similar.
Please clarify what you mean by "FQDNs" in OpenVPN.
Updated by Adrien Carlyle about 2 years ago
In the OpenVPN server configuration option "IPv4 Local network(s)" I use an alias that contains FQDN hostnames like server.domain.com server2.domain.com etc.
I worked with Netgate support to narrow this issue down. I asked them if I should create a bug for this and was told to proceed.
This is a bug because OpenVPN is started before the resolver service which causes the lookup failures/fallbacks when using pfSense's default DNS resolution behavior. If the resolver is online earlier in the bootup sequence OpenVPN comes online instantly (which I have proven by manually starting the resolver while OpenVPN is hung).
If the startup order can't be changed so that the resolver is online before OpenVPN, then the DNS fallback to public servers should be modified during bootup so that it only falls back once during the entire startup sequence.
I haven't tested what happens if I tell the firewall to use internal only DNS and reboot it. My guess right now is that the table would be missing all DNS based entries an boot and would fill in the missing entries when the alias list is refreshed.
Updated by Adrien Carlyle about 2 years ago
I just realized you were confused by what I was referring to in my workaround.
I meant that if I change the setting:
System -> General -> DNS Resolution behavior
From: Use Local DNS, fall back to remote DNS
To: Use Remote DNS servers, ignore local DNS
Everything boots up normally because pfSense uses remote DNS and doesn't depend on the resolver.
Updated by Chris W over 1 year ago
I'm unable to reproduce this to any noteworthy degree on 23.05.1. Steps taken:
1. Made an alias "mint" to mint.home.arpa, a Linux Mint VM behind the pfSense LAN and which has a DHCP lease from pfSense.
2. Made another alias "red" to redmine.pfsense.org, just to give the resolver something external to try biting into.
3. Created a new remote access VPN server using the wizard. During initial setup I added "mint,red" to the IPv4 Local Network(s) area of the server config.
- DNS resolver is default settings, so in Resolver Mode.
- System > General Setup > DNS Resolution Behavior is set to 'Use local, fall back to remote' (the default).
4. Rebooted and watched for the Syncing OpenVPN Settings line.
The result was that it didn't sit on that line for more than a few seconds. If I set Unbound to Forwarding Mode and resolution behavior to 'use remote, ignore local', then it took longer, but only about 30 seconds. Adding mint.home.arpa as a static DNS entry made no difference (meaning, with that entry then Diagnostics > DNS Lookup returned the record from pfSense instead of the upstream DNS server, which is still Unbound).
Please upgrade your Azure machine image to 23.05.1 and let us know if you still see the same behavior.
Updated by Adrien Carlyle over 1 year ago
I no longer work for the company that operates this instance but I might be able to get access and retest this after an upgrade if they haven't deleted the VM.
In my experience I didn't have any issues during testing with a couple entries, it was only after adding many entries to the alias that the issue was severe enough to make me take notice.
Updated by Adrien Carlyle about 1 year ago
I have access to the instance, will attempt to upgrade it and re-test.
Updated by Marcos M 8 days ago
- Project changed from pfSense Plus to pfSense
- Category changed from DNS Resolver to DNS Resolver
- Status changed from New to Closed
- Affected Plus Version deleted (
22.05) - Affected Architecture deleted (
amd64)
In more recent versions, there are checks for the availability of DNS before requests are sent which should help with this issue. The issue can be reopened if it's reproducible in 24.11+.