Project

General

Profile

Actions

Bug #14056

closed

DNS Resolver experiences intermittent resolution failures with SSL over TLS due to ASLR

Added by Todd Adams almost 2 years ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Category:
DNS Resolver
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
23.09
Release Notes:
Default
Affected Version:
Affected Architecture:

Description

DNS is completely broken on PfSense 23.01 with SSL enabled and using Quad9. Reddit has also recognized the same bug: https://www.reddit.com/r/PFSENSE/comments/115ba2d/

DNS is intermittent, then will just stop resolving - then it might start resolving again but there is no rhyme or reason for the behavior. It appears Quad9 and SSL are a common connection


Related issues

Has duplicate Regression #14368: Intermittent DNS failuresDuplicate

Actions
Actions #1

Updated by Jim Pingle almost 2 years ago

  • Subject changed from Major DNS Bug 23.01 with Quad9 on SSL to Intermittent DNS resolution failures with SSL over TLS to Quad9
  • Status changed from New to Feedback

There isn't nearly enough detail here to definitely say it's a bug and not a settings issue somewhere. For example, on the forum some people noticed that they had enabled DNSSEC for forwarding servers (such as this case) which is not a recommended combination and is known to cause issues.

I also have a couple systems in my lab set to use DNS over TLS to quad9 and others and see no failures there.

Here is a forum thread with some other similar experiences:

https://forum.netgate.com/topic/178042/23-01-upgrade-unbound-issue

If disabling DNSSEC doesn't help then we'll still need more information to determine what is happening in your specific case, but this site is not for support or diagnostic discussion. Please start a new thread on the Netgate Forum since it may be a different issues if your symptoms don't match exactly.

Actions #2

Updated by Todd Adams almost 2 years ago

Jim Pingle wrote in #note-1:

There isn't nearly enough detail here to definitely say it's a bug and not a settings issue somewhere. For example, on the forum some people noticed that they had enabled DNSSEC for forwarding servers (such as this case) which is not a recommended combination and is known to cause issues.

I also have a couple systems in my lab set to use DNS over TLS to quad9 and others and see no failures there.

Here is a forum thread with some other similar experiences:

https://forum.netgate.com/topic/178042/23-01-upgrade-unbound-issue

If disabling DNSSEC doesn't help then we'll still need more information to determine what is happening in your specific case, but this site is not for support or diagnostic discussion. Please start a new thread on the Netgate Forum since it may be a different issues if your symptoms don't match exactly.

Thank you for that forum link, it appears some according to that forum there is another commenter that notes even with DNSSEC disabled it still had issues - I see the same thing, for me personally while it was more reliable; unbound would still 'stop' working until restarted (no rhyme or reason or anything helpful in the logs). Disabling SSL has been the only current long term fix.

Actions #3

Updated by Jordan G almost 2 years ago

I have successfully been using DNSoTLS with 1.1.1.2/security.cloudflare-dns.com for some time and have temporarily switched to 9.9.9.9/dns.quad9.net and it's secondary. All other settings being the same and haven't noticed any issues since the switch.

increase log levels and/or log queries so you have more details to work with https://docs.netgate.com/pfsense/en/latest/troubleshooting/dns-queries.html#troubleshooting-dns-queries

Actions #4

Updated by Jim Pingle over 1 year ago

Actions #5

Updated by Doug Miles over 1 year ago

This is a regression, I believe, and definitely does not just affect 9.9.9.9. No settings changes occurred when I first began experiencing this immediately after the 23.01 update. I'm using 1.1.1.2/security.cloudflare-dns.com and experiencing this issue, but in the threads below, others are also experiencing it with 9.9.9.9. Some people in the discussions below had DNSSEC enabled, which is a separate, different issue, but I disabled that months before upgrading to 23.01. (But I did have it enabled at one time.) Enabling/disabling DNS over TLS reliably reproduces/resolves the issue for me, although it's always intermittent. It's most noticeable on iOS devices in my experience - I theorize this may be because I've found macOS (and presumably iOS) to stubbornly cache DNS resolution failures.

Here are the couple of additional discussions I found while researching my issue:

https://forum.netgate.com/topic/177979/23-01-breaks-dns-resolver-and-pfblocker/23

https://forum.netgate.com/topic/178413/major-dns-bug-23-01-with-quad9-on-ssl

I'll see about getting some log detail soon.

Actions #6

Updated by Marcos M over 1 year ago

https://forum.netgate.com/post/1104001
This issue is not unique to pfSense. We do have a workaround:
  1. Stop the Unbound service
  2. Run elfctl -e +noaslr /usr/local/sbin/unbound
  3. Start the Unbound service

Ref: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270912

Actions #7

Updated by Jim Pingle over 1 year ago

  • Assignee set to Christian McDonald
  • Target version set to 2.7.0
  • Plus Target Version set to 23.09
Actions #8

Updated by Christian McDonald over 1 year ago

  • Status changed from Feedback to Confirmed
Actions #9

Updated by Jim Pingle over 1 year ago

  • Subject changed from Intermittent DNS resolution failures with SSL over TLS to Quad9 to DNS Resolver experiences intermittent resolution failures with SSL over TLS due to ASLR

Updating subject to reflect current knowledge.

Christian added an option to the Unbound port to disable ASLR for now until the bug is addressed upstream .

On the latest 23.05 RC snapshot, it's now disabled by default:

: pkg info unbound | grep ASL
    NOASLR         : on
: elfctl `which unbound` | grep -i aslr
noaslr          'Disable ASLR' is set.

Actions #10

Updated by Christian McDonald over 1 year ago

  • Status changed from Confirmed to Closed

We are disabling ASLR on Unbound until a proper fix lands upstream.

Actions #11

Updated by Michael Vincent about 1 year ago

This ticket has a target version of 23.09, but I'm pretty sure it was fixed in 23.05. I came across it in the 23.09 release notes.

I'm currently running 23.05 and

$ elfctl /usr/local/sbin/unbound
noaslr          'Disable ASLR' is set.

Actions #12

Updated by Marcos M about 1 year ago

Disabling ASLR was a workaround until it was fixed upstream in unbound (which is now the case). In 23.09, unbound is built with ASLR again, hence the target.

Actions

Also available in: Atom PDF