Project

General

Profile

Bug #9296

Rule / Alias FQDN-Resolution broken

Added by Ph. T over 1 year ago. Updated 3 days ago.

Status:
New
Priority:
High
Assignee:
Category:
Aliases / Tables
Target version:
Start date:
01/30/2019
Due date:
% Done:

100%

Estimated time:
Affected Version:
2.4.4_2
Affected Architecture:
All

Description

If you are using FQDN-Aliases each FQDN can only be used once, if
you use the alias twice, the generated tables are incomplete.

No DNS-Server/Resolver on the firewall is used. External DNS
resolvers are configured.

Example:
alias1 : fqdn1, fqdn2, fqdn3
alias2 : fqdn4, fqdn2, fqdn3

Generated tables are incomplete
alias1 : fqdn1, fqdn2, fqdn3
alias2 : fqdn4 (the others are missing)

alias2 does only contain fqdn4 and fqdn2 and fqdn2 are missing.

This bug seems to arise with 2.4.4_p1 and is still existing in 2.4.4_p2;
I am not sure if this behavior is present within 2.4.4.

I am working on a minimal example which i will provide.

Rule_Set.PNG (37.4 KB) Rule_Set.PNG Ruleset Ph. T, 01/31/2019 05:29 AM
Alias_Configuration.PNG (24.2 KB) Alias_Configuration.PNG Alias-Configuration Ph. T, 01/31/2019 05:29 AM
table_fqdn1.PNG (35.8 KB) table_fqdn1.PNG table fqdn1 Ph. T, 01/31/2019 05:29 AM
table_fqdn2.PNG (49.6 KB) table_fqdn2.PNG table fqdn2 Ph. T, 01/31/2019 05:29 AM
191011_Tnk_config-pfSense.localdomain-20191011143458.xml (15.9 KB) 191011_Tnk_config-pfSense.localdomain-20191011143458.xml Ph. T, 10/11/2019 09:40 AM
pfsense.png (39 KB) pfsense.png Art Manion, 10/31/2019 11:48 PM
filterdns-2.0_3.txz (17.5 KB) filterdns-2.0_3.txz filterdns pkg built by me on a FreeBSD 11.2 VM Eduard Rozenberg, 02/01/2020 10:19 AM

History

#1 Updated by Ph. T over 1 year ago

I have now prepared a minimal example:

As you can see fqdn1 is missing the entry for one.one.one.one

Please FIX

#2 Updated by Eduard Rozenberg over 1 year ago

I believe my issues may be related to this. We updated to 2.4.4 p2 on Jan 9, but only in the past few days have seen the problems.

The firewalls and sites at several locations are configured to allow remote access based on firewall aliases and rules using those aliases. The aliases sometimes contain a mix of IP addresses (/32), IP networks (/29 for ex), and DNS names (something.company.com).

Since the past few days, the pfSense firewalls at the various sites all reject my remote connection attempts, and the rejections are visible in the firewall logs.

It's...a problem.

#3 Updated by Jim Pingle over 1 year ago

  • Category set to Rules / NAT
  • Assignee set to Luiz Souza
  • Target version set to 48
  • Affected Architecture All added
  • Affected Architecture deleted ()

#4 Updated by Robert Gijsen over 1 year ago

2.4.4-RELEASE-p2, I've had this multiple times. At the moment I can even sort of reproduce it.
When adding hosts to an alias my AD DNS server logs:

2/18/2019 12:39:54 PM 1B40 PACKET 000001A857BE1DC0 UDP Rcv <pfsense IP> a463 Q [0001 D NOERROR] AAAA (8)host(7)i'm(2)resolving(0)

2/18/2019 12:39:54 PM 1B3C PACKET 000001A858859CC0 UDP Rcv <pfsense IP> 519a Q [0001 D NOERROR] AAAA (8)host(7)i'm(2)resolving(0)

2/18/2019 12:39:54 PM 1B40 PACKET 000001A857BE1DC0 UDP Snd <pfsense IP> a463 R Q [8081 DR NOERROR] AAAA (8)host(7)i'm(2)resolving(0)

2/18/2019 12:39:54 PM 1B3C PACKET 000001A858859CC0 UDP Snd <pfsense IP> 519a R Q [8085 A DR NOERROR] AAAA (8)host(7)i'm(2)resolving(0)

This is an external host, i.e. a DNS that needs to be externaly resolved by our DNS servers. That seems to work fine result gets send back to pfSense. However the host does NOT end up in the table for that alias. When I add another DNS, same domain, so hosted at the same DNS on internet, that works fine. I tried others like www.tweakers.net, www.nos.nl or bbc.co.uk I have the same success loggings in my DNS debug log, and they DO end up in the alias table as well.

pfSense Resolver log:
Feb 18 12:47:14 filterdns Adding host <Host that gets added to the alias> (I just added that one to the alias)
Feb 18 12:47:14 filterdns Adding Action: pf table: B_it_webserver host: <Host that gets added to the alias>
Feb 18 12:47:14 filterdns Adding Action: pf table: B_it_webserver host: <host that does NOT end up in table> (I just added that one in the alias as well)
Feb 18 12:47:14 filterdns Adding Action: pf table: B_it_webserver host: <existing host, which was already in the alias>

The host that does NOT end up in table here, is by the way successfully added to some other aliasses, where it works just as expected. But for this alias I am missing the 'Adding host' in the pfSense log.

I tried creating a new alias, with the same three hosts as in the alias I used above. Here NONE of them end up in the table, after waiting for about 20 minutes, while in the alias used above two out of three (and the same two every time, no matter what order I put them in) work. Then I added www.tweakers.net as another try, and that one gets in there immediately.
I again killed filterdns, restarted it and poof - the tables immediately got filled as they should. So it seems filterdns is partially functional - some hosts get added, some aren't. It could indeed be when hosts already exist in the table somewhere; however restarting filterdns at least populates them for a while.

Tell me what loggings you need. As it seems I can now reproduce this at will (also on my second carp / HA node by the way) I can probably give all needed logs.

#5 Updated by Robert Gijsen over 1 year ago

I've just downgraded a test-machine to 2.4.4 release, and that works fine. Keeping it there for a while.

#6 Updated by Eduard Rozenberg over 1 year ago

Shortly after I posted my problem above 20 days ago, it started working again on its own.

Then today, it is again not working.

So it may be a sporadic issue with the alias resolution, that doesn't happen consistently. Have not been able to pin down the issue at all.

#7 Updated by Eduard Rozenberg over 1 year ago

I can confirm my issue is the same as described by the other posters on this bug.

Logs show that filterdns claims to be doing the right thing - all expected alias entries (FQDN's, IP's, networks) show up in:
$ clog /var/log/resolver.log | grep "Adding Action"

But the alias table is incomplete, some IP addresses are missing:
$ pfctl -T show -t my_alias_name

There is no DNS resolution issue with any of the FQDN's - if I ping the FQDN's from the firewall their IP addresses are resolved.

Restarting the filter, re-saving the alias does not help.

#8 Updated by Eduard Rozenberg over 1 year ago

I've also ruled out some other possibilities below -

Not the issue:
https://docs.netgate.com/pfsense/en/latest/firewall/thread-error-using-many-hostname-in-aliases.html
(I don't have a threads error in logs, and setting this tunable did not help)

Not the issue:
Mixing FQDN's and IP's - I tried creating a new alias with only a single FQDN from the ones that don't work in the original alias. Still no luck.

#9 Updated by Jim Pingle over 1 year ago

  • Target version changed from 48 to 2.5.0

#10 Updated by Azamat Khakimyanov over 1 year ago

I see this behavior on 2.4.4_p2, on 2.4.5-dev and on 2.5.0-dev.
As workaround we can:
- in console run 'pkill filterdns' command
- then /Status/Filter Reload to start 'filterdns' service

#11 Updated by Gavin Stewart over 1 year ago

As a workaround I have installed the Cron package with the following additional entries:

*/15 * * * * root killall -9 filterdns; sleep 2; /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1     
@reboot      root sleep 10; killall -9 filterdns; sleep 2; /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1

#12 Updated by Robert Gijsen over 1 year ago

I know it's targeted for 2.5.0, but still I'd like to inform people here that 2.4.4_3 does indeed NOT fix this, making it yet another update that kills main functionality. Be aware, and be extremely reluctant to update!

#13 Updated by Rudolf Mayerhofer over 1 year ago

Setting "Aliases Hostnames Resolve Interval" to 30 seconds (which should be the minimum value) in System/Advanced/Firewall&NAT seems to work around the issue for me (which could be some kind of race condition in filterdns but that's just guessing on my side).

#14 Updated by Christoforos Tsoukaris over 1 year ago

Rudolf Mayerhofer wrote:

Setting "Aliases Hostnames Resolve Interval" to 30 seconds (which should be the minimum value) in System/Advanced/Firewall&NAT seems to work around the issue for me (which could be some kind of race condition in filterdns but that's just guessing on my side).

Based on what Rudolf wrote, I changed the value of "Aliases Hostnames Resolve Interval" from empty to 300secs. That setting combined with a filter reload from "Status/Filter Reload" menu made my rules work as expected again.

I will continue testing and update this post if I find anything else.

#15 Updated by Gavin Stewart over 1 year ago

Christoforos Tsoukaris wrote:

Based on what Rudolf wrote, I changed the value of "Aliases Hostnames Resolve Interval" from empty to 300secs. That setting combined with a filter reload from "Status/Filter Reload" menu made my rules work as expected again.

I will continue testing and update this post if I find anything else.

If you look at the cron entries I have mentioned earlier (#9296#note-11), you will see that I have the interval (-i) set to 300. I am still seeing missing entries in the alias tables on occasion, which do get corrected when cron kills and restarts filterdns within the next 15 mins.

#16 Updated by Mark Monaghan over 1 year ago

The crontab entries as mentioned in #11 didn't run as they were just keeping on adding new filterdns processes, eventually causing the firewall to trigger CARP/HA, give high latency to VPN and internet traffic, and eventually cause the firewall to stop passing traffic altogether. I went for installing the cron GUI package (But you could just as easily edit /etc/crontab directly), and I've changed the lines to:

*/15 * * * *  root    /usr/bin/pkill -f "/usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1"; sleep 2; /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1
@reboot       root    sleep 10; /usr/bin/pkill -f "/usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1"; sleep 2; /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1

I've checked that these cron jobs now correctly kill the old processes before starting new ones, and that the filterdns process is correctly doing it's job as long as the restarts are in place (As noted in other notes to this bug, it still stops resolving after a while, although I've not had time to monitor the firewalls to find out exactly how long it is before the process stops, so apologies for that).

#17 Updated by Gavin Stewart over 1 year ago

Mark Monaghan wrote:

The crontab entries as mentioned in #11 didn't run as they were just keeping on adding new filterdns processes,

Very interesting. I'm not seeing that occur at all (and it was something I was monitoring closely when I set it up). I wonder what makes this operation different on your system ? Could you have possibly forgotten the "-9" argument to "killall" ?

#18 Updated by Mark Monaghan over 1 year ago

Gavin Stewart wrote:

Very interesting. I'm not seeing that occur at all (and it was something I was monitoring closely when I set it up). I wonder what makes this operation different on your system ? Could you have possibly forgotten the "-9" argument to "killall" ?

No, sorry, I copied and pasted the commands verbatim from here to ensure that I didn't make any errors when implementing them. The -9 was definitely in there. What I was finding was that if I ran them individually, or even as a grouped command set, from the CLI, they worked perfectly, but they failed to kill any processes when ran via the cron. This was all done and tested on 2.4.4-p3. I cannot comment on how this performed prior to this version, as it wasn't implemented on 2.4.4-p2 or lower, only after the system was upgraded to run on the latest stable version.

This is the reason I switched the cron job to pkill as nothing I tried would get killall or even kill to terminate the filterdns process via the cron, but pkill was working for reasons only known to the OS. However, that also presented it's own challenges as unless I used the exact filter of -f "/usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1" rather than pkill filterdns or pkill -f "filterdns" it was seeing filterdns in it's own cron job and killing itself before it killed any filterdns processes, so until I put the large filter in place, I wasn't any further forward and still had filterdns processes stacking up, causing the system to fall over eventually (I put this down to the system running out of open file handles as it certainly never ran out of memory before the crashes that I experienced started to happen.

#19 Updated by Rudolf Mayerhofer over 1 year ago

Rudolf Mayerhofer wrote:

Setting "Aliases Hostnames Resolve Interval" to 30 seconds (which should be the minimum value) in System/Advanced/Firewall&NAT seems to work around the issue for me (which could be some kind of race condition in filterdns but that's just guessing on my side).

As a follow up: With 30 seconds resolve interval things are still working fine one month later without killing/restarting filterdns.

#20 Updated by Eduard Rozenberg over 1 year ago

Rudolf Mayerhofer wrote:

As a follow up: With 30 seconds resolve interval things are still working fine one month later without killing/restarting filterdns.

Unfortunately this doesn't solve my problem. Same issues, regardless of the refresh value. Continues to make life difficult on a regular basis.

#21 Updated by Peter van der Kleij over 1 year ago

I think I have a similar problem.
My inbound rule did not work with an FQDN in the Alias. (Whitelist for source addresses) Weird thing was that only one ip-address in the Alias (not the FQDN) did not work, restarting servers/pfsense and such did not give any result.

When I enable the 300 sec for 'Aliases Hostname Resolve interval', it WORKS, when i leave it empty, it FAILS directly.

2.4.4-RELEASE-p3, FreeBSD 11.2-RELEASE-p10

#22 Updated by Art Manion about 1 year ago

Netgate SG-4860 running 2.4.4-RELEASE-p3 (amd64). At least twice I've experienced issues, I assume involving filterdns, where aliases were updated via the GUI but did not make it into the pf table (and I believe the updates didn't land in /var/etc/dnsfilter.conf either, but I can't confirm).

The alias/table is 58 entries, mostly DNS names, some IP addresses, and some other pfSense aliases. The last time this problem happened neither DNS name, IP address, nor pfSense aliases were added to the pf table. filterdns was running. killall -9 filterdns && /etc/rc.filter_configure fixed it.

#23 Updated by Justin J about 1 year ago

Also experiencing this issue on 2.4.4-p2 and now 2.4.4-p3. If FQDNs are remove the table updates correctly. Due to the way many cloud service provider clusters move and reallocate IPs it is impractical to use individual host or CIDR lists as this requires constant updating once a change is made on the remote server. In my case I first noted the issue after one entry was updated and became a duplicate IP in a CIDR that was in the table. I've tried killing and restarting filterdns and changing resolution time to 30 seconds but with no improvement. Sometimes the table would partially generate, other times it was completely empty.
Can this be escalated for a resolution before 2.5 as it breaks the core firewall functionality for those of us using a cloud or hybrid cloud setup?

#24 Updated by Robert Gijsen about 1 year ago

I second Justins message / question. pfSense is completely unusable after 2.4.4 initial release. With filterdns not working properly, it fails as a firewall completely. That leaves us with the choice of updating pfSense to a completely useless not working version, or leave it at 2.4.4 and leave it vulnerable to now known security issues. Both are unacceptable.

This should certainly get higher priority.

#25 Updated by Tom Hebert about 1 year ago

Most of you are more experienced at this than me so please be tolerant if this is a dumb question.

I added a Firewall Alias containing a single FQDN, xxx.mydomain.com. A Diagnostics=>DNS Lookup for xxx.mydomain.com returns three addresses, one IPv4 and two IPv6 (received from DHCPv6 and RA). However when I inspect the alias's table using Diagnostics=>Tables, only the IPv4 address is listed. It follows that the rule referencing the alias isn't passing traffic to the IPv6 addresses. I would expect to see all three addresses in the table. Is that expectation correct or am I missing something?

For background, I am using resolver and there is a domain override in place for mydomain.com (TypeTransparent).

Am I experiencing this bug?

#26 Updated by Justin J about 1 year ago

That sounds like it might be something else Tom. Check your output from the CLI with: pfctl -T show -t ALIASNAME
If it's not there try making a forum post as the discussion here should be directly about the bug, not diagnosing the possibility of your setup having it.

#27 Updated by Tom Hebert about 1 year ago

Justin J: I took your advice and posted on the forum and was promptly referred back here. Here's the link in case you are inclined to read it. https://forum.netgate.com/topic/145553/firewall-alias-not-updating-table-correctly

#28 Updated by Jim Pingle about 1 year ago

  • Category changed from Rules / NAT to Aliases / Tables

#29 Updated by Robert Gijsen about 1 year ago

It's been about 8 months now that we are unable to update / patch our firewalls because of this. Yeah I know, open source, contribute yourself if you don't like it, and so on. But still no update on this from pfSense team? By now we are seriously considering moving away. It's unacceptable that we have to run a firewall that's by now simply not secure anymore because of the now public security flaws.

Can we get an official status update on this? And why this doesn't get higher priority? Is there an official ETA for 2.5?

#30 Updated by John K about 1 year ago

Robert Gijsen wrote:

It's been about 8 months now that we are unable to update / patch our firewalls because of this. [...] And why this doesn't get higher priority? Is there an official ETA for 2.5?

This issue is becoming a show stopper for us as well.

#31 Updated by Angel Briceño about 1 year ago

Ph. T wrote:

If you are using FQDN-Aliases each FQDN can only be used once, if
you use the alias twice, the generated tables are incomplete.

No DNS-Server/Resolver on the firewall is used. External DNS
resolvers are configured.

Example:
alias1 : fqdn1, fqdn2, fqdn3
alias2 : fqdn4, fqdn2, fqdn3

Generated tables are incomplete
alias1 : fqdn1, fqdn2, fqdn3
alias2 : fqdn4 (the others are missing)

alias2 does only contain fqdn4 and fqdn2 and fqdn2 are missing.

This bug seems to arise with 2.4.4_p1 and is still existing in 2.4.4_p2;
I am not sure if this behavior is present within 2.4.4.

I am working on a minimal example which i will provide.

I had the same problem. The rules stack has a limit and this means that domain names cannot be resolved. For example:

Alias-1 => Network => "10.10.0.0/24" It is not equal to "10.10.0.0-10.10.0.254"

While 10.10.0.0/24 is a valid nomenclature for a rule, but 10.10.0.0-10.10.0.254 say that the entire range should be described:
10.10.0.0
10.10.0.1
10.10.0.2
10.10.0.3
10.10.0.4
..
...
....
10.10.0.254

This range of IPs causes a problem for the "next" aliases in the system, and it is very possible that they cannot be resolved.

I have removed all gigantic ranges of IPs and the problem is solved.

This problem has already affected several cloud providers, so they mostly do not accept using FQDN aliases in their routing rules or ACLs.

#32 Updated by Gavin Stewart about 1 year ago

Angel Briceño wrote:

I have removed all gigantic ranges of IPs and the problem is solved.

I have no ranges of IP addresses (only networks defined in CIDR notation), and the problem persists.

#33 Updated by Ph. T about 1 year ago

I am very,very unhappy with the time it takes to deal and fix this problem.
Is there any way to speed up the process ? I can provide any additional info.

I think Angel has a complete different problem. The limit of table entries
has been an issue due to big bogon tables some time ago. A fix might be to
increase the maximum table entries, or not to use the bogon table.

    System > Advanced >Firewall & NAT
    Firewall Maximum Table Entries 

we have set this value to 400000.

#34 Updated by Luiz Souza about 1 year ago

Ph. T wrote:

I am very,very unhappy with the time it takes to deal and fix this problem.
Is there any way to speed up the process ? I can provide any additional info.

I think Angel has a complete different problem. The limit of table entries
has been an issue due to big bogon tables some time ago. A fix might be to
increase the maximum table entries, or not to use the bogon table.
[...]
we have set this value to 400000.

Please, provide the filterdns logs for your case, with debug enabled (-d20).

As strange as it may seem, this is proving to be difficult to reproduce reliably.

If you want to send the logs privately, please send it to luiz at netgate.com

Thanks.

#35 Updated by Jim Pingle about 1 year ago

If anyone can come up with simple cases that reliably reproduce the problem, that would definitely help. That is, the smallest possible configuration that results in the problem happening. For example, an alias and firewall rule which exhibit the problem (or multiple aliases+rules) along with whatever other conditions are necessary, such as waiting specific amounts of time, or having invalid hostnames in the alias, etc. Along with log data mentioned above and the contents of /var/etc/filterdns.conf.

#36 Updated by Ph. T about 1 year ago

I will provide the data / config.xml . I could also provide a virtual-box pfsense-installation
which shows this problem. I hope i could provide this today.

#37 Updated by Ph. T about 1 year ago

I have tried to reproduce the issue. Unfortently that was not possible. Now i just get complete empty tables.
I have waited the timeout.

I have used a 2.4.4_p3 Image as base. I turned of DNS forwarder and Resolver;Using an external resolver.

Steps to reproduce:

Start the machine:

Delete entry using_one_alias
Delete entry using_one_alias2

Add entry
using_one_alias host, fqdn_alias2
using_one_alias2 host, fqdn_alias1

Without killing the filterdns-process thouse tables remain empty.

[2.4.4-RELEASE][]/var/log: cat /var/etc/filterdns.conf
pf dns.google fqdn_alias1
pf one.one.one.one fqdn_alias1
pf one.one.one.one fqdn_alias2
pf one.one.one.one using_one_alias
pf dns.google using_one_alias2
pf one.one.one.one using_one_alias2

If you do a reboot the tables are populated.

But if you delete the aliases after a reboot and recreate them the tables are not filled until you restart filterdns.

#38 Updated by Ph. T about 1 year ago

I see similar effects with the old config which i attached in January.

#39 Updated by John K about 1 year ago

Jim Pingle wrote:

If anyone can come up with simple cases that reliably reproduce the problem [...]

What's the status here? Has Netgate been able to reproduce this issue?

#40 Updated by Jim Pingle about 1 year ago

John K wrote:

What's the status here? Has Netgate been able to reproduce this issue?

Not that I have seen yet. We still need to find a combination of settings that reliably and repeatedly reproduces the issue.

#41 Updated by Vinicius DellAglio about 1 year ago

Jim Pingle wrote:

John K wrote:

What's the status here? Has Netgate been able to reproduce this issue?

Not that I have seen yet. We still need to find a combination of settings that reliably and repeatedly reproduces the issue.

I just installed a brand new pfsense box and once I created an alias with an FQDN it didn't work, when I checked it on TABLES I only had the entries' list above the fqdn entry, the fqdn itself and everything else were missing.

Once I deleted the FQDN entry and hit filter reload, the alias was correct.

#42 Updated by John K 12 months ago

Vinicius DellAglio wrote:

I just installed a brand new pfsense box and once I created an alias with an FQDN it didn't work, when I checked it on TABLES I only had the entries' list above the fqdn entry, the fqdn itself and everything else were missing.

Once I deleted the FQDN entry and hit filter reload, the alias was correct.

Is the FQDN a CNAME, A record, etc?

#43 Updated by Art Manion 12 months ago

Art Manion wrote:

Netgate SG-4860 running 2.4.4-RELEASE-p3 (amd64). At least twice I've experienced issues, I assume involving filterdns, where aliases were updated via the GUI but did not make it into the pf table (and I believe the updates didn't land in /var/etc/dnsfilter.conf either, but I can't confirm).

The alias/table is 58 entries, mostly DNS names, some IP addresses, and some other pfSense aliases. The last time this problem happened neither DNS name, IP address, nor pfSense aliases were added to the pf table. filterdns was running. killall -9 filterdns && /etc/rc.filter_configure fixed it.

Update:

In the GUI, there is an alias named yoke11 that contains the IP address 192.168.1.211. There is another alias, internet_allowed_LAN, that contains yoke11. Weeks after this configuration was made (including multiple save/apply actions from the GUI), 192.168.1.211 is not the internet_allowed_LAN pf table and is in /var/etc/filterdns.conf. This vaguely points to a problem between filterdns.conf and the actual pf table.

I tried once to reproduce this behavior with clean/new/test aliases and I did not observe the problem.

pfctl -t internet_allowed_LAN -T test 192.168.1.211
0/1 addresses match.

grep '192.168.1.211' /var/etc/filterdns.conf
pf 192.168.1.211 internet_allowed_LAN

killall -9 filterdns
/etc/rc.filter_configure

pfctl -t internet_allowed_LAN -T test 192.168.1.211
1/1 addresses match.

I also observe that aliases do not become pf tables until the alias is used in a firewall rule. I believe this is expected behavior, just noting it.

#44 Updated by Gavin Stewart 12 months ago

Jim Pingle wrote:

John K wrote:

What's the status here? Has Netgate been able to reproduce this issue?

Not that I have seen yet. We still need to find a combination of settings that reliably and repeatedly reproduces the issue.

I now have a minimal and repeatable set of steps to reproduce this.

This has been verified in a VirtualBox VM with the following configuration:
  • New VM for FreeBSD 64-bit, accepting all defaults
  • First network adapter stays NAT (used as WAN in pfSense)
  • Add second network adapter as host-only (used as LAN in pfSense) to access web interface from host
  • pfSense-CE-2.4.4-RELEASE-p3-amd64.iso.gz, accepting all defaults
  • configure LAN IP address as needed at pfSense console to match host-only network settings
Procedure to reproduce FQDN name resolution failure in Alias tables:
  1. Firewall -> Aliases -> Import
  2. Alias Name "TEST1", import entire alias list below, Save
  3. Diagnostics -> Tables -> "TEST1"
    + Note approx 59 entries, specifically the 192.168.x.x addresses.
    + (It may take a couple of page reloads for all the addresses to resolve.)
  4. Firewall -> Aliases -> Import
  5. Alias Name "TEST2", import entire alias list below, Save
  6. Diagnostics -> Tables -> "TEST2"
    + Note that table is never fully populated (even if waiting longer than the 5 minute filterdns interval).
    + Note that 192.168.x.x addresses do not all appear.
  7. Kill the filterfdns process at the console, and restart manually:
    /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
    + Note that "TEST2" table is now properly populated.

List of aliases to import:

mail.bigpond.com
speedtest.telstra.com
mirror.internode.on.net
192.168.1.1
mirror.aarnet.edu.au
account.mojang.com
authserver.mojang.com
192.168.2.2
mc.hypixel.net
sessionserver.mojang.com
launchermeta.mojang.com
192.168.3.3
apps.yourtown.com.au
obook4.oxforddigital.com.au
www.oxforddigital.com.au
192.168.4.4
www.distance.vic.edu.au
lms.decvonline.vic.edu.au
connect.vic.edu.au
192.168.5.5
us.mineplex.com
stileapp.com
stileeducation.com
192.168.6.6

#45 Updated by John K 12 months ago

Jim Pingle wrote:

John K wrote:

What's the status here? Has Netgate been able to reproduce this issue?

Not that I have seen yet. We still need to find a combination of settings that reliably and repeatedly reproduces the issue.

Given Gavin Stewart's steps above, has Netgate finally been able to reproduce this issue now?

#46 Updated by Gavin Stewart 11 months ago

Gavin Stewart wrote:

I now have a minimal and repeatable set of steps to reproduce this.

Actually, I have revised the list of aliases to just one single host, still repeatable:

mail.bigpond.com

As per my previous instructions, with a clean install, and adding two host aliases for mail.bigpond.com (TEST1 and TEST2, applying changes between them), this is the entire clog output from /var/log/resolver.log (filterdns with -d 20):

Nov 27 06:18:13 pfSense filterdns:  Adding Action: pf table: TEST1 host: mail.bigpond.com
Nov 27 06:18:13 pfSense filterdns:      Adding host mail.bigpond.com
Nov 27 06:18:13 pfSense filterdns: Creating a new thread for action type: pf table: TEST1 hostname: mail.bigpond.com
Nov 27 06:18:13 pfSense filterdns: Creating a new thread for host mail.bigpond.com
Nov 27 06:18:20 pfSense filterdns:      found address 203.36.137.241 for host mail.bigpond.com
Nov 27 06:18:20 pfSense filterdns:          adding address 203.36.137.241 for host mail.bigpond.com
Nov 27 06:18:20 pfSense filterdns: Change detected on host: mail.bigpond.com
Nov 27 06:18:20 pfSense filterdns:  Awaking from the sleep for type: pf table: TEST1 hostname: mail.bigpond.com
Nov 27 06:18:20 pfSense filterdns:      Added pf address, table: TEST1 host: mail.bigpond.com address: 203.36.137.241
Nov 27 06:18:20 pfSense filterdns:  Updated pf table TEST1 host: mail.bigpond.com error: 0
Nov 27 06:18:24 pfSense filterdns: Received signal Hangup(1).
Nov 27 06:18:24 pfSense filterdns: merge_config: configuration reload
Nov 27 06:18:24 pfSense filterdns: Copied 1 actions to old
Nov 27 06:18:24 pfSense filterdns:  Adding Action: pf table: TEST1 host: mail.bigpond.com
Nov 27 06:18:24 pfSense filterdns:  Adding Action: pf table: TEST2 host: mail.bigpond.com
Nov 27 06:18:24 pfSense filterdns: Copied 2 actions to new
Nov 27 06:18:24 pfSense filterdns: Cleaning up action type: pf table: TEST1 hostname: mail.bigpond.com
Nov 27 06:18:24 pfSense filterdns: Loaded actions: 1 old and 1 new = 2 total
Nov 27 06:18:24 pfSense filterdns: Cleaning up previous actions
Nov 27 06:18:24 pfSense filterdns: Creating a new thread for action type: pf table: TEST2 hostname: mail.bigpond.com
Nov 27 06:18:24 pfSense filterdns:  Awaking from the sleep for hostname mail.bigpond.com (2)
Nov 27 06:18:24 pfSense filterdns:      found address 203.36.137.241 for host mail.bigpond.com
Nov 27 06:23:23 pfSense filterdns:  Awaking from the sleep for hostname mail.bigpond.com (2)
Nov 27 06:23:24 pfSense filterdns:      found address 203.36.137.241 for host mail.bigpond.com

[2.4.4-RELEASE][root@pfSense.localdomain]/root: cat /var/etc/filterdns.conf
pf mail.bigpond.com TEST1
pf mail.bigpond.com TEST2
[2.4.4-RELEASE][root@pfSense.localdomain]/root: pfctl -T show -t TEST1
   203.36.137.241
[2.4.4-RELEASE][root@pfSense.localdomain]/root: pfctl -T show -t TEST2
[2.4.4-RELEASE][root@pfSense.localdomain]/root: 

As can be seen, filterdns doesn't ever add the resolved address to the second table, unless filterdns is killed and restarted, resulting in this addition to the log:

Nov 27 06:26:10 pfSense filterdns: Received signal Terminated(15).
Nov 27 06:26:10 pfSense filterdns: Cleaning up action type: pf table: TEST2 hostname: mail.bigpond.com
Nov 27 06:26:10 pfSense filterdns: Cleaning up action type: pf table: TEST1 hostname: mail.bigpond.com
Nov 27 06:26:10 pfSense filterdns: Waiting 2 seconds for threads to finish
Nov 27 06:26:10 pfSense filterdns:  Awaking from the sleep for hostname mail.bigpond.com (0)
Nov 27 06:26:10 pfSense filterdns: Cleaning up hostname mail.bigpond.com
Nov 27 06:26:10 pfSense filterdns:          removing address 203.36.137.241 from host mail.bigpond.com
Nov 27 06:26:35 pfSense filterdns:  Adding Action: pf table: TEST1 host: mail.bigpond.com
Nov 27 06:26:35 pfSense filterdns:      Adding host mail.bigpond.com
Nov 27 06:26:35 pfSense filterdns:  Adding Action: pf table: TEST2 host: mail.bigpond.com
Nov 27 06:26:35 pfSense filterdns: Creating a new thread for action type: pf table: TEST1 hostname: mail.bigpond.com
Nov 27 06:26:35 pfSense filterdns: Creating a new thread for action type: pf table: TEST2 hostname: mail.bigpond.com
Nov 27 06:26:35 pfSense filterdns: Creating a new thread for host mail.bigpond.com
Nov 27 06:26:36 pfSense filterdns:      found address 203.36.137.241 for host mail.bigpond.com
Nov 27 06:26:36 pfSense filterdns:          adding address 203.36.137.241 for host mail.bigpond.com
Nov 27 06:26:36 pfSense filterdns: Change detected on host: mail.bigpond.com
Nov 27 06:26:36 pfSense filterdns:  Awaking from the sleep for type: pf table: TEST1 hostname: mail.bigpond.com
Nov 27 06:26:36 pfSense filterdns:      Added pf address, table: TEST1 host: mail.bigpond.com address: 203.36.137.241
Nov 27 06:26:36 pfSense filterdns:  Updated pf table TEST1 host: mail.bigpond.com error: 0
Nov 27 06:26:36 pfSense filterdns:  Awaking from the sleep for type: pf table: TEST2 hostname: mail.bigpond.com
Nov 27 06:26:36 pfSense filterdns:      Added pf address, table: TEST2 host: mail.bigpond.com address: 203.36.137.241
Nov 27 06:26:36 pfSense filterdns:  Updated pf table TEST2 host: mail.bigpond.com error: 0
[2.4.4-RELEASE][root@pfSense.localdomain]/root: cat /var/etc/filterdns.conf
pf mail.bigpond.com TEST1
pf mail.bigpond.com TEST2
[2.4.4-RELEASE][root@pfSense.localdomain]/root: pfctl -T show -t TEST1
   203.36.137.241
[2.4.4-RELEASE][root@pfSense.localdomain]/root: pfctl -T show -t TEST2
   203.36.137.241
[2.4.4-RELEASE][root@pfSense.localdomain]/root: 

#47 Updated by Gavin Stewart 11 months ago

I have a fix for this, and have created a pull request.

https://github.com/pfsense/FreeBSD-ports/pull/714

#48 Updated by Jim Pingle 11 months ago

  • Status changed from New to Pull Request Review

#49 Updated by Luiz Souza 11 months ago

  • Status changed from Pull Request Review to Feedback
  • % Done changed from 0 to 100

A fix based on Gavin's PR was committed, please let me know if the problem persists.

Thanks

#50 Updated by Robert Gijsen 11 months ago

Luiz Souza wrote:

A fix based on Gavin's PR was committed, please let me know if the problem persists.

Thanks

Maybe a stupic question, but as I don't have any git or build tools available within pfSense obviously, how can we test this? Does that mean we'd need to install the 2.5 nightly? Sidequestion, is there any ETA on 2.5 RTM?

#51 Updated by Christian Ullrich 11 months ago

  • Robert Gijsen wrote:

Maybe a stupic question, but as I don't have any git or build tools available within pfSense obviously, how can we test this? Does that mean we'd need to install the 2.5 nightly? Sidequestion, is there any ETA on 2.5 RTM?

In a (very large) nutshell:

$ git clone https://github.com/pfsense/freebsd-ports pfsense-ports
$ git clone -b RELENG_2_4_4 https://github.com/pfsense/freebsd-src pfsense-src
# poudriere ports -cp pfsense -m null -M $PWD/pfsense-ports
# poudriere jail -cj pfsense244 -m src=$PWD/pfsense-src -b
# echo ALLOW_UNSUPPORTED_SYSTEM=1 >> /usr/local/etc/poudriere.d/pfsense-make.conf
# poudriere bulk -j pfsense244 -p pfsense -z default net/filterdns
$ scp /usr/local/poudriere/data/packages/pfsense244-pfsense-default/All/filterdns-2.0_3.txz $WHEREVER
$ ssh $WHEREVER pkg install -f filterdns-2.0_3.txz

The ALLOW_UNSUPPORTED_SYSTEM line is necessary if the next line fails (on a FreeBSD 12 build system).

#52 Updated by Christian Ullrich 11 months ago

  • Luiz Souza wrote:

A fix based on Gavin's PR was committed, please let me know if the problem persists.

Confirmed. With rebuilt filterdns-2.0_3 on pfSense 2.4.4-p3, the tables are now populated correctly.

#53 Updated by Luiz Souza 11 months ago

  • Status changed from Feedback to Resolved

#54 Updated by Jim Pingle 11 months ago

  • Target version changed from 2.5.0 to 2.4.5

#55 Updated by Robert Gijsen 11 months ago

Luiz Souza wrote:

A fix based on Gavin's PR was committed, please let me know if the problem persists.

Thanks

I have compiled the package for our test-environment (huge thanks to Christian Ullrich for the info, I couldn't have done that without his help) and so far tables are populated as it should now.

#56 Updated by Jim Pingle 11 months ago

  • Status changed from Resolved to Feedback

Needs checked and/or tested again on 2.4.5 snapshots

#57 Updated by Viktor Gurov 10 months ago

Jim Pingle wrote:

Needs checked and/or tested again on 2.4.5 snapshots

tested on pfSense 2.4.5.a.20191220.1407

works as expected,
Resolved

#58 Updated by Jim Pingle 10 months ago

  • Status changed from Feedback to Resolved

#59 Updated by Eduard Rozenberg 9 months ago

Great to hear about the fix! Would have loved to see a 2.4.4 update with this fixed package, or even just a fixed filterdns package I could download and install on all my 2.4.4 setups, rather than waiting an unknown and growing number of months for 2.4.5 to be released.

As it is I'm spending a number of hours downloading and setting up a FreeBSD 11.2 virtual machine, installing git/poudriere, wastefully downloading multiple GB's of ports and source, and going through the filterdns package rebuild process. And hoping I'll have a fixed filterdns package at the end of this process.

This issue has been an ongoing pain for over a year and even trying to apply the fix is no carnival. But thanks again to the folks who actually dug into and resolved the issue.

#60 Updated by Eduard Rozenberg 9 months ago

Christian Ullrich wrote:

  • Robert Gijsen wrote:

Maybe a stupic question, but as I don't have any git or build tools available within pfSense obviously, how can we test this? Does that mean we'd need to install the 2.5 nightly? Sidequestion, is there any ETA on 2.5 RTM?

In a (very large) nutshell:
...

Thanks Christian for these instructions in the earlier post above. I'm not the only pfSense user who's not a FreeBSD and pfSense package build expert, so without his help I don't think any normal person would attempt this.

The following steps were required on a fresh FreeBSD 11.2 system/VM before following Christian's instructions.

$ pkg update -f
$ pkg install git
$ pkg install poudriere
$ mkdir /usr/local/poudriere
$ mkdir /usr/ports/distfiles

After doing these initial steps I ran through Christian's instructions and it worked fine to generate the package filterdns-2.0_3.txz which I installed on my various firewalls and things seem to be working fine. The poudriere jail command took a bunch of time (not sure how many hours on my underpowered VM), but it was done when I checked it the next day :).

Note that I did need to also run Christian's line containing ALLOW_UNSUPPORTED_SYSTEM even on my FreeBSD 11.2 VM.

I went to Status -> Filter Reload and clicked Reload Filter on each firewall after installling the new filterdns package.

I'm attaching the filterdns package I generated but of course you should never use someone else's binary package, and especially not on any security sensitive equipment :).

#61 Updated by Eduard Rozenberg 9 months ago

It appears a reboot was required on each firewall after updating the filterdns package to my custom built one (2.0_3). Without the reboot, the alias table was still not fully and correctly populated (i.e. there were still ip addresses missing from the list when running $ pfctl -T show -t my_alias_name). Hopefully the problem is well and truly solved now dammit :).

#62 Updated by Eduard Rozenberg 9 months ago

Still not working properly, at least a couple of IP's are still not populating in the table. Giving up for now, will wait for 2.4.5 and see if that somehow fixes things. Maybe my filterdns build didn't pick up the latest code fix or something, no idea.

#63 Updated by Fabián Burbano 9 months ago

Eduard Rozenberg wrote:

Still not working properly, at least a couple of IP's are still not populating in the table. Giving up for now, will wait for 2.4.5 and see if that somehow fixes things. Maybe my filterdns build didn't pick up the latest code fix or something, no idea.

Version 2.4.5 already has several RCs. I think it is safer to upgrade to the RC than to do such a process. At least in this case, due to the urgency. The final version should come out in a few days or maybe hours. I have three days using the RC and it works very well.

#64 Updated by Eduard Rozenberg 9 months ago

Fabián Burbano wrote:

Version 2.4.5 already has several RCs. I think it is safer to upgrade to the RC than to do such a process. At least in this case, due to the urgency. The final version should come out in a few days or maybe hours. I have three days using the RC and it works very well.

Thanks! My problem is definitely NOT resolved by my filterdns custom build. Having exactly the same issue. Will upgrade to 2.4.5 when it releases, hopefully soon. Got burned by at least 2-3 pfSense version upgrades in the past, so would like to minimize risk by waiting for 2.4.5 official release.

#65 Updated by Robert Gijsen 8 months ago

I also have to come back to my conclusion it was ok with the rebuild filterdns. While working better than before, tables are still not filled correctly when a FQDN is used multiple times in the aliasses.

#66 Updated by Chris Poillion 8 months ago

This bug still persists in Build 2.4.5.r.20200307.0900.
.

#67 Updated by Gabriel Ribeiro 7 months ago

This bug still persists in Build 2.4.5 date:2020.04.09

#68 Updated by Luiz Souza 7 months ago

How it was tested ? What was the result ? How it failed ?

#69 Updated by Gabriel Ribeiro 7 months ago

This bug still persists in Build 2.4.5 date:2020.04.10

I can confirm my issue is the same as described by the other posters on this bug.

Logs show that filterdns claims to be doing the right thing - all expected alias entries (FQDN's, IP's, networks) show up in:
$ clog /var/log/resolver.log | grep "Adding Action"

But the alias table is incomplete, some (FQDN) addresses are missing:
$ pfctl -T show -t my_alias_name

There is no DNS resolution issue with any of the FQDN's - if I ping the FQDN's from the firewall their IP addresses are resolved.

Restarting the filter, re-saving the alias does not help

#70 Updated by Eduard Rozenberg 6 months ago

Yep. Same issue. Today got locked out again out of all our sites. My workaround is to use a personal VPN to force my own IP to change, which then normally will allow me back in once the firewall re-evaluates the DNS name -> IP mapping in the hostnames alias. As others have speculated, issue may be linked to cases where there are multiple hostnames which evaluate to the same IP inside a pfSense alias. But I don't really have a firm idea or clue.

#71 Updated by Donn Lasher 5 months ago

Same problem here - 2.4.5-RELEASE (amd64)

pfctl -T show -t my_remote
   71.237.XX.XX
   97.120.XX.XX
   184.100.XX.XX
cat /var/etc/filterdns.conf 
pf connection1.dyndns.org my_remote
pf connection2.dyndns.org my_remote
pf connection3.dyndns.org my_remote

My favorite part is the firewall log, showing the blocks.

    May 30 13:42:37    COMCAST      71.237.XX.XX:30614      96.65.XX.XX:21443    TCP:S

Wish I knew how to fix it.. the "Aliases Hostnames Resolve Interval" didn't help :(

#72 Updated by Gavin Stewart 5 months ago

Donn Lasher wrote:

Same problem here - 2.4.5-RELEASE (amd64)

This is confirmed.

I am able to replicate the failure in a test VM, using my instructions in #9296#note-44

I will point out that I cannot replicate the failure with just one host alias as described in #9296#note-46

#73 Updated by Gavin Stewart 5 months ago

Gavin Stewart wrote:

This is confirmed.

I am able to replicate the failure in a test VM, using my instructions in #9296#note-44

Just revising this statement, because the error isn't immediately reproducible with the instructions linked above. There is a distinct change in the symptoms, and it now appears to only relate to IP addresses in aliases, and not FQDNs.

I now have the following steps to reproducibly demonstrate failure:

  1. Firewall -> Aliases -> Import
  2. Alias Name "TEST1", import entire alias list below, Save, Apply Changes
  3. Diagnostics -> Tables -> "TEST1"
    + Note 2 entries, specifically the 192.168.1.1 address.
  4. Firewall -> Aliases -> Import
  5. Alias Name "TEST2", import entire alias list below, Save, Apply Changes
  6. Diagnostics -> Tables -> "TEST2"
    + Note table is the same as TEST1
  7. Alias Name "TEST3", import entire alias list below, Save, Apply Changes
  8. Diagnostics -> Tables -> "TEST3"
    + Note that 192.168.1.1 addresses does not appear.
  9. Kill the filterfdns process at the console, and restart manually:
    /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
    + Note that "TEST3" table is now properly populated.

List of aliases to import:

mail.bigpond.com
192.168.1.1

Notes
  • Test may be replicated over again by deleting all aliases and starting again.
  • Test appears to always require third alias to occur
  • This case also fails in the patched filterdns I have been using with 2.4.4-RELEASE-p3, so it would appear there is still an undiagnosed bug in filterdns, this ticket should probably be reopened.

#74 Updated by Rob Shiras 4 months ago

I just ran into this today. I was using IP addresses for the bookkeeper. She finally got a hostname with DynDNS.
So I went to Firewall>Aliases and changed her IP address into the hostname. I verified that the hostname resolves to her IP address. She connected once (with RDP) with no problems.
Subsequent tries fail. She gets the RDC dialog that says Remote Desktop can't connect to the remote computer for one of these reasons....
I add the IP back into the alias and she is again able to connect via RDP.
Interesting development here...My hostname works without issue. My hostname is a CNAME I added to my GoDaddy account, pointing to my public static IP.
Her hostname does not work. Her public IP is dynamic, and she uses DynDNS. What's the difference? They both resolve to a public IP correctly.

#75 Updated by Luiz Souza 4 months ago

Thanks for the detailed instructions Gavin.

I pushed a fix which should do the right thing in this case.

Please test the new version (filterdns-2.0_4) and let me know if the problem persists.

  • The new package is available with the 2.5 snapshots.

#76 Updated by Gavin Stewart 3 months ago

Luiz Souza wrote:

Please test the new version (filterdns-2.0_4) and let me know if the problem persists.

This problem persists unfortunately.

I have recompiled filterdns from a pull of the latest devel branch (55691996), and the problem remains while following the instructions in #9296#note-73

#77 Updated by Brendon Baumgartner 3 months ago

Should the status on this be changed? It says resolved.

#78 Updated by Eduard Rozenberg 3 days ago

Brendon Baumgartner wrote:

Should the status on this be changed? It says resolved.

Definitely not resolved. It's an unpredictably occuring issue that continues to terrorize admins.

There's related strange behavior - using an IP address that was part of an alias, still cannot access the network when I use that IP as part of an allow from source rule.

The only way I can deal with the issue is to use a VPN to change my own IP, and use the VPN IP in the dynamic FQDN specified in the alias.

#79 Updated by Renato Botelho 3 days ago

  • Status changed from Resolved to New
  • Target version changed from 2.4.5 to 2.5.0

Luiz, can you please take a look?

Also available in: Atom PDF