Project

General

Profile

Bug #8758

filterdns stops working on a regular basis.

Added by Pi Ba 7 months ago. Updated 25 days ago.

Status:
Closed
Priority:
High
Assignee:
Category:
FilterDNS
Target version:
Start date:
08/06/2018
Due date:
% Done:

100%

Estimated time:
Affected Version:
2.4.3_1
Affected Architecture:
All

Description

filterdns stops working on a regular basis.

Ive found that filterdns stops working after a short while (a few days..)

Yes ive got a few dns names that dont properly resolve in there.. but that shouldn't matter for the names that do resolve right?
The process does keep 'running', but doesn't log anymore to the dnsresolver log..
Is there any way to query it directly for what it might be doing / waiting for?

What would help find the cause of such a issue?

resolver.txt (500 KB) resolver.txt khaled osama, 10/24/2018 01:31 AM

History

#1 Updated by Jim Pingle 6 months ago

  • Priority changed from Normal to High

I had this happen on my edge firewall which is running a snapshot from earlier this week. The pf tables with entries managed by filterdns were empty when it failed, even entries that did not require DNS resolution. There were no log messages from filterdns until it was killed and restarted, at which point it was functioning as expected.

The filterdns process was idle and not doing anything. Running truss -fp on its PID showed no activity.

When it was stuck:

root    91126   0.0  0.2 101592  8936  -  Is   14:10      0:09.41 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1

After killing it manually and restarting:
root     6915   0.0  0.2 33880  8060  -  Ss   09:47      0:00.06 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 30 -c /var/etc/filterdns.conf -d 1

Checking it randomly after, the main process was always in the "sleep" state (Ss), I didn't see the main process in the idle state (Is) again, but if you show all threads (ps uxHaww | grep filterdns) the individual threads are in the idle state.

#2 Updated by Steve Beaver 6 months ago

  • Assignee set to Renato Botelho

#3 Updated by Steve Beaver 6 months ago

  • Status changed from New to This Sprint

#4 Updated by Luke Hamburg 6 months ago

Would this affect IP alias lists getting "stuck" and not updating? I ran into this last week - editing an Alias, added a few IPs and everything saved and looked good in the GUI, but when checking the actual pf tables they still had the old IPs. Only a reboot fixed it.

Was just wondering if this was the same issue...

#5 Updated by Jim Pingle 6 months ago

That is a possible side effect. You can kill the filterdns daemon and then trigger a filter reload from Status > Filter Reload to restart it. Much faster than a reboot.

#6 Updated by Luke Hamburg 6 months ago

Hmm. In that case, how about we add some tests to the alias edit php functions to query pfctl directly after a save and if it finds that the tables are not updating, then kill and restart the daemon automatically?

#7 Updated by Jim Pingle 6 months ago

That's a rather ugly hack/kludge and it would be better to find out why it's failing and fix it directly.

#8 Updated by Luke Hamburg 6 months ago

Agree it's less than ideal. But until a proper fix can be found a kludge might be safer than nothing, since aliases not updating could leave large security holes open.

I'm not sure where the source code for filterdns.c even is. It looks like it used to be in pfsense-tools but that's been archived and made private. Where to even look?

#9 Updated by Jim Pingle 6 months ago

It is at https://github.com/pfsense/FreeBSD-ports/blob/devel/net/filterdns/files/filterdns.c but Renato is already looking at this currently so it will be fixed properly in this release.

#10 Updated by Luke Hamburg 6 months ago

Sounds good!

#11 Updated by Renato Botelho 6 months ago

Jim Pingle wrote:

I had this happen on my edge firewall which is running a snapshot from earlier this week. The pf tables with entries managed by filterdns were empty when it failed, even entries that did not require DNS resolution. There were no log messages from filterdns until it was killed and restarted, at which point it was functioning as expected.

The filterdns process was idle and not doing anything. Running truss -fp on its PID showed no activity.

When it was stuck:
[...]
After killing it manually and restarting:
[...]

Checking it randomly after, the main process was always in the "sleep" state (Ss), I didn't see the main process in the idle state (Is) again, but if you show all threads (ps uxHaww | grep filterdns) the individual threads are in the idle state.

The I state indicates it's sleeping for over 20 seconds and per-se is not the problem because filterdns threads sleep for 1 minute so it will stay as S in the first 20 seconds and then move to I

#12 Updated by Renato Botelho 6 months ago

  • Status changed from This Sprint to In Progress

#13 Updated by Steve Beaver 6 months ago

  • Assignee deleted (Renato Botelho)
  • Target version changed from 2.4.4 to 2.4.5

Currently unable to reproduce

#14 Updated by Luke Hamburg 6 months ago

I have definitely hit this one - yes it is hard to reproduce. But, if I hit it again, is it worth sending any sort of truss output or anything else to help track it down?

#15 Updated by Renato Botelho 6 months ago

Luke Hamburg wrote:

I have definitely hit this one - yes it is hard to reproduce. But, if I hit it again, is it worth sending any sort of truss output or anything else to help track it down?

Can you patch your local system the places that start filterdns and replace '-d 1' by '-d 10' to increase debug level? then if it fails it would be good to check the logs from /var/log/resolver.log

#16 Updated by Luke Hamburg 5 months ago

Ok, I've done that. For anyone else who wants an easy way, I made a patch that you can add via System Patches:
https://github.com/luckman212/pfsense/commit/72834bf677bdbd1cf78f6772b79abe4b3eaa8235

Don't forget to kill & restart filterdns after applying, or simply reboot.
To follow the resolver logs in realtime, you can use:

clog -f /var/log/resolver.log

Also, I don't see it linked here so I want to add, this seems very related to https://redmine.pfsense.org/issues/8001 which was reported on the forums here: https://forum.netgate.com/topic/124467/filterdns-stops-working

Might literally be the same issue.

#17 Updated by khaled osama 4 months ago

Dear All

i am affected with same problem
it happens every day approx.
i must kill filterdns service and restart to make worked again
i increased the debug level for filterdns
and i attached the resolver log file
best regards,

i linked the same post from netgate forum
https://forum.netgate.com/topic/124467/filterdns-stops-working/36

#18 Updated by Luiz Souza 3 months ago

  • Assignee set to Luiz Souza

#19 Updated by Luiz Souza 3 months ago

  • Status changed from In Progress to Feedback

This issue was one of fixes included in the new filterdns (version 2.0).

If you still have issues, please let us know.

Thanks!

#20 Updated by Luiz Souza 3 months ago

  • % Done changed from 0 to 100

#21 Updated by Renato Botelho 3 months ago

  • Target version changed from 2.4.5 to 2.4.4_1

#22 Updated by Luiz Souza 3 months ago

  • Status changed from Feedback to Closed

#23 Updated by Rudolf Mayerhofer 25 days ago

Just ran into this on 2.4.4-p2 with a not updating alias table:

[2.4.4-RELEASE][root@fw2]/root: ps aux | grep filterdns
root 96457 0.0 0.0 49776 9336 - Is 10Jan19 0:10.67 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
root 2975 0.0 0.0 6564 2468 0 S+ 12:54 0:00.00 grep filterdns

After killall -9 filterdns and applying changes to an alias filterdns started working properly again:

[2.4.4-RELEASE][root@fw2]/root: ps aux | grep filterdns
root 36300 67.0 0.0 47984 8016 - Ss 12:54 0:00.01 /usr/local/sbin/filterdns -p /var/run/filterdns.pid -i 300 -c /var/etc/filterdns.conf -d 1
root 39464 0.0 0.0 6564 2468 0 S+ 12:54 0:00.00 grep filterdns

Unfortunately, at this point in time I do not have any more debug output available. I will continue to monitor this and add more information as it will be available to me.

Also available in: Atom PDF