DNS forwarder domain override queries timeout if destination server on different subnet
I'm running 2.0-RC1 (i386) built on Mon Mar 7 12:37:11 EST 2011. This is a complicated one to explain but I'll do my best. Feel free to request more information.
I have a router with LAN IP address 172.16.4.1. I have another router with LAN IP address 172.16.1.1. I have an OpenVPN tunnel between them, with tunnel network 10.9.4.0/24.
In the DNS forwarder on 172.16.4.1, I have a domain override that forwards requests for domain "internal.foofoofoofoo.com" to the IP address 172.16.1.144, which is reachable from 172.16.4.1 over the OpenVPN tunnel.
Unfortunately, if I try a lookup for myhost1.internal.foofoofoofoo.com from 172.16.4.1 (or any other host on 172.16.4.0/24), the request times out (e.g. dig @localhost myhost1.internal.foofoofoofoo.com +short). If I run "tcpdump -n -i ovpnc1 host 172.16.1.144" on 172.16.4.1, I can see the request packet going out to 172.16.1.144 but there is no reply packet:
[2.0-RC1][email@example.com]/root(6): tcpdump -n -i ovpnc1 host 172.16.1.144 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ovpnc1, link-type NULL (BSD loopback), capture size 96 bytes 14:28:37.471416 IP 10.9.4.2.36192 > 172.16.1.144.53: 26073+ A? myhost1.internal.foofoofoofoo.com. (51) 14:28:42.500633 IP 10.9.4.2.36192 > 172.16.1.144.53: 26073+ A? myhost1.internal.foofoofoofoo.com. (51) 14:28:47.513229 IP 10.9.4.2.36192 > 172.16.1.144.53: 26073+ A? myhost1.internal.foofoofoofoo.com. (51)
I think the problem is the default source address of the request, 10.9.4.2. If 172.16.4.1 is used as the source, I get a reply. The prescribed way to set the source address for domain override queries in dnsmasq is described in the dnsmasq man page:
-S, --local, --server=[/[<domain>]/[domain/]][<ipaddr>[#<port>][@<source-ip>|<interface>[#<port>]] ... "The optional string after the @ character tells dnsmasq how to set the source of the queries to this nameserver."
If we were able to configure advanced settings for dnsmasq in the pfSense GUI, I think this setting would do the trick:
However, AFAIK, the GUI generates something like this only:
Ideally, the GUI should be changed to allow an optional source address to be entered for each domain override. Or, better yet, the GUI should have a free text box for adding advanced dnsmasq settings. Thus we could also add useful settings like rebind-domain-ok, srv-host, dhcp-option, etc. You can do this in the Tomato firmware for WRT54G routers. You should be able to do it for pfSense also. If you are worried about users get into trouble with advanced settings, just put a big fat warning in the GUI.
My very ugly workaround for now is to create an outbound NAT rule that sets the source IP to 172.16.4.1 for DNS queries destined for 172.16.1.144 over the OpenVPN interface. It works (i.e. I get back a reply). Here is what the tcpdump output looks like in this case:
[2.0-RC1][firstname.lastname@example.org]/root(22): tcpdump -n -i ovpnc1 host 172.16.1.144 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on ovpnc1, link-type NULL (BSD loopback), capture size 96 bytes 14:49:05.036710 IP 172.16.4.1.13737 > 172.16.1.144.53: 23067+ A? myhost1.internal.foofoofoofoo.com. (51) 14:49:05.041864 IP 172.16.1.144.53 > 172.16.4.1.13737: 23067* 1/0/0 A[|domain]
If the above sounds really complicated, here is an easier to understand situation. When you try to ping 172.16.1.144 from 172.16.4.1, you don't get a reply unless you add -S 172.16.4.1 to your ping command like this: ping -S 172.16.4.1 172.16.1.144
Anway, just to reiterate, the dnsmasq GUI should allow for either advanced settings entry and/or the addition of source IP addresses to each domain override.
#3 Updated by Joe Kelly over 9 years ago
Jim P wrote:
The easiest way around this is to add a route on the remote side so that 10.9.4.x goes across the tunnel. No need for any fancy dnsmasq settings or complex logic to determine the proper source address.
Jimp, thanks for the suggestion. I'll give that a try and report back here. Silly me.
#4 Updated by Joe Kelly over 9 years ago
Chris Buechler wrote:
not a bug, you're either blocking that traffic or have a routing issue of some sort. could be moved to a feature request if you'd like to implement what you described and attach a patch.
Chris, I'll try Jimp's suggestion to add a route (don't know why I didn't think to try that). Still would like to see an "advanced settings" text box on that screen. I like your suggestion about sending a patch. I just need to set up a development environment for pfSense. Any links to a how-to for this?
#5 Updated by Joe Kelly over 9 years ago
Jim P wrote:
The easiest way around this is to add a route on the remote side so that 10.9.4.x goes across the tunnel.
Chris Buechler wrote:
not a bug, you're either blocking that traffic or have a routing issue of some sort.
Chris wins the Christmas turkey (sorry Jim). Chris, you sir are a steely-eyed missile man! :-)
It was not a routing issue at all. OpenVPN automatically created all the necessary routes on both routers to route traffic to and from the tunnel networks. The problem was that the firewall on the DNS server at 172.16.1.144 (which, co-incidentally, was also running pfSense just like the two routers) was blocking all LAN traffic that did not come from its LAN (172.16.1.0/24). All I had to do was add another rule on the LAN tab of the DNS server to allow traffic from the OpenVPN tunnel network, 10.9.4.0/24.
I'll still look at creating that patch for you Chris, but no rush now.
Thanks again fellas!