Bug #2749
closedgateway groups - when tier 1 gateway fails, routes traffic via gateway set to "never"
0%
Description
pfSense 2.0.2 running as a VM inside ESXi 5.0
Summary: it seems that a gateway group is not respecting the "never" tag for a gateway and is routing traffic through this gateway when a higher "tier 1" gateway fails leaving no available gateways, versus not being able to route traffic at all through this gateway group (the result that was expected/desired)
Setup:
- I have an OpenVPN client in pfSense connecting to an external OpenVPN server with a gateway "VPS" mapped to this link.
- I have a gateway group "VPS_Servers" with this OpenVPN client gateway "VPS" set to "tier 1" and my default gateway "WAN" (over which the OpenVPN client connects to the OpenVPN server) set as "never" since I want the traffic destined to this gateway group to fail entirely when the OpenVPN client is down versus trying to go out through my default gateway. There are currently no other gateways in this group (and no other gateways in this instance of pfSense).
- I have a firewall rule under "LAN" that routes traffic from a single IP address on my LAN through this gateway group "VPS_Servers" and it is at the top of my "LAN" firewall rules under the anti-lockout rules.
- my LAN address space is 192.168.1.0/24, my WAN has a DHCP assigned public IP address
When the OpenVPN client is up (i.e. "VPS" gateway is up and "VPS_Servers" gateway group is up) everything works as intended and my traffic is routed via the external OpenVPN server. When the "tier 1" gateway dies (OpenVPN client shutdown or failed) the traffic is routed through the default gateway set to "never". I do not recall this behaviour occurring in pfSense 2.0.1 as I would often find that I had no connectivity in the past when my external OpenVPN server was down and would have to go and reboot the external remote server.
Investigations:
I have also tried creating an additional gateway that cannot route traffic (associated with an invalid interface) and adding it as a "tier 2" gateway to my "VPS_Servers" gateway group with monitoring for the gateway set to "always up", my expectation was when the tier 1 gateway failed (OpenVPN client disconnected) that it would then resort to the tier 2 dead gateway listed as "up" and no traffic would leave the network, instead it continues to switch to the gateway set to "never" and traffic flows.
I also tried creating a firewall rule under "LAN" immediately under my rule to route traffic to the gateway group to reject all traffic from that IP address (so same exact rule as the one above but set to reject), traffic still flows through the "WAN" gateway set to "never", thereby ensuring that it was the rule above it that was causing the traffic to go out over the "WAN" gateway and not some later rule.
Thanks for all of your hard work on this amazing software!
Colin
Updated by Jim Pingle over 12 years ago
There may be a bug here but the way you're using gateway groups is unnecessary.
Just choose the actual gateway in the firewall rule directly, no need to make a group for that. If only the one gateway is selected, not a group, it can never use any other gateway.
For further reference about the underlying issue though, do you have "Allow default gateway switching" enabled under System > Advanced on the Misc tab?
Updated by Colin Sinclair over 12 years ago
Hi Jim, thanks for your response, I'll go and check that setting now...
The reason I'm using a gateway group is that I plan on adding a number of additional gateways to this group (each one an additional OpenVPN client/server pair linked to a new gateway in pfSense) and I want to have load balancing and/or failover while always ensuring that the default gateway is never used (I don't want my Hulu Plus account to get flagged as being connected to from a non-US IP address! :) ).
Thanks,
Colin
Jim P wrote:
There may be a bug here but the way you're using gateway groups is unnecessary.
Just choose the actual gateway in the firewall rule directly, no need to make a group for that. If only the one gateway is selected, not a group, it can never use any other gateway.
For further reference about the underlying issue though, do you have "Allow default gateway switching" enabled under System > Advanced on the Misc tab?
Updated by Jim Pingle over 12 years ago
It would also help to get a copy of /tmp/rules.debug from when it's running normally, and again when the VPN is down and it's not routing traffic as you expect.
Updated by Colin Sinclair over 12 years ago
"Allow default gateway switching" was/is UNchecked.
Updated by Colin Sinclair over 12 years ago
I can't do that now (not at home) but will post /tmp/rules.debug tonight, thanks!
Updated by Colin Sinclair over 12 years ago
Here are the relevant parts that changed in each one, thanks!
############################################## ##### when all gateways are up ############################################## # Gateways GWWAN = " route-to ( vxn0 <REMOVED MY PUBLIC IP GATEWAY HERE> ) " GWALIENVPS = " route-to ( ovpnc4 10.9.0.5 ) " GWCHICAGOVPS = " route-to ( ovpnc1 10.10.0.5 ) " GWUS_Gateways = " route-to { ( ovpnc4 10.9.0.5 ) } " anchor "userrules/*" pass in quick on $LAN proto tcp from 192.168.1.0/24 to <negate_networks> flags S/SA keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $LAN $GWUS_Gateways proto tcp from 192.168.1.0/24 to any port 25 flags S/SA keep state label "USER_RULE: SMTP TCP 25 to any via US Gateways" pass in quick on $LAN from $Colins_Dell_XPS_8300 to <negate_networks> keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $LAN $GWUS_Gateways from $Colins_Dell_XPS_8300 to any keep state label "USER_RULE: Colins_Dell_XPS_8300 to any via US_Gateways" pass in quick on $LAN from $Apple_TV to <negate_networks> keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $LAN $GWUS_Gateways from $Apple_TV to any keep state label "USER_RULE: Apple_TV to any via US_Gateways" pass in quick on $LAN from 192.168.1.0/24 to <negate_networks> keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $LAN $GWWAN from 192.168.1.0/24 to any keep state label "USER_RULE: LAN to any via WAN" block in quick on $IPsec proto tcp from any to 192.168.1.0/24 label "USER_RULE: Block IPsec client (TCP) to LAN (UDP/VoIP passes)" pass in quick on $IPsec from any to <negate_networks> keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $IPsec $GWUS_Gateways from any to any keep state label "USER_RULE: IPsec client to any via US_Gateways " # WANLANCHICAGOVPSALIENVPSIPsecOpenVPN l2tp array key does not exist for label "USER_RULE" pass in quick on $OpenVPN from any to 192.168.1.0/24 keep state label "USER_RULE: OpenVPN client to LAN" pass in quick on $OpenVPN from any to <negate_networks> keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $OpenVPN $GWUS_Gateways from any to any keep state label "USER_RULE: OpenVPN client to any via US_Gateways" # WANLANCHICAGOVPSALIENVPSIPsecOpenVPN pptp array key does not exist for label "USER_RULE" ############################################## ##### when ALIENVPS (part of US_Gateways) is down, should switch to CHICAGOVPS gateway but switching to WAN instead...I see US_Gateways disappears entirely! ############################################## # Gateways GWWAN = " route-to ( vxn0 <REMOVED MY PUBLIC IP GATEWAY HERE> ) " GWALIENVPS = " " GWCHICAGOVPS = " route-to ( ovpnc1 10.10.0.5 ) " anchor "userrules/*" pass in quick on $LAN proto tcp from 192.168.1.0/24 to any port 25 flags S/SA keep state label "USER_RULE: SMTP TCP 25 to any via US Gateways" pass in quick on $LAN from $Colins_Dell_XPS_8300 to any keep state label "USER_RULE: Colins_Dell_XPS_8300 to any via US_Gateways" pass in quick on $LAN from $Apple_TV to any keep state label "USER_RULE: Apple_TV to any via US_Gateways" pass in quick on $LAN from 192.168.1.0/24 to <negate_networks> keep state label "NEGATE_ROUTE: Negate policy routing for destination" pass in quick on $LAN $GWWAN from 192.168.1.0/24 to any keep state label "USER_RULE: LAN to any via WAN" block in quick on $IPsec proto tcp from any to 192.168.1.0/24 label "USER_RULE: Block IPsec client (TCP) to LAN (UDP/VoIP passes)" pass in quick on $IPsec from any to any keep state label "USER_RULE: IPsec client to any via US_Gateways " # WANLANCHICAGOVPSALIENVPSIPsecOpenVPN l2tp array key does not exist for label "USER_RULE" pass in quick on $OpenVPN from any to 192.168.1.0/24 keep state label "USER_RULE: OpenVPN client to LAN" pass in quick on $OpenVPN from any to any keep state label "USER_RULE: OpenVPN client to any via US_Gateways" # WANLANCHICAGOVPSALIENVPSIPsecOpenVPN pptp array key does not exist for label "USER_RULE" Jim P wrote: > It would also help to get a copy of /tmp/rules.debug from when it's running normally, and again when the VPN is down and it's not routing traffic as you expect.
Updated by Colin Sinclair over 12 years ago
I should have added, all I did was disconnect the OpenVPN client behind the ALIENVPS gateway and then waited for the gateway monitoring to detect it as being down, I assume this replicates the link going down due to other reasons though
Colin Sinclair wrote:
Here are the relevant parts that changed in each one, thanks!
[...]
Updated by Colin Sinclair over 12 years ago
Colin Sinclair wrote:
Also, I tried setting the gateway in my LAN firewall rules to a standalone gateway (i.e. not to a group) and then put the gateway as down, and it shows the exact same behaviour as for a gateway group, it starts using the default WAN instead of just not routing any traffic at all
I also created a gateway group with three gateways in it (one of them being my default WAN and the other two being OpenVPN based VPN clients) and put one of the VPN clients as tier 1, the other as tier 2 and the default WAN as "never" and when the first tier 1 VPN client is down it goes immediately to the default WAN, bypassing the tier 2 VPN client....also tried putting default WAN to tier 3 behind the other two but once again, same behaviour, tier 1 goes down and it goes again to the WAN bypassing tier 2
I should have added, all I did was disconnect the OpenVPN client behind the ALIENVPS gateway and then waited for the gateway monitoring to detect it as being down, I assume this replicates the link going down due to other reasons though
Colin Sinclair wrote:
Here are the relevant parts that changed in each one, thanks!
[...]
Updated by Doug Dimick over 9 years ago
I'm experiencing the same issue on 2.2.5. For policy purposes I require redundant OpenVPN tunnnels, and I also need the system to fail closed rather than fail open.
Updated by Doug Dimick over 9 years ago
Still an issue on 2.3 alpha. A somewhat ugly and not terribly secure workaround is to add a rule to outbound NAT disabling NAT on the WAN interface for the specific hosts(s) you want to keep VPN-only. At the very least the UI should correctly reflect that setting WAN to "never" in the gateway group doesn't produce the expected behavior... it shouldn't be an option if it doesn't actually work.
Updated by Chris Buechler about 9 years ago
- Status changed from New to Not a Bug
"never" on a gateway group means that gateway is never a member of the group. that works fine.
See System>Advanced, Misc, "By default, when a rule has a gateway specified and this gateway is down, the rule is created omitting the gateway. This option overrides that behavior by omitting the entire rule instead." That's exactly how it works.
enable that option, add a block rule beneath the pass specifying the VPN gateway, and you have the desired end result