Project

General

Profile

Bug #8555

Selectively killing states on WAN failure

Added by Steven Brown over 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Multi-WAN
Target version:
-
Start date:
06/06/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
All
Affected Architecture:
All

Description

The current options on a WAN failure is to kill all states, or none at all. In a scenario such as having a wireless link is installed as a backup, this leaves all your connections being dropped if the wireless backup link goes offline or not dropping connection states and having devices that don't fail over to the backup link properly if your main link goes offline. With something like VOIP this can result in dropped calls when the backup connection fails or phones going dead and not failing over if the main link fails.

Killing states was looked at in Bug #3181, and there is a comment "Wiping the entire state table is overkill, but will have to suffice for 2.1", but the code doesn't look to have been changed since then.

There is code in /etc/rc.kill_states that attempts to selectively kill states based on the states found on a failed interface. I have taken this, modified it and added it in to /etc/inc/filter.inc to try to handle these situations so connections will fail over to a backup gateway without the need to kill all active states on non-failed gateways.

I have attached two patches. One takes the code from /etc/rc.kill_states and only kills the connections based on IPs which match associated NAT states, along with all connections on the interface. The other expands this code and finds and kills all connections based on IPs which match any connection state on that interface, NAT or not, IPv4 and IPv6.

There is a situation where if certain IP pairs have connections out two different gateways, for example if different connections from the same source to the same destination were routed out two different gateways, it will drop the connections which were on going through the non-failed gateway as well, but this is still less of an impact compared to killing all states in the state table.

Possible improvements to these patches:
  • Moving this code into its own function if the logic can be shared by these two areas.
  • Fix the code path such that routing fails to a backup gateway before the states are killed. The code to kill states seems to be called multiple times (some in different threads) on gateway failover. I've noted that after the first call to kill states, connection attempts directly after this may still attempt to go out the failed gateway. Further calls to kill states happen subsequently and the connections will eventually fail over, but this seems to take extra time than may be necessary.

On a side note, I also discovered that the original code in /etc/rc.kill_states has a bug preventing it from working as expected - Bug #8554

patch.nat_only (1.32 KB) patch.nat_only Kills only NAT connections on other interfaces associated with the failed interface Steven Brown, 06/06/2018 08:41 PM
patch.all_connections (2.26 KB) patch.all_connections Kills all connections on other interfaces associated with the failed interface Steven Brown, 06/06/2018 08:41 PM
FreeBSD-src.patch (12.2 KB) FreeBSD-src.patch Steven Brown, 07/15/2018 09:44 PM
pfsense.patch (1.46 KB) pfsense.patch Steven Brown, 07/15/2018 09:44 PM

History

#1 Updated by Jim Pingle over 1 year ago

  • Category set to Multi-WAN
  • Affected Version set to All
  • Affected Architecture set to All

The reason we have not taken these approaches is primarily because they do not scale. Some people have state tables with hundreds of thousands or millions of entries. PHP code like this will quickly fail with that volume. The processing would have to be handled in a significantly different manner for it to have any chance of working, and would still be slow on affected systems.

If you or someone else can come up with a method that scales well into the millions of states, we'd be interested in having a solution.

#2 Updated by Steven Brown over 1 year ago

I've been working on doing this better and I believe I've come up a solution. I'd appreciate if someone could review these patches to make sure it is all on the right track. My C programming could be rusty and I don't really have the resources to do extensive testing on them, including with the amount of states you'd like to scale to. If I've understood the code correctly, I don't think the performance should be too different from a "pfctl -Fs" call.

The problem I see it is two main parts:
- killing a state would often leave behind a matching state in the opposite direction that is not killed off
- there was no easy way to match states to a gateway (you kill states matched to an interface, but an interface may have multiple gateways)

From reading through the pf code, I've found that a state contains a rt_addr, which seems to correspond to the gateway a state will route through as defined by a policy based routing rule. There is also a function pf_find_state_all, which matches a key (address family, protocol, src & dst addr, src & dst port) and returns a matching state. From what I can tell this uses a hash structure so it should be very quick.

What my kernel patches try to do is:
- add a search parameter to be able to kill states based off the rt_addr
- add a flag which will tell the system to look for and kill a state in the opposite direction based off the specified key and kill it off along with the original state being killed

From there we can patch filter.inc to:
- search for a down gateway and then kill states based off the gateway IP (which will usually be an IN LAN interface state) and also kill matching (OUT WAN interface) states. Killing by gateway here only really applies to states created using policy based routing. Otherwise, states which route using the internal routing table don't define a rt_addr address (more accurately it shows as 0.0.0.0, ::).

The kernel patches also allow us to kill states based of interface and their matching states in the opposite direction with a command like:

 # /sbin/pfctl -M -i interface -Fs

What I'd appreciate feedback or review on is some of the kernel code logic to make sure I have understood it correctly.

In particular:

if (s->direction == PF_OUT) {
  dir = PF_IN;
  idx = PF_SK_STACK;
} else {
  dir = PF_OUT;
  idx = PF_SK_WIRE;
}

This is supposed to select which key from the original state we look for a match from. As I understand, an inbound packet should use PF_SK_WIRE (packets as they come in off the wire) and outbound PF_SK_STACK (packets as they are represented in the kernel). This seems to work for my crude test but if someone could verify that is correct. Basically this key is what I take, swap the src, dst and direction around and then search for a match.

Also, I'm not 100% sure about:

match = pf_find_state_all(&key, dir, &more);
if (match && !more) {
  pf_unlink_state(match, 0);
  killed++;
}

This looks for a calls the find function and this function will return if there are more states available than the one matched. I'm only expecting one matched state so if more != 0 I skip killing the matched state. I'm not sure if this is correct or if maybe I should kill the state returned and then loop around until there are no more matching states? The other reference to pf_find_state_all under the DIOCNATLOOK ioctl call treats a more value > 1 as an error. The more variable looked to be returned with a 0 value in my tests.

Also, as I read it the pf_find_state_all function will not return with the hash row locked, so I am passing 0 as the second argument to pf_unlink_state. If someone could confirm that is correct because almost everywhere else I could find locks the hash row and calls pf_unlink_state with PF_ENTER_LOCKED.

#3 Updated by Steven Brown over 1 year ago

I'm not sure if I need to specify, these patches are for FreeBSD-src 'origin/RELENG_2_4' and pfsense 'origin/RELENG_2_4_3'.

#4 Updated by Luke Hamburg about 1 year ago

Steven, pretty impressive work you've done there. How have these patches been working for you? Have you gotten any other feedback on them?

#5 Updated by Steven Brown about 1 year ago

Unfortunately, I never really had the opportunity to create a proper complete build or run this outside a virtual environment. My tests involved compiling the kernel or pfctl components on a development machine and then copying them over into a pfSense virtual machine where I had edited some PHP files and then running my tests.

I made it as far as setting up some git repositories and putting my patches into it but I never quite nailed the process to build a full image. I don't code in C very often and have never ventured down to the kernel level before so I wasn't confident to submit them as pull requests in to the main code line as I really couldn't be sure I haven't broken something with these changes. In the end time became a factor and I reverted to using a PHP only based patch as it was good enough to fit our scenario and I was happy I could deploy it fairly simply using the System Patches package.

As far as the performance I was able to achieve in selectively killing states using the kernel patches it was much better than the PHP only solution. Quite similar to just using a standard pfctl -Fs and no limitation on the number of states it can handle.

#6 Updated by Luke Hamburg about 1 year ago

Well it still could be worth submitting the PR to get some other eyes on it.

Also, having it up on Github would make it easier for other people to test the patches (myself included)

#7 Updated by Steven Brown about 1 year ago

I've created a pull request for the system part at https://github.com/pfsense/FreeBSD-src/pull/11

I'm guessing that isn't the right branch for new updates, but I'm hoping someone might be able to take it from there to wherever it needs to go.

Also available in: Atom PDF