Bug #10513


State issues with policy routing and HA failover

Added by Jim Pingle almost 4 years ago. Updated almost 2 years ago.

Rules / NAT
Target version:
Start date:
Due date:
% Done:


Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
Affected Architecture:


Seeing some odd behavior on HA pairs which have multiple WANs and use policy routing. In some cases, the states for a client disappear when failing over. In others, the state is present but the traffic may be egressing the wrong interface.

Consider this scenario:

WAN1 is default, some clients policy routed out WAN2. In this example,

Start a TCP connection from to an Internet host. States on both and packet capture on primary show the traffic entering LAN, exiting WAN2 (OK)

Put the primary node into CARP maintenance mode. State is OK on primary. The state, which was there moments ago, is no longer in the state table on the secondary. Traffic from the client stops entirely.

Take the primary node out of CARP maintenance mode. States and packet capture on primary still show the traffic entering LAN, exiting WAN2 (OK).

Wait a bit and the state eventually re-syncs to the secondary node.

Now put the primary node back into CARP maintenance mode again. States on the secondary still show the traffic entering LAN, exiting WAN2 (OK) but the packet capture shows the packets actually leaving WAN1, with the address of WAN2 on the packets.

Note that if this is tested with ICMP, the second step will be different, as ICMP will result in a new state created to replace the missing state. That case appears to show the problem on the first fail back instead of taking a second turn.

Tested on 2.5.0.a.20200430.0741 (12.1-STABLE) but we have a report from a customer who is seeing this happen on 2.4.5-RELEASE


Also available in: Atom PDF