Project

General

Profile

Actions

Bug #6863

closed

pf states reset by CARP neighbor

Added by Alex Kolesnik over 7 years ago. Updated over 7 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
CARP
Target version:
-
Start date:
10/18/2016
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
Affected Architecture:
i386

Description

There are two pfsense routers (version 2.3.2-RELEASE-p1, but I've faced this issue 1st time on 2.2.5/2.2.6) in HA mode. Sometimes one of the routers starts to drop traffic by resetting firewall states. Most times it happened on MASTER node, today it's happened on the BACKUP node: I opened SSH connection to the BACKUP node and it was stalled after a few seconds. tcpdump showed incoming tcp packets to port 22, but there were no replies. The states table didn't have an entry for my ssh connection.
After some researching I've found that the issue has gone after I switch the CARP interface off (cleared the "Enable" checkbox in the web interface). Restoring the
interface back re-started the issue. Temporal disabling CARP (on the BACKUP node) instead of switching the interface off doesn't help.

Let me know if you need any other info.

Please, help figure this out!

Actions #1

Updated by Jim Pingle over 7 years ago

  • Status changed from New to Rejected
  • Priority changed from Very High to Normal
  • Affected Version deleted (2.3.2)

That is normal and expected when the two units are properly synchronizing states. Find what is clearing the states and fix the problem (down gateway with state killing, for example)

When you fix the network/config problem, the undesirable aspect of this correct behavior will go away.

Actions #2

Updated by Alex Kolesnik over 7 years ago

Jim, thanks for your explanation! This what I'm trying to detect - what exactly clearing the states. I know, Redmine is not the best place for such questions, but I would really appreciate if you gave me a hint on where to look at? I've already tried a lot of things, including checking Schedule States (it's ticked) and Killing States (it's cleared) checkboxes, pfctl -xl, daemons stopping, reconfiguring sync from scratch using LAGG instead of plain Ethernet, asked IRC people - absolutely nothing helps. I'm completely lost.

Actions

Also available in: Atom PDF