Bug #8056
closedBridge + CARP crashes/freezes pfSense
100%
Description
Same behavior as the linked bug below: running CARP on a bridge interface and sending any non-trivial amount of traffic to the CARP IP results in freezing pfSense.
Older issue: https://redmine.pfsense.org/issues/4607
On a VirtualBox VM the VM just freezes, whereas on real hardware the hardware did not completely freeze (e.g. the serial console was somewhat usable), but various processes ended up in locked state, as per the symptoms drescribed in https://forum.pfsense.org/index.php?topic=139030.0
The problem is mitigated when traffic is sent to the pfSense interface IP instead of the CARP IP.
Files
Updated by Anonymous almost 7 years ago
More context: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200319
This configuration works well on 2.3.3+ (didn't test any previous releases), but fails on 2.4.1.
Updated by Anonymous almost 7 years ago
Re-tested a few days ago on 2.4.2 and I can observe the same crash.
Can anyone move this report to status Confirmed, since several people have reported the same issue in the linked forum thread?
Updated by Harry Coin almost 7 years ago
Confirmed. For detail, see this.
https://redmine.pfsense.org/issues/8145
Updated by Harry Coin almost 7 years ago
PF deadlocks once every 3 hours or so. There's a process holding a lock (carp lock, bridge lock)? which then I think fires off an ifconfig which in part wants to display carp status and there it sits. Interestingly VGA/keyboard ops are locked, but it is still possible to run any non-network related thread via the serial connection. There's lots of detail on the above referenced report.
Updated by Harry Coin almost 7 years ago
This is observed on pfsense running in a QEMU/KVM host running Ubuntu/"artful".
Updated by Harry Coin almost 7 years ago
Happens on both e1000 drivers and virtio drivers.
Updated by James Freeman almost 7 years ago
Confirmed - I can also replicate this easily. CARP on a bridged interface, tested on 2.4.2 and 2.4.2_1 with no change. pfSense running on VMware ESXi 6.5 and VMware Workstation 14, e1000 emulated NIC's, fully repeatable on both platforms. Happy to help with any testing required on this.
Updated by Adam Boyhan over 6 years ago
Confirmed - We have 2 Netgate 8860 1u appliances setup with CARP + Bridge and when upgrading from 2.3.4 to 2.4.2_1 we hit this bug on both firewall's. Sometimes we would be ok for 10-15 minutes, other times we would make it past a hour of uptime. Ended up having to go back to 2.3.4 which is simply rock solid.
Updated by Jim Pingle over 6 years ago
- Category set to CARP
- Priority changed from High to Normal
- Affected Version changed from 2.4.1 to 2.4.x
- Affected Architecture All added
- Affected Architecture deleted (
)
The underlying FreeBSD bug is still open:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200319
The previous patch that was on 2.2.x and 2.3.x had some issues and was not accepted by FreeBSD:
https://reviews.freebsd.org/D3133
Updated by Luiz Souza over 6 years ago
- Status changed from New to Confirmed
- Assignee set to Luiz Souza
Updated by Anonymous over 6 years ago
The previous patch works well on 2.3.x. Is it possible to apply the same patch for 2.4.x while FreeBSD folks decide what to do next?
Updated by Luiz Souza over 6 years ago
- Priority changed from Normal to High
- Affected Version changed from 2.4.x to 2.4.3
Updated by Luiz Souza over 6 years ago
- Target version set to 2.4.3
- Affected Version changed from 2.4.3 to 2.4.x
Set target.
Updated by Scott Maxwell over 6 years ago
I have exactly the same issue with my pfSense setup on a Netgate Physical Appliance. Is there any ETA when this will be resolved ?
Updated by Andreas Kaindl over 6 years ago
I also have exactly the same issue on netgate appliances 8860. I first thought it is a hardware problem and migrated the config to a pair of SG4860. after 10 mins the same problem again
Updated by Simon Kristensen over 6 years ago
I just upgrade my pfsense from 2.3.4-p1 to 2.4.2-Release-p1.
Now I also have the same issue.
Any news on this, Luiz :-) ?
Thanks
Updated by Simon Kristensen over 6 years ago
Simon Kristensen wrote:
I just upgrade my pfsense from 2.3.4-p1 to 2.4.2-Release-p1.
Now I also have the same issue.Any news on this, Luiz :-) ?
Thanks
Back to 2.3.5-p1 and it works again.
Updated by Luiz Souza over 6 years ago
- Status changed from Confirmed to Feedback
- % Done changed from 0 to 100
This issue seems to be fixed (again) in my local tests.
Please check with tomorrow's snapshot.
Updated by Steve Wheeler over 6 years ago
I have tested this. I could easily trigger it in 2.4.2_1 but could not in current snaps. It looks to be solved.
Anyone who was hitting this and is able to please test current 2.4.3 snapshots.
Updated by Jim Pingle over 6 years ago
- Status changed from Feedback to Resolved
Tested and resolved.