Bug #8056
closed
Bridge + CARP crashes/freezes pfSense
Added by Anonymous about 7 years ago.
Updated over 6 years ago.
Affected Architecture:
All
Description
Same behavior as the linked bug below: running CARP on a bridge interface and sending any non-trivial amount of traffic to the CARP IP results in freezing pfSense.
Older issue: https://redmine.pfsense.org/issues/4607
On a VirtualBox VM the VM just freezes, whereas on real hardware the hardware did not completely freeze (e.g. the serial console was somewhat usable), but various processes ended up in locked state, as per the symptoms drescribed in https://forum.pfsense.org/index.php?topic=139030.0
The problem is mitigated when traffic is sent to the pfSense interface IP instead of the CARP IP.
Files
Re-tested a few days ago on 2.4.2 and I can observe the same crash.
Can anyone move this report to status Confirmed, since several people have reported the same issue in the linked forum thread?
PF deadlocks once every 3 hours or so. There's a process holding a lock (carp lock, bridge lock)? which then I think fires off an ifconfig which in part wants to display carp status and there it sits. Interestingly VGA/keyboard ops are locked, but it is still possible to run any non-network related thread via the serial connection. There's lots of detail on the above referenced report.
This is observed on pfsense running in a QEMU/KVM host running Ubuntu/"artful".
Happens on both e1000 drivers and virtio drivers.
Confirmed - I can also replicate this easily. CARP on a bridged interface, tested on 2.4.2 and 2.4.2_1 with no change. pfSense running on VMware ESXi 6.5 and VMware Workstation 14, e1000 emulated NIC's, fully repeatable on both platforms. Happy to help with any testing required on this.
Confirmed - We have 2 Netgate 8860 1u appliances setup with CARP + Bridge and when upgrading from 2.3.4 to 2.4.2_1 we hit this bug on both firewall's. Sometimes we would be ok for 10-15 minutes, other times we would make it past a hour of uptime. Ended up having to go back to 2.3.4 which is simply rock solid.
- Category set to CARP
- Priority changed from High to Normal
- Affected Version changed from 2.4.1 to 2.4.x
- Affected Architecture All added
- Affected Architecture deleted (
)
- Status changed from New to Confirmed
- Assignee set to Luiz Souza
The previous patch works well on 2.3.x. Is it possible to apply the same patch for 2.4.x while FreeBSD folks decide what to do next?
- Priority changed from Normal to High
- Affected Version changed from 2.4.x to 2.4.3
- Target version set to 2.4.3
- Affected Version changed from 2.4.3 to 2.4.x
I have exactly the same issue with my pfSense setup on a Netgate Physical Appliance. Is there any ETA when this will be resolved ?
I also have exactly the same issue on netgate appliances 8860. I first thought it is a hardware problem and migrated the config to a pair of SG4860. after 10 mins the same problem again
I just upgrade my pfsense from 2.3.4-p1 to 2.4.2-Release-p1.
Now I also have the same issue.
Any news on this, Luiz :-) ?
Thanks
Simon Kristensen wrote:
I just upgrade my pfsense from 2.3.4-p1 to 2.4.2-Release-p1.
Now I also have the same issue.
Any news on this, Luiz :-) ?
Thanks
Back to 2.3.5-p1 and it works again.
- Status changed from Confirmed to Feedback
- % Done changed from 0 to 100
This issue seems to be fixed (again) in my local tests.
Please check with tomorrow's snapshot.
I have tested this. I could easily trigger it in 2.4.2_1 but could not in current snaps. It looks to be solved.
Anyone who was hitting this and is able to please test current 2.4.3 snapshots.
- Status changed from Feedback to Resolved
Also available in: Atom
PDF