Bug #16836: IPsec daemon can crash if a peer initiates two rekeys for the same child SA - pfSense - pfSense bugtracker

Actions

Copy link

Bug #16836

open

IPsec daemon can crash if a peer initiates two rekeys for the same child SA

Added by David Hiebert 1 day ago. Updated about 14 hours ago.

Status:

Feedback

Priority:

Normal

Assignee:

Christian McDonald

Category:

IPsec

Target version:

2.9.0

Start date:

Due date:

% Done:

Estimated time:

Plus Target Version:

26.03.1

Release Notes:

Default

Affected Version:

Affected Architecture:

Description

Product / version
- pfSense Plus 25.11.1-RELEASE
- strongSwan version on 25.11.1: `strongswan-6.0.3` (confirmed via `pkg info strongswan`)
- strongSwan version on 26.03: `strongswan-6.0.3_1` (confirmed by launching the Netgate pfSense Plus 26.03 AWS Marketplace AMI and querying `pkg info strongswan`)
- The `_1` is a FreeBSD port revision bump; the CPE string still identifies the package as `strongswan:6.0.3`, `port_checkout_unclean: no`, and the upstream fix is not present.

Summary
Reproducible pattern of charon crashes on a pfSense Plus 25.11.1 IPsec concentrator. The crash signature matches upstream strongSwan issue strongswan/strongswan#2945 ("Crash caused if confused peer initiates two rekeyings for the same Child SA"), which was fixed in strongSwan 6.0.4 (released 2025-12-12). The crash has now been observed at least twice on the same host.

We have independently confirmed pfSense Plus 26.03 still bundles strongSwan 6.0.3 (port revision `_1`, no relevant patches). Request is that strongSwan >= 6.0.4 be shipped in a future pfSense Plus release or backported to the 25.11.x train.

Evidence

Kernel-level exit
```
kernel: pid <pid> (charon), jid 0, uid 0: exited on signal 6 (core dumped)
```
Signal 6 (SIGABRT) is charon's own abort() call from its internal signal handler after catching a critical signal (SIGBUS, signal 10 on FreeBSD).

charon in-process stack (from ipsec.log immediately before abort)
Fatal frame chain on the crashing worker thread:
```
child_delete_create+0x31a
<- task_manager_v2_create+0x2b22
<- delete_child_sa_job_create_id+0x103
<- processor_create
<- thread_create
```

A coredump was preserved on the host but will not be shared (process memory of an IPsec daemon — contains session key material). A sanitized symbolic backtrace can be provided on request.

Sequence at time of crash
1. ~7 minutes before the crash: CHILD_SA on a site-to-site tunnel completed a rekey cycle cleanly (SPI A → SPI B; old SA transitioned REKEYED → DELETING → DELETED).
2. A second rekey cycle on the same tunnel entered REKEYED → DELETED state.
3. A CHILD_DELETE job was dispatched on the already-rekeyed CHILD_SA.
4. Worker thread faulted inside `child_delete_create`.
5. strongSwan's signal handler caught SIGBUS, logged "killing ourself, received critical signal", dumped the stack, and called abort().

Matches the mechanism described in strongswan/strongswan#2944: a peer driving two sequential rekeys on the same CHILD_SA, leaving the original SA destroyed while a delete job still references it.

Upstream references
- https://github.com/strongswan/strongswan/issues/2945 — fixed in 6.0.4 ("Prevent a crash if a confused peer rekeys a Child SA twice before sending a delete")
- https://github.com/strongswan/strongswan/discussions/2944 — mechanism description
- 6.0.5 adds a defensive follow-on fix: "Avoid an incorrect down event if deleting a rekeyed Child SA fails"
- 6.0.6 (2026-04-22) includes several unrelated CVE fixes

Requests

1. Ship strongSwan >= 6.0.4 in a future pfSense Plus release. 6.0.5 preferred for the follow-on fix; 6.0.6 adds CVE fixes worth having.
2. Backport consideration: a targeted backport of the 6.0.4 child-rekey fix to a 25.11.x package update would let deployments on the current train avoid a major version upgrade. Is this feasible?
3. Interim mitigation: are there `charon.strongswan.conf` tuning options (rekey margins, `delete_rekeyed` behavior, related options) that would reduce exposure while awaiting a fixed version?

Impact
Production IPsec concentrator serving site-to-site VPN tunnels. A charon crash drops all tunnels on the host until the daemon is restarted, causing service interruption for every tunnel on the concentrator.

What can be provided on request
- Sanitized backtrace (`thread apply all bt`, `info locals` on the failing frame) — can be shared via a non-public channel if needed
- Timing of prior occurrence
- Peer IKE implementation / vendor (we have identified the specific peer driving the double-rekey pattern)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

pfSense

Custom queries

Bug #16836

IPsec daemon can crash if a peer initiates two rekeys for the same child SA

Updated by Christian McDonald 1 day ago

Updated by Christian McDonald 1 day ago

Updated by Jim Pingle about 14 hours ago