Project

General

Profile

Actions

Bug #14898

closed

Suricata core dumps with signal 11

Added by Marcos M about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Suricata
Target version:
-
Start date:
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Affected Version:
Affected Plus Version:
23.09
Affected Architecture:
amd64

Description

I installed Suricata on a system with previous config using Legacy Mode, Enable/Disable/Drop SID lists. After attempting to start it without performing other actions, it crashed:

Oct 19 10:38:12 kernel pid 18065 (suricata), jid 0, uid 0: exited on signal 11 (core dumped)

I then went to Services > Suricata > SID Management, checked Rebuild, and saved. That caused it to rebuild, but it crashed again (log reversed):

Oct 19 10:44:20 kernel pid 38878 (suricata), jid 0, uid 0: exited on signal 11 (core dumped)
Oct 19 10:44:20 php 31813 [Suricata] Suricata START for WAN...
Oct 19 10:44:19 php 31813 [Suricata] Building new sid-msg.map file for ISP1...
Oct 19 10:44:19 php 31813 [Suricata] Enabling any flowbit-required rules for: ISP1...
Oct 19 10:44:17 php 31813 [Suricata] Updating rules configuration for: ISP1 ...
Oct 19 10:44:16 php-fpm 410 Starting Suricata on ISP1 per user request...

Manually starting it again then succeeded (and continued to work after rebooting):

root 58585 0.2 7.2 651116 596984 - Ss 10:47 1:04.36 |-- /usr/local/bin/suricata -i vmx1 -D -c /usr/local/etc/suricata/suricata_41734_vmx1/suricata.yaml --pidfile /var/run/suricata_vmx141734.pid


Files

coredump.7z (644 KB) coredump.7z Marcos M, 10/19/2023 04:55 PM
suricata.zip (1.8 MB) suricata.zip Marcos M, 10/31/2023 11:33 PM
Actions #1

Updated by Marcos M about 1 year ago

This time it continued to crash after an update to the latest 23.09 snap. It seems to be related to the existence of a VIP.

With the IP Alias VIP 2001:db8:db8:ccc::a/128: Show

After removing the VIP, it then starts: Show

New core dump is attached.

Actions #2

Updated by Bill Meeks about 1 year ago

Thank you Marcos for the hint about the VIP. I am investigating. The crash is happening within a portion of the custom Legacy Blocking Mode plugin used on pfSense (via a custom patch) where a thread is created to subscribe to and monitor system routing messages so Suricata is aware of any firewall interface IP changes and therefore will not block them.

Actions #3

Updated by Bill Meeks about 1 year ago

I have not been able to reliably reproduce this crash, but I am testing on pfSense 2.7.0 CE with the latest Suricata 7.0.2 binary from upstream and not on pfSense Plus 23.09 as Marcos did. I can add and remove a virtual IP from pfSense without producing a crash during the operation. I tested adding the exact same IPv6 address as noted above to the localhost interface. I can see the kernel socket routing messages logged in the suricata.log file as expected when I add or delete the virtual IP on the interface. So, I'm not sure now my original hypothesis is valid.

Here is a log snippet showing the automatic firewall interface IP address monitoring thread sensing the addition of a virtual IP to a firewall interface from the option under FIREWALL > VIRTUAL IPs and then updating the internal automatic interface Pass List within the custom blocking module:

[100389 - ] 2023-11-09 14:18:47 Info: alert-pf: Received notification of IP address change on firewall interface lo0.
[100389 - ] 2023-11-09 14:18:47 Info: alert-pf: Added address 2001:0db8:0db8:0ccc:0000:0000:0000:000a to automatic firewall interface IP Pass List.

During another unrelated operation, my running Suricata instance on the test firewall did crash with a Signal 11 core dump. That dump indicated the crash occurred in the DatasetsInit() function, but the binary I was using was not built with debugging enabled and so there was no further helpful information. I've since compiled a 7.0.2 binary with debugging enabled and am letting it run hoping for another crash so I can examine the core file for a clue. This certainly appears to be somewhat random. At least two or three other users have reported similar Signal 11 Suricata crashes with the latest update.

Actions #4

Updated by Bill Meeks about 1 year ago

I may have found the culprit here (quite by accident I will admit). I think this commit by Kristof Provost might have fixed the Signal 11:

https://github.com/pfsense/FreeBSD-ports/commit/811780ff506a008abdc045d89c941529de38118a#diff-62fe346c471b44798a5f35942dfd9804014315c0f27b2e8ce690497f2d3c24db

I stumbled across it while trying to figure out why my diff patch for some changes I was making to the Legacy Blocking Mode custom plugin was seemingly omitting some code I had never seen when creating a Pull Request on the FreeBSD-ports repo of pfSense. Turns out Kristof made the change last week. I was not aware of the change and continued to work with my own private copy of the full source file for the custom blocking plugin. I have synced my full source file with Kristof's changes so we are good going forward.

Please retest this scenario after I post the Suricata 7.0.2 package to the DEVEL snapshots branch.

Actions #5

Updated by Kris Phillips about 1 year ago

  • Status changed from New to Confirmed

Users on the forums seem to have worked around the issue and seem to believe it's a Hyperscan issue.

https://forum.netgate.com/topic/183878/after-upgrade-to-pf-23-09-surricata-says-it-s-starting-but

Seems we have enough information to confirm this is an issue, so I'm marking this as Confirmed.

Actions #6

Updated by Bill Meeks about 1 year ago

Kris Phillips wrote in #note-5:

Users on the forums seem to have worked around the issue and seem to believe it's a Hyperscan issue.

https://forum.netgate.com/topic/183878/after-upgrade-to-pf-23-09-surricata-says-it-s-starting-but

Seems we have enough information to confirm this is an issue, so I'm marking this as Confirmed.

No, let's be careful here not to confuse the two. There are two separate bugs that produce two different types of failure.

The Hyperscan bug is an erroneous Fatal Error call made by the Suricata binary in response to the Hyperscan library failing to compile certain pattern matcher patterns. That bug was fixed in Suricata binary version 7.0.1 upstream. The fix made upstream changes the internal Suricata code so that instead of calling its internal FatalErrorExit() function, it simply logs the failure to compile in suricata.log and then continues on ignoring just that particular pattern that failed to compile in Hyperscan. The 7.0.2 Suricata binary containing this upstream fix is in the pfSense 23.09 repo and awaiting package build and deployment.

The other problem causing the Signal 11 is potentially an issue around recent changes in the custom blocking plugin to accomodate libpfctl enhancements in FreeBSD. This bug is also manifesting in the Snort package because the custom blocking plugin used in the two packages shares much common code. The Signal 11 bug is still being actively investigated. I have experienced the bug three times now over the last 2 days testing on a CE 2.7.0-RELEASE machine. Unfortunately the actual cause has not been identified yet, but I'm searching. It takes a bit for the bug to trigger (an hour and sometimes many hours). That's why I initially had difficulty reproducing it.

Actions #7

Updated by Bill Meeks about 1 year ago

This bug has likely been traced to the particular version of the libpfctl library bundled with pfSense CE 2.7.0, 2.7.1, and pfSense Plus 23.09. A fix for the libpfctl library package was submitted by its maintainer here: https://github.com/pfsense/FreeBSD-ports/commit/36019faf7b771be00808b184eda565f346c5ed5b.

Some additional code cleanup was done in the Suricata custom output plugin used on pfSense to implement Legacy Blocking Mode. The pull request containing those code fixes is awaiting review and merge here: https://github.com/pfsense/FreeBSD-ports/pull/1325.

After these changes are all merged and new packages are built, final confirmation testing can performed.

Actions #8

Updated by Bill Meeks about 1 year ago

Pull request 1333 for the RELENG_2_7_2 branch of FreeBSD-ports has been submitted to address this issue.

https://github.com/pfsense/FreeBSD-ports/pull/1333

Actions #9

Updated by Jim Pingle about 1 year ago

  • Status changed from Confirmed to Resolved
  • % Done changed from 0 to 100

PRs merged, thanks!

Actions #10

Updated by Bill Meeks about 1 year ago

Additional update for this issue for a complete history:

Two additional heap memory buffer overflow bugs were recently discovered in the custom Legacy Blocking Module code used with Suricata on pfSense. Those memory overflows were found during testing with the llvm ASAN tool enabled. It is highly likely these memory buffer overflows contributed to the Hyperscan bug and to other Signal 11 segfault bugs (including the one described in this issue) experienced when using Legacy Blocking Mode with Suricata 7.x. The newly identified bugs were fixed in this pull request: https://github.com/pfsense/FreeBSD-ports/pull/1337.

Actions #11

Updated by Jim Pingle about 1 year ago

PR merged, thanks!

Actions

Also available in: Atom PDF