Project

General

Profile

Actions

Bug #8449

closed

FRR 4.0 zebra daemon crashes

Added by Jim Pingle almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
FRR
Target version:
Start date:
04/09/2018
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Affected Version:
2.4.4
Affected Plus Version:
Affected Architecture:
All

Description

The zebra daemon in FRR 4.0 won't stay running with a BGP configuration. It crashes on startup. OSPF alone seems to be OK.

Crash backtrace from SG-3100 that hits it:

: cat /var/tmp/quagga.zebra.crashlog
2017/12/21 14:13:18 ZEBRA: Assertion `node->lock > 0' failed in file table.c, line 201, function route_unlock_node
2017/12/21 14:13:18 ZEBRA: Cannot get backtrace, returned invalid # of frames -1 (valid range is between 1 and 20)
2017/12/21 14:13:18 ZEBRA: Current thread not known/applicable

Crash backtrace from an amd64 system:

2017/08/09 13:44:43 ZEBRA: Assertion `node->lock > 0' failed in file table.c, line 201, function route_unlock_node
2017/08/09 13:44:43 ZEBRA: Backtrace for 11 stack frames:
2017/08/09 13:44:43 ZEBRA: [bt 0] 0x8008cb958 <zlog_backtrace+0x28> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 1] 0x8008cbed0 <_zlog_assert_failed+0xa0> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 2] 0x8008bdd8e <route_unlock_node+0xfe> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 3] 0x8008bcc78 <if_terminate+0x48> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 4] 0x8008df063 <vrf_delete+0xa3> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 5] 0x8008df795 <vrf_terminate+0x35> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 6] 0x417f71 <zebra_zserv_socket_init+0x2731> at /usr/local/sbin/zebra
2017/08/09 13:44:43 ZEBRA: [bt 7] 0x8008d9007 <quagga_sigevent_process+0x47> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 8] 0x8008ba06f <thread_fetch+0x7af> at /usr/local/lib/libfrr.so.0
2017/08/09 13:44:43 ZEBRA: [bt 9] 0x4183f7 <main+0x3e7> at /usr/local/sbin/zebra
2017/08/09 13:44:43 ZEBRA: [bt 10] 0x4135cf <_start+0x17f> at /usr/local/sbin/zebra
2017/08/09 13:44:43 ZEBRA: Current thread not known/applicable

System log message is the same either way, signal 6.

Apr  9 11:50:16 river kernel: pid 40583 (zebra), uid 168: exited on signal 6

A forum user reports seeing a signal 11, waiting to see what hardware/setup they use:
https://forum.pfsense.org/index.php?topic=146410.0

cat /var/tmp/quagga.zebra.crashlog
ZEBRA: Received signal 11 at 1523178483 (si_addr 0x20); aborting...
Backtrace for 5 stack frames:
0x8008c9650 <zlog_backtrace_sigsafe+0x40> at /usr/local/lib/libfrr.so.0
0x8008c8e98 <zlog_signal+0x558> at /usr/local/lib/libfrr.so.0
0x8008dd344 <signal_init+0x244> at /usr/local/lib/libfrr.so.0
0x801596904 <pthread_sigmask+0x544> at /lib/libthr.so.3
0x801595e9f <pthread_getspecific+0xe2f> at /lib/libthr.so.3
no thread information available

Might be better to stay on FRR 3.0.x for the moment (maybe make a net/frr3 to be used by the GUI package, and keep frr as 4.0 so we can work with it still).

I didn't see any open issues on FRR's github that matched.

Actions

Also available in: Atom PDF