Project

General

Profile

Actions

Bug #8022

closed

radvd receives SIGBUS on SG-3100 (ARM)

Added by Leif Huhn over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
10/27/2017
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.4.1
Affected Architecture:
SG-3100

Description

Hi,

I just received my first pfsense box, the SG-3100. I tried to setup IPv6 on the LAN and advertise the network by going to Services / DHCPv6 Server & RA / LAN / Router Advertisements. After making the config on this page, dmesg displays many such lines:

pid 31338 (radvd), uid 0: exited on signal 10 (core dumped)
pid 80493 (radvd), uid 0: exited on signal 10 (core dumped)
pid 83450 (radvd), uid 0: exited on signal 10 (core dumped)
...

Here is one config I tried among many:

# Automatically Generated, do not edit
# Generated for DHCPv6 Server lan
interface mvneta1 {
        AdvSendAdvert on;
        MinRtrAdvInterval 5;
        MaxRtrAdvInterval 20;
        AdvLinkMTU 1500;
        AdvDefaultPreference medium;
        AdvManagedFlag on;
        AdvOtherConfigFlag on;
        prefix 2001:470:c279::/64 {
                DeprecatePrefix on;
                AdvOnLink on;
                AdvAutonomous off;
                AdvRouterAddr on;
                AdvValidLifetime 86400;
                AdvPreferredLifetime 14400;
        };
        route ::/0 {
                RemoveRoute on;
        };
        RDNSS 2001:470:c279::1 { };
        DNSSL localdomain  { };
};

It may be that radvd is misbehaving on an ARM platform. Processes will receive SIGBUS if they try to access structures that are unaligned.

Actions #1

Updated by Jim Thompson over 6 years ago

  • Assignee set to Luiz Souza
  • Target version set to 2.4.2
  • Affected Version set to 2.4.1
Actions #2

Updated by Leif Huhn over 6 years ago

I'm trying to install gdb to debug this but when I run:

pkg add http://pkg.freebsd.org/FreeBSD:11:armv6/release_1/All/gdb-7.12.1_2.txz

and use gdb I get something like this:

Starting program: /usr/local/sbin/radvd -n -m logfile -l file
During startup program exited normally.

It never actually runs the program. Something must be wrong with how I'm installing or using gdb.

Actions #3

Updated by Jim Thompson over 6 years ago

Leif Huhn wrote:

I'm trying to install gdb to debug this but when I run:

pkg add http://pkg.freebsd.org/FreeBSD:11:armv6/release_1/All/gdb-7.12.1_2.txz

and use gdb I get something like this:

Starting program: /usr/local/sbin/radvd -n -m logfile -l file
During startup program exited normally.

What's happening is that radvd is forking a process and the parent is returning.

You'll need to set "-d #" to prevent the fork() call.

It never actually runs the program. Something must be wrong with how I'm installing or using gdb.

No, you just need a minor tweak to what you're trying.

try:

gdb radvd
(gdb) run -d 5 -n -m logfile -l file

Actions #4

Updated by Leif Huhn over 6 years ago

That doesn't seem to be it. -n is actually the option to prevent forking, and in fact gdb is unable to debug any programs for me. The message "During startup program exited normally" indicates that main was never reached.

Actions #5

Updated by Leif Huhn over 6 years ago

I don't know what is happening with gdb but I'm working around it by starting the process before entering gdb.

[2.4.1-RELEASE][admin@pfSense.localdomain]/root: /usr/local/bin/bash
[root@pfSense ~]# echo $BASHPID; kill -STOP $BASHPID; exec /usr/local/sbin/radvd -n -d 5 -C /var/etc/radvd.conf 
2331

Suspended (signal)
[2.4.1-RELEASE][admin@pfSense.localdomain]/root: gdb -p 2331
GNU gdb (GDB) 7.12.1 [GDB v7.12.1 for FreeBSD]
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying" 
and "show warranty" for details.
This GDB was configured as "armv6-portbld-freebsd11.0".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 2331
Reading symbols from /usr/local/bin/bash...(no debugging symbols found)...done.
Reading symbols from /lib/libncurses.so.8...Reading symbols from /usr/lib/debug//lib/libncurses.so.8.debug...done.
done.
Reading symbols from /usr/local/lib/libintl.so.8...(no debugging symbols found)...done.
Reading symbols from /lib/libc.so.7...Reading symbols from /usr/lib/debug//lib/libc.so.7.debug...done.
done.
Reading symbols from /libexec/ld-elf.so.1...Reading symbols from /usr/lib/debug//libexec/ld-elf.so.1.debug...done.
done.
[Switching to LWP 100149 of process 2331]
0x202634d4 in kill () from /lib/libc.so.7
(gdb) c
Continuing.
process 2331 is executing new program: /usr/local/sbin/radvd

Program received signal SIGBUS, Bus error.
0x000121c0 in ?? ()
(gdb) display/i $pc
2: x/i $pc
=> 0x121c0:     stm     r0, {r3, r4, r6}
(gdb) where
#0  0x000121c0 in ?? ()
#1  0x00000318 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 
Actions #6

Updated by Leif Huhn over 6 years ago

Admittedly the above isn't terribly useful without symbols.

Actions #7

Updated by Leif Huhn over 6 years ago

Signs are pointing to a corrupt stack. From https://stackoverflow.com/questions/27577179/signal-sigbus-on-a-line-with-no-memory-access

"If you've somehow got nonsense into the SP, e.g. by popping a corrupted stack frame previously, circumstances could unfold thus:

The nonsense value in SP is not 4-byte aligned, so since the architecture doesn't allow unaligned load/store multiple, the STM results in an alignment fault (alignment faults are higher-priority than any other MMU fault)."

Actions #8

Updated by Leif Huhn over 6 years ago

It looks like the radvd version is fairly old:

: radvd -v
Version: 1.9.1

Compiled in settings:
  default config file           "/usr/local/etc/radvd.conf" 
  default pidfile               "/var/run/radvd.pid" 
  default logfile               "/var/log/radvd.log" 
  default syslog facility       24
Please send bug reports or suggestions to Pekka Savola <pekkas@netcore.fi>.

From the CHANGES file in the port:

2012/06/19      Fixing bashism '==' in configure
                Updating to 1.9.1 and releasing
Actions #9

Updated by Leif Huhn over 6 years ago

I compiled 2.17 from ports on raspi2 and it runs on the SG-3100 without SIGBUS.

Actions #10

Updated by Jim Thompson over 6 years ago

Leif Huhn wrote:

I compiled 2.17 from ports on raspi2 and it runs on the SG-3100 without SIGBUS.

We know the port is old. There are patches for CARP that we’re determining what to do with in 2.17

Actions #11

Updated by Dave Pugh over 6 years ago

Leif Huhn wrote:

I compiled 2.17 from ports on raspi2 and it runs on the SG-3100 without SIGBUS.

This issue is affecting me as well. Any chance you could post the re-compiled version of radvd somewhere that I could download it?

Actions #12

Updated by Leif Huhn over 6 years ago

Dave I formatted over the memory card, but I bet this would work for you:

http://pkg.freebsd.org/FreeBSD:11:armv6/release_1/All/radvd-2.16.txz

Actions #13

Updated by Dave Pugh over 6 years ago

Leif Huhn wrote:

Dave I formatted over the memory card, but I bet this would work for you:

http://pkg.freebsd.org/FreeBSD:11:armv6/release_1/All/radvd-2.16.txz

Thank you - the radvd binary no longer crashes for me, though it didn't actually let me do IPv6 on my LAN. Is everything else working for you now?

(I'm not sure it makes sense to hijack this bug report thread that relates specifically to the SIGBUS - if I should move this conversation elsewhere, say so. Though this does appear to be an issue specific to the SG-3100)

On my SG-3100 with the binary you provided, I'm now getting a "can't join ipv6-allrouters on mvneta1" error a few times a minute and IPv6 is not being provided to clients on my LAN (though IPv6 works fine from the SG-3100 itself when pinging external IPv6 addresses from the diagnostics menu). This is with an out-of-the-box/factory configuration with nothing set except walking through the initial wizard.
Using a default factory configuration on an older pfsense box I have works, so I know IPv6 works with my ISP and such.

Here are the messages radvd spits out at debug level 5 (with my IPv6 obfuscated):
[Nov 07 23:52:13] radvd (51820): timer_handler called for mvneta1
[Nov 07 23:52:13] radvd (51820): ioctl(SIOCGIFFLAGS) succeeded on mvneta1
[Nov 07 23:52:13] radvd (51820): mvneta1 is up
[Nov 07 23:52:13] radvd (51820): mvneta1 is running
[Nov 07 23:52:13] radvd (51820): mvneta1 supports multicast
[Nov 07 23:52:13] radvd (51820): mtu for mvneta1 is 1500
[Nov 07 23:52:13] radvd (51820): link layer token length for mvneta1 is 48
[Nov 07 23:52:13] radvd (51820): prefix length for mvneta1 is 64
[Nov 07 23:52:13] radvd (51820): mvneta1 linklocal address: fe80::1:1
[Nov 07 23:52:13] radvd (51820): mvneta1 address: 2600:XXXX:XXXX:XXXX:XXXX:XXXX:XXXX:c242
[Nov 07 23:52:13] radvd (51820): mvneta1 address: fe80::1:1
[Nov 07 23:52:13] radvd (51820): can't join ipv6-allrouters on mvneta1
[Nov 07 23:52:13] radvd (51820): not sending RA for mvneta1, interface is not ready
[Nov 07 23:52:13] radvd (51820): send_ra_forall failed on interface mvneta1
[Nov 07 23:52:13] radvd (51820): mvneta1 next scheduled RA in 16 second(s)
[Nov 07 23:52:13] radvd (51820): polling for 16 second(s), next iface is mvneta1

Actions #14

Updated by Leif Huhn over 6 years ago

Dave I experienced exactly the same thing as you did. I don't know how to fix it. I was only setting up ipv6 for fun so I haven't continued to pursue it.

Actions #15

Updated by Luiz Souza over 6 years ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100

The package was upgraded to recent upstream version (2.17) just to let us discover that the original bug was still present.

The bug was tracked and fixed, the new (and working) version will be available in the next 2.4.2 snapshot.

Thanks.

Actions #16

Updated by Random User over 6 years ago

Luiz Souza wrote:

the new (and working) version will be available in the next 2.4.2 snapshot.

That commit rendered RADVD completely no-op. You might want to undo that urgently.

https://github.com/pfsense/FreeBSD-ports/commit/276e39e102a57ed209043c7ff4f69b5519e1f717
https://forum.pfsense.org/index.php?topic=139828.msg764163#msg764163
https://forum.pfsense.org/index.php?topic=139797.0

Actions #17

Updated by Daryl Morse over 6 years ago

Random User wrote:

Luiz Souza wrote:

the new (and working) version will be available in the next 2.4.2 snapshot.

That commit rendered RADVD completely no-op. You might want to undo that urgently.

https://github.com/pfsense/FreeBSD-ports/commit/276e39e102a57ed209043c7ff4f69b5519e1f717
https://forum.pfsense.org/index.php?topic=139828.msg764163#msg764163
https://forum.pfsense.org/index.php?topic=139797.0

I second this. RADVD is dead in the water.

Actions #18

Updated by Luiz Souza over 6 years ago

The regression was fixed.

Thanks for reporting.

Actions #19

Updated by Daryl Morse over 6 years ago

Luiz Souza wrote:

The regression was fixed.

Thanks for reporting.

Updated, looks good.

Actions #20

Updated by Renato Botelho over 6 years ago

  • Status changed from Feedback to Resolved
Actions #21

Updated by Dave Pugh over 6 years ago

I also just updated to 2.4.2.a.20171116.0841 and IPv6 looks to be working on my SG-3100. My client machines are getting IPv6 addresses now.

Actions

Also available in: Atom PDF