Project

General

Profile

Bug #5715

Rare lighttpd crash, core dump, when loading login page or on initial visit

Added by Jim Pingle over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Web Interface
Target version:
Start date:
12/30/2015
Due date:
% Done:

100%

Estimated time:
Affected Version:
2.3
Affected Architecture:
amd64

Description

Hitting a rare crash in lighttpd somehow, details are sketchy yet. System is idle for a while, trying to hit the GUI fails to load the login page. The crash happens when the browser attempts to load the login page.

Three examples:

Dec 23 08:13:03 clara kernel: pid 46023 (lighttpd), uid 0: exited on signal 10 (core dumped)

Dec 23 19:45:06 fw1 kernel: pid 25778 (lighttpd), uid 0: exited on signal 10 (core dumped)

Dec 29 08:38:40 jack kernel: pid 77860 (lighttpd), uid 0: exited on signal 10 (core dumped)

Restarting lighttpd brought it back to a working state in each case. No log entries before or after that appear to be connected.

These are all running 2.3, current snapshots at the time, using HTTPS, running amd64.

History

#1 Updated by Chris Buechler over 3 years ago

probably don't need to do anything with this one because of #5719

#2 Updated by Renato Botelho over 3 years ago

  • Status changed from Confirmed to Closed
  • % Done changed from 0 to 100

lighttpd is out of the game, it's nginx era now

#3 Updated by Orion Poplawski over 3 years ago

I'm seeing this pretty regularly trying to configure a brand new SG-4860 with 2.2.6. So I don't think we're quite into the nginx era yet...

#4 Updated by Phillip Davis over 3 years ago

The issue is resolved in the new 2.3-BETA - which is in BETA :) so that is why it is marked resolved here.

#5 Updated by Chris Buechler over 3 years ago

Orion: this is a very unusual circumstance, if you're hitting something repeatedly it's probably something different. Please get in touch with us via support. We're not fixing this because it's so rare, and lighttpd doesn't exist in the next release.

#6 Updated by Jim Thompson over 3 years ago

Signal 11 is "Bus Error".

Causes include: invalid address alignment (accessing a multi-byte value at an odd address), accessing a physical address that does not correspond to any device, or some other device-specific hardware error. A bus error triggers a processor-level exception which Unix translates into a "SIGBUS" signal which, if not caught, will terminate the
current process.

This can be caused by hardware error, or memory problems.

IIRC, lighttpd can use mmap and that's the source of a number of issues. Google for "lighttpd bus error" and see.

I think we should ensure that a) your hardware is ok, and b) the image on disk is OK. As Chris states, best to get in-touch with support.

#7 Updated by Guy Baconniere over 3 years ago

Since we updated our pfSense from 2.1.x to 2.2.x (currently 2.2.6),
lighttpd is crashing randomly when I try to edit an alias with a lot
of networks (like 5 pages) and Save, Apply Changes, Edit it again a
couple of times.

I don't think it's hardware or memory related as it worked perfectly
with 2.1.x.

The workaround to re-enable to webConfigurator GUI is to SSH to the pfSense
and do 11 (Restart webConfigurator) OR 8 (Shell) on the main menu and type
/usr/local/sbin/lighttpd -f /var/etc/lighty-webConfigurator.conf

Crash of lighttpd over couple hours of editing (a pain)

pid 86866 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 71868 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 73118 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 20066 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 71011 (lighttpd), uid 0: exited on signal 6 (core dumped)
pid 4237 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 68590 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 24707 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 29717 (lighttpd), uid 0: exited on signal 10 (core dumped)
pid 49046 (lighttpd), uid 0: exited on signal 10 (core dumped)

Why not using nginx with php5-fpm and FastCGI?

#8 Updated by Jim Pingle over 3 years ago

Please read the ticket updates before yours. On pfSense 2.3 we are using nginx. That work isn't being backported to 2.2.x, there will be no more 2.2.x releases. pfSense 2.3 is in RC right now, given your situation it is likely more stable than what you have with 2.2.x, unless your hardware really does have an issue.

Also available in: Atom PDF