Project

General

Profile

Bug #7256

syslogd is not running after installing or uninstalling a package with logging (e.g. tinc, haproxy)

Added by Kill Bill about 1 month ago. Updated 16 days ago.

Status:
Confirmed
Priority:
Normal
Category:
Logging
Target version:
Start date:
02/13/2017
Due date:
% Done:

0%

Affected version:
2.3.x
Affected Architecture:
All

Description

Suspect it's related to Feature #4898. Last log entry:

Feb 12 14:17:54 gw php: /etc/rc.packages: Successfully installed package: haproxy.
Feb 12 14:17:54 gw kernel: done.

Reproduced on multiple boxes.

Associated revisions

Revision 8724b1ad
Added by Jim Pingle 16 days ago

Some small improvements to help with ticket #7256

They don't fix the problem but avoid some pitfalls that could contribute to it.

History

#1 Updated by Kill Bill about 1 month ago

Here's some more context:

Feb 12 14:16:27 gw php: /etc/rc.packages: The command '/usr/local/etc/rc.d/haproxy.sh stop' returned exit code '1', the output was 'Stopping haproxy. Stopping haproxy. No matching processes were found'
Feb 12 14:16:27 gw kernel: done.
Feb 12 14:16:36 gw check_reload_status: Syncing firewall
Feb 12 14:17:08 gw syslogd: exiting on signal 15
Feb 12 14:17:50 gw syslogd: kernel boot file is /boot/kernel/kernel
Feb 12 14:17:54 gw check_reload_status: Syncing firewall
Feb 12 14:17:54 gw php: /etc/rc.packages: Successfully installed package: haproxy.
Feb 12 14:17:54 gw kernel: done.

So, the first line would be the old haproxy version uninstall. Then a firewall is resynced, then syslogd gets a SIGTERM, firewall is resynced again on package new version install, and then the package install completes and syslogd is gone. Hmmmm... not exactly good.

EDIT: Again the same thing with upgrade to 0.52_6:

$ ps ax | grep 'syslogd'
72518  -  Ss       0:37.46 /usr/sbin/syslogd -s -c -c -l /var/dhcpd/var/run/log -l /tmp/haproxy_chroot/var/run/log -P /var/run/syslog.pid -f /var/etc/syslog.conf -b 10.0.0.254

After upgrade:

$ ps ax | grep 'syslogd'

System log:

Feb 14 14:58:58 gw php: /etc/rc.packages: The command '/usr/local/etc/rc.d/haproxy.sh stop' returned exit code '1', the output was 'Stopping haproxy. Stopping haproxy. No matching processes were found'
Feb 14 14:59:05 gw check_reload_status: Syncing firewall
Feb 14 14:59:34 gw syslogd: exiting on signal 15
Feb 14 14:59:34 gw syslogd: kernel boot file is /boot/kernel/kernel
Feb 14 15:00:07 gw syslogd: kernel boot file is /boot/kernel/kernel
Feb 14 15:00:12 gw check_reload_status: Syncing firewall
Feb 14 15:00:12 gw php: /etc/rc.packages: Successfully installed package: haproxy.

#2 Updated by Jim Thompson about 1 month ago

  • Assignee set to Jim Pingle
  • Priority changed from High to Normal

#3 Updated by Jim Pingle 21 days ago

  • Subject changed from (Re)installing haproxy kills syslogd to syslogd is not running after installing or uninstalling a package with logging (e.g. tinc, haproxy)
  • Target version set to 2.4.0

Was this on 2.3.2 or 2.3.3?

It also affects tinc, happens every time it is added or removed on 2.4

#5 Updated by Jim Pingle 21 days ago

  • Assignee changed from Jim Pingle to Renato Botelho

I thought it might be due to it running twice in some cases but even only being run once, for some reason it's not starting it back up. I changed the code to write out the command and it works when run by hand but not when installing or uninstalling these packages.

tinc does have a command in its install script to create/clear the log file which is unnecessary but it doesn't appear to be affecting this problem.

syslogd is being restarted in both the deinstall and post-deinstall stage of package removal but it should only be run during post-deinstall as at least with haproxy the settings are not fully removed during the deinstall step so for example it still tries to use the haproxy log socket at that point.

I can take another stab at it tomorrow but the more eyes on it the better.

It happens from the GUI or the shell so it's easy to see the problem by running commands in the shell:

ps uxaww | grep '[s]yslogd'
pkg install -y pfSense-pkg-tinc
ps uxaww | grep '[s]yslogd'
pkg delete -y tinc pfSense-pkg-tinc
ps uxaww | grep '[s]yslogd'

#6 Updated by Jim Pingle 21 days ago

  • Status changed from New to Confirmed

#7 Updated by Jim Pingle 16 days ago

Worked on this a bit more but still haven't nailed down the reason. It is executing the command to run syslog, it's just not staying up/running. No errors, debug mode or foreground mode for syslog shows nothing at all in the output. If I run the same command by hand, it works.

This seems vaguely familiar, similar to what we saw in years past with sigprocmask and ntp but toggling that made no difference either.

A snippet of the truss output just after syslogd was run:

58519: read(9,"<118>Mar  7 12:56:28 snmpd[58486"...,1023) = 237 (0xed)
58519: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGALRM },{ }) = 0 (0x0)
58519: sigprocmask(SIG_SETMASK,{ },{ SIGHUP|SIGALRM }) = 0 (0x0)
58519: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGALRM },{ }) = 0 (0x0)
58519: sigprocmask(SIG_SETMASK,{ },{ SIGHUP|SIGALRM }) = 0 (0x0)
58519: sigprocmask(SIG_BLOCK,{ SIGHUP|SIGALRM },{ }) = 0 (0x0)
58519: sigprocmask(SIG_SETMASK,{ },{ SIGHUP|SIGALRM }) = 0 (0x0)
58519: read(9,0x7fffffffe150,1023)         ERR#35 'Resource temporarily unavailable'
58519: select(10,{ 4 5 6 9 },0x0,0x0,{ 0.000000 }) = 0 (0x0)
58519: kill(58406,SIGALRM)             = 0 (0x0)
58406: fork()                     = 58519 (0xe497)
58406: SIGNAL 14 (SIGALRM)
58406: process killed, signal = 14
58273: fork()                     = 58406 (0xe426)
58273: write(1,"58406\n",6)             = 6 (0x6)
58273: exit(0x0)                
58273: process exit, rval = 0
56273: read(9,"58406\n",8192)             = 6 (0x6)
56273: read(9,0x80eeef810,8192)             = 0 (0x0)

Some of that is nohup finishing up running the command and passing back the PID.

Also available in: Atom PDF