Bug #196: remote syslog does not work after reboot - pfSense - pfSense bugtracker

Actions

Copy link

Bug #196

closed

remote syslog does not work after reboot

Added by Dan Swartzendruber over 15 years ago. Updated over 15 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

1.2.3

Start date:

11/29/2009

Due date:

% Done:

Estimated time:

Plus Target Version:

Release Notes:

Affected Version:

1.2.3

Affected Architecture:

Description

If you enable remote syslogging it works fine until you reboot pfsense, at which point no messages are ever sent to the remote server. If one does anything to restart syslog (for example, by clicking the 'save' button in the settings page for syslog), it then starts working. What I believe is happening is this: syslog is started before the LAN interface is brought up, so when it tries to create and bind the UDP socket, it gets "Network unreachable" and gives up. This is easily reproduceable.

Actions

Copy link

Updated by Chris Buechler over 15 years ago

Status changed from New to Feedback

This isn't universally true, I have multiple boxes that use remote syslog on 1.2.3 and work fine after rebooting. A couple to a LAN host, several to a host across IPsec, and they all work fine without touching the box after booting. The can't bind theory isn't legit, there is no binding to a remote UDP service, it's stateless and the firewall has no clue whether or not it can actually reach that syslog server, it just keeps pushing bits. Especially true in the case of IPsec, you have to start logging well before you start IPsec.

There may be an edge case in combination with something else, but I can't replicate any problems. We'll need more specific details on how to replicate.

Actions

Copy link

Updated by Dan Swartzendruber over 15 years ago

That is odd then. As far as binding, yes and no. I reread the log message, and the complaint is about sendto(), not bind. Notwithstanding that, it still does do bind(), since that is how it causes the source port to be 514. As far as how to repro, I confirmed this on a virtualbox VM. I installed a recent 1.2.3RC3, set it to log to a host on my LAN, and saw the messages. Rebooted, and saw no messages. Looked at /var/log/system.log and see the network unreachable error.

Actions

Copy link

Updated by Chris Buechler over 15 years ago

Sounds like syslog fails binding to the local address, as if something else were already bound to 514. What's the exact message? Any packages or anything else non-default installed?

Actions

Copy link

Updated by Dan Swartzendruber over 15 years ago

yes, i have a number of packages loaded (squid, havp, etc...) i have been looking at the source on cvsweb and i think i see the bug (not sure why it doesn't always happen). Here is the code after we get an error from sendto:

dprintf("lsent/l: %d/%d\n", lsent, l);
            if (lsent != l) {
                int e = errno;
                logerror("sendto");
                errno = e;
                switch (errno) {
                case ENOBUFS:
                case ENETDOWN:
                case EHOSTUNREACH:
                case EHOSTDOWN:
                    break;
                /* case EBADF: /
                / case EACCES: /
                / case ENOTSOCK: /
                / case EFAULT: /
                / case EMSGSIZE: /
                / case EAGAIN: /
                / case ENOBUFS: /
                / case ECONNREFUSED: */
                default:
                    dprintf("removing entry\n");
                    f->f_type = F_UNUSED;
                    break;
                }

note that ENETDOWN, EHOSTUNREACH and EHOSTDOWN are all non-fatal, whereas the default case is fatal and removes this destination from the list - unfortunately, ENETUNREACHABLE is not listed, and therefore falls into the default case. Unfortunately, if this is wrong, it is in freebsd code, and not anything we can really fix, no? I guess I could work around this, if there was some file like /etc/rc.local in linux, to put customizations?

Actions

Copy link

Updated by Dan Swartzendruber over 15 years ago

Hmmm, on the other hand, don't think it package related - my virtualbox repro, is a barebones "quick install" with nothing else done before i repro this.

Actions

Copy link

Updated by Chris Buechler over 15 years ago

Status changed from Feedback to New
Target version set to 1.2.3

Confirmed issue, after upgrading to a snapshot from today. This is something that's changed in the past month, except virtually nothing has changed in RELENG_1_2 in that time. Looking into it.

Actions

Copy link

Updated by Dan Swartzendruber over 15 years ago

Definitely something odd here. I booted the VM, and once it was up, from the console, i shutdown the WAN enet (since for the VM, that is how it gets to the syslog server on my LAN), this gets rid of the default gateway, so it cannot now reach 10.0.0.1 (my syslog server). i then did 'ping 10.0.0.1' and saw "Network is down", which is ENETDOWN, which is what syslog treats as transient. It did NOT get the ENETUNREACH.

Actions

Copy link

Updated by Jim Pingle over 15 years ago

I can reproduce this in a VM and on a real test box. Adding to the weirdness, from my VM I actually do get log messages through just after the network error (re: tcpdump starting on pflog0) but nothing before or after that point. I get nothing from my test box. Both the VM and test box were updated this morning.

It does start to log again if I send a HUP to syslogd.

I see that Scott adjusted some patches to see if one may have been the cause, hopefully the next snapshot will give us some hints.

Actions

Copy link