Bug #196
closedremote syslog does not work after reboot
0%
Description
If you enable remote syslogging it works fine until you reboot pfsense, at which point no messages are ever sent to the remote server. If one does anything to restart syslog (for example, by clicking the 'save' button in the settings page for syslog), it then starts working. What I believe is happening is this: syslog is started before the LAN interface is brought up, so when it tries to create and bind the UDP socket, it gets "Network unreachable" and gives up. This is easily reproduceable.
Updated by Chris Buechler almost 15 years ago
- Status changed from New to Feedback
This isn't universally true, I have multiple boxes that use remote syslog on 1.2.3 and work fine after rebooting. A couple to a LAN host, several to a host across IPsec, and they all work fine without touching the box after booting. The can't bind theory isn't legit, there is no binding to a remote UDP service, it's stateless and the firewall has no clue whether or not it can actually reach that syslog server, it just keeps pushing bits. Especially true in the case of IPsec, you have to start logging well before you start IPsec.
There may be an edge case in combination with something else, but I can't replicate any problems. We'll need more specific details on how to replicate.
Updated by Dan Swartzendruber almost 15 years ago
That is odd then. As far as binding, yes and no. I reread the log message, and the complaint is about sendto(), not bind. Notwithstanding that, it still does do bind(), since that is how it causes the source port to be 514. As far as how to repro, I confirmed this on a virtualbox VM. I installed a recent 1.2.3RC3, set it to log to a host on my LAN, and saw the messages. Rebooted, and saw no messages. Looked at /var/log/system.log and see the network unreachable error.
Updated by Chris Buechler almost 15 years ago
Sounds like syslog fails binding to the local address, as if something else were already bound to 514. What's the exact message? Any packages or anything else non-default installed?
Updated by Dan Swartzendruber almost 15 years ago
yes, i have a number of packages loaded (squid, havp, etc...) i have been looking at the source on cvsweb and i think i see the bug (not sure why it doesn't always happen). Here is the code after we get an error from sendto:
dprintf("lsent/l: %d/%d\n", lsent, l);
if (lsent != l) {
int e = errno;
logerror("sendto");
errno = e;
switch (errno) {
case ENOBUFS:
case ENETDOWN:
case EHOSTUNREACH:
case EHOSTDOWN:
break;
/* case EBADF: /
/ case EACCES: /
/ case ENOTSOCK: /
/ case EFAULT: /
/ case EMSGSIZE: /
/ case EAGAIN: /
/ case ENOBUFS: /
/ case ECONNREFUSED: */
default:
dprintf("removing entry\n");
f->f_type = F_UNUSED;
break;
}
note that ENETDOWN, EHOSTUNREACH and EHOSTDOWN are all non-fatal, whereas the default case is fatal and removes this destination from the list - unfortunately, ENETUNREACHABLE is not listed, and therefore falls into the default case. Unfortunately, if this is wrong, it is in freebsd code, and not anything we can really fix, no? I guess I could work around this, if there was some file like /etc/rc.local in linux, to put customizations?
Updated by Dan Swartzendruber almost 15 years ago
Hmmm, on the other hand, don't think it package related - my virtualbox repro, is a barebones "quick install" with nothing else done before i repro this.
Updated by Chris Buechler almost 15 years ago
- Status changed from Feedback to New
- Target version set to 1.2.3
Confirmed issue, after upgrading to a snapshot from today. This is something that's changed in the past month, except virtually nothing has changed in RELENG_1_2 in that time. Looking into it.
Updated by Dan Swartzendruber almost 15 years ago
Definitely something odd here. I booted the VM, and once it was up, from the console, i shutdown the WAN enet (since for the VM, that is how it gets to the syslog server on my LAN), this gets rid of the default gateway, so it cannot now reach 10.0.0.1 (my syslog server). i then did 'ping 10.0.0.1' and saw "Network is down", which is ENETDOWN, which is what syslog treats as transient. It did NOT get the ENETUNREACH.
Updated by Jim Pingle almost 15 years ago
I can reproduce this in a VM and on a real test box. Adding to the weirdness, from my VM I actually do get log messages through just after the network error (re: tcpdump starting on pflog0) but nothing before or after that point. I get nothing from my test box. Both the VM and test box were updated this morning.
It does start to log again if I send a HUP to syslogd.
I see that Scott adjusted some patches to see if one may have been the cause, hopefully the next snapshot will give us some hints.
Updated by Jim Pingle almost 15 years ago
This still happens even with a snapshot from Dec 1st, so it doesn't appear the patch changes made a difference.
Updated by Scott Ullrich almost 15 years ago
- Status changed from New to Feedback
Updated by Chris Buechler almost 15 years ago
- Status changed from Feedback to Resolved
Fixed in 1.2.3, verified it also works in 2.0.