Bug #1397
closedntpdate sync not functioning properly
100%
Description
Hi folks,
I installed a fresh copy of your pfSense 2.0 RC1 image. A few days later I got some alerts on my BIND server indicating the disk was over 80% full. I investigated and found my named.log was almost 1GB. Further investigation found the following requests coming about every 3 seconds...
29-Mar-2011 19:37:01.349 client 172.25.1.1#14161: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:01.350 client 172.25.1.1#53511: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:01.350 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:03.409 client 172.25.1.1#17710: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:03.410 client 172.25.1.1#3475: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:03.473 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:06.390 client 172.25.1.1#47988: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:06.390 client 172.25.1.1#46124: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:06.391 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:08.509 client 172.25.1.1#57622: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:08.510 client 172.25.1.1#32958: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:08.560 createfetch: 0.pool.ntp.org AAAA
Note: 172.25.1.1 is my pfsense box and 172.25.1.2 is my internal BIND server. I also changed my default NTP server to be 0.pool.ntp.org.
Then I looked at running processes on the pfsense box and found these suspects.
root 30429 0.0 0.1 3656 1276 ?? IN 7:35PM 0:00.01 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 31056 0.0 0.1 3504 1332 ?? IN 7:35PM 0:00.02 ntpdate 0.pool.ntp.org
root 43258 0.0 0.1 3656 1360 ?? SN 7:35PM 0:00.02 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 9915 0.0 0.1 3656 1324 v0- S 7:35PM 0:00.02 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 851 0.0 0.1 3524 1232 0 S+ 7:36PM 0:00.01 grep -i ntp
Not sure why, but the ntpdate_sync_once.sh was running multiple times. Once I killed those processes, the DNS queries to 0.pool.ntp.org stopped and things returned to normal. After a reboot though, they come right back and I have to kill them.
So I dug into the ntpdate_sync_once script.sh and it has a loop that looks like this...
-- snip --
#!/bin/sh
NOTSYNCED="true"
SERVER=`cat /cf/conf/config.xml | grep timeservers | cut -d">" -f2 | cut -d"<" -f1`
while [ "$NOTSYNCED" = "true" ]; do
ntpdate $SERVER
if [ "$?" = "0" ]; then
NOTSYNCED="false"
fi
sleep 5
done
- Launch -- we have net.
killall ntpd 2>/dev/null
sleep 1
/usr/local/sbin/ntpd -s -f /var/etc/ntpd.conf
-- end snip --
I ran the logic above manually and it appears to work, but when pfsense loads it seems that the ntpdate never exits and there are three copies of that ntpdate_sync_once.sh running without exiting. I didn't see this issue previously reported, so I figured I'd file a bug report.
Updated by Jim Pingle about 13 years ago
- Status changed from New to Feedback
- % Done changed from 0 to 100
Updated by Angel Torres about 13 years ago
Jimp, I've applied your commits and I see the service is now running fine at startup and is able to stop and restart normally.
Problem is hitting save on an OpenVPN config stops ntp cold and the service not only shows as stopped but doesnt restart at all.
Also, I still cant get my clients to sync. Running ps -ax | grep ntp at startup shows
ntpd: ntp engine (ntpd)
ntpd: [priv] (ntpd)
If I save an openvpn config and then ntp service stops, grep ntp shows
ntpdate -s -t 5 0.pfsense.pool.ntp.org
/bin/sh /usr/local/sbin/ntpdate_sync_once.sh
Running the grep ntp at startup should show the ntpdate 0.pfsense.pool.ntp.org and the shell script entry as well but it missing and not allowing clients to sync even though the service is running.
Updated by Jim Pingle about 13 years ago
The ntp issue would be separate (it has its own ticket), but I didn't test with openvpn. I suspect that's because OpenVPN connections end up calling rc.newwanip which calls the ntp sync script. That should be working, as long as the system has working dns at the time.
Anyhow, ntpdate is probably going away in favor of just starting ntpd with -s, Ermal pointed it out to me earlier today, and it seems to be the favored way to do things these days.
Updated by Damon Morda about 13 years ago
Hi Jim P,
I've also applied your commits, but it didn't resolve the issue. Once I applied them and started the disabled ntp service through the web interface. All seemed to be working, so I rebooted. Upon rebooting, I found that the ntp service was stopped and could not be started. Here's the processes running that were ntp-related.
root 7562 0.0 0.1 3504 1364 ?? SN 7:36PM 0:00.01 ntpdate -s -t 5 0.pool.ntp.org
root 36380 0.0 0.1 3656 1312 ?? SN 7:35PM 0:00.03 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 44595 0.0 0.1 3656 1368 ?? SN 7:35PM 0:00.03 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 8144 0.0 0.1 3524 1240 0 S+ 7:36PM 0:00.01 grep -i ntp
Same issue as before. The pfSense box keeps querying my name server for 0.pool.ntp.org.
Now, if I kill ntpdate, and the two ntpdate_sync_once.sh processes and start ntp through the web interface, all is good.
_ntp 44414 0.0 0.1 3316 1336 ?? S 7:40PM 0:00.00 ntpd: ntp engine (ntpd)
root 44661 0.0 0.1 3316 1348 ?? Ss 7:40PM 0:00.00 ntpd: [priv] (ntpd)
Contents of my /usr/local/sbin/ntpdate_sync_once.sh are as follows:
#!/bin/sh
NOTSYNCED="true"
SERVER=`cat /cf/conf/config.xml | grep timeservers | cut -d">" -f2 | cut -d"<" -f1`
pfkill -f ntpdate_sync_once.sh
while [ "$NOTSYNCED" = "true" ]; do
# Ensure that ntpd and ntpdate are not running so that the socket we want will be free.
killall ntpd 2>/dev/null
killall ntpdate
sleep 1
ntpdate -s -t 5 $SERVER
if [ "$?" = "0" ]; then
NOTSYNCED="false"
fi
sleep 5
done
/usr/local/sbin/ntpd -s -f /var/etc/ntpd.conf
Any ideas?
Updated by Jim Pingle about 13 years ago
I updated this again with cd11a14
ntpdate sync is completely gone, since simply starting ntpd with -s will have the same effect. This makes that separate syncing step and script obsolete.
Updated by Damon Morda about 13 years ago
Hi Jim P,
That change seemed to do the trick. NTP is running just perfectly after making those changes and rebooting. Thanks!
Updated by Jim Pingle about 13 years ago
- Status changed from Feedback to Resolved