Bug #1397
closedntpdate sync not functioning properly
100%
Description
Hi folks,
I installed a fresh copy of your pfSense 2.0 RC1 image. A few days later I got some alerts on my BIND server indicating the disk was over 80% full. I investigated and found my named.log was almost 1GB. Further investigation found the following requests coming about every 3 seconds...
29-Mar-2011 19:37:01.349 client 172.25.1.1#14161: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:01.350 client 172.25.1.1#53511: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:01.350 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:03.409 client 172.25.1.1#17710: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:03.410 client 172.25.1.1#3475: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:03.473 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:06.390 client 172.25.1.1#47988: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:06.390 client 172.25.1.1#46124: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:06.391 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:08.509 client 172.25.1.1#57622: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:08.510 client 172.25.1.1#32958: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:08.560 createfetch: 0.pool.ntp.org AAAA
Note: 172.25.1.1 is my pfsense box and 172.25.1.2 is my internal BIND server. I also changed my default NTP server to be 0.pool.ntp.org.
Then I looked at running processes on the pfsense box and found these suspects.
root 30429 0.0 0.1 3656 1276 ?? IN 7:35PM 0:00.01 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 31056 0.0 0.1 3504 1332 ?? IN 7:35PM 0:00.02 ntpdate 0.pool.ntp.org
root 43258 0.0 0.1 3656 1360 ?? SN 7:35PM 0:00.02 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 9915 0.0 0.1 3656 1324 v0- S 7:35PM 0:00.02 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 851 0.0 0.1 3524 1232 0 S+ 7:36PM 0:00.01 grep -i ntp
Not sure why, but the ntpdate_sync_once.sh was running multiple times. Once I killed those processes, the DNS queries to 0.pool.ntp.org stopped and things returned to normal. After a reboot though, they come right back and I have to kill them.
So I dug into the ntpdate_sync_once script.sh and it has a loop that looks like this...
-- snip --
#!/bin/sh
NOTSYNCED="true"
SERVER=`cat /cf/conf/config.xml | grep timeservers | cut -d">" -f2 | cut -d"<" -f1`
while [ "$NOTSYNCED" = "true" ]; do
ntpdate $SERVER
if [ "$?" = "0" ]; then
NOTSYNCED="false"
fi
sleep 5
done
- Launch -- we have net.
killall ntpd 2>/dev/null
sleep 1
/usr/local/sbin/ntpd -s -f /var/etc/ntpd.conf
-- end snip --
I ran the logic above manually and it appears to work, but when pfsense loads it seems that the ntpdate never exits and there are three copies of that ntpdate_sync_once.sh running without exiting. I didn't see this issue previously reported, so I figured I'd file a bug report.