Project

General

Profile

Actions

Bug #1397

closed

ntpdate sync not functioning properly

Added by Damon Morda about 13 years ago. Updated about 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
NTPD
Target version:
-
Start date:
03/30/2011
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.0
Affected Architecture:

Description

Hi folks,

I installed a fresh copy of your pfSense 2.0 RC1 image. A few days later I got some alerts on my BIND server indicating the disk was over 80% full. I investigated and found my named.log was almost 1GB. Further investigation found the following requests coming about every 3 seconds...

29-Mar-2011 19:37:01.349 client 172.25.1.1#14161: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:01.350 client 172.25.1.1#53511: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:01.350 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:03.409 client 172.25.1.1#17710: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:03.410 client 172.25.1.1#3475: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:03.473 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:06.390 client 172.25.1.1#47988: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:06.390 client 172.25.1.1#46124: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:06.391 createfetch: 0.pool.ntp.org AAAA
29-Mar-2011 19:37:08.509 client 172.25.1.1#57622: view internal-in: query: 0.pool.ntp.org IN A + (172.25.1.2)
29-Mar-2011 19:37:08.510 client 172.25.1.1#32958: view internal-in: query: 0.pool.ntp.org IN AAAA + (172.25.1.2)
29-Mar-2011 19:37:08.560 createfetch: 0.pool.ntp.org AAAA

Note: 172.25.1.1 is my pfsense box and 172.25.1.2 is my internal BIND server. I also changed my default NTP server to be 0.pool.ntp.org.

Then I looked at running processes on the pfsense box and found these suspects.

root 30429 0.0 0.1 3656 1276 ?? IN 7:35PM 0:00.01 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 31056 0.0 0.1 3504 1332 ?? IN 7:35PM 0:00.02 ntpdate 0.pool.ntp.org
root 43258 0.0 0.1 3656 1360 ?? SN 7:35PM 0:00.02 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 9915 0.0 0.1 3656 1324 v0- S 7:35PM 0:00.02 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 851 0.0 0.1 3524 1232 0 S+ 7:36PM 0:00.01 grep -i ntp

Not sure why, but the ntpdate_sync_once.sh was running multiple times. Once I killed those processes, the DNS queries to 0.pool.ntp.org stopped and things returned to normal. After a reboot though, they come right back and I have to kill them.

So I dug into the ntpdate_sync_once script.sh and it has a loop that looks like this...

-- snip --
#!/bin/sh

NOTSYNCED="true"
SERVER=`cat /cf/conf/config.xml | grep timeservers | cut -d">" -f2 | cut -d"<" -f1`

while [ "$NOTSYNCED" = "true" ]; do
ntpdate $SERVER
if [ "$?" = "0" ]; then
NOTSYNCED="false"
fi
sleep 5
done

  1. Launch -- we have net.
    killall ntpd 2>/dev/null
    sleep 1
    /usr/local/sbin/ntpd -s -f /var/etc/ntpd.conf

-- end snip --

I ran the logic above manually and it appears to work, but when pfsense loads it seems that the ntpdate never exits and there are three copies of that ntpdate_sync_once.sh running without exiting. I didn't see this issue previously reported, so I figured I'd file a bug report.

Actions #1

Updated by Damon Morda about 13 years ago

Duplicate of #1398. Clicked too fast.

Actions #2

Updated by Jim Pingle about 13 years ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100

Should be fixed as of edf99ce (See also 2db351a and 54c1859)

Actions #3

Updated by Angel Torres about 13 years ago

Jimp, I've applied your commits and I see the service is now running fine at startup and is able to stop and restart normally.

Problem is hitting save on an OpenVPN config stops ntp cold and the service not only shows as stopped but doesnt restart at all.

Also, I still cant get my clients to sync. Running ps -ax | grep ntp at startup shows

ntpd: ntp engine (ntpd)
ntpd: [priv] (ntpd)

If I save an openvpn config and then ntp service stops, grep ntp shows

ntpdate -s -t 5 0.pfsense.pool.ntp.org
/bin/sh /usr/local/sbin/ntpdate_sync_once.sh

Running the grep ntp at startup should show the ntpdate 0.pfsense.pool.ntp.org and the shell script entry as well but it missing and not allowing clients to sync even though the service is running.

Actions #4

Updated by Jim Pingle about 13 years ago

The ntp issue would be separate (it has its own ticket), but I didn't test with openvpn. I suspect that's because OpenVPN connections end up calling rc.newwanip which calls the ntp sync script. That should be working, as long as the system has working dns at the time.

Anyhow, ntpdate is probably going away in favor of just starting ntpd with -s, Ermal pointed it out to me earlier today, and it seems to be the favored way to do things these days.

Actions #5

Updated by Damon Morda about 13 years ago

Hi Jim P,

I've also applied your commits, but it didn't resolve the issue. Once I applied them and started the disabled ntp service through the web interface. All seemed to be working, so I rebooted. Upon rebooting, I found that the ntp service was stopped and could not be started. Here's the processes running that were ntp-related.

root 7562 0.0 0.1 3504 1364 ?? SN 7:36PM 0:00.01 ntpdate -s -t 5 0.pool.ntp.org
root 36380 0.0 0.1 3656 1312 ?? SN 7:35PM 0:00.03 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 44595 0.0 0.1 3656 1368 ?? SN 7:35PM 0:00.03 /bin/sh /usr/local/sbin/ntpdate_sync_once.sh
root 8144 0.0 0.1 3524 1240 0 S+ 7:36PM 0:00.01 grep -i ntp

Same issue as before. The pfSense box keeps querying my name server for 0.pool.ntp.org.

Now, if I kill ntpdate, and the two ntpdate_sync_once.sh processes and start ntp through the web interface, all is good.

_ntp 44414 0.0 0.1 3316 1336 ?? S 7:40PM 0:00.00 ntpd: ntp engine (ntpd)
root 44661 0.0 0.1 3316 1348 ?? Ss 7:40PM 0:00.00 ntpd: [priv] (ntpd)

Contents of my /usr/local/sbin/ntpdate_sync_once.sh are as follows:

#!/bin/sh

NOTSYNCED="true"
SERVER=`cat /cf/conf/config.xml | grep timeservers | cut -d">" -f2 | cut -d"<" -f1`
pfkill -f ntpdate_sync_once.sh

while [ "$NOTSYNCED" = "true" ]; do # Ensure that ntpd and ntpdate are not running so that the socket we want will be free.
killall ntpd 2>/dev/null
killall ntpdate
sleep 1
ntpdate -s -t 5 $SERVER
if [ "$?" = "0" ]; then
NOTSYNCED="false"
fi
sleep 5
done

/usr/local/sbin/ntpd -s -f /var/etc/ntpd.conf

Any ideas?

Actions #6

Updated by Jim Pingle about 13 years ago

I updated this again with cd11a14

ntpdate sync is completely gone, since simply starting ntpd with -s will have the same effect. This makes that separate syncing step and script obsolete.

Actions #7

Updated by Damon Morda about 13 years ago

Hi Jim P,

That change seemed to do the trick. NTP is running just perfectly after making those changes and rebooting. Thanks!

Actions #8

Updated by Jim Pingle about 13 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF