Bug #4166
closedfilterdns generates floods of DNS requests when there are significant jumps in system time
0%
Description
When you have FQDNs in aliases, and the system clock jumps significantly (talking years), it creates a flood of DNS requests for all the hostnames being monitored. For instance on systems with a dead (or no) CMOS battery after losing power, the system boots at some "Jan 1" date years in the past. Then when NTP syncs during boot, it jumps years forward to current time. It creates enough states to max out the state table seemingly regardless of its size if the jump in time is large enough.
It is possible to replicate by manually setting a system date back years in the past, restarting filterdns, setting the date back to current time, then waiting a few minutes.
Updated by Bipin Chandra almost 10 years ago
i guess the easy way to fix this would be to handle the $resolve_interval properly such that if the time difference is way too high then to ignore creating so many entries but just run once.
i would appreciate if some1 looked into this as its turning out to be a nuisance as on an alix even after like 30 mins the system wont goto normal.
Updated by Bipin Chandra almost 10 years ago
line 405 in filter.inc is what i suspect to be the issue (maybe) because when filterdns is initially run with a time in past, it sets the resolve interval to a default 300 and later when the time jumps it makes filterdns resolve that many times (time difference/300= times) and that would be creating the flood.
Updated by Chris Buechler almost 10 years ago
- Target version changed from 2.2.1 to 2.2.2
Updated by Chris Buechler over 9 years ago
- Target version changed from 2.2.2 to 2.2.3
Updated by Chris Buechler over 9 years ago
- Target version changed from 2.2.3 to 2.3
Updated by Jim Thompson about 9 years ago
The solution is likely to do something like this (moving the "collect the current time" part inside the loop)
The way the code is structured now, it will always add "interval + (interval % 30)" to the last time used.
There are 31,536,000 seconds in a year. If the default interval in pfSense is 300 seconds, then we will chew through these 31,536,000 seconds "300 at a time".
This will result in 105,120 calls to host_dns()
The below will add "interval + (interval % 30)" to the >current< time. This way, if the clock advances (potentially by several years) we will be called once (because the thread will wake out of the cold_timed_wait()), and then we will go back to sleep for interval + (interval % 30) seconds.
void * check_hostname(void *arg) { struct thread_data *thrd = arg; struct timespec ts; struct timespec timeToWait; struct sockaddr_in in; struct sockaddr_in6 in6; int forceUpdate, added, error; if (!thrd->hostname) return (NULL); if (debug >= 2) syslog(LOG_WARNING, "Found hostname %s with netmask %d.", thrd->hostname, thrd->mask); if (thrd->type == PF_TYPE) get_present_table_entries(thrd); added = 0; forceUpdate = 0; pthread_mutex_lock(&thrd->mtx); for (;;) { gettimeofday(&ts, NULL); timeToWait.tv_sec = ts.tv_sec += interval; timeToWait.tv_sec += (interval % 30); timeToWait.tv_nsec = 0UL; if (dev < 0) { dev = open("/dev/pf", O_RDWR); if (dev < 0) syslog(LOG_ERR, "firewall device could not be opened for operation...skipping this time"); } if (dev > 0) { pthread_rwlock_rdlock(&main_lock); if (thrd->exit == 1) { pthread_rwlock_unlock(&main_lock); break; } else if (thrd->exit == 2) { forceUpdate = 1; added = 0; thrd->exit = 0; } /* Detect if an ip address was passed in */ if (added == 0 && inet_pton(AF_INET, thrd->hostname, &in.sin_addr) == 1) { added = 1; in.sin_family = AF_INET; in.sin_len = sizeof(in); error = add_table_entry(thrd, (struct sockaddr *)&in, 1); } else if (added == 0 && is_ipaddrv6(thrd->hostname, &in6) == 1) { error = add_table_entry(thrd, (struct sockaddr *)&in6, 1); added = 1; } else if (added == 0) { error = host_dns(thrd, forceUpdate); } if (error == EAGAIN) { /* * Need to retry again due to some issue with * table handling */ forceUpdate = 1; } else forceUpdate = 0; pthread_rwlock_unlock(&main_lock); } /* Hack for sleeping a thread */ pthread_cond_timedwait(&thrd->cond, &thrd->mtx, &timeToWait); if (debug >= 6) syslog(LOG_WARNING, "\tAwaking from the sleep for hostname %s, table %s", thrd->hostname, TABLENAME(thrd->tablename)); } pthread_mutex_unlock(&thrd->mtx); if (debug >= 4) syslog(LOG_ERR, "Cleaning up hostname %s", thrd->hostname); pthread_mutex_destroy(&thrd->mtx); pthread_cond_destroy(&thrd->cond); clear_hostname_addresses(thrd); if (thrd->hostname != NULL) free(thrd->hostname); if (thrd->tablename != NULL) free(thrd->tablename); if (thrd->cmd != NULL) free(thrd->cmd); free(thrd); return (NULL); }
Updated by Jim Thompson almost 9 years ago
- Assignee changed from Jim Thompson to Renato Botelho
Re-assigning this to Renato.
Updated by Renato Botelho almost 9 years ago
- Status changed from Confirmed to Feedback
I believe luiz did changes on it recently. Can you please confirm it still happen in 1.0_7
Updated by Renato Botelho almost 9 years ago
- Status changed from Feedback to Resolved
- Assignee changed from Renato Botelho to Luiz Souza
Works fine.
Assign it to Luiz since he fixed it