Project

General

Profile

Bug #4166

filterdns generates floods of DNS requests when there are significant jumps in system time

Added by Chris Buechler over 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Rules/NAT
Target version:
Start date:
12/31/2014
Due date:
% Done:

0%

Estimated time:
Affected Version:
All
Affected Architecture:

Description

When you have FQDNs in aliases, and the system clock jumps significantly (talking years), it creates a flood of DNS requests for all the hostnames being monitored. For instance on systems with a dead (or no) CMOS battery after losing power, the system boots at some "Jan 1" date years in the past. Then when NTP syncs during boot, it jumps years forward to current time. It creates enough states to max out the state table seemingly regardless of its size if the jump in time is large enough.

It is possible to replicate by manually setting a system date back years in the past, restarting filterdns, setting the date back to current time, then waiting a few minutes.

History

#1 Updated by Bipin Chandra over 4 years ago

i guess the easy way to fix this would be to handle the $resolve_interval properly such that if the time difference is way too high then to ignore creating so many entries but just run once.

i would appreciate if some1 looked into this as its turning out to be a nuisance as on an alix even after like 30 mins the system wont goto normal.

#2 Updated by Bipin Chandra over 4 years ago

line 405 in filter.inc is what i suspect to be the issue (maybe) because when filterdns is initially run with a time in past, it sets the resolve interval to a default 300 and later when the time jumps it makes filterdns resolve that many times (time difference/300= times) and that would be creating the flood.

#3 Updated by Chris Buechler over 4 years ago

  • Target version changed from 2.2.1 to 2.2.2

#4 Updated by Chris Buechler about 4 years ago

  • Target version changed from 2.2.2 to 2.2.3

#5 Updated by Chris Buechler almost 4 years ago

  • Target version changed from 2.2.3 to 2.3

#6 Updated by Jim Thompson over 3 years ago

The solution is likely to do something like this (moving the "collect the current time" part inside the loop)

The way the code is structured now, it will always add "interval + (interval % 30)" to the last time used.

There are 31,536,000 seconds in a year. If the default interval in pfSense is 300 seconds, then we will chew through these 31,536,000 seconds "300 at a time".
This will result in 105,120 calls to host_dns()

The below will add "interval + (interval % 30)" to the >current< time. This way, if the clock advances (potentially by several years) we will be called once (because the thread will wake out of the cold_timed_wait()), and then we will go back to sleep for interval + (interval % 30) seconds.


void *
check_hostname(void *arg)
{
    struct thread_data *thrd = arg;
    struct timespec    ts;
    struct timespec    timeToWait;
    struct sockaddr_in in;
    struct sockaddr_in6 in6;
    int forceUpdate, added, error;

    if (!thrd->hostname)
        return (NULL);

    if (debug >= 2)
        syslog(LOG_WARNING, "Found hostname %s with netmask %d.", thrd->hostname, thrd->mask);

    if (thrd->type == PF_TYPE)
        get_present_table_entries(thrd);

    added = 0;
    forceUpdate = 0;
    pthread_mutex_lock(&thrd->mtx);

    for (;;) {
        gettimeofday(&ts, NULL);
        timeToWait.tv_sec = ts.tv_sec += interval;
        timeToWait.tv_sec += (interval % 30);
        timeToWait.tv_nsec = 0UL;

        if (dev < 0) {
            dev = open("/dev/pf", O_RDWR);
            if (dev < 0)
                syslog(LOG_ERR, "firewall device could not be opened for operation...skipping this time");
        }
        if (dev > 0) {
            pthread_rwlock_rdlock(&main_lock);

            if (thrd->exit == 1) {
                pthread_rwlock_unlock(&main_lock);
                break;
            } else if (thrd->exit == 2) {
                forceUpdate = 1;
                added = 0;
                thrd->exit = 0;
            }
            /* Detect if an ip address was passed in */
            if (added == 0 && inet_pton(AF_INET, thrd->hostname, &in.sin_addr) == 1) {
                added = 1;
                in.sin_family = AF_INET;
                in.sin_len = sizeof(in);
                error = add_table_entry(thrd, (struct sockaddr *)&in, 1);
            } else if (added == 0 && is_ipaddrv6(thrd->hostname, &in6) == 1) {
                error = add_table_entry(thrd, (struct sockaddr *)&in6, 1);
                added = 1;
            } else if (added == 0) {
                error = host_dns(thrd, forceUpdate);
            }
            if (error == EAGAIN) {
                /*
                 * Need to retry again due to some issue with
                 * table handling
                 */
                forceUpdate = 1;
            } else
                forceUpdate = 0;
            pthread_rwlock_unlock(&main_lock);
        }
        /* Hack for sleeping a thread */
        pthread_cond_timedwait(&thrd->cond, &thrd->mtx, &timeToWait);
        if (debug >= 6)
            syslog(LOG_WARNING, "\tAwaking from the sleep for hostname %s, table %s", thrd->hostname, TABLENAME(thrd->tablename));
    }
    pthread_mutex_unlock(&thrd->mtx);

    if (debug >= 4)
        syslog(LOG_ERR, "Cleaning up hostname %s", thrd->hostname);
    pthread_mutex_destroy(&thrd->mtx);
    pthread_cond_destroy(&thrd->cond);
    clear_hostname_addresses(thrd);
    if (thrd->hostname != NULL)
        free(thrd->hostname);
    if (thrd->tablename != NULL)
        free(thrd->tablename);
    if (thrd->cmd != NULL)
        free(thrd->cmd);
    free(thrd);

    return (NULL);
}

#7 Updated by Jim Thompson over 3 years ago

  • Assignee set to Jim Thompson

#8 Updated by Jim Thompson over 3 years ago

  • Assignee changed from Jim Thompson to Renato Botelho

Re-assigning this to Renato.

#9 Updated by Renato Botelho over 3 years ago

  • Status changed from Confirmed to Feedback

I believe luiz did changes on it recently. Can you please confirm it still happen in 1.0_7

#10 Updated by Renato Botelho over 3 years ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Renato Botelho to Luiz Souza

Works fine.

Assign it to Luiz since he fixed it

Also available in: Atom PDF