Project

General

Profile

Bug #4442

Boot sits at "Configuring firewall" for long time with hostnames, URL Tables, where DNS non-functional

Added by Chris Buechler over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Category:
Rules / NAT
Target version:
Start date:
02/18/2015
Due date:
% Done:

0%

Estimated time:
Affected Version:
All
Affected Architecture:

Description

Where you have FQDNs in aliases, and no reachable DNS servers, the boot gets excessively delayed sitting at "Configuring firewall" awaiting DNS timeouts. It can delay things by 5-10+ minutes at that point depending on the configuration. Things are conservative to avoid giving up and moving along too quickly as subsequent parts of the boot can be dependent on that having been successfully resolved, but that's really excessive.

Associated revisions

Revision eefd7773 (diff)
Added by Chris Buechler over 4 years ago

A number of things block waiting for file download timeouts, sometimes multiple times across multiple files (many URL Table aliases, for instance). The long timeout causes very long boot times (10-20+ minutes) on many configs with pfblocker if booted disconnected from the Internet. This is strictly the timeout for the HTTP/HTTPS connection attempt. Once connected, it can run past that. 5 seconds should be more than enough for any properly-functioning network. Part of Ticket #4442.

Revision a320af18 (diff)
Added by Chris Buechler over 4 years ago

A number of things block waiting for file download timeouts, sometimes multiple times across multiple files (many URL Table aliases, for instance). The long timeout causes very long boot times (10-20+ minutes) on many configs with pfblocker if booted disconnected from the Internet. This is strictly the timeout for the HTTP/HTTPS connection attempt. Once connected, it can run past that. 5 seconds should be more than enough for any properly-functioning network. Part of Ticket #4442.

Conflicts:
etc/inc/pfsense-utils.inc

Revision ec9eb789 (diff)
Added by Ermal Luçi over 4 years ago

Ticket #4442 Do not process URL aliases during bootup but trigger it just after finished booting. This completely solves the bootup delays without lowering the timeout as before. Probably need to increase a bit the timeouts now to be friendly to other connections

Revision 0d44aca6 (diff)
Added by Ermal Luçi over 4 years ago

Ticket #4442 Do not process URL aliases during bootup but trigger it just after finished booting. This completely solves the bootup delays without lowering the timeout as before. Probably need to increase a bit the timeouts now to be friendly to other connections

History

#1 Updated by Rob Turner over 4 years ago

Could this help:

$destination = $fqdn;
$output = shell_exec("host -W 1 $destination");
if (preg_match_all('#\b(connection|timed|out)\b#', $output, $matches)) {
die("Host lookup failed\n");
}

#2 Updated by Chris Buechler over 4 years ago

  • Target version changed from 2.2.2 to 2.2.3

#3 Updated by Kill Bill over 4 years ago

This is not limited to FQDNs. When you have URL aliases that rely on local files that do not exist (such as restoring config from a machine with pfBlockerNG on a freshly installed box), it gets completely ridiculous. People actually think the boot has hanged.

https://forum.pfsense.org/index.php?topic=92355.0

#4 Updated by Chris Buechler over 4 years ago

  • Subject changed from Boot sits at "Configuring firewall" for long time if aliases contain hostnames and DNS non-functional to Boot sits at "Configuring firewall" for long time with hostnames, URL Tables, where DNS non-functional
  • Assignee set to Chris Buechler

A big portion of the issue with URL table aliases is file_download can be attempted many times during filter reload when booting, and if that times out, it adds significant delays while awaiting the timeout over and over. CURLOPT_CONNECTTIMEOUT was 60 seconds (down from default 300), which is still way longer than necessary. Dropped that to 5 seconds in what I just pushed, which helps greatly for that part.

I'm still tracing other related delays.

#5 Updated by Kill Bill over 4 years ago

Much better now... ;)

#6 Updated by Chris Buechler over 4 years ago

Kill Bill: mind sharing any specifics on what you've seen? How long did it take to boot before, and how long does it take now? I have a couple diff configs that replicate but having some outside feedback wouldn't hurt. That last part alone shaved 10-15 minutes off a 20-25 minute boot time with one config. That config has other bits that cause other delays with no Internet connectivity though.

#7 Updated by Kill Bill over 4 years ago

Well, I tested the pfBNG case (i.e., restore the config with tons of URL aliases on a new box). Down to under 2 minutes from something like 20 (with 15 aliases).

#8 Updated by Ermal Luçi over 4 years ago

  • Status changed from Confirmed to Feedback

I pushed a fix that do not processes URL aliases until bootup is finished.
This should fix properly the issue.

Maybe should increase the timeouts a bit again on those functions?

#9 Updated by Ermal Luçi over 4 years ago

Also one this to consider here probably as another issue is that update of urlaliases should not be done inline during filter_reload.

Done on master with commit:#5b2b1f4e57caec234fbc1ed1d61b28a79e67ef8c

Probably should sync that commit on RELENG_2_2

#10 Updated by Kill Bill over 4 years ago

Hmmmm. Not exactly convinced this is better. This seems to be blocking all traffic from LANs until the boot is completely finished, all packages reinstalled etc.

#11 Updated by Ermal Luçi over 4 years ago

You DNS is busted what do you get blocked that was not blocked before?

If your boot takes 1-2 minutes than this is tolerated since this is downtime anyhow.

#12 Updated by Chris Buechler over 4 years ago

It's definitely worse to skip it during boot in a variety of cases, and I don't see any circumstances where that helps anything. I reverted that. The change I made fixed all the really excessive timeouts that made boot take 10-20+ minutes when not Internet-connected depending on config. The increased timeout is strictly for establishing the initial connection to the web server. If you can't establish a TCP connection in 5 seconds, you have bigger problems. There isn't a properly-functioning Internet connection on earth that'll take >5 seconds to establish a HTTP connection. The download of data can take however long is necessary past that point, the timeout doesn't apply to it.

This should be fine as is, I'll verify again on later snapshots, but this should be safe to close.

#13 Updated by Kill Bill over 4 years ago

Ermal Luçi wrote:

You DNS is busted what do you get blocked that was not blocked before?

No. That'd make package reinstall fail. But the packages reinstalled fine. In fact, this made the issue happen in circumstances where it was never seen, merely by upgrading to a newer snapshot on a full install. I have seen a popup with error about failing to load firewall rules after I could finally log in to pfSense box. As said, all outgoing traffic from LAN was blocked until boot completely finished and all packages reinstalled.

#14 Updated by Kill Bill over 4 years ago

No more undefined macros and errors when loading the rules on boot with latest snapshot. I'm with Chris here, looks like this can be closed.

#15 Updated by Chris Buechler over 4 years ago

  • Status changed from Feedback to Resolved

this is good.

Also available in: Atom PDF