Project

General

Profile

Bug #8987

Web GUI main page very slow to load if wan interface is enabled but not connected.

Added by Arnaldo Pirrone 6 months ago. Updated 15 days ago.

Status:
New
Priority:
Very Low
Assignee:
-
Category:
Web Interface
Target version:
Start date:
10/01/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
Affected Architecture:

Description

Hi,
I noticed this annoying bug in pfSense 2.4.4:
by configuring the wan interface and leaving it disconnected,
the main page of the web GUI becomes very slow to load (you must wait many minutes!) though you can reach every other page.
you can verify this by enabling and disabling the WAN interface from the interfaces menu (always available), then by clicking on the logo.

History

#1 Updated by Jim Pingle 6 months ago

  • Priority changed from Normal to Very Low
  • Target version set to Future

There are several things on the dashboard that need working DNS and connectivity, like the update check, packages widget, etc.

Without connectivity it has to wait for DNS to timeout for various things. There isn't a way around that.

Using the local DNS Resolver/Forwarder can help -- see #1407

There may be something more we can do here in the future, but it's doubtful.

#2 Updated by James Howel 2 months ago

To add to this bug we've been using pfSense 2.3.5 for an internal project and its been working brilliantly.

We're using pfSense as a cluster but it has no internet connectivity and 2.3.5 has no issues with this.

I upgraded the first cluster node with 2.4.4_p1 and noticed this laggy behaviour generally when trying to load the dashboard. After spending several hours trying to figure out what was wrong I ended up having to roll back to snapshot and leave the cluster on 2.3.5.

Testing this further, I have found that on top of the dashboard hanging for up to 2 minutes, certain other configuration saves behave the same way, such as DNS Resolver, Advanced Settings and others. Firewall Rules saves do not exhibit the same issue.

Essentially this will preclude pfSense from being deployed where there is no internet connectivity as this issue means that configuration will take 10 times longer to perform and for some people they will see this as a bug and opt to use another platform.

I'm a long term pfSense user and its gaining traction on this project but ultimately will have to be abandoned without some resolution or workaround to this problem as this will be a showstopper for using pfSense here.

#3 Updated by Luke Hamburg 2 months ago

If you remove all widgets from the dashboard does that help at all? It's probably a widget that's causing this delay.

#4 Updated by James Howel 2 months ago

Hi Luke,

Thanks for the suggestion but I've tried that, same issue.

It looks like whatever is timing out due to no internet connectivity affects several areas so it's not just the dashboard...

James

#5 Updated by Joshua Sign 2 months ago

Hi,

I just test it :

- Loading dashboard normaly takes about 1 second or less.
- Without WAN connectivity, it takes about 30 seconds.
- Adding theses options in resolv.conf, it now takes only 5 seconds :

options timeout:1
options attempts:1

I don't know if it can help, but when WAN connectivity goes down, we got an alert in systems logs like :

Jan 15 15:51:52        php-fpm                /rc.linkup: Hotplug event detected for WAN(wan) static IP (xxx.xxx.xxx.xxx)
Jan 15 15:51:51        kernel                 em0: link state changed to DOWN
Jan 15 15:51:51        check_reload_status    Linkup starting em0 

So if these options are good to use when connexion is down (there are many cases to consider),
it can be a workaround to add them 'on fly' in resolv.conf when WAN goes down and then remove them when WAN goes UP.

It's a suggestion...

#6 Updated by Maverick Phillips 2 months ago

Hello,

One of my two firewalls has developed this issue - I can confirm disabling the WAN adapter resolved this slowness !

#7 Updated by James Howel 2 months ago

Hi Joshua,

Thanks for looking at this.

We don't have a WAN in a down state, it is connected but it has no NAT and is basically just another interface but on a totally isolated network.

It appears that if pfSense has NEVER been connected to the internet, the way it behaves with the timeouts is different to IF it has been connected at least once and then disconnected.

Build scenario: 2.4.4 p1 build, 2 interfaces, vsphere 6.5 VM, no configuration aside from adding ip addresses from console and setting the admin pass from gui. Never internet connected.
  • Dashboard access once logged in: 66 seconds
  • General page save after adding one dns ip: 63 seconds
  • General page save with no config changes: 66 seconds
  • Rules save: instant
  • Post rules save Apply Changes: Instant

Architecturally, there is a significant dependency on internet connectivity in pfSense which for most people is fine but if an internal pfSense tier or the entire pfSense implementation can never be connected to the internet then it's very slow to configure and troubleshoot. When compared to an offline hardware firewall from another vendor pfSense behaves like its buggy and slow.

I've also discovered that it's also fundamentally impossible to install any packages in an offline state which is another problem but that's a separate issue.

James

#8 Updated by Corey Bock about 2 months ago

Maverick Phillips wrote:

Hello,

One of my two firewalls has developed this issue - I can confirm disabling the WAN adapter resolved this slowness !

I've recently just discovered this issue on configs made with 2.4.4-RELEASE-p2 for both the SG8860 and the XG1541.

GUI has become almost unusable on both after updating. This is a problem as we're often teching the hardware in a shop offline. Also, in the event a single WAN setup looses connectivity --right when you'd wanna log in and check it out-- you're unable to get in for at least 60 seconds.

I can also confirm that disabling the offline WAN interfaces will resolve the problem. However, I noticed that only statically assigned WAN interfaces cause this issue. If I enable a WAN interface set to DHCP, even if it's not online, the problem does not exist as it does with the same interface set to static.

I hope this helps get this resolved!

#9 Updated by Joshua Sign about 2 months ago

James Howel wrote:

It appears that if pfSense has NEVER been connected to the internet, the way it behaves with the timeouts is different to IF it has been connected at least once and then disconnected.
...
Architecturally, there is a significant dependency on internet connectivity in pfSense which for most people is fine but if an internal pfSense tier or the entire pfSense implementation can never be connected to the internet then it's very slow to configure and troubleshoot. When compared to an offline hardware firewall from another vendor pfSense behaves like its buggy and slow.

I just try to understand, i will be able to try to reproduce it, i hope next week.
In the case you explain, i understand that internet connectivity is not possible and has never be done.
But does dns resolution is possible ?
For local domains and remote domains (like google.com) ?

The case you talk about sound like a DNS latency problem.
Do you try to play with it ?
Do you use a spécific DNS configuration ?

And finally, how are you doing for now : you always wait 1 minute between pages, or do you find a workaround ?

Josh_

#10 Updated by Tom Embt about 2 months ago

I have also noticed this issue on my home pfSense. I was able to reproduce it reliably with a VM and it appears to happen when the WAN interface has an IP address assigned (and thus has a related DNS server in resolv.conf) but that connectivity is no longer working.

It happens to my home pfSense when the WAN interface has a DHCP lease and then something breaks for a while at the link level. For the VM reproduction I installed under VirtualBox with the WAN interface bridged to my laptop wifi and the LAN interface on a "host only network". If I allow the VM's WAN to get an IP on my network and then shut off wifi on my laptop, attempts to load the pfSense dashboard take 60+ seconds. It does not necessarily break immediately, which I believe to be because of DNS cache. Going to Services > DNS Resolver and restarting that service will cause it to break if it was not already doing so.

Doing some tcpdump of outbound DNS when this is breaking leads me to the hostname "ews.netgate.com", which is what it's trying and failing to look up. I confirmed that host-filing that IP temporarily makes the problem disappear. I locally munged some code for troubleshooting, and it would seem that in the above case at least, the issue is src/etc/inc/copyget.inc . Looks like that copyright download functionality is a fairly recent addition.

#11 Updated by Jamie Donovan 18 days ago

This is affecting our company's setup as well. Static public IPs /29 (total 5 available IPs) with one hooked up with a virtual IP alias on pfsense.

However, the connection is up and running – no interfaces are disconnected (apart from openvpn servers that have not been assigned interfaces). And even after removing the virtual IP the dashboard is still stuck for what seems like minutes.

With dhcp it works fine, but obviously static assignment is the only option here. I see no workarounds atm.

Edit: I see that I was missing DNS servers after reconfiguring. The workaround for me is to enter DNS servers manually. DNS Forwarder & DNS Resolver services are both turned off; this pfsense works purely as a router/firewall, no DNS or DHCP services.

#12 Updated by Joshua Sign 17 days ago

could you confirm that adding DNS entries can be a workaround ? (if you can try to do it for testing purpose)
How many seconds are you wating for the dashbord with this... And is it acceptable if it is ?

Tks
Josh_

#13 Updated by Pieter . 16 days ago

We had the same issue. It's a pfSense 2.4.4p2 installation in an air-gapped environment and has never touched the internet. The home page is unbearably slow to load, but all the other pages load just fine.

We made a network capture and saw repeated DNS requests to ews.netgate.com when loading the home page. When we did a host override in the DNS for ews.netgate.com to localhost, the home page loaded almost instantly, just like the other pages.

A little digging showed that in src/usr/local/www/index.php on line 469 the copyget.inc file is included. In this file there is an attempt to download the file https://ews.netgate.com/copyright every 24 hours or if the local copyright file doesn't exists. In an air-gapped environment, this file isn't updated so there is an attempt to download a fresh copy every time the homepage loads.

The attempted download is done in a way that blocks the PHP renderer from doing other things, so it waits for a number of timeouts on the DNS request before finishing to process the index.php file. Hence the very long loading time.

#14 Updated by Luke Hamburg 15 days ago

Hmm, nice find Pieter!

Maybe we need a function like haveWorkingDns() that returns a bool if DNS is working, and then use that in the include so it only tries to fetch from Netgate if the retval is true...

#15 Updated by Tom Embt 15 days ago

Looks like Pieter and I have come to the same conclusion (see comment 10), hopefully a fix isn't too far out.

Also available in: Atom PDF