pkg update checking with no Internet access kills web GUI
When a system can't reach the Internet and is checking for updates, pkg processes start piling up, eventually killing the GUI (nginx returning 504 because php-fpm doesn't return).
pkg processes like the following start piling up:
root 25840 0.0 1.5 45144 7268 - I 2:43AM 0:00.00 /usr/sbin/pkg search --raw-format json-compact pfSense-base root 25984 0.0 1.6 45144 7676 - S 2:43AM 0:00.00 /usr/sbin/pkg search --raw-format json-compact pfSense-base root 63921 0.0 1.5 45144 7328 - S 2:43AM 0:00.00 /usr/sbin/pkg search --raw-format json-compact pfSense-base root 64216 0.0 1.6 45144 7640 - S 2:43AM 0:00.00 /usr/sbin/pkg search --raw-format json-compact pfSense-base root 66597 0.0 1.5 45144 7328 - S 2:43AM 0:00.00 /usr/sbin/pkg search --raw-format json-compact pfSense-base root 66628 0.0 1.6 45144 7640 - S 2:43AM 0:00.00 /usr/sbin/pkg search --raw-format json-compact pfSense-base
It appears these processes never timeout.
Only execute remote search operation on first call of get_pkg_info(), this should fix #6177
#1 Updated by robi robi about 3 years ago
A quick suggestion for the routine which starts re-installing packages after the first reboot following the update:
- there should be a test/check of connectivity with pfsense packages servers before trying to download anything.
- test should be able to pass through the preconfigured proxy if that's the case, so pinging the server would not be appropriate. Perhaps downloading a dummy/test-file from the server(s) would assume that there is a connection available to packages servers
- If no connection is available, print something useful to the user, and continue booting normally
- also put something on the TTY console and the Web Dashboard, that package re-installation didn't succeed - and let the user the chance to trigger package reinstall later, when connection is made available.
#2 Updated by Jan Jurkus about 3 years ago
Well, it's not only the GUI that wasn't working any more. I could not get an IP with DHCP from pfSense. Look at the load average in the attached screenshot.
After a night's sleep it was still broken, so I opted to restart pfSense from the console. It would not complete that, and apparently it kept waiting for something. I did not know how to return to a command line and kill any remaining processes, so I reset the thing. Luckily everything worked afterwards.
As a sidenote: it says 'retrieving overview data' under ipsec on the dashboard. Didn't this just show up instantly in 2.2? The ntp status on the dashboard also showed a large error about a gateway, something I was unable to make a screenshot of.
#4 Updated by Chris Buechler about 3 years ago
- Status changed from Feedback to Confirmed
What's been done so far all works and definitely helps.
But, pkg processes will still pile up and hang the GUI, if you refresh dashboard a few times. Then a 'killall pkg' brings it back immediately.
pkg_update probably should skip the pkg_call if it's already running.
#11 Updated by Nicola Bressan about 3 years ago
I've experienced a similar issue.
IPv6 tunnel configured
pfSense in GUI was checking for updates and was resolving pkg.pfsense.org as IPv6 address (AAAA record) and couldn't check for updates.
This caused issues and 100% cpu for pkg process.
Checking " System > Advanced on the Networking tab > Prefer IPv4 over IPv6" was a good workaround, but this indicates that there's still that 100% cpu bug when pkg.pfsense.org is not responding correctly...right?
can you have a look in it?
or maybe fix IPv6 answering of pkg.pfsense.org :)
#13 Updated by Nicola Bressan about 3 years ago
Chris Buechler wrote:
IPv6 works just fine on pkg.pfsense.org. You're not hitting the issue here, please start a forum thread to discuss.
ok I'll do it, but the fact that with IPv6 preference enabled, package update was not working correctly it's a matter of fact.
When I don't enable "Prefer to use IPv4 even if IPv6 is available" I get 100% CPU usage for pkg search:
2.3.1-RELEASE][root@pfSense.nbr.local]/root: ps aux | grep pkg root 31701 93.0 1.8 45172 8980 - R 2:26PM 0:44.03 /usr/sbin/pkg search --raw-format json-compact pfSense-base root 7399 0.0 6.7 225160 32624 - S 9:19PM 0:39.61 /usr/local/bin/php -f /usr/local/pkg/pfblockerng/pfblockerng.inc dnsbl root 31683 0.0 1.6 45172 7624 - I 2:26PM 0:00.01 /usr/sbin/pkg search --raw-format json-compact pfSense-base last pid: 19104; load averages: 0.91, 0.48, 0.25 up 0+17:10:31 14:28:17 47 processes: 4 running, 43 sleeping CPU: 68.2% user, 0.0% nice, 31.8% system, 0.0% interrupt, 0.0% idle Mem: 19M Active, 127M Inact, 103M Wired, 38M Buf, 214M Free Swap: 1024M Total, 1024M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 31701 root 1 102 0 45172K 8980K RUN 2:00 100.00% pkg 35155 unbound 1 20 0 38604K 13856K kqread 0:42 0.00% unbound
This is what's happening:
with IPv6 [before hitting Google to show it's working...]
[2.3.1-RELEASE][root@pfSense.nbr.local]/root: fetch -v -6 https://www.google.it looking up www.google.it connecting to www.google.it:443 SSL options: 83004bff Peer verification enabled Using CA cert file: /usr/local/etc/ssl/cert.pem Verify hostname TLSv1.2 connection established using ECDHE-ECDSA-AES128-GCM-SHA256 Certificate subject: /C=US/ST=California/L=Mountain View/O=Google Inc/CN=*.google.com Certificate issuer: /C=US/O=Google Inc/CN=Google Internet Authority G2 requesting https://www.google.it/ fetch: https://www.google.it: size of remote file is not known www.google.it 10 kB 863 kBps 00m00s
Now the test:
[2.3.1-RELEASE][root@pfSense.nbr.local]/root: fetch -v -6 https://pkg.pfsense.org/pfSense_v2_3_1_amd64-core/meta.txz looking up pkg.pfsense.org connecting to pkg.pfsense.org:443 SSL options: 83004bff Peer verification enabled Using CA cert file: /usr/local/etc/ssl/cert.pem
and it hangs here...
over IPv4 it works:
[2.3.1-RELEASE][root@pfSense.nbr.local]/root: fetch -v -4 https://pkg.pfsense.org/pfSense_v2_3_1_amd64-core/meta.txz looking up pkg.pfsense.org connecting to pkg.pfsense.org:443 SSL options: 83004bff Peer verification enabled Using CA cert file: /usr/local/etc/ssl/cert.pem Verify hostname TLSv1.2 connection established using ECDHE-RSA-AES256-GCM-SHA384 Certificate subject: /OU=Domain Control Validated/OU=PositiveSSL Wildcard/CN=*.pfsense.org Certificate issuer: /C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO RSA Domain Validation Secure Server CA requesting https://pkg.pfsense.org/pfSense_v2_3_1_amd64-core/meta.txz remote size / mtime: 944 / 1466102812 meta.txz 100% of 944 B 2265 kBps 00m00s [2.3.1-RELEASE][root@pfSense.nbr.local]/root:
the problem that arises is always the same...when pkg update/fetching is not getting done correctly, the GUI hang with 100% cpu power being collected by the pkg process.