Project

General

Profile

Bug #8519

pfSense update from the webGUI fails

Added by Steve Wheeler about 1 year ago. Updated 10 months ago.

Status:
Resolved
Priority:
Normal
Category:
Upgrade
Target version:
Start date:
05/15/2018
Due date:
% Done:

100%

Estimated time:
Affected Version:
2.4.3_1
Affected Architecture:
All

Description

When running an update from the web interface it can appear to fail and reports "System update failed".

In that situation it may continue to update in the background and will reboot some minutes later. Or it may require the update running for a second time when it then succeeds. You can't see from the gui page which of those situations you're in.

Selection_398.png (19 KB) Selection_398.png Steve Wheeler, 05/15/2018 06:31 AM
Selection_458.png (68.7 KB) Selection_458.png Steve Wheeler, 08/06/2018 05:53 PM

Associated revisions

Revision dea792c2 (diff)
Added by Steve Beaver about 1 year ago

Fixed #8519
Added simple test to ensure the instance of pfSense-upgrade is the instance started by hte upgrade GUI page, not some other process

Revision b3cd2eb4 (diff)
Added by Renato Botelho 10 months ago

Fix #8519

- Remove possible leftover sockfile before call pfSense-upgrade
- Wait until sockfile exists while process is still running
- Make sure to start polling only if process is running and sockfile
exists

History

#1 Updated by Steve Beaver about 1 year ago

CHris Linstruth can reproduce the “fails once then succeeds” issue by simply installing 2.4.3 CE and attempting a GUI upgrade. More here: https://netgate.slack.com/files/U12B39VD4/FAP4XGASU/screen_shot_2018-05-15_at_12.04.27_am.png

#2 Updated by Steve Beaver about 1 year ago

Based on the message that we can see on the GUI it seems that a ‘pfSense-upgrade -c’ call happened to check if there is a newer version available but GUI considered it was the real upgrade process that was running that output is from `pfSense-upgrade -c` for sure It was reported to happen in the past but I was never able to reproduce it.

We need to isolate all places that call pfSense-upgrade -c (including a cronjob) and think about a way to make sure the real upgrade call is running.

#3 Updated by Steve Beaver about 1 year ago

  • Assignee set to Renato Botelho

#4 Updated by Steve Beaver about 1 year ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100

#5 Updated by Jim Pingle about 1 year ago

  • Status changed from Feedback to New
  • Assignee changed from Renato Botelho to Steve Beaver

On an SG-1000 I occasionally get "The update system is busy. Please try again later" message despite starting the upgrade from the GUI. It isn't consistent, however. There may be a timing issue here on busy/slower platforms.

#6 Updated by Steve Beaver 11 months ago

  • Status changed from New to This Sprint

#7 Updated by Steve Beaver 11 months ago

  • Status changed from This Sprint to New

#8 Updated by Steve Wheeler 11 months ago

I have one test box which hits this on every single update. Always reports failure. Always updates fine in the background.

The machine itself is fast (G1820) but the storage is slow.

#9 Updated by Steve Beaver 11 months ago

  • Assignee changed from Steve Beaver to Renato Botelho

#10 Updated by Steve Beaver 11 months ago

Debugging shows that the PID file used to determine whether the upgrade process is still running goes away unexpectedly.

$pidfile = $g['varrun_path'] . '/' . $g['product_name'] . '-upgrade.pid';

Reassigning to Renato, who has the upgrade hood open ATM for another issue.

#11 Updated by Renato Botelho 11 months ago

  • Status changed from New to Resolved

System was running a modified version.

#12 Updated by Renato Botelho 11 months ago

  • Status changed from Resolved to Feedback

We were able to reproduce it and a fix was pushed at 1d8cd2215b2a0131f69d2879f77c01204b7928c5

#13 Updated by Jim Pingle 10 months ago

  • Status changed from Feedback to This Sprint

Since that last commit, systems that were not experiencing problems before now fail to track the updates. They print this and nothing else, but the update continues:

Please wait while the update system initializes

#14 Updated by Renato Botelho 10 months ago

  • Status changed from This Sprint to Feedback

#15 Updated by James Dekker 10 months ago

Can't reproduce in VM from 2.4.3 CE to 2.4.3_1, or 2.4.3_1 to 2.4.4 latest snapshot. Is there a specific device, design or configuration this should be tested with?

#16 Updated by Steve Wheeler 10 months ago

Not seen any update issues for a few snaps now on a number of boxes.

#17 Updated by Renato Botelho 10 months ago

  • Status changed from Feedback to Resolved

#18 Updated by Jim Pingle 10 months ago

I had several hitting this in my lab but only just now getting them onto snaps which included the latest fix. Let's give it another day / batch of updates to see how they all fare. If any are still broken I'll reopen.

#19 Updated by Jim Pingle 10 months ago

All of my hosts that had issues before appear to be OK when upgrading from snaps from early yesterday to the latest available.

Also available in: Atom PDF