Bug #6803
openCSRF timeout occurs when it (probably) shouldn't
0%
Description
Expected behaviour and error
A feature/change in release 2.1 was that CSRF timeout was changed to be the same as session timeout [1]. When a CSRF timeout occurs, the user should be offered a "retry" option, in which a new CSRF token is generated; they can then continue without disruption.
But when CSRF fails during an AJAX call, these aren't respected and the result is a page hang or failure.
Example use-case
A file is being worked on using diag_edit.php. The session timeout is lengthy (many hours). However when the "edit file" page doesn't have any server interaction for a few (<< 8) hours, and the "save" button is clicked, a CSRF timeout error occurs which is not recoverable. No error message is produced and the page just "hangs"; however the AJAX "browse" button continues to work and other GUI pages can be opened in new tabs without re-entering username+pw, showing that the user session is also still active.
Work done to replicate and explore
diag_edit.php doesn't contain any handler for AJAX "fail" (a bug in itself), and the main symptom of this issue is hanging when one tries to save, so I made the following change to the AJAX handler in function saveFile(file):
complete: function(req) { var values = req.responseText.split("|"); $("#fileStatus").html(values[1]); }, error: function(req, text,err) { alert('AJAX ERROR DETECTED. RETURN VALUES:\n\nreq:\n ' + JSON.stringify(req) + '\n\ntext:\n ' + JSON.stringify(text) + '\n\nerr:\n ' + JSON.stringify(err)'); }, timeout: 5000
I then set user session timeout to ~ 3 days, loaded a file in diag_edit.php, and left the computer unattended for around 5 hours. Upon return I clicked "save" to save the file, and received an AJAX error notification showing that csrf_callback() had been called, confirming that the previous "hang" on saving was due to csrf timeout/token validation failure during the AJAX POST call. The return value was an entire web page of HTML.
I could still click "browse" in the diag_edit.php GUI and receive a full file listing (surely if one request isn't valid then the other shouldn't be, either?)
I could still right-click the menu/header and open other pages in new tabs (showing that the user session had not timed out and was still active).
Possible issues
1. CSRF TIMEOUT ISN'T SAME AS SESSION TIMEOUT - Although CSRF timeout should match session timeout, it doesn't always, and can time out long before the session ends. In this case the GUI page became non-responsive due to CSRF timeout while my session was still very much active.
2. SOME AJAX BASED PAGES ARE MISSING ERROR: HANDLERS AND TIMEOUTS IN THEIR AJAX CALLS - Some pages that use AJAX don't have error handling, so they'll appear to hang on failure. Presumably all ajax code should have some kind of error: handler, or the GUI code should have a default AJAX error: handler for cases when the page doesn't have one. Also ajax calls should generally have some kind of timeout so they aren't allowed to hang indefinitely if there's a problem.
3. csrf_callback() DOESN'T RETURN A USEFUL ERROR IF IT OCCURS DURING AN AJAX POST CALL - If there is an ajax POST request which fails, the GUI code doesn't handle it or display a useful error message.
4. CSRF "CLICK TO RETRY" MISHANDLES IF IT OCCURS DURING AN AJAX POST CALL - By default, if CSRF times out, the CSRF handler presents an option to click and retry the request. It returns the full html of a web page which has this effect. This works fine if the POST was a form submission. But if the POST was an AJAX request, then the return of a full HTML page offering a retry option is received as meaningless response data (not the expected data/format) so the user doesn't get an option to retry.
I'd like to fix this but not sure what fix is needed. I had tried to fix this previously (PR #2981) but this was rejected. I don't understand the PR comment "You don't want to change the CSRF timeout value based on the session timeout", in light of the statement in 2.1 release notes "Set CSRF Magic token timeout to be the same as the login expiration". Any guidance what would be needed to improve this area would be appreciated, and I would try to produce a patch for it if able, once I understand what's needed.
[1] https://doc.pfsense.org/index.php/2.1_New_Features_and_Changes