Project

General

Profile

Bug #7790

dpinger / code using it, falsely defines a down gateway as up after dpinger gets restarted.

Added by Pi Ba about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Gateway Monitoring
Target version:
Start date:
08/21/2017
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.4
Affected Architecture:
All

Description

dpinger / code using it, falsely defines a down gateway as up after dpinger gets restarted.
Or maybe the code that uses it wrongly defines it as down.. as dpinger itself only tells it measured no loss and no latency sofar. while actually it didnt measure anything those first few seconds after it starts.

This adds and seconds later removes the gateway from gatewaygroups while nothing changed with regard to the gateway its monitoring.

Anyhow a pullrequest is waiting that evades this issue: https://github.com/pfsense/pfsense/pull/3763

History

#1 Updated by Renato Botelho about 2 years ago

  • Status changed from New to Feedback

PR has been merged, thanks

#2 Updated by Jim Thompson about 2 years ago

  • Assignee set to Pi Ba

#3 Updated by Pi Ba about 2 years ago

Works for me :). But then again, i made the patch. Would be weird if it didn't 'fix' my reported issue..

Question that remains is if this is the proper workaround.? Or should apinger be changed to more clearly report it hasn't gathered enough data yet to give usable results.?

Checked with version:
2.4.0-RC (amd64)
built on Mon Sep 11 21:38:52 CDT 2017
FreeBSD 11.0-RELEASE-p12

#4 Updated by Jim Pingle about 2 years ago

  • Status changed from Feedback to Resolved

Seems OK, though I did have to push a fix because this resulted in PHP errors in some cases. See 59104a6ff6c862482eddb9696fd8d22dec89052e

#5 Updated by Pi Ba about 2 years ago

So 'file exists' check returns true, but then it doesn't exist.. Wondering how that can be ;) well i probably dont want to know.. (can of worms..)

#6 Updated by Jim Pingle about 2 years ago

Because during the sleep while it's in the loop, the file can disappear if something happens in the background.

#7 Updated by Pi Ba about 2 years ago

Thats kinda what i meant with "i dont want to know", as even though now the check and usage are always only microseconds apart 'something' might still happen in the background.
But yes the chance of it happening likely did decrease so much that no-one will encounter it again (anytime soon).

Adding 'locks' around lots of things to avoid such situations 100% would likely cause more issues than it solves.. and even then 'something' might happen that was not planned. a crash of dpinger or out of memory, a corrupted memory bit or something else unforeseen :)
I should probably leave it at that and hope that pfSense 3.0 has a different framework that avoids most of these background 'unexpected' changes.

Anyhow thanks for improving it!

#8 Updated by Jim Pingle about 2 years ago

Yes it was definitely weird, I saw it on maybe 3/12 test systems (so ~1/4) but not every boot either. They're all OK with the current change though.

Also available in: Atom PDF