Bug #6407
closedWatchdog Timeout -- Error on VMware 5.5 Virtualized PFsense 2.3 Release
0%
Description
I am seeing an issue where one of my interfaces completely drops offline. em2 to be precise, it is a e1000 interface in vmware.
When this happens i see the Watchdog Timeout Error scrolling over and over in console referencing em2.
Rebooting seems to fix it, but only temporarily, as a number of hours later it re-occurs. I even tried disconnecting the interface, then reconnecting it, this causes the error message to stop scrolling, however connectivity is not restored.
Another sympton is that while this is happenning, attempting to do anything with the affected interface causes the Web Gui to stall out nearly endlessly, and i have to restart the web gui to even get back into the devices web admin.
I have disabled said interface for the moment, and un-assigned its NIC, and assiged the affected interface a new nic that is a VMXNET3 interface, and so far that interface is working. The interface only has 2 Windows 7 VMs on it that are only used for testing, so it sees little to no traffic so there is no way its getting overloaded or anything like that.
Config as follows when issue occurred.
Interfaces:
E1000 = WAN = X.X.X.130/29
E1000 = LAN = 192.168.1.1/24 = DHCP 100-199
E1000 = OPT1 = 10.10.10.1/24 = DHCP 10-50
VMXNET3 = AZR = 172.16.0.1/24 - disabled and no longer used
Config After
Interfaces:
E1000 = WAN = X.X.X.130/29
E1000 = LAN = 192.168.1.1/24 = DHCP 100-199
VMXNET3 = OPT1 = 10.10.10.1/24 = DHCP 10-50
E1000 = the problematic one - un-assigned
I have installed the openvm-tools package to pfsense using the builtin package manager.
NOTE: This pfsense has been in place for 2 years and functioned flawlessly, untill the recent upgrade to version 2.3 from 2.2.6.
Only basic NAT rules for port forwarding are in use, no CARP, LAGG, loadbalancing, or VLANs are in use here.
Basic native subnets, DHCP, Not even ipv6, and 1 IPSEC tunnel that has given us 0 issues. Thats literally it.
Im a 24 time certified network engineer and Cyber Security Expert, Certified to the FBIs Tier 1 ranking, so im not mis-stating any of this btw.
Im also a 15 year linux developer and admin as well, so its not my first rodeo here. I have about 18 pfsense deployments in place only 1 of which is not in a production environment so im really hoping this can be fixed before it starts creeping up on the other units, im worried to upgrade them at the moment for fear this issue will surface. Most of the production units are on versions between 2.2.4 and 2.2.6. Some are ALIXs, a few are virtualized, and a couple SG series, the rest are custom builds using Core 2 Duo PCs.
Updated by Jim Pingle over 9 years ago
- Status changed from New to Duplicate
Sounds like #6296 -- Update to 2.3.1 or 2.3.1_1 and it should be fixed.
Updated by Xander Venterus over 9 years ago
Jim Pingle wrote:
Sounds like #6296 -- Update to 2.3.1 or 2.3.1_1 and it should be fixed.
It would seem according to comments, that it is not fixed in 2.3.1
Updated by Jim Pingle over 9 years ago
One person on the ticket claims it wasn't fixed. Many on the forum and elsewhere have stated it's fixed for them. Either way: Update and re-test, if it's still happening, post on the other ticket.