https://redmine.pfsense.org/https://redmine.pfsense.org/favicon.ico?16780521162017-01-15T02:18:16ZpfSense bugtrackerpfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=306472017-01-15T02:18:16ZRolf Sommerhalder
<ul></ul><p>Jim Pingle wrote:</p>
<blockquote>
<p>On 2.4, when changing attributes of an assigned LAGG such as the mode or membership, the firewall panics and reboots.</p>
<p>Tested on an 8860 and 4860, so it may be specific to igb. In this case, the lagg instance contained igb4,igb5 in LACP mode, and I attempted to change the mode to Failover. bjaffe encountered the same crash when changing member interfaces.</p>
</blockquote>
<p>With 2.4 amd64 Snapshot on Supermicro SuperServers 5018D-FN8T with X10SDV-TP8F motherboards, for example changing an IP address of a VLAN on LAGG interfaces igb1,igb2,igb3 that uses LACP also panics, and the kernel hangs subsequently.</p>
<p>It requires a manual Reset or Power Cycle, using BMC/IPMI from remote. Fortunately it will restart, and the changes will then take effect.</p>
<p>For such situations, getting the Watch Dog to work would be helpful, which is available in the BIOS...</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=307162017-01-18T11:03:45ZRenato Botelhorenato@netgate.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Feedback</i></li><li><strong>Assignee</strong> set to <i>Renato Botelho</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>I've cherry-picked FreeBSD-src patches that should fix it:</p>
<p><a class="external" href="https://svnweb.freebsd.org/base?view=revision&revision=310180">https://svnweb.freebsd.org/base?view=revision&revision=310180</a><br /><a class="external" href="https://svnweb.freebsd.org/base?view=revision&revision=310327">https://svnweb.freebsd.org/base?view=revision&revision=310327</a></p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=307542017-01-19T11:28:58ZJim Pingle
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Confirmed</i></li></ul><p>Still crashes on the latest factory snapshot: Wed Jan 18 19:49:46 CST 2017</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=307852017-01-20T11:18:17ZRenato Botelhorenato@netgate.com
<ul></ul><p>I couldn't reproduce it on a VM using em driver, probably something specific to igb as mentioned</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=307882017-01-20T14:02:32ZRolf Sommerhalder
<ul></ul><p>Snapshots from this morning still crash with igb hardware NICs.</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=308092017-01-23T01:01:55ZRolf Sommerhalder
<ul></ul><p>To be more precise: pfSense does not exactly "crash", as it is still ping-able. And SSH shells that were open from before the "crash" remain connected, while still being able to type commands, but do not return answers.</p>
<p>Only reset or power-cycle gets it out of this state (did not managed to get Watch Dog working yet). <br />Thereafter, the changes made to LAGG right before the "crash" take effect.</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=308202017-01-23T09:07:44ZJim Pingle
<ul></ul><p>Here, it still panics + dumps + reboots same as it did originally.</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=308722017-01-25T07:34:13ZRenato Botelhorenato@netgate.com
<ul><li><strong>Assignee</strong> changed from <i>Renato Botelho</i> to <i>Luiz Souza</i></li></ul> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=309402017-01-27T15:33:37ZLuiz Souzaluiz@netgate.com
<ul><li><strong>Status</strong> changed from <i>Confirmed</i> to <i>Feedback</i></li></ul><p>Fixed in latest snapshot.</p>
<p>Relevant commits:</p>
<p><a class="external" href="https://github.com/pfsense/FreeBSD-src/commit/b5996bd8278c710ce6859cfae2208e175e9b1171">https://github.com/pfsense/FreeBSD-src/commit/b5996bd8278c710ce6859cfae2208e175e9b1171</a><br /><a class="external" href="https://github.com/pfsense/FreeBSD-src/commit/a86883d40fbb81454f6e44c6a759c0142408912d">https://github.com/pfsense/FreeBSD-src/commit/a86883d40fbb81454f6e44c6a759c0142408912d</a></p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=309422017-01-27T18:53:49ZJim Pingle
<ul></ul><p>Seems better now, it doesn't crash. Logs of activity in the log, though:<br /><pre>
Jan 27 19:47:40 master snmpd[47102]: SIOCGIFDESCR (lagg0): Device not configured
Jan 27 19:47:40 master kernel: igb4: lagg_port_destroy: lp_ifflags unclean
Jan 27 19:47:40 master kernel: igb5: lagg_port_destroy: lp_ifflags unclean
Jan 27 19:47:40 master kernel: lagg0: promiscuous mode disabled
Jan 27 19:47:40 master check_reload_status: Linkup starting lagg0
Jan 27 19:47:40 master kernel: lagg0: link state changed to DOWN
Jan 27 19:47:40 master check_reload_status: Syncing firewall
Jan 27 19:47:40 master php-fpm[43135]: /interfaces_lagg_edit.php: Beginning https://portal.pfsense.org configuration backup.
Jan 27 19:47:41 master check_reload_status: Reloading filter
Jan 27 19:47:43 master php-fpm[43135]: /interfaces_lagg_edit.php: End of portal.pfsense.org configuration backup (success).
Jan 27 19:47:43 master snmpd[47102]: SIOCGIFDESCR (lagg0_vlan10): Device not configured
Jan 27 19:47:43 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:43 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:43 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:43 master kernel: carp: demoted by -240 to 240 (vhid removed)
Jan 27 19:47:43 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:43 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:43 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:43 master kernel: carp: demoted by -240 to 0 (vhid removed)
Jan 27 19:47:43 master kernel: lagg0_vlan10: promiscuous mode disabled
Jan 27 19:47:43 master kernel: vlan0: changing name to 'lagg0_vlan10'
Jan 27 19:47:43 master snmpd[47102]: SIOCGIFDESCR (lagg0_vlan10): Device not configured
Jan 27 19:47:43 master snmpd[47102]: SIOCGIFDESCR (vlan0): Device not configured
Jan 27 19:47:43 master kernel: lagg0: promiscuous mode enabled
Jan 27 19:47:43 master kernel: lagg0_vlan10: promiscuous mode enabled
Jan 27 19:47:43 master check_reload_status: Restarting ipsec tunnels
Jan 27 19:47:43 master kernel: carp: demoted by 240 to 240 (interface down)
Jan 27 19:47:43 master kernel: carp: demoted by 240 to 480 (interface down)
Jan 27 19:47:45 master check_reload_status: updating dyndns opt2
Jan 27 19:47:45 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:45 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:45 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:45 master kernel: carp: demoted by -240 to 240 (vhid removed)
Jan 27 19:47:45 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:45 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:45 master kernel: ifa_maintain_loopback_route: deletion failed for interface lagg0_vlan10: 3
Jan 27 19:47:45 master kernel: carp: demoted by -240 to 0 (vhid removed)
Jan 27 19:47:45 master kernel: lagg0: promiscuous mode disabled
Jan 27 19:47:45 master kernel: lagg0_vlan10: promiscuous mode disabled
Jan 27 19:47:46 master snmpd[47102]: SIOCGIFDESCR (lagg0_vlan20): Device not configured
Jan 27 19:47:46 master kernel: lagg0: promiscuous mode enabled
Jan 27 19:47:46 master kernel: lagg0_vlan10: promiscuous mode enabled
Jan 27 19:47:46 master kernel: carp: demoted by 240 to 240 (interface down)
Jan 27 19:47:46 master kernel: carp: demoted by 240 to 480 (interface down)
Jan 27 19:47:46 master kernel: vlan1: changing name to 'lagg0_vlan20'
Jan 27 19:47:46 master snmpd[47102]: SIOCGIFDESCR (vlan1): Device not configured
Jan 27 19:47:59 master php-fpm[94047]: /rc.newipsecdns: IPSEC: One or more IPsec tunnel endpoints has changed its IP. Refreshing.
Jan 27 19:47:59 master check_reload_status: Reloading filter
</pre></p>
<p>If that is normal/expected then we can close this.</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=309442017-01-27T21:04:50ZLuiz Souzaluiz@netgate.com
<ul></ul><p>Yes, the messages does not seem related with the original bug (crash at ifconfig laggX destroy).</p>
<p>Let's open a new ticket to track these warnings.</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=309762017-01-29T21:46:57ZLuiz Souzaluiz@netgate.com
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Resolved</i></li></ul> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=347312017-10-25T13:48:58ZMichael OBrien
<ul></ul><p>Luiz Souza wrote:</p>
<blockquote>
<p>Yes, the messages does not seem related with the original bug (crash at ifconfig laggX destroy).</p>
<p>Let's open a new ticket to track these warnings.</p>
</blockquote>
<p>Was this new ticket opened? When I change LAGG interface settings via the pfSense GUI or a command prompt, my pfSense 2.4.1 box (using igb drivers) cannot ping anything on the LAGG until I completely reboot it.</p>
<p>Nothing interesting in dmesg. Here's what shows up in system.log - you'll see a lot of sync noise, but this happened before HA was configured as well.</p>
<pre>
Oct 25 11:43:26 fw-lvdc-01 check_reload_status: Syncing firewall
Oct 25 11:43:27 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://172.16.0.2:443/xmlrpc.php.
Oct 25 11:43:27 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: XMLRPC reload data success with https://172.16.0.2:443/xmlrpc.php (pfsense.host_firmware_version).
Oct 25 11:43:27 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: XMLRPC versioncheck: 17.3 -- 17.3
Oct 25 11:43:27 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://172.16.0.2:443/xmlrpc.php.
Oct 25 11:43:28 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: XMLRPC reload data success with https://172.16.0.2:443/xmlrpc.php (pfsense.restore_config_section).
Oct 25 11:43:28 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: Beginning XMLRPC sync data to https://172.16.0.2:443/xmlrpc.php.
Oct 25 11:43:28 fw-lvdc-01 check_reload_status: Linkup starting igb2
Oct 25 11:43:28 fw-lvdc-01 kernel: igb2: link state changed to DOWN
Oct 25 11:43:28 fw-lvdc-01 kernel: igb3: link state changed to DOWN
Oct 25 11:43:28 fw-lvdc-01 kernel: lagg0: link state changed to DOWN
Oct 25 11:43:28 fw-lvdc-01 check_reload_status: Restarting ipsec tunnels
Oct 25 11:43:28 fw-lvdc-01 check_reload_status: Linkup starting igb3
Oct 25 11:43:28 fw-lvdc-01 check_reload_status: Linkup starting lagg0
Oct 25 11:43:29 fw-lvdc-01 check_reload_status: Reloading filter
Oct 25 11:43:29 fw-lvdc-01 check_reload_status: Reloading filter
Oct 25 11:43:29 fw-lvdc-01 php-fpm[89611]: /rc.linkup: Hotplug event detected for MGMT(lan) static IP (10.50.1.1 )
Oct 25 11:43:30 fw-lvdc-01 check_reload_status: updating dyndns lan
Oct 25 11:43:31 fw-lvdc-01 php-fpm[52624]: /rc.filter_synchronize: XMLRPC reload data success with https://172.16.0.2:443/xmlrpc.php (pfsense.filter_configure).
Oct 25 11:43:32 fw-lvdc-01 check_reload_status: Linkup starting igb2
Oct 25 11:43:32 fw-lvdc-01 kernel: igb2: link state changed to UP
Oct 25 11:43:32 fw-lvdc-01 kernel: lagg0: link state changed to UP
Oct 25 11:43:32 fw-lvdc-01 check_reload_status: Linkup starting lagg0
Oct 25 11:43:32 fw-lvdc-01 check_reload_status: Linkup starting igb3
Oct 25 11:43:32 fw-lvdc-01 kernel: igb3: link state changed to UP
Oct 25 11:43:32 fw-lvdc-01 check_reload_status: Reloading filter
Oct 25 11:43:32 fw-lvdc-01 php-fpm[87800]: /interfaces.php: Creating rrd update script
Oct 25 11:43:33 fw-lvdc-01 php-fpm[87800]: /rc.linkup: Hotplug event detected for MGMT(lan) static IP (10.50.1.1 )
Oct 25 11:43:33 fw-lvdc-01 check_reload_status: Reloading filter
Oct 25 11:43:33 fw-lvdc-01 check_reload_status: rc.newwanip starting lagg0
Oct 25 11:43:34 fw-lvdc-01 php-fpm[32701]: /rc.newwanip: rc.newwanip: Info: starting on lagg0.
Oct 25 11:43:34 fw-lvdc-01 php-fpm[32701]: /rc.newwanip: rc.newwanip: on (IP address: 10.50.1.1) (interface: MGMT[lan]) (real interface: lagg0).
Oct 25 11:43:34 fw-lvdc-01 check_reload_status: Reloading filter
</pre> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=347322017-10-25T14:17:05ZMichael OBrien
<ul></ul><blockquote>
<p>Was this new ticket opened? When I change LAGG interface settings via the pfSense GUI or a command prompt, my pfSense 2.4.1 box (using igb drivers) cannot ping anything on the LAGG until I completely reboot it.</p>
</blockquote>
<p>I think it's this, testing nightly now: <a class="external" href="https://redmine.pfsense.org/issues/7928">https://redmine.pfsense.org/issues/7928</a></p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=347442017-10-25T17:42:38ZSteve Wheeler
<ul></ul><p>If it didn't actually panic it's probably that MAC address issue. That should be fixed in 2.4.2 snaps now. Please report if you're still able to trigger it there.</p> pfSense - Bug #7119: Changing LAGG attributes results in a panic/crashhttps://redmine.pfsense.org/issues/7119?journal_id=347572017-10-26T16:43:32ZMichael OBrien
<ul></ul><p>Steve Wheeler wrote:</p>
<blockquote>
<p>If it didn't actually panic it's probably that MAC address issue. That should be fixed in 2.4.2 snaps now. Please report if you're still able to trigger it there.</p>
</blockquote>
<p>Nope, 2.4.2 snapshots fixed it right up. Thanks!</p>