Project

General

Profile

Actions

Bug #3996

closed

Solarflare NIC panic with LACP

Added by Chris Buechler over 9 years ago. Updated about 7 years ago.

Status:
Needs Patch
Priority:
Very Low
Assignee:
-
Category:
Operating System
Target version:
-
Start date:
11/07/2014
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
All
Affected Architecture:

Description

Up to and including 2.2 are affected by the bug described here.
https://bugs.freenas.org/issues/4803

There is a trivial patch attached to that ticket which reportedly fixes this problem. Should also see why this hasn't made it into FreeBSD 10.1 given it's been in FreeNAS for 7 months.

Actions #1

Updated by Ermal Luçi over 9 years ago

  • Status changed from New to Feedback

The patch mentioned here is already part of pfSense shiped sfxge driver.

Actions #2

Updated by Chris Buechler over 9 years ago

it wasn't as of 2 weeks ago and I don't see any relevant changes since then.

Actions #3

Updated by Chris Buechler over 9 years ago

  • Status changed from Feedback to Confirmed

confirmed the described scenario is an issue, and I can't find that patch's contents anywhere

Actions #4

Updated by Jim Thompson over 9 years ago

  • Priority changed from Normal to Low

If that "Solarflare patch" is the binary blob driver for sfxge, then we should yank it back out by the roots.

Actions #5

Updated by Jim Thompson over 9 years ago

Ermal is correct.

Check the contents of the patch against https://svnweb.freebsd.org/base/releng/10.1/sys/dev/sfxge/sfxge_port.c?revision=272461&view=markup#l318 (which is what is in FreeBSD 10.1-RELEASE)

diff r 8dc01b10eb64 sys/dev/sfxge/sfxge_port.c
--
a/sys/dev/sfxge/sfxge_port.c Tue Apr 15 10:32:43 2014 0100
++ b/sys/dev/sfxge/sfxge_port.c Sat Apr 19 14:49:46 2014 +0400
@ -357,10 +357,21 @
struct sfxge_port *port = &sc->port;
int rc;

- KASSERT);

mtx_lock(&port
>lock);
- rc = sfxge_mac_filter_set_locked(sc);
+ /*
+ * The function may be called without softc_lock held in the
+ * case of SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler
+ * checks IFF_DRV_RUNNING flag which implies port started, but
+ * it is not guaranteed to remain. softc_lock shared lock can't
+ * be held in the case of these ioctls processing, since it
+ * results in failure where kernel complains that non-sleepable
+ * lock is held in sleeping thread. Both problems are repeatable
+ * on LAG with LACP proto bring up.
+ */
+ if (port->init_state == SFXGE_PORT_STARTED)
+ rc = sfxge_mac_filter_set_locked(sc);
+ else
+ rc = 0;
mtx_unlock(&port->lock);
return rc;
}

318 int
318 sfxge_mac_filter_set(struct sfxge_softc sc)
319 {
320 struct sfxge_port *port = &sc->port;
321 int rc;
322
323 mtx_lock(&port->lock);
324 /

325 * The function may be called without softc_lock held in the
326 * case of SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler
327 * checks IFF_DRV_RUNNING flag which implies port started, but
328 * it is not guaranteed to remain. softc_lock shared lock can't
329 * be held in the case of these ioctls processing, since it
330 * results in failure where kernel complains that non-sleepable
331 * lock is held in sleeping thread. Both problems are repeatable
332 * on LAG with LACP proto bring up.
333 */
334 if (port->init_state == SFXGE_PORT_STARTED)
335 rc = sfxge_mac_filter_set_locked(sc);
336 else
337 rc = 0;
338 mtx_unlock(&port->lock);
339 return rc;
340 }

See also: https://svnweb.freebsd.org/base?view=revision&revision=265884 which reads, in-part:

264772:
Check that port is started when MAC filter is set
The MAC filter set may be called without softc_lock held in the case of
SIOCADDMULTI and SIOCDELMULTI ioctls. The ioctl handler checks IFF_DRV_RUNNING
flag which implies port started, but it is not guaranteed to remain.
softc_lock shared lock can't be held in the case of these ioctls processing,
since it results in failure where kernel complains that non-sleepable
lock is held in sleeping thread.
Both problems are repeatable on LAG with LACP proto bring up.

Checked in by gnn on my birthday. Sun May 11 17:18:09 2014 UTC

See also: https://svnweb.freebsd.org/base/stable/10/sys/dev/sfxge/sfxge_port.c?r1=265884&r2=265883&pathrev=265884

I'm closing this as resolved.

Actions #6

Updated by Jim Thompson over 9 years ago

  • Status changed from Confirmed to Rejected
  • Priority changed from Low to Very Low
Actions #7

Updated by Chris Buechler over 9 years ago

  • Status changed from Rejected to Feedback
  • Assignee changed from Ermal Luçi to Chris Buechler

back to me for testing after discussion with Jim. I now have a Solarflare card to test.

Actions #8

Updated by Chris Buechler over 9 years ago

  • Target version changed from 2.2 to 2.2.1

not something we'll be able to get fixed in 2.2. needs testing and reporting upstream, not something I can make a priority pre-2.2.

Actions #9

Updated by Chris Buechler about 9 years ago

  • Target version changed from 2.2.1 to 2.2.2
Actions #10

Updated by Chris Buechler about 9 years ago

  • Target version changed from 2.2.2 to 2.2.3
Actions #11

Updated by Chris Buechler almost 9 years ago

  • Status changed from Feedback to Needs Patch
  • Target version deleted (2.2.3)

not hardware we sell, so not something we'll deal with. if someone wants to pursue, report and get it fixed in FreeBSD and we can add a patch to our builds

Actions #12

Updated by Renato Botelho about 7 years ago

  • Assignee deleted (Chris Buechler)
Actions

Also available in: Atom PDF