Project

General

Profile

Actions

Regression #11550

closed

Segmentation fault when loading ALTQ traffic shaping rules using FAIRQ

Added by Thorsten Zitterell 5 months ago. Updated 5 days ago.

Status:
Resolved
Priority:
Normal
Category:
Traffic Shaper (ALTQ)
Target version:
Start date:
02/26/2021
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
21.05
Release Notes:
Default
Affected Version:
2.5.0
Affected Architecture:
amd64

Description

I have upgraded from 2.4.5p1 to 21.02/21.02p1 on my SG-4860.

Following traffic shaper rule causes an segmentation fault:


[21.02-RELEASE][admin@firewall]/root: pfctl -vf /tmp/rules.debug
[...]
altq on igb0 fairq bandwidth 1Gb tbrsize 36000 queue { qLink qAck qOthersHigh qVoIP qOthersLow }
Segmentation fault (core dumped)

As a result other rules are not loaded and NAT does not work.


Files

shaper.xml (3.73 KB) shaper.xml Thorsten Zitterell, 02/26/2021 08:08 AM
Actions #1

Updated by Jim Pingle 5 months ago

  • Tracker changed from Bug to Regression
  • Project changed from pfSense Plus to pfSense
  • Subject changed from Segfault when Traffic Shaper is active to Segmentation fault when loading ALTQ traffic shaping rules
  • Category changed from Traffic Shaper (ALTQ) to Traffic Shaper (ALTQ)
  • Status changed from New to Feedback
  • Target version set to CE-Next
  • Affected Version set to 2.5.0

Unlikely that this is specific to Plus.

Can you attach the config.xml entries for the shaper? It would help to see the queue settings and so on to reproduce the issue locally.
Or at the very least, post the specific settings you put into the shaper wizard if that's what you used to create the queues.

Actions #2

Updated by Thorsten Zitterell 5 months ago

Jim Pingle wrote:

Can you attach the config.xml entries for the shaper? It would help to see the queue settings and so on to reproduce the issue locally.

<shaper> from config.xml attached.

Actions #3

Updated by Jim Pingle 5 months ago

Not that it should cause a segfault, but why are you mixing FAIRQ, PRIQ, and HFSC?

Does the crash happen if all your interfaces are using the same scheduler?

Actions #4

Updated by Thorsten Zitterell 5 months ago

Jim Pingle wrote:

Not that it should cause a segfault, but why are you mixing FAIRQ, PRIQ, and HFSC?

I used PRIQ for outgoing WAN interfaces, and FAIRQ for LAN interfaces because I wanted balanced rates to the internal hosts. The interface with HFSC was not enabled.

Does the crash happen if all your interfaces are using the same scheduler?

The crash does not happen if I use PRIQ for all interfaces. So it seems to be related to FAIRQ.

Actions #5

Updated by Jim Pingle 5 months ago

Have you tried only using FAIRQ instead of only using PRIQ? It's not clear from the symptom behavior if the problem is from FAIRQ alone or from mixing the two schedulers.

Actions #6

Updated by Thorsten Zitterell 5 months ago

Jim Pingle wrote:

Have you tried only using FAIRQ instead of only using PRIQ? It's not clear from the symptom behavior if the problem is from FAIRQ alone or from mixing the two schedulers.

When I use FAIRQ for all the interfaces, the segfault comes with the first rule.

The last lines of the trace are:


23153 pfctl CALL mmap(0,0x3000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
23153 pfctl RET mmap 34367680512/0x800793000
23153 pfctl CALL write(0x1,0x800738000,0x6d)
23153 pfctl GIO fd 1 wrote 109 bytes
"altq on pppoe0 fairq bandwidth 1Mb tbrsize 1492 queue { qACK qLink qVoIP qOthersHigh qOthersMid qOthersLow }
"
23153 pfctl RET write 109/0x6d
23153 pfctl PSIG SIGSEGV SIG_DFL code=SEGV_MAPERR
23153 pfctl NAMI "/root/pfctl.core"

Actions #7

Updated by Jim Pingle 5 months ago

  • Subject changed from Segmentation fault when loading ALTQ traffic shaping rules to Segmentation fault when loading ALTQ traffic shaping rules using FAIRQ
  • Status changed from Feedback to New

OK, thanks for checking on that. I've updated the subject to reflect that it's specific to FAIRQ.

Actions #8

Updated by Jim Pingle 2 months ago

  • Plus Target Version set to 21.05

Would be nice to fix soon if we can, but not a blocker at the moment.

Actions #9

Updated by Kristof Provost 2 months ago

This is an upstream FreeBSD bug, and is reproducible with the following pf.conf on a recent FreeBSD/main:

altq on mvneta0 fairq bandwidth 1Gb tbrsize 36000 queue { qLink qAck qOthersHigh qVoIP qOthersLow }
queue qLink fairq(default)

Actions #10

Updated by Jim Pingle 2 months ago

  • Status changed from New to Feedback
  • Assignee set to Kristof Provost
  • Target version changed from CE-Next to 2.6.0

Kristof committed a potential fix for this, needs tested. If it's still an issue, set target ahead to 21.09.

Actions #11

Updated by Jim Pingle about 2 months ago

  • Target version changed from 2.6.0 to 2.5.2
Actions #12

Updated by Viktor Gurov about 2 months ago

  • Status changed from Feedback to Resolved

pfSense 2.5.1 test:

# pfctl -vf /tmp/rules.debug
...
set loginterface vtnet0
set skip on { pfsync0 }
altq on vtnet0 fairq bandwidth 10Mb tbrsize 36000 queue { q1 qq2 }
Segmentation fault (core dumped)

pfSense 2.5.2.b.20210601.0300 test:

# pfctl -vf /tmp/rules.debug
...
set loginterface vtnet0
set skip on { pfsync0 }
altq on vtnet0 fairq bandwidth 10Mb tbrsize 36000 queue { q1 qq2 }
queue q1 on vtnet0 bandwidth 5Mb fairq( default ) 
queue qq2 on vtnet0 bandwidth 3Mb priority 2
...
(no segfault)

Actions #13

Updated by Roman Nik 10 days ago

Its look like regression in 2.5.2 release, because for 2.5.2 beta all worked fine.

Actions #14

Updated by Jim Pingle 5 days ago

Roman Nik wrote:

Its look like regression in 2.5.2 release, because for 2.5.2 beta all worked fine.

Are the symptoms identical?

Actions

Also available in: Atom PDF