Project

General

Profile

Actions

Bug #11192

open

Using Limiters causes out of order packets within one TCP or UDP flow

Added by Alexey Ab 10 months ago. Updated 10 months ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Category:
Traffic Shaper (Limiters)
Target version:
-
Start date:
12/27/2020
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.4.5-p1
Affected Architecture:
amd64

Description

I am using following limiters:

pipe 1 config bw 85Mb queue 2000 mask all droptail
sched 1 config pipe 1 type qfq
queue 1 config pipe 1 queue 2000 mask all codel target 20ms interval 200ms ecn

pipe 2 config bw 85Mb queue 2000 mask all droptail
sched 2 config pipe 2 type qfq
queue 2 config pipe 2 queue 2000 mask all codel target 20ms interval 200ms ecn

(to get "mask all" option i have patched shaper.inc, but problem is reproducible with default shaper.inc also)

And using this rule:

match in on { ovpns8 } inet from 192.168.8.0/22 to 192.168.8.0/22 tracker 1608854657 dnqueue( 2,1) label "USER_RULE: Shape VPN Traffic"

As a result - I get a lot out-of-order packets in TCP / UDP streams when this rule is applied, and no reorder when firewall rule is turned off.

I have tried different types of queue management and schedulers, result is always the same.

I have also tried to limit network to 1 thread, net.isr.maxthreads=1 , no success.
Disabling net.inet.ip.dummynet.io_fast (via patching shaper.inc) gives no result too.

Attaching wireshark screenshow showing typical TCP out-of-order packets block (for 50-60 Mbit stream).
(There are hundreds of such blocks within all capture, but complete wireshark capture is 40 Megabytes for 6 seconds, so I can not attach it to ticket.)

Also attaching output of iperf3 with OUT OF ORDER errors for UDP 30 Mbit stream.

If i raise pipe bandwidth to 185Mb then probability of out-of-order becomes higher for the same traffic.


Files

Wireshark-out-of-order.PNG (187 KB) Wireshark-out-of-order.PNG Alexey Ab, 12/27/2020 07:19 PM
PFsense-Reorder-Iperf3.txt (46.6 KB) PFsense-Reorder-Iperf3.txt Alexey Ab, 12/27/2020 07:20 PM
Actions #1

Updated by Alexey Ab 10 months ago

Forget to mention: I am using VMWare workstation 15.5, 2 core PFsense VM with em adapters.

Actions #2

Updated by Jim Pingle 10 months ago

  • Status changed from New to Feedback

Have you only tested this on pfSense 2.4.5?

Can you try again on a 2.5.0 development snapshot?

Actions #3

Updated by Alexey Ab 10 months ago

Update:

I've tested different pipe bandwidth and same 50 mbit traffic:

85 Mbit pipe - less reorder
185 Mbit pipe - more reorder
600 Mbit pipe - less reorder
1000 Mbit pipe - no reorder

When 85 Mbit pipe is saturated with traffic there is no reorder.

No, I did not test it on 2.5 so far.

Can you try reproduce this problem and fix it in release version?

Actions #4

Updated by Alexey Ab 10 months ago

I have tested 2.4.2, 2.4.5p1, 2.5 - all versions have this problem.

Setting kernel.hz=1000 instead of 100 does not fix it too.

Packet reordering makes PFSense shaping unusable because it degrades TCP performance from 85 Mbit to 20-30, and produce other errors.

If TCP session get full speed, then all working good. But if speed drops for some reason, TCP can not restore it due to out-of-order packets.

Actions #5

Updated by Alexey Ab 10 months ago

Adding 10 ms delay to the pipe seems to fix reordering.

Trying to set both kernel.hz=1000 and delay=1 ms to make a workaround lead to crashes under load.

Actions #6

Updated by Thomas Pilgaard 10 months ago

Observed the same on 2.4.5 p1 with out of order packets during iperf testing using fq-codel with limiters set to 930 Mb/s. Tested it on a Supermicro X10SDV-4C-TLN2F without any packages installed and wasn't seeing any particular high load on it either.

Actions #7

Updated by Alexey Ab 10 months ago

Since net.inet.ip.dummynet.io_fast does split path of packets for saturated/unsaturated pipe mode, then this setting is likely to be responsible for packet reordering. (traffic is very bursty for TCP without pacing or IPERF3 UDP test, so saturation/desaturation of pipe occurs several times in one second, so it seems then we get reorders on every transition)

But setting of net.inet.ip.dummynet.io_fast=0 has no effect, net.inet.ip.dummynet.io_pkt_fast is still increasing. Explanation is very simple:
io_fast check is commented in dummynet source code:

if (/*dn_cfg.io_fast &&*/ m *m0 && (dir & PROTO_LAYER2) 0 ) {

https://github.com/luigirizzo/dummynet/blob/master/sys/netinet/ipfw/ip_dn_io.c

Actions #8

Updated by Alexey Ab 10 months ago

Actions #9

Updated by Alexey Ab 10 months ago

I have tried to disable whole if (/*dn_cfg.io_fast */ && ...) via patching /boot/kernel/dummynet.ko .

Traffic then goes only to net.inet.ip.dummynet.io_pkt, net.inet.ip.dummynet.io_pkt_fast always stays zero.

But whole pfsense hangs after several seconds.

There is a comment before this if block:
"XXX Don't call dummynet_send() if scheduler return the packet just enqueued. This avoid a lock order reversal."

This seems to be a cause of hang, and it is unclear how to turn off io_fast correctly.

I mentioned in previous post, adding 10 ms lag to pipe seems to solve the problem. This can be be explained this way: if lag is set, then it
actually disables io_fast, because dummynet redirect all packets to delay queue and does not perform fast io, so no reorder occurs.

Now I need an advice.

Actions #10

Updated by Alexey Ab 10 months ago

I've succesfully used kernel.hz=1000 and limiter delay=1ms as workaround to fix this problem.

I've also posted message to freebsd forum: https://forums.freebsd.org/threads/possible-race-condition-bug-in-dummynet-out-of-order-packets.78312/

So far I leave this problem for pfSense team, FreeBSD community, or anyone who want to help create proper fix. As I can see, dummynet is not maintained for a long time by the authors.

Actions

Also available in: Atom PDF