Project

General

Profile

Actions

Bug #11192

open

Using Limiters causes out of order packets within one TCP or UDP flow

Added by Alexey Ab over 3 years ago. Updated about 14 hours ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Traffic Shaper (Limiters)
Target version:
-
Start date:
12/27/2020
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.4.5-p1
Affected Architecture:
amd64

Description

I am using following limiters:

pipe 1 config bw 85Mb queue 2000 mask all droptail
sched 1 config pipe 1 type qfq
queue 1 config pipe 1 queue 2000 mask all codel target 20ms interval 200ms ecn

pipe 2 config bw 85Mb queue 2000 mask all droptail
sched 2 config pipe 2 type qfq
queue 2 config pipe 2 queue 2000 mask all codel target 20ms interval 200ms ecn

(to get "mask all" option i have patched shaper.inc, but problem is reproducible with default shaper.inc also)

And using this rule:

match in on { ovpns8 } inet from 192.168.8.0/22 to 192.168.8.0/22 tracker 1608854657 dnqueue( 2,1) label "USER_RULE: Shape VPN Traffic"

As a result - I get a lot out-of-order packets in TCP / UDP streams when this rule is applied, and no reorder when firewall rule is turned off.

I have tried different types of queue management and schedulers, result is always the same.

I have also tried to limit network to 1 thread, net.isr.maxthreads=1 , no success.
Disabling net.inet.ip.dummynet.io_fast (via patching shaper.inc) gives no result too.

Attaching wireshark screenshow showing typical TCP out-of-order packets block (for 50-60 Mbit stream).
(There are hundreds of such blocks within all capture, but complete wireshark capture is 40 Megabytes for 6 seconds, so I can not attach it to ticket.)

Also attaching output of iperf3 with OUT OF ORDER errors for UDP 30 Mbit stream.

If i raise pipe bandwidth to 185Mb then probability of out-of-order becomes higher for the same traffic.


Files

Wireshark-out-of-order.PNG (187 KB) Wireshark-out-of-order.PNG Alexey Ab, 12/27/2020 07:19 PM
PFsense-Reorder-Iperf3.txt (46.6 KB) PFsense-Reorder-Iperf3.txt Alexey Ab, 12/27/2020 07:20 PM
Actions #1

Updated by Alexey Ab over 3 years ago

Forget to mention: I am using VMWare workstation 15.5, 2 core PFsense VM with em adapters.

Actions #2

Updated by Jim Pingle over 3 years ago

  • Status changed from New to Feedback

Have you only tested this on pfSense 2.4.5?

Can you try again on a 2.5.0 development snapshot?

Actions #3

Updated by Alexey Ab over 3 years ago

Update:

I've tested different pipe bandwidth and same 50 mbit traffic:

85 Mbit pipe - less reorder
185 Mbit pipe - more reorder
600 Mbit pipe - less reorder
1000 Mbit pipe - no reorder

When 85 Mbit pipe is saturated with traffic there is no reorder.

No, I did not test it on 2.5 so far.

Can you try reproduce this problem and fix it in release version?

Actions #4

Updated by Alexey Ab over 3 years ago

I have tested 2.4.2, 2.4.5p1, 2.5 - all versions have this problem.

Setting kernel.hz=1000 instead of 100 does not fix it too.

Packet reordering makes PFSense shaping unusable because it degrades TCP performance from 85 Mbit to 20-30, and produce other errors.

If TCP session get full speed, then all working good. But if speed drops for some reason, TCP can not restore it due to out-of-order packets.

Actions #5

Updated by Alexey Ab over 3 years ago

Adding 10 ms delay to the pipe seems to fix reordering.

Trying to set both kernel.hz=1000 and delay=1 ms to make a workaround lead to crashes under load.

Actions #6

Updated by Thomas Pilgaard over 3 years ago

Observed the same on 2.4.5 p1 with out of order packets during iperf testing using fq-codel with limiters set to 930 Mb/s. Tested it on a Supermicro X10SDV-4C-TLN2F without any packages installed and wasn't seeing any particular high load on it either.

Actions #7

Updated by Alexey Ab over 3 years ago

Since net.inet.ip.dummynet.io_fast does split path of packets for saturated/unsaturated pipe mode, then this setting is likely to be responsible for packet reordering. (traffic is very bursty for TCP without pacing or IPERF3 UDP test, so saturation/desaturation of pipe occurs several times in one second, so it seems then we get reorders on every transition)

But setting of net.inet.ip.dummynet.io_fast=0 has no effect, net.inet.ip.dummynet.io_pkt_fast is still increasing. Explanation is very simple:
io_fast check is commented in dummynet source code:

if (/*dn_cfg.io_fast &&*/ m *m0 && (dir & PROTO_LAYER2) 0 ) {

https://github.com/luigirizzo/dummynet/blob/master/sys/netinet/ipfw/ip_dn_io.c

Actions #8

Updated by Alexey Ab over 3 years ago

Actions #9

Updated by Alexey Ab over 3 years ago

I have tried to disable whole if (/*dn_cfg.io_fast */ && ...) via patching /boot/kernel/dummynet.ko .

Traffic then goes only to net.inet.ip.dummynet.io_pkt, net.inet.ip.dummynet.io_pkt_fast always stays zero.

But whole pfsense hangs after several seconds.

There is a comment before this if block:
"XXX Don't call dummynet_send() if scheduler return the packet just enqueued. This avoid a lock order reversal."

This seems to be a cause of hang, and it is unclear how to turn off io_fast correctly.

I mentioned in previous post, adding 10 ms lag to pipe seems to solve the problem. This can be be explained this way: if lag is set, then it
actually disables io_fast, because dummynet redirect all packets to delay queue and does not perform fast io, so no reorder occurs.

Now I need an advice.

Actions #10

Updated by Alexey Ab over 3 years ago

I've succesfully used kernel.hz=1000 and limiter delay=1ms as workaround to fix this problem.

I've also posted message to freebsd forum: https://forums.freebsd.org/threads/possible-race-condition-bug-in-dummynet-out-of-order-packets.78312/

So far I leave this problem for pfSense team, FreeBSD community, or anyone who want to help create proper fix. As I can see, dummynet is not maintained for a long time by the authors.

Actions #11

Updated by Azamat Khakimyanov 7 months ago

  • Status changed from Feedback to Rejected

Tested on 2.5 CE but I wasn't able to reproduce this issue.

I used KVM with em NICs and I created RA OpenVPN server, then I connected 2 Ubuntu VM hosts to OpenVPN server and use one of these hosts to generate constant 50Mbps traffic by using iperf3 ( -u -b 50M keys) and I used second host to generate big (2000 bytes) TCP packets by using hping3 utility ( hping3 <OpenVPN Server> -d 2000) trying to check if there are any issue with fragmented packets. I saw no issue.

So with different Limiters applied on OpenVPN (85Mbps, then 185 Mbps) I didn't see any issue with fragmented packets.

So it should be tested on latest 2.7 CE and 23.05_1 and if you still see this issue, please describe how to reproduce this issue.

Actions #12

Updated by Alexey Ab 7 months ago

There was nothing regarding fragmented packets in my bug report.

Actions #13

Updated by Marcos M 7 months ago

  • Status changed from Rejected to Feedback

It would be useful to know if this is reproducible on CE 2.7 (or preferably 23.09 dev) given the major OS version bump since 2.6.

Actions #14

Updated by Alexey Ab 7 months ago

I've spent two weeks of my working time to debug this problem, find root cause, find workaround, and write complete report to you. I've solved my problem, and not able to spend more time on testing, but I think nothing has changed, since dummynet is not maintained.

By the coincidence, there was also a problem with fragmented packets when using floating rules and dummynet to shape traffic. Large packets are not reassembled and lost (on PFSense 2.4.5). If I turn off floating rules, then all works fine. It was not reported.

Actions #15

Updated by Marcos M 7 months ago

  • Status changed from Feedback to New

Thank you - it's a good analysis! Since this is more of a FreeBSD issue than a pfSense one, reporting this upstream would be best. If you do, please reference the link here.

To summarize the workaround on pfSense:
  • If on a VM, set kern.hz="1000" in /boot/loader.conf.local, otherwise the delay will round up to 10ms (the default on VMs is 100).
  • Set 1 for the limiter pipe Delay option in the GUI.
Actions #16

Updated by P L 7 months ago

Marcos M wrote in #note-15:

Thank you - it's a good analysis! Since this is more of a FreeBSD issue than a pfSense one, reporting this upstream would be best. If you do, please reference the link here.

To summarize the workaround on pfSense:
  • If on a VM, set kern.hz="1000" in /boot/loader.conf.local; the default on VMs is 100.
  • Set 1 for the limiter pipe Delay option in the GUI.

Setting the delay on the limiter to 1ms and net.inet.ip.dummynet.io_fast="0" forced all traffic into io_pkt instead of io_pkt_fast for me in pfSense.

I have been troubleshooting fq_codel since early 2.6 pfSense. I'm now using the official layer 2 ethernet firewall rule setup for the AT&T bypass in 23.05.1. I play a lot of Nintendo, and the codel pipe drops or collapses randomly. I've tried the official "one-way" rule here, https://docs.netgate.com/pfsense/en/latest/recipes/codel-limiters.html

I first wanted to try disabling the fast IO because of what I read here.

https://forum.dd-wrt.com/phpBB2/viewtopic.php?t=324422

Also, over time, net.inet.ip.dummynet.tick_delta_sum accumulates to over -900. I find that once it reaches -300 games have chances of performing worse even with an A+ bufferbloat. TCP duplications and out of orders show up in pcaps. Turning off and on dummynet expiration resets this ticker. And a delay of 1ms seems to fix the problem.

Actions #17

Updated by P L 3 months ago

Recently I switched to the wpa_supplicant bypass method in pfSense and was still getting out of order packet issues unless I applied delay to fq_codel, which TCP and some videogames don't particularly appreciate. Interestingly, I also saw stability if I used FiFo on a LAN uplink, fq_codel on a LAN downlink, fq_codel on a WAN uplink and FiFo on a WAN Downlink (sometimes I have to pray the traffic doesn't use the fifo with that setup). fq_codel offers marvelous performance in videogames, but I constantly find myself having to set up in-line suricata to block bad TCP window updates, bad UDP checksums and application layer packets going in the wrong direction, and possibly sometimes HTTP requests getting sent to internal devices or right back to the device that sent it, resulting in suricata "HTTP response doesn't match request" errors.

I think all of these issues are because fq_codel, qdisc and dummynet are designed to be used on a LAN's downlink and a WAN's uplink, similar to ALTQ. If both fq_codel AQMs are placed on the WAN, the traffic still gets jumbled in the LAN interface's lack of queuing discipline. And vice versa with both on LAN. You cannot just choose a down/inbound limiter in the pfSense gui, it forces you to choose an uplink on each interface for statefulness reasons (traffic creates states outbound on the LAN and then again outbound on WAN.);

Other routing software seems to allow you to place fq_codel down on LAN and up on WAN. Here is a case example I can pull up right now, and there are a few others online: https://www.b1c1l1.com/blog/2020/03/26/linux-home-router-traffic-shaping-with-fq_codel/

"The simplest implementation is to create a single HTB bucket with FQ_CoDel on the "WAN" interface . . . If download shaping is required, you can use a similar configuration on the "LAN" interface connected to your home devices."

To be honest, I never have issues with my upload on fiber, just my download, but cannot choose to only shape my downloads with fq codel in pfSense. Thanks for your time.

Actions #18

Updated by P L 3 months ago

I also have tried FIFO + taildrop on the LAN up+down and fq_codel + tildrop WAN up+down, and it seemed to stabilize UDP but not TCP. Specifically some application protocols built upon UDP.

Actions #19

Updated by Marcos M about 14 hours ago

It may be that due to the way dummynet works, packets will inevitably arrive out of order. Dummynet will let packets through directly until the limit is hit. Limited packets are then handled later, and that may be enough to bring things back under the limit, which lets the packet through directly, so you end up out-of-order.

Actions

Also available in: Atom PDF