Project

General

Profile

Bug #9024

nat + a limiter + fq_codel dropping near all ping traffic under load

Added by Dave taht 9 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
10/07/2018
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.4
Affected Architecture:

Description

I think https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/595 we have confirmed an issue still exists with this.

It's a very long thread.

bug looks similar but not identical to https://redmine.pfsense.org/issues/4326

History

#1 Updated by Tim Harman 9 months ago

I saw this when only TCP/UDP was being put into the limiter. As soon as I changed it to "all traffic" the loss went away.

#2 Updated by Dave taht 9 months ago

ok, so we just have a configuration guideline then: "Always put all traffic through the limiter". Do you have a conf that works for https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/570 ?

#3 Updated by Josh Chilcott 9 months ago

The conf attached to the example https://forum.netgate.com/topic/112527/playing-with-fq_codel-in-2-4/570 shows that the match rules include all protocols for IPv4. The issue presents itself when match out limiter rules are used on interfaces creating NAT states (ex: WAN). Loading the out limiter to capacity, using a match rule on WAN and testing for roughly 60 seconds which includes ramp up and ramp down, showed an 82% loss of successful pings to hosts on the WAN side. During heavy saturation of the limiter almost all ping is lost. Disabling outgoing NAT remedies the situation. Creating in/out limiters on just the LAN side remedies the situation - this appears to be the most performant workaround for single WAN single LAN setup where you have traffic originating on the WAN and LAN side.

#4 Updated by Steven Brown 9 months ago

I can confirm this bug. My testing seemed to show that the behaviour was the same no matter which scheduler I assigned to the limiter when the limiter was applied using floating rules. Using a LAN interface firewall rule no longer dropped the pings when fq_codel was assigned.

I had the rules assigned for "all traffic" so this did not fix the issue for me.

#5 Updated by Josh Chilcott 7 months ago

Using limiters on an interface, with outgoing NAT enabled, causes all ICMP echo reply packets to drop, coming back into WAN, when the limiter is loaded with flows. I can reproduce this issue with the following configuration:

  • limiters created (any scheduler). One limiter for out and one limiter for in.
  • create a single child queue for the out limiter and one for the in limiter.
  • floating match IPv4 any rule on WAN Out, using the out limiter child queue for in and in limiter child queue for out.
  • floating match IPV4 any rule on WAN In, using the in limiter child queue for in and out limiter child queue for out.
  • load the limiter with traffic. (Most recently I've been using a netperf netserver v2.6.0 on the WAN side and a Flent client on the LAN side running RRUL test)
  • start a constant ping from the client to the server during the RRUL test.

Both the flent.gz output and the constant ping will show a high rate of ICMP packets getting dropped. If a separate floating match rule is created for ICMP, then packets will not be dropped. Pushing less pps through pfSense seems to net fewer dropped echo replies.

#6 Updated by Dave taht 7 months ago

I would try to update this bug to make it more specific to limiters but I don't seem to hav privs

#7 Updated by Patrik Hildingsson 5 months ago

I just wanted to chime in that I have the very same exact behaviour on my setup.
Is there any progress on the issue?

Also available in: Atom PDF