Project

General

Profile

Bug #10189

pfsense calculates wrong ip header checksum when reassambling packages with different mtu

Added by Stefan Mark 3 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Interfaces
Target version:
Start date:
01/17/2020
Due date:
% Done:

0%

Estimated time:
Affected Version:
2.4.4-p3
Affected Architecture:
All

Description

IP packages that are routed through pfsense are reassambled, if incoming packages are fragments and the MTU of outgoing interface allows refragmentation to a higher MTU. In some cases the IP header checksum is calculated wrong:
- problems occure if packages are reassambled to exactly one package (e.g. IN: 1400+80 > OUT: 1480 ; or IN: 600+600+250 > OUT: 1450).
- problems do not occure if packages are reassambled to more then one outgoing package (e.g. IN: 1400+1400 > OUT: 1500+1300)
(i did some rounding of the package size for abstraction, in real there is also some overhead for source and destination IP, etc)

Setup for reproduction:
-----------------------

Machine A <> pfsense <> Machine B

- Machine A has set an MTU of 1400
- pfsense and Machine B use MTU of 1500

Steps for reproduction:
-----------------------
1.) ping from A to B (via GW pfsense) using IPv4: ping <ip of B> -s 1450
2 packages (Fragments) are received by pfsense (1400+50)
1 package leaves pfsense (1450) -> IP header checksum wrong; B drops package; no ICMP reply message
2.) ping from A to B (via GW pfsense) using IPv4: ping <ip of B> -s 1600
2 packages (Fragments) are received by pfsense (1400+200)
2 packages (Fragments) leave pfsense (1500+100) -> IP header checksums OK; B answers with ICMP reply

Some details that may be relevant:
- we use VLANs on all networks
- problem occurs indepent of underlying hardware (we use it on a DELL server and original netgate HW XG-7100)
- if we disable scrub, packages are not reassambled when passing through pfsense and no error occurs

Interesting observation:
- if header checksum is calculated wrongly, it is exactly 0x0100 higher than the checksum of the first incoming fragment. This leads me to the assumption that there is some logical bug in the implementation of header checksum calculation

I think the setup i quite easy so you can reproduce it, but if you need assistance don't hesitate to ask me

History

#1 Updated by Jim Pingle 3 months ago

  • Category changed from Routing to Interfaces
  • Status changed from New to Feedback

You'll need to try reproducing that on bare FreeBSD (and FreeBSD+pf) -- Odds are that isn't caused by anything specific to pfSense, so it needs to be raised upstream. If it works OK on FreeBSD+pf and not an equivalent version of pfSense, then it's something we can look into.

#2 Updated by Stefan Mark 3 months ago

I tried to reproduce this with different freebsd versions:
- 13.0 : OK
- 11.2 : Fails
- 9.3 : OK

It seems that a bug was introduced between 9.3 -> 11.2 and fixed between 11.2 -> 13.0.

Here the steps i did on freebsd system (live system):
ifconfig em0 10.0.0.1/24
ifconfig em1 10.1.0.1/24
service pf onestart
service pflog onestart
echo "scrub in all" > /tmp/pf
pfctl -e
pfctl -f /tmp/pf
sysctl net.inet.ip.forwarding=1

I hope this will help you to find the bug and fix it in next pfsense release.

#3 Updated by Jim Pingle 3 months ago

Have you also tried on pfSense 2.4.5 and 2.5.0 snapshots to see if it persists there as well?

#4 Updated by Stefan Mark 3 months ago

No, i haven't tried these versions yet and currently don't have time to do more investigation.
If 2.4.5 becomes stable we'll update of our pfsense firewalls and I'll be able to check if the bug is fixed.

When looking into release notes of 2.4.5 I don't see anything that may correct the error, but following 2 points in release notes of 2.4.4 (https://docs.netgate.com/pfsense/en/latest/releases/2-4-4-new-features-and-changes.html) may have introduced the bug:
https://www.freebsd.org/security/advisories/FreeBSD-SA-18:08.tcp.asc
https://www.freebsd.org/security/advisories/FreeBSD-SA-18:10.ip.asc

#5 Updated by Jim Pingle 3 months ago

If it's fixed in 13, there is a possibility that the fix was MFCd from 13 to 12-STABLE and back to 11-STABLE. 2.4.5 is built from 11-STABLE at a point after 11.3, so if the fix was brought back that far, it may be included.

#6 Updated by Danilo Zrenjanin 2 months ago

I replicated the issue on SG-1100 2.4.4-p3, following the steps from the description. Ping was failing when the packet size was set to 1450 (ping <ip of B> -s 1450) and host A MTU was set to 1400. Host A and host B were connected to different interfaces (no VLANs).

After upgrade to 2.4.5-DEVELOPMENT, ping started to work using the same setup!

#7 Updated by Jim Pingle 2 months ago

  • Status changed from Feedback to Resolved
  • Target version set to 2.4.5

Great, so it looks like the issue is resolved in FreeBSD. I'll close this for now.

Also available in: Atom PDF