Project

General

Profile

Actions

Bug #729

closed

if_bridge unpredictable filter interface selection

Added by Jonathan Tripathy almost 14 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Rules / NAT
Target version:
Start date:
07/09/2010
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
All
Affected Architecture:

Description

This bug is regarding traffic leaking when using bridges interfaces. I have been able to produce this bug on 2 seperate systems. One system was an independant test best which was not connected to any other network. I was using the 2.0 snampshot on 7th July 2010.

1 ) Setup pfsense with 2 interfaces. WAN and LAN
2 ) Bridge WAN and LAN together (They will share the same subnet).
3 ) Make sure that the WAN interface is assigned an IP, but the LAN interface is not
5 ) Connect a host to the LAN interface (We will call this Z) and assign an ip to it. Also connect a host to the WAN interface, and assign an IP to it. We will use the host connected to the WAN interface for accessing the web GUI/shell.
6 ) In the WAN tab on pfsense, make sure that the only rule there is "allow all". I appreciate that you wouldn't do this in production, but this is just here to prove a point.
7 ) In the LAN tab, make sure the only rule listed is a "block all". This is important, as the assumption in this test is that hosts on the LAN interface will not be able to access anything.
8 ) Reset state table
9) Reset ARP table (arp -d -a)
10 ) Try and ping the pfsense WAN interface IP from Z. You would expect this to fail, as the only rule on the LAN tab is block all. However, for about 5 minutes, it will allow all traffic depending on the rules in the WAN tab.

What I feel is happening, is that once the arp table and states tables are cleared, pfsense thinks that traffic coming from the LAN interface is actually from the WAN interface. Hence, it evaluates the WAN rules instead of the LAN rules.

I have also tested the above with 3 interfaces (with LAN and WAN bridged together, and OPT1 as a standard subnet). During the inital 5 minutes, provided that the LAN host is using the pfsense WAN ip as its default gateway (!), it is able to access everything behind OPT1.

This is very worrying, as the states and arp tables are empty upon pfsense booting, so this problem would manifest itself then.
If you need any more tests done, or more information, please do ask and I will do whatever I can to help

Many Thanks


Files

Actions #1

Updated by Jonathan Tripathy almost 14 years ago

I currently don't have access to the test box, but I will later on this evening, and once I do, I will post the output of:

ifconfig -a

and

cat /tmp/rules.debug

Cheers

Actions #2

Updated by Jim Pingle almost 14 years ago

  • Project changed from pfSense Packages to pfSense
Actions #3

Updated by Jim Pingle almost 14 years ago

  • Category set to Rules / NAT
  • Target version set to 2.0
  • Affected Version set to 2.0
Actions #4

Updated by Chris Buechler almost 14 years ago

  • Priority changed from Urgent to Normal

In such a config you must disable the antilockout rule under System > Advanced if you want to block everything out LAN.

I can reproduce with that disabled, if 'arp -a' shows the MAC/IP of the LAN-side host on the WAN interface (same subnet of the host on LAN), then when communicating with the interface IP on the WAN interface, it hits the WAN ruleset. That's not true of any traffic traversing the firewall. It's not a problem in our code, the rules generated are correct, looks to just be how pf works in such scenarios. Leaving open for feedback from Ermal on why pf functions this way, and if that can or should be changed.

Actions #5

Updated by Jonathan Tripathy almost 14 years ago

I see what you mean about the Anti-Lock out rule, however when I first discovered this issue, I was actually using a 3rd OPT1 interface (with WAN and OPT1 bridged, with LAN on a private subnet), and the same problem was happening - i.e. block everything on OPT1 tab, yet WAN rules are evaluated when arp cache is flushed and states reset.

Any ideas/comments?

I really do feel that this is a serious issue

Actions #6

Updated by Jonathan Tripathy almost 14 years ago

Forgot to mention, indeed the host LAN side host (or OPT1 side host depending which scenario I'm doing), always appears as being on WAN in the arp cache for me...

Actions #7

Updated by Jonathan Tripathy almost 14 years ago

Also forgot to mention, that if using the 3 interface setup (with WAN and OPT1 bridged, with LAN on a private subnet), if a host on the OPT1 interface (which should not be allowed to access anything) uses the WAN IP address of pfsense as its default gateway, then it can access hosts on the LAN interface, provided the WAN rules allow for this. I have tested this and it does indeed do this.

IMHO, the WAN rules should never be evaluated for hosts on the OPT1 interface.

This is the major issue

Actions #8

Updated by Jonathan Tripathy over 13 years ago

Does anyone have an update on this? This is a major security issue in pf.

Please keep in mind that I discovered this issue on a prodduction system, as restricted hosts were able to access my internal LAN hosts (not just the router web GUI)

Actions #9

Updated by Ermal Luçi over 13 years ago

Can you please try this command and see if it changes anything:
sysctl net.link.bridge.pfil_local_phys=1

Actions #10

Updated by Ermal Luçi over 13 years ago

  • Status changed from New to Feedback
Actions #11

Updated by Chris Buechler over 13 years ago

This scenario is described in the if_bridge man page:

The packets destined to the bridging host will be seen by the filter on
     the interface with the MAC address equal to the packet's destination MAC.

http://www.freebsd.org/cgi/man.cgi?query=if_bridge&apropos=0&sektion=0&manpath=FreeBSD+8.1-RELEASE&format=html

So it is the expected behavior, such configurations should not be used if that isn't the behavior you desire. If you set the default gateway on hosts to an IP on a different interface of a bridge, all that host's traffic will be destined to the WAN MAC and hence have the WAN rules apply.

net.link.bridge.pfil_local_phys does not change the behavior, it does as described whether that's 0 or 1.

Actions #12

Updated by Jonathan Tripathy over 13 years ago

Hi Chris,

I appreciate what you are saying, but the point of concern here is that the host shouldn't be able to set the WAN interface as the default gateway, as the only rule on set on the respective tab is block all (i.e. no traffic should be allowed through). In my test, I was using OPT1 as the "restricted" interface (which had the block all rule).

Actions #13

Updated by Jonathan Tripathy over 13 years ago

And please remember that the problem goes away after you wait for a few minutes. This only happens when the arp table is flushed and the states are reset (for example at system boot)

Actions #14

Updated by Chris Buechler over 13 years ago

  • Subject changed from Traffic Leaks when using bridged interfaces to if_bridge unpredictable filter interface selection
  • Status changed from Feedback to New

Ermal has a patch to if_bridge that seems to make filtering with if_bridge behave as it logically should (not as it is intended to function, which is as described in the man page above). Setting to New until it's committed. My tests have been fine.

Actions #15

Updated by Ermal Luçi over 13 years ago

Patch integrated in the builds. Please test the latest one.

Actions #16

Updated by Chris Buechler over 13 years ago

  • Status changed from New to Feedback
Actions #17

Updated by Jonathan Tripathy over 13 years ago

I'll test this latest snapshot ASAP.

Very keen to see the results

Actions #18

Updated by Jon Bruce over 13 years ago

Using pfSense-Full-Update-2.0-BETA4-20100808-1004.tgz I have a WAN and LAN and also a PHONE which is bridged to LAN. Upon upgrading to pfSense-Full-Update-2.0-BETA4-20100810-0228.tgz the LAN stops passing traffic. I tested this by upgrading then downgrading, I've also tested every firmware version since. Firewall logs show that the traffic is being blocked by the default rule and one other, I believe it was something about a packet (I can get specifics if needed). The PHONE interface continues to work fine.

The PHONE device has a valid public IP in it, using the ISP gateway. WAN is also a valid public IP statically assigned by my ISP.

Actions #19

Updated by Jim Pingle over 13 years ago

A note on the last comment there:
From that config it looks like "Phones" is bridged to WAN, not LAN, which is more consistent with your comment about the Phone device having a public IP and that it uses the ISP gateway. (It seems that saying it was bridged to LAN was a typo of sorts)

Actions #20

Updated by Jon Bruce over 13 years ago

Yes, my bad. PHONE is bridged to WAN not LAN

Actions #21

Updated by Derek Buttineau over 13 years ago

Seeing something different, but similar with a bridge configuration. It appears now that the arp cache is leaking into the bridge.

In my configuration, I have a PFSense device setup with 3 interfaces. 2 of these interfaces are bridged together into a transparent bridge, the 3rd interface is used for management.

The management interface has an IP of 10.10.15.11 and a gateway of 10.10.15.1

The bridge terminates into a 24 port switch behind it, and has 2 VLANs configured and the firewall connects to port 1. VLAN 1 is the default VLAN and VLAN 100 is the management VLAN. The Management VLAN has an IP of 10.10.15.54 assigned to it, and has only port 23 attached to it.

When we attempted to connect to the switch for management, we noticed a huge amount of packet loss (> 80%), on investigating, we could see that the switch was receiving the arp for 10.10.15.1 from port 1 instead of port 23. If I bypass the firewall, everything works normally and the arp for the gateway is received on port 23.

The odd thing is that the arp 10.10.15.11 (the local IP on the firewall) does not appear to be being leaked, as I'm receiving that normally on port 23.

I've attached a rough drawing of our setup with this device as well as the PFSense configuration.

Actions #22

Updated by Ermal Luçi over 13 years ago

  • Status changed from Feedback to New

This needs revisiting at proper time because now the patch that was added is not in the builds.

Actions #23

Updated by Jon Bruce over 13 years ago

The original issue is sorted, however the problem from #22 could still be an issue.

Actions #24

Updated by Ermal Luçi over 13 years ago

This possibly is to late for 2.0 since there are if_bridge(4) chagnes involved which might become problematic.
The problem is identified and analyzed and half solution was tested so i think 2.1+ is a better candidate.

Actions #25

Updated by Ermal Luçi about 13 years ago

  • Target version changed from 2.0 to 2.1
Actions #26

Updated by Chris Buechler about 13 years ago

it works exactly as it should per the man page, there are just certain ways you shouldn't configure it or you should expect the results. A documentation issue at this point, if it can be fixed in the future without a lot of effort we can revisit.

Actions #27

Updated by Derek Buttineau about 13 years ago

Chris, was that in response to the issue I noted or the original one? I could understand the IP of the management interface leaking into the bridge, but the gateway leaking into it seems strange.

Actions #28

Updated by Chris Buechler about 12 years ago

  • Target version changed from 2.1 to 2.2
  • Affected Version changed from 2.0 to All
Actions #29

Updated by Phil Lavin over 11 years ago

Continuing discussion of https://redmine.pfsense.org/issues/2744

Odd that it's just appeared now on an established and tested setup after an upgrade to latest snapshot.

What's the recommended workaround? Presently I've just added a static ARP entry on the device connected to the "Firewall" interface.

Actions #30

Updated by Ermal Luçi over 9 years ago

This should work better on pfSense 2.2 as of 1 week ago!

Actions #31

Updated by Chris Buechler over 9 years ago

  • Status changed from New to Closed

I've been through a good deal of bridging testing in 2.2. It all behaves as expected. The subject-described issue is possible in certain configurations, but we have alternate configuration options today that didn't exist at the creation of this ticket.

Actions

Also available in: Atom PDF