Bug #6220
openstate mismatch with host-initiated traffic matching binat to IP not locally assigned
0%
Description
Replies to traffic initiated from the host itself, translated by binat, to a target IP that isn't locally assigned on the system, results in a state mismatch on reply traffic. Simplified (as much as possible) example.
OS setup:
vtnet0 - 172.30.6.133/24 - acting as WAN vtnet1 - 192.168.1.1/24 - acting as LAN 172.30.6.1 default gateway 172.30.6.88/32 routed to 172.30.6.133 by 6.1
pf.conf:
binat on vtnet0 from 192.168.1.1 to any -> 172.30.6.88 pass out quick from any to any block quick from any to any
Then generate some traffic that matches the binat.
# ping -S 192.168.1.1 172.30.6.1
and you get no reply. Counters show the egress traffic matches the binat and 'pass out' rules, but the replies match the block rule.
# pfctl -vvsn @0(0) binat on vtnet0 inet from 192.168.1.1 to any -> 172.30.6.88 [ Evaluations: 38 Packets: 6 Bytes: 504 States: 1 ] [ Inserted: pid 20547 State Creations: 1 ] # pfctl -vvsr @0(0) pass out quick all flags S/SA keep state [ Evaluations: 45 Packets: 32 Bytes: 2824 States: 4 ] [ Inserted: pid 20547 State Creations: 6 ] @1(0) block drop quick all [ Evaluations: 39 Packets: 39 Bytes: 3609 States: 0 ] [ Inserted: pid 20547 State Creations: 0 ]
The state that's created is correct.
# pfctl -vvss | grep -A 2 icmp vtnet0 icmp 172.30.6.88:51795 (192.168.1.1:51795) -> 172.30.6.1:51795 0:0 age 00:00:03, expires in 00:00:10, 4:4 pkts, 336:336 bytes, rule 0 id: 000000005717dd2c creatorid: 776a8091
If you change the ruleset to allow the reply traffic, you get the first ping response, but nothing subsequent because the ping reply state prevents it from matching the binat state, so subsequent echo requests go out without the binat's translation.
This pf.conf:
binat on vtnet0 from 192.168.1.1 to any -> 172.30.6.88 pass out quick from any to any pass in quick from any to any
With the same ping, you get one reply and nothing further.
# ping -S 192.168.1.1 172.30.6.1 PING 172.30.6.1 (172.30.6.1) from 192.168.1.1: 56 data bytes 64 bytes from 172.30.6.1: icmp_seq=0 ttl=64 time=0.515 ms
# pfctl -vvss | grep -A 2 icmp vtnet0 icmp 172.30.6.88:42562 (192.168.1.1:42562) -> 172.30.6.1:42562 0:0 age 00:00:14, expires in 00:00:00, 1:1 pkts, 84:84 bytes, rule 0 id: 010000005717dbe5 creatorid: 5cdfce54 -- vtnet0 icmp 192.168.1.1:42562 <- 172.30.6.1:42562 0:0 age 00:00:33, expires in 00:00:10, 1:32 pkts, 84:2688 bytes, rule 1 id: 000000005717de2f creatorid: 5cdfce54
The second state shouldn't be there, it should match the first.
Workarounds¶
Two ways I know of to make this not happen.
1) Add an IP alias for the translated IP.
# ifconfig lo0 inet 172.30.6.88/32 alias
It just needs to be defined on the system, doesn't have anything to do with ARP or anything. With the same configs above, it'll work after adding that alias to lo0.
2) Discovered by accident that adding a setkey "none" policy matching the translation IP also makes it work.
# setkey -DP 192.168.1.0/24[any] 192.168.1.0/24[any] any in none created: Apr 20 19:53:07 2016 lastused: Apr 20 20:46:13 2016 lifetime: 9223372036854775807(s) validtime: 0(s) spid=12 seq=9 pid=23348 refcnt=1
I'm pretty sure the problem is in tryforward. It doesn't happen on any pre-2.3 versions.