Project

General

Profile

Actions

Bug #1629

closed

invalid state table entries after WAN IP change

Added by Eli Hunter almost 13 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
High
Category:
Rules / NAT
Target version:
Start date:
06/29/2011
Due date:
% Done:

100%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
All
Affected Architecture:

Description

We have an asterisk server behind pfsense 2.0-RC3 using a PPPoE DSL connection.
Whenever our WAN IP changes the asterisk server cannot register to the providers.
Flushing the state table allows asterisk to register again.


Files

state_table.txt (25.2 KB) state_table.txt Eli Hunter, 07/05/2011 09:00 PM
log.txt (20.6 KB) log.txt Eli Hunter, 08/02/2011 08:32 PM
states-bug.patch (1.22 KB) states-bug.patch Daniel Haid, 05/02/2015 10:09 PM
Actions #1

Updated by Evgeny Yurchenko almost 13 years ago

Please provide state-table dump before IP change and after.

Actions #2

Updated by Eli Hunter almost 13 years ago

Hopfully this is what you wanted.

My IP before the address changed was 76.254.18.100 and the new assigned address is 99.179.45.73

Bad entry in state table before resetting states
udp 10.0.4.3:5060 -> 76.254.18.100:13819 -> 67.215.241.250:5060 SINGLE:NO_TRAFFIC

I've attached a txt document with the state table info before and after flushing it.

Actions #3

Updated by Chris Buechler almost 13 years ago

  • Category set to PPP Interfaces
  • Priority changed from Normal to High
  • Target version set to 2.0
  • Affected Version set to 2.0

PPPoE is supposed to clear all states on that interface when an IP changes, that's not happening correctly.

Actions #4

Updated by Ermal Luçi almost 13 years ago

  • Status changed from New to Feedback
  • % Done changed from 0 to 100
Actions #5

Updated by Ermal Luçi almost 13 years ago

Actions #6

Updated by Chris Buechler almost 13 years ago

That's expected to happen in 1.2.3 (it has no provisions for dealing with that scenario, only 2.0 does).

Actions #7

Updated by Ermal Luçi almost 13 years ago

Have you tested this on 2.0?

Actions #8

Updated by Eli Hunter almost 13 years ago

I got the update installed last week but haven't had the IP change on me yet (surprisingly). I'll update this once the IP's changed.

Actions #9

Updated by Matt Corallo almost 13 years ago

I have the same problem (after the fixes) with IPv6 tunneling, so this is not resolved.

Actions #10

Updated by Chris Buechler almost 13 years ago

IPv6 is a completely different version, that's 2.1 not 2.0, post info to the IPv6 board on the forum.

Actions #11

Updated by Matt Corallo almost 13 years ago

No, no, Im not talking about IPv6 in pfSense, Im talking about IPv6 NAT passthrough in the "System: Advanced: Networking" menu in 2.0, not 2.1. Its the same bug for the same reason.

Actions #12

Updated by Eli Hunter almost 13 years ago

I had it reset again this weekend which took the asterisk server down again. Unfortunately I wasn't near a computer and had to get things up and running for them quickly so I used my phone to reset the state table. This got their phones working again but I wasn't able to get a copy of any logs. I'm going to assume this isn't fixed yet but it's probably good to wait until it happens again so I can make sure it's still a problem.

Actions #13

Updated by Eli Hunter over 12 years ago

It's still happening.
Again here's a relevant section. Our Asterisk server at 10.0.4.3 is still trying to use the old gateway address of 99.58.29.27. Our server at 10.0.4.100 is using the correct gateway at 99.169.80.219

I can attach the full state table again if it helps.

udp 10.0.4.3:5060 -> 99.58.29.27:37488 -> 209.62.1.2:5060 MULTIPLE:MULTIPLE
tcp 10.0.4.100:55507 -> 99.169.80.219:46579 -> 216.52.233.157:443 ESTABLISHED:ESTABLISHED

Actions #14

Updated by Ermal Luçi over 12 years ago

Can you post system log with state table as well?

Actions #15

Updated by Eli Hunter over 12 years ago

I just copied the system log page and state table in attached document.

Would collecting the data with a syslog server help? I can get it setup if it helps track this down.

Actions #16

Updated by Ermal Luçi over 12 years ago

From the attached:
- What is the old gateway?
- What is the new gateway?
- What is the wrong entry?

Actions #17

Updated by Chris Buechler over 12 years ago

  • Target version changed from 2.0 to 2.0.1
Actions #18

Updated by → luckman212 over 12 years ago

Hi- I've just experienced the exact same issue. pfSense 2.0(REL) running nanobsd-2g on a Netgate Hamakua. My WAN DHCP lease expired and when it renewed the IP had changed. My internal Asterisk server lost all trunk registrations and I had to manually reset the states to fix it.

Has there been any update on this problem - or is there a workaround that doesn't require manual intervention?

Actions #19

Updated by Andrea Cutelle' over 12 years ago

Hi, the same error in my installation. pfsense 2.0 rel running on jetway nc9c-550lf. I have static public ip, when the connection change state up to down and then again up my asterisk server lost connection to the trunk with state request sent. resetting state work well again.

sorry for my english..

Actions #20

Updated by Chris Buechler over 12 years ago

  • Target version deleted (2.0.1)
Actions #21

Updated by Pho Bia over 12 years ago

I also experience this with my SIP device (PAP2T). I thought my provider was to blame as changing the remote server usually got my phones back online.

I have a multiple WAN (3) setup.

Are you still looking for logs on this, or is the fix already known?

Actions #22

Updated by Pho Bia over 12 years ago

This is what my states look like for my effected device from Diagnostics --> States when my VoIP adapter shows offline (filtered for my PAP2T IP only) :

udp 64.120.22.242:5060 <- 192.168.0.100:5061 NO_TRAFFIC:SINGLE
udp 64.120.22.242:5060 <- 192.168.0.100:5060 NO_TRAFFIC:SINGLE
udp 192.168.0.100:5060 -> 64.231.21.117:52434 -> 64.120.22.242:5060 SINGLE:NO_TRAFFIC
udp 192.168.0.100:5061 -> 64.231.21.117:57887 -> 64.120.22.242:5060 SINGLE:NO_TRAFFIC

This is what it looks like after I reset states and my service goes back online :

udp 64.120.22.242:5060 <- 192.168.0.100:5061 MULTIPLE:MULTIPLE
udp 192.168.0.100:5061 > 64.231.160.191:46020 -> 64.120.22.242:5060 MULTIPLE:MULTIPLE
udp 64.120.22.242:5060 <
192.168.0.100:5060 MULTIPLE:MULTIPLE
udp 192.168.0.100:5060 -> 64.231.160.191:56285 -> 64.120.22.242:5060 MULTIPLE:MULTIPLE

Actions #23

Updated by Christian Schwarz over 12 years ago

Bug still present with 2.0.1.
This is not happening every time the IP chages.(Provider disonnects once a day) But after a few days our SIP-Registration is down. Statetable shows entry to SIP-Provider with old WAN-IP...
Flushing the states manually will bring the trunk up again. (Using Telekom Germany with Panasonic PBX)

Actions #24

Updated by Akom Benevolent almost 12 years ago

Same deal, 2.0.1-RELEASE and this happens every so often, but not on every IP change. I can delete the 2 state entries for the old WAN IP, and then asterisk registers fine.

Actions #25

Updated by Grant Emsley almost 12 years ago

I'm seeing the exact same behavior using a PPPOE internet connection, 2.0.1-RELEASE i386, and asterisk.

When my connection goes down, the states remain in the state table and the asterisk server is unable to register.

Actions #26

Updated by Nicklas Blidmo almost 12 years ago

Same issue with WAN DHCP, 2.0.1-RELEASE (i386) (nanobsd)

Actions #27

Updated by Chris Buechler almost 12 years ago

  • Category changed from PPP Interfaces to Rules / NAT
  • Status changed from Feedback to New
  • Target version set to 2.1
  • Affected Version changed from 2.0 to All
Actions #28

Updated by Grant Emsley almost 12 years ago

I don't know if it will help, but I noticed it happens much more frequently when my internet connection is flaky. It happened almost every time when my PPPOE connection was dropping every 2-5 minutes due to a loose cable.

Actions #29

Updated by fos4X fos4X over 11 years ago

I can confirm that this problem still exists in 2.0.1-RELEASE (amd64) built on Mon Dec 12 18:16:13 EST 2011 using PPPoE with static IP (but 24h disconnects)

Actions #30

Updated by Jim Pingle over 11 years ago

  • Status changed from New to Feedback

Some fixes for this have gone into 2.1 over the past few months. Try a 2.1-BETA snapshot and see if it's repeatable there.

Actions #31

Updated by fos4X fos4X over 11 years ago

Confirmed to still be an issue in 2.1-BETA0 (amd64) built on Wed Nov 28 15:23:39 EST 2012

A reconnect of PPPoE WAN (manual or 24h) leads to asterisk beind unable to Register, it hangs at "Request Sent".

Could be related to #2700 but if older revisions are any indication, removing the /32 from $3/32 will not help either.

I would be willing to test any suggested (hot)fixes.

I read elsewhere that a pfctl -b <gw> in ppp_linkup (UP!!!!) helps, have not tried it yet though (every trial means disconnecting my colleagues from the web for a few minutes)

Actions #32

Updated by pierre mayer over 11 years ago

still not working with 2.1Beta0(i386)built pfSense-memstick-2.1-BETA0-i386-20121128-1058.img

need to reset state table to make freepbx working

Actions #33

Updated by Ermal Luçi about 11 years ago

  • Target version changed from 2.1 to 2.2

The only real solution to this is to switch to if-bound states for many reasons.
That is a bit more involved changed for 2.1

Actions #34

Updated by Chris Buechler about 11 years ago

  • Status changed from Feedback to New
  • Target version changed from 2.2 to 2.1

we at least need the option to wipe the entire state table upon IP change.

Actions #35

Updated by Ermal Luçi about 11 years ago

  • Status changed from New to Feedback

Ok i went and did another implementation fix for this.
Can you please try with later 2.1 snapshots and see if it behaves correctly?

Actions #36

Updated by Tobias Wigand about 11 years ago

Does not work for me.
Correct me if I'm mistaken here, but can
pfctl -i
work without binding states to interfaces?
One of my external interfaces is em1, but
pfctl -i em1 -ss
does not show anything. Altough I have a working VoIP state in the table going out on that interface.

Actions #37

Updated by Ermal Luçi about 11 years ago

Check with later coming snapshot there was a problem with the patch that has been corrected.

Actions #38

Updated by Tobias Wigand about 11 years ago

Does not work, sorry. Only the "Out" states are flushed. The "In" states persist and seem to remember their gateway. After some time the "Out" states are coming back with the non-existent / down gateway. You can see this in the long pftop view. The debug.rules are correct, but they are never used because of the persisting "In" states. Or do I need to use floating rules do use this, will they behave differently?

Actions #39

Updated by Matthias Dilbert about 11 years ago

This problem also affects me. I’ve upgraded to Snapshot "built on Sat Feb 9 23:46:16 EST 2013". I will look for the problem to reoccur.

Actions #40

Updated by Matthias Dilbert about 11 years ago

Today the problem occurred again. So it was not fixed yet.

Actions #41

Updated by Ermal Luçi about 11 years ago

I just pushed another change to reset states with certain gateways set.
It should behave even better than previously, since it will send a RST for tcp states getting killed belonging to a certain gateway.

UPDATE for later: Probably a more through way of chained dependency of state need to be implemented with -Fs option.
As if you kill a state belonging to an interface try to find the correlated state on any other interface if present, especially on non-pfSense originated traffic.
That is a bit more involved and more careful checks of not corrupting the table needs to be done but for now this should work correctly.

Actions #42

Updated by Tobias Wigand about 11 years ago

Thanks, it works with my VoIP device now. The states get killed correctly.

Actions #43

Updated by Dim Hatz about 11 years ago

Ermal, testing this feature on a pfsense box with a WAN interface that gets via DHCP an IP in a /24 subnet (i.e. it's not PPPoE), it won't kill pf states that originate from the LAN to any host in that /24 subnet (which includes the gwip). Connections beyond the WAN subnet seem to get killed, but the LAN states don't.

E.g. I establish an ssh connection from 192.168.100.12 to aa.bb.40.155, then manually flushed states using:
pfctl -i em0 -Fs -G gwip

After doing it 3 times, I get:
pfctl ss | fgrep 40.155
em1 tcp aa.bb.40.155:22 <
192.168.100.12:3131 ESTABLISHED:ESTABLISHED
em1 tcp aa.bb.40.155:22 <- 192.168.100.12:3590 ESTABLISHED:ESTABLISHED
em1 tcp aa.bb.40.155:22 <- 192.168.100.12:3595 ESTABLISHED:ESTABLISHED
em1 tcp aa.bb.40.155:22 <- 192.168.100.12:3597 ESTABLISHED:ESTABLISHED
em0 tcp 192.168.100.12:3597 -> xxx.yyy.1.201:65161 -> aa.bb.40.155:22 ESTABLISHED:ESTABLISHED

em0 WAN - xxx.yyy.1.201
em1 LAN - 192.168.100.1
remote ssh server - aa.bb.40.155

PS: Running latest 2.1-BETA snapshot (12-Feb 08:58)
MD5 (/sbin/pfctl) = af1a7f62f1ae26958ba050f6c6f418a6

Actions #44

Updated by Dim Hatz about 11 years ago

To followup my previous post, I have verified that the WAN (em0) states are indeed flushed, however their corresponding LAN (em1) states linger on.

Actions #45

Updated by Renato Botelho about 11 years ago

  • Status changed from Feedback to New
  • % Done changed from 100 to 50
Actions #46

Updated by Matthias Dilbert about 11 years ago

I’ve upgraded to the latest beta, but the problem still persists. Even when the modem is restartet and i don’t get a new ip, the states go wrong.

Actions #47

Updated by Sebastian Chrostek about 11 years ago

Same Problem here with 2.1 Beta (built on Fri Mar 1 21:17:31 EST 2013)

It seems that also states without the old IP in it make problems with SIP.

In my case this two states:

udp 212.227.18.199:5060 <- 172.17.0.1:5060 MULTIPLE:MULTIPLE
udp 172.17.0.1:5060 -> 212.227.18.199:5060 SINGLE:NO_TRAFFIC

for my VoIP connection to "1und1" prevent asterisk from getting a connection.
if i clear only this two states, asterisk gets a connection only a few seconds later.

i use a PPPOE WAN

Actions #48

Updated by Tom De Coninck almost 11 years ago

I also have the same issues and following this issue. Maybe I can provide some extra information, sorry if it's double

I started using pfsense since version 2.0, now running the latest 2.03 on alix board.

I was using a PPTP VDSL connection, and i am now using a cable WAN connection. There was no difference for asterisk. When the WAN ip changed, the UDP state with old wan ip address stayed alive. With the different WAN connections the bad UDP state stayed alive.

When the wan IP changed into a new address, the state stayed alive with the old WAN IP ADDRESS

udp LOCALASTERISKIP:5060 -> WANIPOLD:17205 -> SIPPROVIDER:5060 MULTIPLE:MULTIPLE
instead of
udp LOCALASTERISKIP:5060 -> WANIPNEW:17205 -> SIPPROVIDER:5060 MULTIPLE:MULTIPLE

I was able to kill the state using
pfctl -k LOCALASTERISKIP -k SIPPROVIDER

So to fix my issue i had to run this command every time the WAN IP Address changes.

I created this script with info i found in the internet

create /usr/local/bin/reset_states.sh

#!/bin/sh
# Kill Udp Sip States after new wan IP
echo "Killing States from ASTERISKIP to SIPPROVIDER" |logger;
/sbin/pfctl -k ASTERISKIP -k SIPPROVIDER

Change file permissions

chmod 755 /usr/local/bin/reset_states.sh

Edit config file /conf/config.xml

<system>
...
<afterfilterchangeshellcmd>/usr/local/bin/reset_states.sh</afterfilterchangeshellcmd>
</system>

Asterisk configuration

#pfctl -st

udp.first                    60s
udp.single                   30s
udp.multiple                 60s

Running the command shows me that the states die after 60s of inactivity.

To keep the state alive, keep the qualify under 60s, in my case 30s (30000)

; SIPPRODER_SIPPHONENUMBER
[SIP-PROVIDER-13764962994f4fdde1430ba]
qualify=30000

This works for me , hope it helps anyone.. and looking forward to a permanent fix

my compliments for the pfsense programmers, i am a very happy user, and i will promote it.

Actions #49

Updated by Tom De Coninck almost 11 years ago

When we have a state like this :

udp LOCALASTERISKIP:5060 -> WANIPOLD:17205 -> SIPPROVIDER:5060 MULTIPLE:MULTIPLE

Is it possible to kill states based on WANIPOLD with pfctl ?

Actions #50

Updated by Martin Oosterheert almost 11 years ago

I am also affected by this bug in 2.0.3.
In my case not a changed ipadres on my WAN, but a dual Wan setup with failover in which i simulate a failed WANlink
PfSense registers in a few seconds that the failed WAN is down and websitebrowsing recontinues after a few seconds on the other WAN, but VoIP can take 5 - 10 minutes with my TCP SIP account or 20 or so minutes with the standard (and preferred) UDP SIP account.

Each time the states table shows entries like:

udp  INTERNALASTERISKIP:5060  <-  PUBLICASTERISKIP:5060  <-  SIPPROVIDERIP:5060  MULTIPLE:MULTIPLE
udp  SIPPROVIDERIP:5060  ->  PUBLICASTERISKIP:5060  MULTIPLE:MULTIPLE
(i have a 1:1 nat between my local asterisk ip (192.168.1.23) and a public ip 109.x.x.x )

Resetting the states resolves the problem at once.
Since this is a case of interface failing a pfctl -i fxp0(or whichever interfacename) seems appropriate and does solve my problem.
However thats not very practical in a business setup...

I hope this info helps !

Actions #51

Updated by Hannes Meer almost 11 years ago

I'm facing the samem problem with latest 2.1-RC. Anything we can do to get a solution?

Actions #52

Updated by Renato Botelho over 10 years ago

  • Target version changed from 2.1 to 2.2

#3181 is a band-aid for 2.1, this will need to wait 2.2

Actions #53

Updated by Eric Jacksch about 10 years ago

Still a significant issue - causing random VoIP outages. Would be great to get this fixed.

Actions #54

Updated by Dim Hatz about 10 years ago

It seems that in recent weeks there have been several related commits in 10-STABLE, e.g.

http://lists.freebsd.org/pipermail/svn-src-all/2014-January/079820.html
http://lists.freebsd.org/pipermail/svn-src-all/2014-January/079821.html

as well as several bugfixes, which apparently didn't make it into 10.0 RELEASE ...

Actions #55

Updated by Andy Lawson about 10 years ago

I'm still experiencing this issue with pfsense 2.1 on an ALIX platform and an Cisco SPA112 ATA.
pfsense is configured with a single ADSL/PPPoE WAN, but does not clear the state entry for this device on WAN IP change.
This issue doesn't get mentioned in the release notes for pfsense 2.1.1 (out today) so I assume it's not resolved there.

Actions #56

Updated by Tom De Coninck about 10 years ago

Friends, Developers
i have been doing some extensive testing on this issue yesterday evening.. yes i know ...get a life! :)

My wan connection is DHCP, but I always get the same ip address. Before i had a DSL connection with DHCP and every 3 days a new ip address. Even with this fixed IP, the issue unfortunatly remains with the states ..

i have a theory about this, maybe facts , please correct me where i'm wrong

From what i gatherd, these asterisk parameters are causing the states

qualifyfreq = 30
when the provider is reachable, asterisk send '102 OPTIONS' message every 30 seconds)
-> My advice, keep this UNDER 60 which is the life of udp state in pfsense. This keeps the state stays alive, we don't have to open ports in the firewall

qualify = 5000
When the provider is unreachable, asterisk send '102 OPTIONS' message every 1 second for 5 seconds (5000ms), then it waits 10 seconds to start over(in default compilations of asterisk)
-> I cannot give good advice about this one. This parameter in combination with pfsense gives us troubles :)

defaultexpiry=180
reregisters the provider every 180 seconds
-> My advice, keep this above 60 which is the life of udp state in pfsense

registertimeout=120
When the provide is unreachable, asterisk tries to register every 120 seconds
-> My advice, keep this above 60 which is the life of udp state in pfsense

When the WAN connection is down, asterisk will start the next qualify after max 30 seconds and asterisk will detect the provider is unreachable
. Then the qualify starts to send the 'OPTIONS' for 5 seconds

Everybody thinks that that the states aren't killed, but i think pfsense did kill them. I think after the wan failure, it created new states.

Afther the wan failure i noticed 2 states in pfsense 2.1.1

#pfctl -ss | grep 85.119.188.3
vr0_vlan10 udp 85.119.188.3:5060 <- 192.168.150.80:5060       NO_TRAFFIC:SINGLE
lo0 udp 192.168.150.80:5060 -> 85.119.188.3:5060       SINGLE:NO_TRAFFIC

When the wan interface comes back the sip provider stays unreachable. It keeps sending qualify messages , but i think it stays in the loopback interface

it this normal behaviour ?

thanks,
Tom

Actions #57

Updated by Tom De Coninck about 10 years ago

This week i have done some more testing on this issue nr #1629.

Everybody in that issue is talking that the states do not get killed. I have been testing this manually, even if the state gets killed, the issue remains.

I did it manually :

1. Kill all states when WAN is down
-> every state is killed
[2.1.2-RELEASE][]/root(3): pfctl -k 192.168.150.80 -k 85.119.188.3
killed 2 states from 1 sources and 1 destinations
[2.1.2-RELEASE][]/root(4): pfctl -ss | grep 85.119.188.3
[2.1.2-RELEASE][]/root(5):

2. After a while, i notice that asterisk creates new states
#pfctl ss | grep 85.119
vr0_vlan10 udp 85.119.188.3:5060 <
192.168.150.80:5060 NO_TRAFFIC:SINGLE
lo0 udp 192.168.150.80:5060 -> 85.119.188.3:5060 SINGLE:NO_TRAFFIC

3. When wan comes up again asterisk cannot connect

Please notice that there is a state created on the loopback interface. When i kill that state, asterisk is reconnecting to the provider.

I'm not seure, Is it possible to not create states on a loobpack interface? maybe that could be the fix?

Hope you can do sometihng with this info

thanks for the great pfsense software!

Actions #58

Updated by Tom De Coninck almost 10 years ago

sorry for spamming...yet another update..

Today there were troubles with the Wan provider. In pfsense the gateway went down and i receved this WAN ip address 192.168.100.10 . I have seen this behaviour before after these Docsis cable modems.

But when WAN came back up, these states remains

vr0_vlan10 udp 85.119.188.3:5060 <- 192.168.150.80:5060       NO_TRAFFIC:SINGLE
vr1 udp 192.168.150.80:5060 -> 192.168.100.10:23920 -> 85.119.188.3:5060       SINGLE:NO_TRAFFIC

I think this proves that pfsense not only needs to kill states on 'WAN DOWN' , but also on 'WAN UP'. I can't see how it could work otherwise

cu

Actions #59

Updated by Jim Thompson almost 10 years ago

  • Assignee set to Ermal Luçi

assigned to Ermal, either fix this or push it to 2.3

Actions #60

Updated by Chris Buechler over 9 years ago

  • Status changed from New to Feedback
  • Assignee changed from Ermal Luçi to Chris Buechler
Actions #61

Updated by Chris Buechler over 9 years ago

I committed a change to add a new option that kills all states upon IP change. That's going to be the answer for those who need to work around long-lived UDP connections like in this thread. This will be in 2.2 snapshots on the 29th and newer.

To enable this option (for now), manually edit your config, and above the </system> line, add:

<ip_change_kill_states/>

After doing that, when your IP changes, you'll see a system log entry like:

php-fpm[288]: /rc.newwanip: Killing all states post-IP change.

and the entire state table will be wiped. It's more excessive than ideally you'd want, but there isn't a good way to kill only states on one particular WAN at this time.

If this tests out OK, we'll add an option under System>Advanced to enable/disable this option.

Actions #62

Updated by Chris Buechler over 9 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 50 to 100

this is fixed. The states of the former WAN IP are now killed post-IP change, which should resolve nearly all cases where this is an issue.

I left the non-default option <ip_change_kill_states/> in there, so if there are circumstances where the 'pfctl -k $oldip' doesn't suffice, that can be set to wipe the entire state table. It didn't prove necessary in my test setup, where I have a SIP phone registering through the system, and a DHCP WAN with a short lease time, changing the IP it's assigned every few minutes, and it always got a proper new NAT state with the new WAN IP.

Actions #63

Updated by Tobias Wigand over 9 years ago

I'm on

2.2-BETA (amd64) 
built on Sat Nov 01 21:36:28 CDT 2014

something might be wrong with the 'pfctl -k $oldip' mechanism or the discovery of gateway IPs. I found this in my log and the IP 192.168.xxx.28 has never been a gateway. The gateway has always been 192.168.xxx.1

/rc.newwanip: IP has changed, killing states on former IP 192.168.xxx.28

Maybe this is related as pftop quite often shows the wrong gateway on my test machine, mostly when my secondary WAN is used:
https://forum.pfsense.org/index.php?topic=83663.0

Actions #64

Updated by Chris Buechler over 9 years ago

It's not the gateway that needs states killed, it's the old WAN IP.

Actions #65

Updated by → luckman212 over 9 years ago

So is this change going in to 2.2? Will the state killing be triggered in a gateway group failover event that is typical for a 1LAN+2WAN setup?

Actions #66

Updated by saqi b about 9 years ago

I have been hit by this bug as well. so I updated to 2.2 and it didnt take long for the ip to change and my iax2 trunk went into request sent state.

Version 2.2.1-RELEASE (amd64)
built on Fri Mar 13 08:16:49 CDT 2015
FreeBSD 10.1-RELEASE-p6

---------------------------------- system log ------------------------------------
Mar 19 21:54:58 php-fpm78159: /rc.newwanip: IP has changed, killing states on former IP 174.95.68.17.
Mar 19 21:54:58 php-fpm78159: /rc.newwanip: ROUTING: setting default route to 10.11.2.169
Mar 19 21:55:00 php-fpm78159: /rc.newwanip: Forcefully reloading IPsec
Mar 19 21:55:00 php-fpm78159: /rc.newwanip: Resyncing OpenVPN instances for interface WAN.
Mar 19 21:55:00 kernel: ovpns1: link state changed to DOWN
Mar 19 21:55:00 check_reload_status: Reloading filter
Mar 19 21:55:00 check_reload_status: Reloading filter
Mar 19 21:55:00 php-fpm78159: /rc.newwanip: Creating rrd update script
Mar 19 21:55:00 kernel: ovpns1: link state changed to UP
Mar 19 21:55:00 check_reload_status: rc.newwanip starting ovpns1
Mar 19 21:55:02 php-fpm18855: /rc.newwanip: rc.newwanip: Info: starting on ovpns1.
Mar 19 21:55:02 php-fpm18855: /rc.newwanip: rc.newwanip: on (IP address: 10.10.2.1) (interface: []) (real interface: ovpns1).
Mar 19 21:55:02 check_reload_status: Reloading filter
Mar 19 21:55:02 php-fpm18855: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - -> 10.10.2.1 - Restarting packages.
Mar 19 21:55:02 check_reload_status: Starting packages
Mar 19 21:55:02 php-fpm78159: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 174.95.68.17 -> 74.12.29.91 - Restarting packages.
Mar 19 21:55:02 check_reload_status: Starting packages
Mar 19 21:55:03 php-fpm18855: /rc.start_packages: Restarting/Starting all packages.
Mar 19 21:55:04 php-fpm18855: /rc.start_packages: Restarting/Starting all packages.

---------------------------------- end of system log ------------------------------------

---------------------------------- states table for port 4569 -----------------------------

LAN udp 209.217.xx.xx:4569 <- 192.168.2.6:4569 MULTIPLE:MULTIPLE
LAN udp 209.217.xx.xx:4569 <- 192.168.2.6:4569 MULTIPLE:MULTIPLE
WAN udp 74.12.81.71:8292 (192.168.2.6:4569) -> 209.217.xx.xx:4569 MULTIPLE:MULTIPLE
WAN udp 74.12.81.71:28476 (192.168.2.6:4569) -> 209.217.xx.xx:4569 MULTIPLE:MULTIPLE

---------------------------------- end of states table for port 4569 -----------------------------

Current WAN IP: 74.12.29.91

Actions #67

Updated by Chris Buechler about 9 years ago

saqi: where your IP changes multiple times in a very short period, as it did there, it'll miss killing states for some of the old IPs because it's changing so quickly. You'll need to enable the ip_change_kill_states option so all states get wiped on IP change where your IP changes that quickly. Go to Diag>Command, and in the PHP execute box, paste in:

$config['system']['ip_change_kill_states'] = true;
write_config();

If you have any further questions on it, please follow up @ forum.pfsense.org.

Actions #68

Updated by Andy Lawson about 9 years ago

Just got hit by this issue again, in v2.2 on alix.
Are you able to confirm what release will finally kill this bug?

Thanks.

Actions #69

Updated by Daniel Haid almost 9 years ago

I have the same issue with a SIP client. It seems that the SIP client creates a new entry in the short time after flushing the table, but before the firwall is updated. My patch reverses the order of these operations, but I do not know whether this is the correct way to do this.

Even with the patch, for some reason I need $config['system']['ip_change_kill_states'] = true; for it to work!

Actions #70

Updated by Daniel Haid almost 9 years ago

I have checked again, with the patch, it seems to work even without ip_change_kill_states. I do not know whether I saw it wrong last time or there may be still be some race condition even with the patch.

Actions #71

Updated by Chris Buechler almost 9 years ago

taking out the filter reload doesn't influence this, and will break things in a number of circumstances. There seemingly is a race condition if your IP changes twice in a very short period, where you need to kill all states. That should probably have its own ticket as this is full of unrelated history.

Actions #72

Updated by Kevin Trace almost 9 years ago

I have been hitting this issue for over a year. Finally getting tired of manually killing the stale UDP states. I am using a ALIX.2 on pfsense 2.2.2-RELEASE (i386). It just happened again and noticed the following the in the logs (IP Addresses have been masked):

Jun 15 08:52:54 php: rc.kill_states: rc.kill_states: Removing states for IP 104.247.***.***/32
Jun 15 08:52:55 check_reload_status: Rewriting resolv.conf
Jun 15 08:53:00 php-fpm42090: /rc.newwanipv6: rc.newwanipv6: Failed to update DSL[opt2] IPv6, restarting...
Jun 15 08:53:00 php-fpm44269: /rc.newwanip: IP has changed, killing states on former IP 0.0.0.0.
Jun 15 08:53:01 php-fpm44269: /rc.newwanip: ROUTING: setting default route to 206.248.154.***
Jun 15 08:53:05 php-fpm44269: /rc.newwanip: phpDynDNS: updating cache file /conf/dyndns_opt2custom''0.cache: 69.196.***.***
Jun 15 08:53:05 php-fpm44269: /rc.newwanip: phpDynDNS: (Success) IP Address Updated Successfully!
Jun 15 08:53:09 php-fpm44269: /rc.newwanip: phpDynDNS: updating cache file /conf/dyndns_opt2custom''1.cache: 69.196.***.***
Jun 15 08:53:09 php-fpm44269: /rc.newwanip: phpDynDNS: (Success) IP Address Updated Successfully!
Jun 15 08:53:11 php-fpm44269: /rc.newwanip: phpDynDNS: updating cache file /conf/dyndns_opt2custom''2.cache: 69.196.***.***
Jun 15 08:53:12 php-fpm44269: /rc.newwanip: phpDynDNS: (Success) IP Address Updated Successfully!
Jun 15 08:53:14 php-fpm44269: /rc.newwanip: phpDynDNS: updating cache file /conf/dyndns_opt2custom''3.cache: 69.196.***.***
Jun 15 08:53:15 php-fpm44269: /rc.newwanip: phpDynDNS: (Success) IP Address Updated Successfully!
Jun 15 08:53:17 php-fpm44269: /rc.newwanip: phpDynDNS: updating cache file /conf/dyndns_opt2custom''4.cache: 69.196.***.***
Jun 15 08:53:17 php-fpm44269: /rc.newwanip: phpDynDNS: (Success) IP Address Updated Successfully!
Jun 15 08:53:18 php-fpm44269: /rc.newwanip: Resyncing OpenVPN instances for interface DSL.
Jun 15 08:53:18 php-fpm44269: /rc.newwanip: Creating rrd update script
Jun 15 08:53:21 php-fpm44269: /rc.newwanip: pfSense package system has detected an IP change or dynamic WAN reconnection - 0.0.0.0 -> 69.196.182.116 - Restarting packages.
Jun 15 08:53:21 check_reload_status: Starting packages
Jun 15 08:53:23 php-fpm44269: /rc.start_packages: Restarting/Starting all packages.

The line that seems odd to me is "IP has changed, killing states on former IP 0.0.0.0.". Is this normal? I didn't have the ip_change_kill_states option enabled. I just enabled that now and will see if it helps for next time.

Actions #73

Updated by Tom De Coninck almost 9 years ago

Hi Kevin,

when the cable modem does weird or reboots i have also seen this behaviour with the 0.0.0.0 address.

Its been a while since my ip address has changed so I'm not the best person to debug .. Anyway this is what i did, and i think it worked.

When you only have 1 Wan internet connection, I think its a good practice to disable the gateway monitoring at System -> Gateways
With this being disabled, pfsense doesnt do strange stuff with the states when the gateway can't be reached.

Another thing that was necessary in my pfsense configuration is not letting to get a 192.168.100 IP address from the cable modem.

Let me know if this works for you

cu
Tom

Actions #74

Updated by frank br almost 9 years ago

I get the same behavior for my ipsec tunnels.
if my GW (cable modem giving dhcp to pfsense) "resets" itself i do not get a new ip address but my ipsec tunnels will stop working until i reset the states.

Actions #75

Updated by frank br almost 9 years ago

I forgot to post that i am using 2.2.3 and using multiple GW's to internet.

Actions #76

Updated by Andy Lawson about 8 years ago

Apologies for becoming hyperbolic, but this is verging towards the absurd. The first post on this issue is over 4 years old, and yet I still need to purge my state table after a WAN IP change to allow my ATA to register with my ISP's SIP proxy.

Any news when this issue is going to be resolved?

Running 2.2.6-RELEASE as a virtual appliance.

Thanks for all your hard work.

Actions #77

Updated by → luckman212 almost 8 years ago

I posted over on the forum but I am not sure who's subscribed so it might have gone unnoticed...

Is the following still necessary on 2.3.1_5 / 2.3.2 / 2.4+ if we want all states killed on failover/failback?

$config['system']['ip_change_kill_states'] = true;
write_config();
Actions #78

Updated by → luckman212 over 7 years ago

I have observed that executing the following code does not seem to actually change anything in config.xml -- so I think wherever/whenever this option used to have an effect, it's been removed?? Can anyone confirm?

global $config;
$config = parse_config(true);
$config['system']['ip_change_kill_states'] = true;
write_config("Enable kill all states");
Actions #79

Updated by Kill Bill about 7 years ago

Luke Hamburg wrote:

it's been removed?? Can anyone confirm?

No, not removed. Adding a relevant PR link:

https://github.com/pfsense/pfsense/pull/3535

Actions

Also available in: Atom PDF