Bug #1629
invalid state table entries after WAN IP change
| Status: | New | Start date: | 06/29/2011 | ||
|---|---|---|---|---|---|
| Priority: | High | Due date: | |||
| Assignee: | - | % Done: | 50% |
||
| Category: | Rules/NAT | ||||
| Target version: | 2.1 | ||||
| Affected version: | All | Affected Architecture: |
Description
We have an asterisk server behind pfsense 2.0-RC3 using a PPPoE DSL connection.
Whenever our WAN IP changes the asterisk server cannot register to the providers.
Flushing the state table allows asterisk to register again.
Associated revisions
Kill states from the previous ip the link had on all mpd consumers. Resolves #1629
Kill states from the previous ip the link had on all mpd consumers. Resolves #1629
Try to remove old states when a DHCP IP changes, might be related to ticket #1629 and also "unable to allocate llinfo" messages from states through an old gateway.
Ticket #1629 Another round of fixes related to state clearing
History
#1
Updated by Evgeny Yurchenko almost 2 years ago
Please provide state-table dump before IP change and after.
#2
Updated by Eli Hunter almost 2 years ago
- File state_table.txt
added
Hopfully this is what you wanted.
My IP before the address changed was 76.254.18.100 and the new assigned address is 99.179.45.73
Bad entry in state table before resetting states
udp 10.0.4.3:5060 -> 76.254.18.100:13819 -> 67.215.241.250:5060 SINGLE:NO_TRAFFIC
I've attached a txt document with the state table info before and after flushing it.
#3
Updated by Chris Buechler almost 2 years ago
- Category set to PPP
- Priority changed from Normal to High
- Target version set to 2.0
- Affected version set to 2.0
PPPoE is supposed to clear all states on that interface when an IP changes, that's not happening correctly.
#4
Updated by Ermal Luçi almost 2 years ago
- Status changed from New to Feedback
- % Done changed from 0 to 100
Applied in changeset 0f2826c03d3e3971f6d83041f9322737686846d9.
#5
Updated by Ermal Luçi almost 2 years ago
Applied in changeset 8ed478973f20678568f03f00309a5165aa48a1b3.
#6
Updated by Fábio Pinto Coelho almost 2 years ago
I'm still on 1.2.3 and the same problem happens on it.
As a workaround, you may check http://forum.pfsense.org/index.php?topic=18053.5
I use two VoIP providers, so I have modified the script to clear the states for both of them. If you need my updated script, or any help on implementing it, please let me know.
I can see Ermal Luçi has already released a fix, but I thought I'd just let you know...
#7
Updated by Chris Buechler almost 2 years ago
That's expected to happen in 1.2.3 (it has no provisions for dealing with that scenario, only 2.0 does).
#8
Updated by Ermal Luçi almost 2 years ago
Have you tested this on 2.0?
#9
Updated by Eli Hunter almost 2 years ago
I got the update installed last week but haven't had the IP change on me yet (surprisingly). I'll update this once the IP's changed.
#10
Updated by Matt Corallo almost 2 years ago
I have the same problem (after the fixes) with IPv6 tunneling, so this is not resolved.
#11
Updated by Chris Buechler almost 2 years ago
IPv6 is a completely different version, that's 2.1 not 2.0, post info to the IPv6 board on the forum.
#12
Updated by Matt Corallo almost 2 years ago
No, no, Im not talking about IPv6 in pfSense, Im talking about IPv6 NAT passthrough in the "System: Advanced: Networking" menu in 2.0, not 2.1. Its the same bug for the same reason.
#13
Updated by Eli Hunter almost 2 years ago
I had it reset again this weekend which took the asterisk server down again. Unfortunately I wasn't near a computer and had to get things up and running for them quickly so I used my phone to reset the state table. This got their phones working again but I wasn't able to get a copy of any logs. I'm going to assume this isn't fixed yet but it's probably good to wait until it happens again so I can make sure it's still a problem.
#14
Updated by Eli Hunter almost 2 years ago
It's still happening.
Again here's a relevant section. Our Asterisk server at 10.0.4.3 is still trying to use the old gateway address of 99.58.29.27. Our server at 10.0.4.100 is using the correct gateway at 99.169.80.219
I can attach the full state table again if it helps.
udp 10.0.4.3:5060 -> 99.58.29.27:37488 -> 209.62.1.2:5060 MULTIPLE:MULTIPLE
tcp 10.0.4.100:55507 -> 99.169.80.219:46579 -> 216.52.233.157:443 ESTABLISHED:ESTABLISHED
#15
Updated by Ermal Luçi almost 2 years ago
Can you post system log with state table as well?
#16
Updated by Eli Hunter almost 2 years ago
- File log.txt
added
I just copied the system log page and state table in attached document.
Would collecting the data with a syslog server help? I can get it setup if it helps track this down.
#17
Updated by Ermal Luçi almost 2 years ago
From the attached:
- What is the old gateway?
- What is the new gateway?
- What is the wrong entry?
#18
Updated by Chris Buechler over 1 year ago
- Target version changed from 2.0 to 2.0.1
#19
Updated by Luke Hamburg over 1 year ago
Hi- I've just experienced the exact same issue. pfSense 2.0(REL) running nanobsd-2g on a Netgate Hamakua. My WAN DHCP lease expired and when it renewed the IP had changed. My internal Asterisk server lost all trunk registrations and I had to manually reset the states to fix it.
Has there been any update on this problem - or is there a workaround that doesn't require manual intervention?
#20
Updated by Andrea Cutelle' over 1 year ago
Hi, the same error in my installation. pfsense 2.0 rel running on jetway nc9c-550lf. I have static public ip, when the connection change state up to down and then again up my asterisk server lost connection to the trunk with state request sent. resetting state work well again.
sorry for my english..
#21
Updated by Chris Buechler over 1 year ago
- Target version deleted (
2.0.1)
#22
Updated by Pho Bia over 1 year ago
I also experience this with my SIP device (PAP2T). I thought my provider was to blame as changing the remote server usually got my phones back online.
I have a multiple WAN (3) setup.
Are you still looking for logs on this, or is the fix already known?
#23
Updated by Pho Bia over 1 year ago
This is what my states look like for my effected device from Diagnostics --> States when my VoIP adapter shows offline (filtered for my PAP2T IP only) :
udp 64.120.22.242:5060 <- 192.168.0.100:5061 NO_TRAFFIC:SINGLE
udp 64.120.22.242:5060 <- 192.168.0.100:5060 NO_TRAFFIC:SINGLE
udp 192.168.0.100:5060 -> 64.231.21.117:52434 -> 64.120.22.242:5060 SINGLE:NO_TRAFFIC
udp 192.168.0.100:5061 -> 64.231.21.117:57887 -> 64.120.22.242:5060 SINGLE:NO_TRAFFIC
This is what it looks like after I reset states and my service goes back online :
udp 64.120.22.242:5060 <- 192.168.0.100:5061 MULTIPLE:MULTIPLE
udp 192.168.0.100:5061 > 64.231.160.191:46020 -> 64.120.22.242:5060 MULTIPLE:MULTIPLE 192.168.0.100:5060 MULTIPLE:MULTIPLE
udp 64.120.22.242:5060 <
udp 192.168.0.100:5060 -> 64.231.160.191:56285 -> 64.120.22.242:5060 MULTIPLE:MULTIPLE
#24
Updated by Christian Schwarz over 1 year ago
Bug still present with 2.0.1.
This is not happening every time the IP chages.(Provider disonnects once a day) But after a few days our SIP-Registration is down. Statetable shows entry to SIP-Provider with old WAN-IP...
Flushing the states manually will bring the trunk up again. (Using Telekom Germany with Panasonic PBX)
#25
Updated by Akom Benevolent 12 months ago
Same deal, 2.0.1-RELEASE and this happens every so often, but not on every IP change. I can delete the 2 state entries for the old WAN IP, and then asterisk registers fine.
#26
Updated by Grant Emsley 11 months ago
I'm seeing the exact same behavior using a PPPOE internet connection, 2.0.1-RELEASE i386, and asterisk.
When my connection goes down, the states remain in the state table and the asterisk server is unable to register.
#27
Updated by Nicklas Blidmo 11 months ago
Same issue with WAN DHCP, 2.0.1-RELEASE (i386) (nanobsd)
#28
Updated by Chris Buechler 11 months ago
- Category changed from PPP to Rules/NAT
- Status changed from Feedback to New
- Target version set to 2.1
- Affected version changed from 2.0 to All
#29
Updated by Grant Emsley 11 months ago
I don't know if it will help, but I noticed it happens much more frequently when my internet connection is flaky. It happened almost every time when my PPPOE connection was dropping every 2-5 minutes due to a loose cable.
#30
Updated by fos4X fos4X 6 months ago
I can confirm that this problem still exists in 2.0.1-RELEASE (amd64) built on Mon Dec 12 18:16:13 EST 2011 using PPPoE with static IP (but 24h disconnects)
#31
Updated by Jim P 6 months ago
- Status changed from New to Feedback
Some fixes for this have gone into 2.1 over the past few months. Try a 2.1-BETA snapshot and see if it's repeatable there.
#32
Updated by fos4X fos4X 6 months ago
Confirmed to still be an issue in 2.1-BETA0 (amd64) built on Wed Nov 28 15:23:39 EST 2012
A reconnect of PPPoE WAN (manual or 24h) leads to asterisk beind unable to Register, it hangs at "Request Sent".
Could be related to #2700 but if older revisions are any indication, removing the /32 from $3/32 will not help either.
I would be willing to test any suggested (hot)fixes.
I read elsewhere that a pfctl -b <gw> in ppp_linkup (UP!!!!) helps, have not tried it yet though (every trial means disconnecting my colleagues from the web for a few minutes)
#33
Updated by pierre mayer 6 months ago
still not working with 2.1Beta0(i386)built pfSense-memstick-2.1-BETA0-i386-20121128-1058.img
need to reset state table to make freepbx working
#34
Updated by Ermal Luçi 4 months ago
- Target version changed from 2.1 to 2.2
The only real solution to this is to switch to if-bound states for many reasons.
That is a bit more involved changed for 2.1
#35
Updated by Chris Buechler 4 months ago
- Status changed from Feedback to New
- Target version changed from 2.2 to 2.1
we at least need the option to wipe the entire state table upon IP change.
#36
Updated by Ermal Luçi 4 months ago
- Status changed from New to Feedback
Ok i went and did another implementation fix for this.
Can you please try with later 2.1 snapshots and see if it behaves correctly?
#37
Updated by Tobias Wigand 4 months ago
Does not work for me.
Correct me if I'm mistaken here, but canpfctl -i
work without binding states to interfaces?
One of my external interfaces is em1, butpfctl -i em1 -ss
does not show anything. Altough I have a working VoIP state in the table going out on that interface.
#38
Updated by Ermal Luçi 3 months ago
Check with later coming snapshot there was a problem with the patch that has been corrected.
#39
Updated by Tobias Wigand 3 months ago
Does not work, sorry. Only the "Out" states are flushed. The "In" states persist and seem to remember their gateway. After some time the "Out" states are coming back with the non-existent / down gateway. You can see this in the long pftop view. The debug.rules are correct, but they are never used because of the persisting "In" states. Or do I need to use floating rules do use this, will they behave differently?
#40
Updated by Matthias Dilbert 3 months ago
This problem also affects me. I’ve upgraded to Snapshot "built on Sat Feb 9 23:46:16 EST 2013". I will look for the problem to reoccur.
#41
Updated by Matthias Dilbert 3 months ago
Today the problem occurred again. So it was not fixed yet.
#42
Updated by Ermal Luçi 3 months ago
I just pushed another change to reset states with certain gateways set.
It should behave even better than previously, since it will send a RST for tcp states getting killed belonging to a certain gateway.
UPDATE for later: Probably a more through way of chained dependency of state need to be implemented with -Fs option.
As if you kill a state belonging to an interface try to find the correlated state on any other interface if present, especially on non-pfSense originated traffic.
That is a bit more involved and more careful checks of not corrupting the table needs to be done but for now this should work correctly.
#43
Updated by Tobias Wigand 3 months ago
Thanks, it works with my VoIP device now. The states get killed correctly.
#44
Updated by Dim Hatz 3 months ago
Ermal, testing this feature on a pfsense box with a WAN interface that gets via DHCP an IP in a /24 subnet (i.e. it's not PPPoE), it won't kill pf states that originate from the LAN to any host in that /24 subnet (which includes the gwip). Connections beyond the WAN subnet seem to get killed, but the LAN states don't.
E.g. I establish an ssh connection from 192.168.100.12 to aa.bb.40.155, then manually flushed states using:
pfctl -i em0 -Fs -G gwip
After doing it 3 times, I get:
pfctl ss | fgrep 40.155 192.168.100.12:3131 ESTABLISHED:ESTABLISHED
em1 tcp aa.bb.40.155:22 <
em1 tcp aa.bb.40.155:22 <- 192.168.100.12:3590 ESTABLISHED:ESTABLISHED
em1 tcp aa.bb.40.155:22 <- 192.168.100.12:3595 ESTABLISHED:ESTABLISHED
em1 tcp aa.bb.40.155:22 <- 192.168.100.12:3597 ESTABLISHED:ESTABLISHED
em0 tcp 192.168.100.12:3597 -> xxx.yyy.1.201:65161 -> aa.bb.40.155:22 ESTABLISHED:ESTABLISHED
em0 WAN - xxx.yyy.1.201
em1 LAN - 192.168.100.1
remote ssh server - aa.bb.40.155
PS: Running latest 2.1-BETA snapshot (12-Feb 08:58)
MD5 (/sbin/pfctl) = af1a7f62f1ae26958ba050f6c6f418a6
#45
Updated by Dim Hatz 3 months ago
To followup my previous post, I have verified that the WAN (em0) states are indeed flushed, however their corresponding LAN (em1) states linger on.
#46
Updated by Renato Botelho 3 months ago
- Status changed from Feedback to New
- % Done changed from 100 to 50
#47
Updated by Matthias Dilbert 2 months ago
I’ve upgraded to the latest beta, but the problem still persists. Even when the modem is restartet and i don’t get a new ip, the states go wrong.
#48
Updated by Sebastian Chrostek 2 months ago
Same Problem here with 2.1 Beta (built on Fri Mar 1 21:17:31 EST 2013)
It seems that also states without the old IP in it make problems with SIP.
In my case this two states:
udp 212.227.18.199:5060 <- 172.17.0.1:5060 MULTIPLE:MULTIPLE
udp 172.17.0.1:5060 -> 212.227.18.199:5060 SINGLE:NO_TRAFFIC
for my VoIP connection to "1und1" prevent asterisk from getting a connection.
if i clear only this two states, asterisk gets a connection only a few seconds later.
i use a PPPOE WAN
#49
Updated by Tom De Coninck 2 days ago
I also have the same issues and following this issue. Maybe I can provide some extra information, sorry if it's double
I started using pfsense since version 2.0, now running the latest 2.03 on alix board.
I was using a PPTP VDSL connection, and i am now using a cable WAN connection. There was no difference for asterisk. When the WAN ip changed, the UDP state with old wan ip address stayed alive. With the different WAN connections the bad UDP state stayed alive.
When the wan IP changed into a new address, the state stayed alive with the old WAN IP ADDRESS
udp LOCALASTERISKIP:5060 -> WANIPOLD:17205 -> SIPPROVIDER:5060 MULTIPLE:MULTIPLE
instead of
udp LOCALASTERISKIP:5060 -> WANIPNEW:17205 -> SIPPROVIDER:5060 MULTIPLE:MULTIPLE
I was able to kill the state using
pfctl -k LOCALASTERISKIP -k SIPPROVIDER
So to fix my issue i had to run this command every time the WAN IP Address changes.
I created this script with info i found in the internet
create /usr/local/bin/reset_states.sh
#!/bin/sh # Kill Udp Sip States after new wan IP echo "Killing States from ASTERISKIP to SIPPROVIDER" |logger; /sbin/pfctl -k ASTERISKIP -k SIPPROVIDER
Change file permissions
chmod 755 /usr/local/bin/reset_states.sh
Edit config file /conf/config.xml
<system> ... <afterfilterchangeshellcmd>/usr/local/bin/reset_states.sh</afterfilterchangeshellcmd> </system>
Asterisk configuration¶
#pfctl -st
udp.first 60s udp.single 30s udp.multiple 60s
Running the command shows me that the states die after 60s of inactivity.
To keep the state alive, keep the qualify under 60s, in my case 30s (30000)
; SIPPRODER_SIPPHONENUMBER [SIP-PROVIDER-13764962994f4fdde1430ba] qualify=30000
This works for me , hope it helps anyone.. and looking forward to a permanent fix
my compliments for the pfsense programmers, i am a very happy user, and i will promote it.