Project

General

Profile

Actions

Bug #1351

closed

Mobile IPsec no traffic pass trough after 2nd connect after 5 minutes

Added by ronald meulendijks over 13 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
IPsec
Target version:
-
Start date:
03/14/2011
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.0
Affected Architecture:

Description

When a mobile tunnel is connected for the first time after configuration in pfsense 2.0RC1+shrewsoft client , all traffic is passed trough the tunnel and all is working o.k.
But when the tunnel is disconnected and after waiting for 10 minutes, the tunnel can be connected again but no traffic is passing trough anymore. This situation stays till racoon service is restarted.
then the whole story repeats.


Files

Logs_when_wrong.txt (6.08 KB) Logs_when_wrong.txt ronald meulendijks, 03/15/2011 02:23 PM
TEST_SITUATION.docx (150 KB) TEST_SITUATION.docx Shrewsoft Tracing ronald meulendijks, 04/05/2011 05:28 AM
Actions #1

Updated by Ermal Luçi over 13 years ago

Can you provide ipsec and system log?

Actions #2

Updated by Jim Pingle over 13 years ago

Also can we get the output of the following two commands:

$ setkey -D
$ setkey -DP

When it is working and when it is not working.

Actions #3

Updated by ronald meulendijks over 13 years ago

$ setkey -D
$ setkey -DP

How can i give those commands , i've tried via command in GUI but nothing happens

I've attached logs when the problem appears

Actions #4

Updated by Jim Pingle over 13 years ago

Don't type the $, that was just there as an example prompt. Diagnostics > Command in the shell execute box should be enough, or from ssh in the shell.

Actions #5

Updated by ronald meulendijks over 13 years ago

$ setkey -DP
10.1.1.0/24[any] 10.1.1.1[any] 255
in none
spid=2 seq=1 pid=7857
refcnt=1
10.1.1.1[any] 10.1.1.0/24[any] 255
out none
spid=1 seq=0 pid=7857
refcnt=1

$ setkey -D
No SAD entries.

Actions #6

Updated by Jim Pingle over 13 years ago

Is that from when it was working or when it was broken? (We need to see both states)

Actions #7

Updated by ronald meulendijks over 13 years ago

New test, both logs are here:

When WORKING :

$ setkey -D
95.96.134.404500 91.189.228.15828909
esp-udp mode=any spi=1282260169(0x4c6dbcc9) reqid=0(0x00000000)
E: 3des-cbc 91bcdee2 75ee5d3a 6ff5b3fd 5d01f915 63276918 317de7b3
A: hmac-sha1 e67bb7e5 681e2c3e 50369354 fd2d30cf 4794c2f1
seq=0x0000035f replay=4 flags=0x00000000 state=mature
created: Mar 16 09:28:09 2011 current: Mar 16 09:42:09 2011
diff: 840(s) hard: 3600(s) soft: 2880(s)
last: Mar 16 09:42:09 2011 hard: 0(s) soft: 0(s)
current: 97152(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 863 hard: 0 soft: 0
sadb_seq=1 pid=38414 refcnt=2
91.189.228.15828909 95.96.134.404500
esp-udp mode=tunnel spi=194755940(0x0b9bbd64) reqid=0(0x00000000)
E: 3des-cbc a97ecbbe 698ba750 c54197f4 9824879c bff6127f 1876facf
A: hmac-sha1 f156e35d d8df6005 8a073de6 370c931c 4e21b879
seq=0x00000588 replay=4 flags=0x00000000 state=mature
created: Mar 16 09:28:09 2011 current: Mar 16 09:42:09 2011
diff: 840(s) hard: 3600(s) soft: 2880(s)
last: Mar 16 09:42:09 2011 hard: 0(s) soft: 0(s)
current: 104145(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 1416 hard: 0 soft: 0
sadb_seq=0 pid=38414 refcnt=1

$ setkey -DP
10.1.1.0/24[any] 10.1.1.1[any] 255
in none
spid=2 seq=3 pid=62587
refcnt=1
192.168.78.1[any] 0.0.0.0/0[any] 255
in ipsec
esp/tunnel/91.189.228.158-95.96.134.40/require
created: Mar 16 09:28:09 2011 lastused: Mar 16 09:28:09 2011
lifetime: 3600(s) validtime: 0(s)
spid=9 seq=2 pid=62587
refcnt=1
10.1.1.1[any] 10.1.1.0/24[any] 255
out none
spid=1 seq=1 pid=62587
refcnt=1
0.0.0.0/0[any] 192.168.78.1[any] 255
out ipsec
esp/tunnel/95.96.134.40-91.189.228.158/require
created: Mar 16 09:28:09 2011 lastused: Mar 16 09:42:47 2011
lifetime: 3600(s) validtime: 0(s)
spid=10 seq=0 pid=62587
refcnt=1

WHEN WRONG :

$ setkey -D
91.189.228.15855570 95.96.134.404500
esp-udp mode=tunnel spi=128118630(0x07a2ef66) reqid=0(0x00000000)
E: 3des-cbc 3dbaa2bc 7eb8cbb7 b19f5dc5 9bb0ea73 d54873d2 cc4872c4
A: hmac-sha1 8cd2f610 88b452e6 7b71a6d7 b5b0f9fb b1fc0735
seq=0x00000047 replay=4 flags=0x00000000 state=mature
created: Mar 16 10:04:26 2011 current: Mar 16 10:05:05 2011
diff: 39(s) hard: 3600(s) soft: 2880(s)
last: Mar 16 10:05:02 2011 hard: 0(s) soft: 0(s)
current: 5459(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 71 hard: 0 soft: 0
sadb_seq=1 pid=2777 refcnt=1
95.96.134.404500 91.189.228.15828909
esp-udp mode=any spi=2200572355(0x832a11c3) reqid=0(0x00000000)
E: 3des-cbc e36cfc6e a8cc2bab 2c0be568 2b0d1a07 2f5677d1 40f576a0
A: hmac-sha1 232c3f6e 39e16544 b46680f0 78363b4f 820f9e5f
seq=0x00000000 replay=4 flags=0x00000000 state=mature
created: Mar 16 10:04:26 2011 current: Mar 16 10:05:05 2011
diff: 39(s) hard: 3600(s) soft: 2880(s)
last: hard: 0(s) soft: 0(s)
current: 0(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 0 hard: 0 soft: 0
sadb_seq=0 pid=2777 refcnt=1

$ setkey -DP
10.1.1.0/24[any] 10.1.1.1[any] 255
in none
spid=2 seq=3 pid=12815
refcnt=1
192.168.78.1[any] 0.0.0.0/0[any] 255
in ipsec
esp/tunnel/91.189.228.158-95.96.134.40/require
created: Mar 16 10:04:26 2011 lastused: Mar 16 10:04:26 2011
lifetime: 3600(s) validtime: 0(s)
spid=11 seq=2 pid=12815
refcnt=1
10.1.1.1[any] 10.1.1.0/24[any] 255
out none
spid=1 seq=1 pid=12815
refcnt=1
0.0.0.0/0[any] 192.168.78.1[any] 255
out ipsec
esp/tunnel/95.96.134.40-91.189.228.158/require
created: Mar 16 10:04:26 2011 lastused: Mar 16 10:05:28 2011
lifetime: 3600(s) validtime: 0(s)
spid=12 seq=0 pid=12815
refcnt=1

Actions #8

Updated by Rob Eckel over 13 years ago

I have the same issue as originally described. I'm currently running:

2.0-RC1 (i386)
built on Thu Mar 24 22:33:52 EDT 2011

I haven't tried to get the info you requested from the original bug submitter, but will do the same if it would be helpful. I have this issue on my iPhone connecting the the IPSec VPN. The first connection after a reset works perfectly, but if its disconnected and tried to reconnect at a later time, it will fail.

Actions #9

Updated by Andy Giles over 13 years ago

I have also seen this issue alongside the problem of not being able to connect more than 1 mobile client.

See http://forum.pfsense.org/index.php/topic,35057.0.html

If it's the same routing issue the direct edit of the mode_cfg entry in the racoon.conf file and restarting racoon fixes it.

Actions #10

Updated by ronald meulendijks over 13 years ago

Thanks,

I will try this solution.

report to you if it also fixes this problem.

Actions #11

Updated by ronald meulendijks over 13 years ago

Hi,

I have altered the Racoon.conf with the solution Andy Giles suggested. But no luck unfotunally.
altered the subnet settings in racoon.conf to 255.255.255.255 and routing should work, but it doesn't

I did not mention that connecting the first time it takes a while to connect, but when it's connected it works.
The second time connecting is almost instantly but no traffic possible.

anyone ?

Actions #12

Updated by Andy Giles over 13 years ago

One thing that is different in my setup is that the other end is racoon as well, not shrewsoft client. With the racoon.conf mod it's now been running fine for me with up to 4 mobile clients connected at the same time.

Actions #13

Updated by ronald meulendijks over 13 years ago

Hi,

Tested again , tried many different situations. They all have the same problem.
It's definitely a routing/filtering issue in pFsense 2.0 RC1.
When the tunnel is connected the 2e time, pfsense routes no traffic back through the tunnel to the mobile client wich has a time out.

the situations tested:
pfsense(VMWARE..Racoon)<--->internet<---->pfsense(physical)<------>Mobile Client(shrewsoft 2.1.7 / 2.2 Beta)

pfsense(VMWARE..Racoon)<--->internet<---->Vodafone Dongle(UMTS)<---->Mobile Client(shrewsoft 2.1.7 / 2.2 Beta)

pfsense(VMWARE..Racoon)<--->internet<---->Cisco ASA<---->Mobile Client(shrewsoft 2.1.7 / 2.2 Beta)

Actions #14

Updated by Rob Eckel over 13 years ago

I solved the problem that I was experiencing today. I noticed that the step of the connection that it was stalling on during failed connections was the step concerning NAT-T. By disabling NAT-T in my phase 1 setting, I am now able to connect reliably from my iPhone. The connection process takes no more than about 3 or 4 seconds over a decent 3G connection.

Actions #15

Updated by Rob Eckel over 13 years ago

Rob Eckel wrote:

I solved the problem that I was experiencing today. I noticed that the step of the connection that it was stalling on during failed connections was the step concerning NAT-T. By disabling NAT-T in my phase 1 setting, I am now able to connect reliably from my iPhone. The connection process takes no more than about 3 or 4 seconds over a decent 3G connection.

I spoke too soon, and didn't fully test that the connection worked. I am able to complete the connection process in IPSec, but not able to pass traffic to/from my home LAN without NAT-T enabled. Re-enabling it verified that I was again able to access my home LAN, but a couple minutes after closing the connection, it's not able to re-establish another one. Restarting racoon does not fix this, however resetting the pfsense box does allow another connection (but not a second one after the first one has been closed for a few minutes).

Apr 5 23:30:55 racoon: [IPHONE_IP] INFO: Selected NAT-T version: RFC 3947
Apr 5 23:30:55 racoon: INFO: Adding remote and local NAT-D payloads.
Apr 5 23:30:55 racoon: [IPHONE_IP] INFO: Hashing IPHONE_IP7818 with algo #2 (NAT-T forced)
Apr 5 23:30:55 racoon: [Self]: [PFSENSE_IP] INFO: Hashing PFSENSE_IP500 with algo #2 (NAT-T forced)
Apr 5 23:30:55 racoon: INFO: Adding xauth VID payload.
Apr 5 23:31:13 racoon: ERROR: phase1 negotiation failed due to time up.

That last line of the log is where I'd normally get a line about "INFO: NAT-T: ports changed" when the connection is successful.

Actions #16

Updated by Andy Giles over 13 years ago

Just for a point of reference to my earlier info.
I eventually found that my issue was a problem with the client end and once that was fixed I no longer required the workaround to the racoon.conf file in pfsense. The mobile client was setting up the policy with a netmask equivalent to that defined for the mobile network from the server thus breaking the routing on the server.

Actions #17

Updated by Chris Buechler over 13 years ago

  • Status changed from New to Resolved

thanks

Actions #18

Updated by Jim Pingle over 13 years ago

  • Status changed from Resolved to New

Some people are still hitting this same error, but not this specific circumstance. Two support customers, plus others in forum threads like http://forum.pfsense.org/index.php/topic,34646.msg197636.html#msg197636 so apparently there are still a couple issues with racoon locating an SA for a client's return traffic.

Actions #19

Updated by Chunlin Yao over 13 years ago

My situation maybe related to this issues.

Mobile clients connect to pfSense use nat-t. I think racoon should support multiple clients behind nat use tunnel mode.But something is wrong.
I use virtualbox to emulate a environment. Use Sherw VPN Client 2.1.7 as client.pfSense-2.0-RC3 as server.
At first I want to try multiple clients behind NAT feature. If I connect the 2nd client to the pfsense, The SA established ,but 2nd client cannot ping pfsense's LAN.tcpdump -ni enc0 show incoming icmp but no reply. The 1st client still can ping pfsense's LAN. At this state if I disconnect the 1st client and reconnect, the first client can not ping.samely the tcpdump show incoming icmp packets from 1st and 2nd client, no reply.

But If I havn't connect the 2nd client, The 1st client can disconnect and reconnect multiple times and still have traffic.
I confirmed use setkey -DPp, Disconnect deleted the policy correctly.

Then I did a second testing.Only one client behind the nat.
  1. Connect the client to the pfsense, everything works fine (include disconnect and reconnect).
  2. Disconnect the client.
  3. Flush the NAT box's state use pfctl -k ..., So the nat will use a different udp port next time.
  4. Connect to the pfsense. The I can not ping pfsense's LAN. tcpdump only have incoming icmp packets.

I must restart racoon to resolve this problem.*Maybe someone can easily reproduce this bug by change the source UDP port used in NAT-T connection.*

This is generated racoon.conf file

# This file is automatically generated. Do not edit
path pre_shared_key "/var/etc/psk.txt";

path certificate  "/var/etc";

listen
{
    adminsock "/var/db/racoon/racoon.sock" "root" "wheel" 0660;
    isakmp 192.168.56.102 [500];
    isakmp_natt 192.168.56.102 [4500];
}

mode_cfg
{
    auth_source system;
    group_source system;
    pool_size 253;
    network4 172.17.1.2;
    netmask4 255.255.255.0;
    split_network include 192.168.57.0/24;
    banner "/var/etc/racoon.motd";
}

remote anonymous
{
    ph1id 1;
    exchange_mode aggressive;
    my_identifier address 192.168.56.102;

    ike_frag on;
    generate_policy = on;
    initial_contact = on;
    nat_traversal = force;

    dpd_delay = 10;
    dpd_maxfail = 5;
    support_proxy on;
    proposal_check obey;
    passive on;

    proposal
    {
        authentication_method pre_shared_key;
        encryption_algorithm 3des;
        hash_algorithm sha1;
        dh_group 2;
        lifetime time 28800 secs;
    }
}

sainfo   anonymous
{
    remoteid 1;
    encryption_algorithm 3des;
    authentication_algorithm hmac_sha1;

    lifetime time 3600 secs;
    compression_algorithm deflate;
}

Actions #20

Updated by Chunlin Yao over 13 years ago

Jim P wrote:

Some people are still hitting this same error, but not this specific circumstance. Two support customers, plus others in forum threads like http://forum.pfsense.org/index.php/topic,34646.msg197636.html#msg197636 so apparently there are still a couple issues with racoon locating an SA for a client's return traffic.

After reading this post.I changed the generate_policy to unique. Now I can connect multiple client behind same NAT.

Actions #21

Updated by Chunlin Yao over 13 years ago

ronald meulendijks wrote:

0.0.0.0/0[any] 192.168.78.1[any] 255
out ipsec
esp/tunnel/95.96.134.40-91.189.228.158/require
created: Mar 16 10:04:26 2011 lastused: Mar 16 10:05:28 2011
lifetime: 3600(s) validtime: 0(s)
spid=12 seq=0 pid=12815
refcnt=1

Your policy is require, Can you change generate_policy to unique and try it again.

Actions #22

Updated by Chris Buechler over 13 years ago

  • Target version changed from 2.0 to 2.0.1
Actions #23

Updated by Chris Buechler about 13 years ago

  • Target version deleted (2.0.1)
Actions #24

Updated by Arthur Brownlee IV almost 13 years ago

Looks like we're still having this issue on 2.0.1 (I realize it wasn't marked fix, just saying as an FYI.)

I've jumped through all the unique/strict etc hoops that most everyone on the forum has, and still wind up having to reboot the firewall to reconnect.

Dec 29 15:36:06 racoon: ERROR: no configuration found for 63.201.xxx.xxx.
Dec 29 15:36:06 racoon: ERROR: failed to begin ipsec sa negotication.

Is what I get plagued with in the logs, and yes, negotiation is spelled wrong in the logs as well :)

Let me know how I can be of help!

Actions #25

Updated by Ermal Luçi almost 13 years ago

This is the same as #1970.

Please try with the new ipsec-tools port from pfPorts.

Actions #26

Updated by Arthur Brownlee IV almost 13 years ago

Ermal,

Any chance you could shed some light on how to do that? I've got a few clients who this is affecting quite severely and would like to come up with a good solution for them.

Thank you!

Actions #27

Updated by Ermal Luçi almost 13 years ago

Need to rebuild the ipsec-tools port and use it in pfSense build.

Actions #28

Updated by Darwin Mach over 12 years ago

This issue is still present in the latest 2.1 Development build with a different twist:

1.) Reset racoon
2.) Mobile VPN user connects, everything routes
3.) User disconnects
4.) User reconnects, nothing routed

Actions #29

Updated by Darwin Mach over 12 years ago

The errors that occur upon the reconnection:

racoon: ERROR: failed to begin ipsec sa negotication.
racoon: ERROR: no configuration found for X.X.X.X.

Actions #30

Updated by Jim Pingle almost 12 years ago

  • Status changed from New to Feedback

This should be OK these days, just make sure:

Phase 1 Settings:
Policy Generation: Unique
Proposal Checking: Strict

Uncheck System > Advanced, Misc Tab: Prefer Old IPsec SA.

Actions #31

Updated by David Duchscher almost 12 years ago

I am running into this issue on 2.1 BETA. I have tried all

Connecting the first time after restart of racoon works fine. After that, things get strange and it looks like the session is not being cleaned up correctly.

For example, if I stick to one device, it seems to be able to connect, disconnect and then reconnect successfully. But if I disconnect that device and connect another, that device does not have any connectivity. Disconnecting the second device and reconnecting the first device shows that it has lost its connectivity. All devices get assigned the same IP address. If I connect the first device so as to get a different IP address on the second device, the second device works. Disconnecting and switch the order of the devices then breaks the connectivity.

The only way I have found to fix this is to restart racoon.

Actions #32

Updated by Ermal Luçi almost 12 years ago

You used teh suggestions from Jim especially disabling prefer old ipsec sa?

Actions #33

Updated by Jim Pingle almost 12 years ago

Also snapshots dated today or later contain ipsec-tools version 0.8.1, so it's worth trying again on a new snapshot. After making the other changes I mentioned.

Actions #34

Updated by Luli Dushaj over 11 years ago

I was using 2.1 BETA also and experienced the same issue as David Duchscher. I had to revert back to 2.0.2 stable so that I could use they mobile VPN without any issues. I followed all of Jim's steps but no success. It seems that the tunnel is not brought down properly when disconnecting. I only started experiencing this when I updated to 2.1 BETA. I can provide any logs you need to help troubleshoot this issue because I really like 2.1 much better.

Actions #35

Updated by Roy Blüthgen over 11 years ago

Running pfSense 2.0.2 stable with same IPsec tunnel issue (no tunnel data on reconnect, racoon restart needed)

I followed instructions by Jim (note 30) and disabled Prefer older IPsec SAs in advanced system settings - and now it works!
(System >> Advanced >> Miscellaneous >> IP Security: disable/uncheck Prefer older IPsec SAs)

Followed this guide for my IPsec server/client setup:
http://doc.pfsense.org/index.php/IPsec_for_road_warriors_in_PfSense_2.0.1_with_PSK_in_stead_of_xauth

Actions #36

Updated by Ignat Esso over 11 years ago

Same problem here running:

2.0.3-RELEASE (amd64)

Client can connect OK for the first session but then after disconnection re-auth is successful but no crypto packets pass through the router/firewall.

The only way to allow packets to pass is to untick the "Enable IPSec" button. This works for the next session but the same issue comes back after disconenction

Actions #37

Updated by Peter Borföi over 11 years ago

Having similar issues:

2.1 RC0 (symptoms started from 2.03 on as far as i can remember)

Policy Generation > Unique
Proposal Checking > Strict
Advanced, Misc Tab: Prefer Old IPsec SA > Unchecked (similar symptoms with checked)
Authentification: PSK (a short test with PSK/XAUTH revealed the same issues)

Settings on clients (Ipsecuritas/Mac, Shrewsoft/Win) match.

Reproducability is weird:
- "Quick reconnect" works most of the times
- Sometimes (possible circumstances: ungraceful disconnect, change of client WAN IP, client sleep, reconnect after 5 minutes, ?) no more traffic is passed. Reconnect works fine, but traffic doesn't pass back from pfSense to mobile client.

Tried several settings phase1/2, same symptoms. Sticking with the "official" ones, also same symptoms.
diag_ipsec_sad.php then shows some SAD's with 0 B traffic .

Actions #38

Updated by David Duchscher over 11 years ago

I backed the following patch out from ipsec-tools and many of my issues when away.

https://github.com/duchscherd/pfsense-tools/blob/master/pfPorts/ipsec-tools-0.8.1/files/patch11-purge_sp_fix.diff

I could then connect clients multiple times without issue. In stress testing, I noticed that an old iPad caused similar issues when it disconnects, as it does not disconnect cleanly. A client connecting and assigned the IP address of the old iPad connection would hang if the connection happened too quickly. Looking at the logs, I found racoon reusing SA entries. You would see following log message when this happened,

racoon: INFO: keeping IPsec-SA spi=72168544 - found valid ISAKMP-SA spi=54f3e5c1172fa0ca:33dafbcf2e51a06a:00008ece.

I commented out that ability in racoon and everything has been working. This needs more testing but I am, at least, hopeful.

I will look at submitting the patch back but I am concerned about backing out patch11-purge_sp_fix.diff without being able to test the problem(s) it fixed. I might have missed it but I don't see enough information here to test.

Actions #39

Updated by David Duchscher over 11 years ago

I have placed all the changes I have made to racoon up on Github. You can find them here

These changes fix the following:

  • Dead tunnel if a client connects to IP address previous used.
  • Dead tunnel if a tunnel times out.
  • Mac OS X authentication prompts after 48 minutes.

This all need more testing and more eyeballs. I can provide binaries if people want to test. I have tested in on Apple platforms, iPad, iPhone, and Mac OS 10.8. I will test on Windows, Linux/FreeBSD when I have some more free time.

Actions #40

Updated by Peter Borföi over 11 years ago

Thanks.
I will test as soon as it's in a snapshot (im currently on 2.1RC0). Backing out the old patch already yielded very good result.

Actions #41

Updated by Peter Borföi over 11 years ago

IPSec with mobile clients on Current 2.1 RC's seems very reliable - various user reports are very positive. Thanks.

Actions #42

Updated by Chris Buechler over 10 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF