Project

General

Profile

Actions

Bug #2951

closed

OpenVPN and alternative monitoring IP in 2.1

Added by Pascal Huesch over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Normal
Category:
Gateways
Target version:
Start date:
04/16/2013
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Release Notes:
Affected Version:
2.1
Affected Architecture:

Description

Regarding Revision 4fdd86a3 fix (https://redmine.pfsense.org/projects/pfsense/repository/revisions/4fdd86a37e8ef82298bed1ec684280644f07b61f)

I'm using OpenVPN client connections with dynamic IP assignment and a gateway that doesn't reply to ICMP. Thus I need to use the alternative monitoring IP. Since todays snapshot monitoring is working fine after configuring the alternative monitoring IP on the gateway.

Nevertheless, when I reboot the pfSense box and the OpenVPN connection is established on boot the monitoring breaks. I can fix it by simply go to the OpenVPN gateway and click save again without actually changing something.

Actions #1

Updated by Renato Botelho over 11 years ago

  • Status changed from New to Feedback
  • Assignee set to Renato Botelho

I couldn't reproduce here. What exactly happens after reboot? Monitoring shows gateway as offline? Is it possible to share your config.xml (hiding sensitive parts) with me?

Actions #2

Updated by Pascal Huesch over 11 years ago

It says: RTT: pending/~ Loss: pending/~ Status: unknown.

I've sent the XML to the mail-address in your profile.

Actions #3

Updated by Renato Botelho over 11 years ago

Please send me the output of the following commands when monitor is broken:

  1. ifconfig ovpnc1
  2. netstat -nrf inet
  3. route get 8.8.8.8
  4. cat /var/etc/apinger.conf
Actions #4

Updated by Pascal Huesch over 11 years ago

ifconfig

@ovpnc1: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
    options=80000<LINKSTATE>
    inet6 fe80::20c:29ff:fee3:15e8%ovpnc1 prefixlen 64 scopeid 0x8 
    inet 10.8.3.102 --> 10.8.3.101 netmask 0xffffffff 
    nd6 options=3<PERFORMNUD,ACCEPT_RTADV>
    Opened by PID 78551

netstat

Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use  Netif Expire
default            10.10.0.1          UGS         0      582    em1
10.8.3.97/32       10.8.3.101         UGS         0        0 ovpnc1
10.8.3.101         link#8             UH          0        0 ovpnc1
10.8.3.102         link#8             UHS         0        0    lo0
10.10.0.0/16       link#2             U           0      196    em1
10.10.68.209       link#2             UHS         0        0    lo0
127.0.0.1          link#6             UH          0        0    lo0
192.168.1.0/24     link#1             U           1     4243    em0
192.168.1.1        link#1             UHS         0        0    lo0

route

route to: google-public-dns-a.google.com
destination: default
       mask: default
    gateway: heaven.office.turtle-entertainment.de
  interface: em1
      flags: <UP,GATEWAY,DONE,STATIC>
 recvpipe  sendpipe  ssthresh  rtt,msec    mtu        weight    expire
       0         0         0         0      1500         1         0 

/var/etc/apinger.conf

# pfSense apinger configuration file. Automatically Generated!

## User and group the pinger should run as
user "root" 
group "wheel" 

## Mailer to use (default: "/usr/lib/sendmail -t")
#mailer "/var/qmail/bin/qmail-inject" 

## Location of the pid-file (default: "/var/run/apinger.pid")
pid_file "/var/run/apinger.pid" 

## Format of timestamp (%s macro) (default: "%b %d %H:%M:%S")
#timestamp_format "%Y%m%d%H%M%S" 

status {
    ## File where the status information should be written to
    file "/var/run/apinger.status" 
    ## Interval between file updates
    ## when 0 or not set, file is written only when SIGUSR1 is received
    interval 5s
}

########################################
# RRDTool status gathering configuration
# Interval between RRD updates
rrd interval 60s;

## These parameters can be overridden in a specific alarm configuration
alarm default { 
    command on "/usr/local/sbin/pfSctl -c 'service reload dyndns %T' -c 'service reload ipsecdns' -c 'service reload openvpn %T' -c 'filter reload' " 
    command off "/usr/local/sbin/pfSctl -c 'service reload dyndnsall' -c 'service reload ipsecdns' -c 'service reload openvpn' -c 'filter reload' " 
    combine 10s
}

## "Down" alarm definition. 
## This alarm will be fired when target doesn't respond for 30 seconds.
alarm down "down" {
    time 10s
}

## "Delay" alarm definition. 
## This alarm will be fired when responses are delayed more than 200ms
## it will be canceled, when the delay drops below 100ms
alarm delay "delay" {
    delay_low 200ms
    delay_high 500ms
}

## "Loss" alarm definition. 
## This alarm will be fired when packet loss goes over 20%
## it will be canceled, when the loss drops below 10%
alarm loss "loss" {
    percent_low 10
    percent_high 20
}

target default {
    ## How often the probe should be sent    
    interval 1s

    ## How many replies should be used to compute average delay 
    ## for controlling "delay" alarms
    avg_delay_samples 10

    ## How many probes should be used to compute average loss
    avg_loss_samples 50

    ## The delay (in samples) after which loss is computed
    ## without this delays larger than interval would be treated as loss
    avg_loss_delay_samples 20

    ## Names of the alarms that may be generated for the target
    alarms "down","delay","loss" 

    ## Location of the RRD
    #rrd file "/var/db/rrd/apinger-%t.rrd" 
}
target "8.8.8.8" {
    description "STRONG_VPNV4" 
    srcip "10.8.3.102" 
    alarms override "loss","delay","down";
    rrd file "/var/db/rrd/STRONG_VPNV4-quality.rrd" 
}

target "10.10.0.1" {
    description "WAN_DHCP" 
    srcip "10.10.68.209" 
    alarms override "loss","delay","down";
    rrd file "/var/db/rrd/WAN_DHCP-quality.rrd" 
}

Actions #5

Updated by Renato Botelho over 11 years ago

When monitor is failing, are you able to ping 8.8.8.8 with source set to 10.8.3.102?

  1. ping -S 10.8.3.102 8.8.8.8
Actions #6

Updated by Pascal Huesch over 11 years ago

Yes. Ping goes through while the Gateway is shown as offline.

Actions #7

Updated by Pascal Huesch over 11 years ago

I just noticed, that killing apinger and restarting it from the shell also brings the gateway back online.

Actions #8

Updated by Renato Botelho over 11 years ago

  • Status changed from Feedback to New
Actions #9

Updated by Renato Botelho over 11 years ago

  • Status changed from New to Feedback

Pascal Huesch wrote:

I just noticed, that killing apinger and restarting it from the shell also brings the gateway back online.

Interesting, did apinger leave any information on logs?

Actions #10

Updated by Pascal Huesch over 11 years ago

Hi Renato, sorry for my late reply.

When I boot the machine the gateway log looks like this: (Monitoring down)

May 13 10:44:11 apinger: ALARM: VPNV4(8.8.8.8) *** down ***
May 13 10:44:01 apinger: Starting Alarm Pinger, apinger(14374)
May 13 10:44:00 apinger: Exiting on signal 15.
May 13 10:43:56 apinger: bind socket: Can't assign requested address
May 13 10:43:56 apinger: Starting Alarm Pinger, apinger(15720)
May 13 10:43:55 apinger: Exiting on signal 15.
May 13 10:43:48 apinger: Starting Alarm Pinger, apinger(28362)
May 13 10:43:47 apinger: Exiting on signal 15.
May 13 10:43:46 apinger: Starting Alarm Pinger, apinger(20424)

After restarting apinger from console: (Monitoring working)

May 13 10:46:16 apinger: Starting Alarm Pinger, apinger(64396)
May 13 10:46:13 apinger: Exiting on signal 15.

I also noticed some logs regarding the VPN interface in the general log.

May 13 10:44:21 check_reload_status: Reloading filter
May 13 10:44:21 check_reload_status: Restarting OpenVPN tunnels/interfaces
May 13 10:44:21 check_reload_status: Restarting ipsec tunnels
May 13 10:44:21 check_reload_status: updating dyndns VPNV4
May 13 10:44:11 php: : Restarting/Starting all packages.
May 13 10:44:08 check_reload_status: Reloading filter
May 13 10:44:08 check_reload_status: Starting packages
May 13 10:44:08 php: : pfSense package system has detected an ip change 10.8.3.102 -> 10.8.3.102 ... Restarting packages.
May 13 10:44:06 php: : Creating rrd update script
May 13 10:44:03 php: : pfSense package system has detected an ip change 10.8.3.102 -> 10.8.3.102 ... Restarting packages.
May 13 10:44:01 php: : Creating rrd update script
May 13 10:44:00 php: : Removing static route for monitor 8.8.8.8 and adding a new route through 10.8.3.101
May 13 10:44:00 php: : rc.newwanip: on (IP address: 10.8.3.102) (interface: opt1) (real interface: ovpnc1).
May 13 10:44:00 php: : rc.newwanip: Informational is starting ovpnc1.
May 13 10:44:00 php: : Restarting/Starting all packages.
May 13 10:43:58 check_reload_status: rc.newwanip starting ovpnc1
May 13 10:43:58 kernel: ovpnc1: link state changed to UP

Actions #11

Updated by Phillip Davis over 11 years ago

I made a few fixes to the logic of when and which OpenVPN instances are restarted after a linkup/linkdown event. The last of these commits was https://github.com/pfsense/pfsense/commit/ddae03adaa76750dc678e62a73de22ccee98757d
It did not directly address messages like "apinger: Exiting on signal 15.", so your problem might still be there. But it would be useful to confirm that you still have the problem on a 2.1 build from 8 May 2013 or later. (I would like to understand and sort out what your problem really is, as I am also interested in ensuring that OpenVPN links come back up in all conditions.)

Actions #12

Updated by Pascal Huesch over 11 years ago

Hi Phillip,

I've updated my installation to: 2.1-BETA1 (amd64) built on Mon May 13 00:54:29 EDT 2013

It is still the same behavior.

Actions #13

Updated by Renato Botelho over 11 years ago

  • Status changed from Feedback to New
Actions #14

Updated by Renato Botelho over 11 years ago

I was reading the whole story again since I was not able to reproduce the issue, and I have a suspect. Could you please show me the config.xml (removing sensitive data)?

Actions #15

Updated by Ermal Luçi over 11 years ago

  • Status changed from New to Feedback

This should behave better with tomorrow snapshot due to a fix done in gateway monitoring.
Can you confirm this is the case?

Actions #16

Updated by Chris Buechler over 11 years ago

I can replicate this, but I think fixing #3179 will fix this as well. Will leave to feedback to test again after 3179 is fixed.

Actions #17

Updated by Chris Buechler over 11 years ago

  • Status changed from Feedback to Resolved

confirmed fixed

Actions

Also available in: Atom PDF