Project

General

Profile

Actions

Regression #13943

closed

OpenVPN crashes with Signal 8 with very low fragment size

Added by Marcos M almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Category:
OpenVPN
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
23.05
Release Notes:
Default
Affected Version:
2.7.0
Affected Architecture:

Description

OpenVPN crashes after updating from 22.01 to 22.05. The issue also occurs on 23.01-RC. Tested on an XG-1537-M2-32GB.

System logs

Feb  5 15:35:32 raptor kernel: pid 99288 (openvpn), jid 0, uid 0: exited on signal 8 (core dumped)

OpenVPN logs:
Feb  5 15:35:30 raptor openvpn[99288]: MANAGEMENT: Client connected from /var/etc/openvpn/server1/sock
Feb  5 15:35:30 raptor openvpn[99288]: MANAGEMENT: CMD 'status 2'
Feb  5 15:35:30 raptor openvpn[99288]: MANAGEMENT: Client disconnected
Feb  5 15:36:02 raptor openvpn[23782]: event_wait : Interrupted system call (fd=-1,code=4)
Feb  5 15:36:02 raptor openvpn[23782]: /usr/local/sbin/ovpn-linkdown ovpns2 1500 0 192.168.223.1 255.255.255.0 init

Actions #1

Updated by Marcos M almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by Marcos M almost 2 years ago

Signal 8 (SIGFPE) is floating-point exception:
https://man.freebsd.org/cgi/man.cgi?sektion=3&query=signal

The OpenVPN configuration looks normal except for:

<custom_options>fragment 1300; mssfix</custom_options>
<local_network>10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16</local_network>

Actions #3

Updated by Leon Dang almost 2 years ago

Marcos M wrote in #note-2:

Signal 8 (SIGFPE) is floating-point exception:
https://man.freebsd.org/cgi/man.cgi?sektion=3&query=signal

This is a divide by zero error.

From a quick look, it appears that the maximum fragment size value in one of the OpenVPN data structures is 0. I'm not sure how that happens.

The crash is at:

(gdb) x/i $rip
=> 0x257837:    idiv   %esi

and %esi = 0.

Mapping back to a debug build of openvpn, we see that the problem is in:

00000000002564f0 <fragment_outgoing>:
  ....
  256545:       41 8b 7f 10             mov    0x10(%r15),%edi         <- frame->max_fragment_size
  ....
  256551:       89 fe                   mov    %edi,%esi
  256553:       83 e6 fc                and    $0xfffffffc,%esi        const int mfs_aligned = (max_frag_size & ~FRAG_SIZE_ROUND_MASK);
  256556:       99                      cltd
  256557:       f7 fe                   idiv   %esi                    <--- esi = 0

Inspecting the contents of the "frame" (%r15 register points to it), shows max_fragment_size = 0 and mss_fix = 0:

0x82815fe30:    0x000006e8      0x00000088      0x00000232      0x00000000  <--- mss_fix = 0
0x82815fe40:    0x00000000      0x000005dc      0x00000640      0x00000000
                ^^^^^^^^^^
                  max_fragment_size

The OpenVPN code, where the divide by zero happens:

void
fragment_outgoing(struct fragment_master *f, struct buffer *buf,
                  const struct frame *frame)
{
            ...
            /*
             * Send the datagram as a series of 2 or more fragments.
             */
            f->outgoing_frag_size = optimal_fragment_size(buf->len, frame->max_fragment_size);    <----- 

optimal_fragment_size(int len, int max_frag_size)
{
    const int mfs_aligned = (max_frag_size & ~FRAG_SIZE_ROUND_MASK);
    const int div = len / mfs_aligned;                                         <-------------- crashes here.
    const int mod = len % mfs_aligned;

~~~~~~~~~~~~

max_fragment_size is set at https://github.com/OpenVPN/openvpn/blob/master/src/openvpn/mss.c#L257:

static void
frame_calculate_fragment(struct frame *frame, struct key_type *kt,
                         const struct options *options,
                         struct link_socket_info *lsi)
{
    ...

    unsigned int target = options->ce.fragment - overhead;
    /* The 4 bytes of header that fragment adds itself. The other extra payload
     * bytes (Ethernet header/compression) are handled by the fragment code
     * just as part of the payload and therefore automatically taken into
     * account if the packet needs to fragmented */
    frame->max_fragment_size = adjust_payload_max_cbc(kt, target) - 4;

    if (cipher_kt_mode_cbc(kt->cipher))
    {
        /* The packet id gets added to *each* fragment in CBC mode, so we need
         * to account for it */
        frame->max_fragment_size -= calc_packet_id_size_dc(options, kt);
    }

    ...
}

Actions #4

Updated by Jim Thompson almost 2 years ago

  • Assignee set to Marcos M
Actions #5

Updated by Leon Dang almost 2 years ago

  • Private changed from No to Yes
Actions #6

Updated by Jim Thompson almost 2 years ago

  • File deleted (openvpn_23.01.zip)
  • File deleted (openvpn_22.05.zip)
  • File deleted (server_config.ovpn)
Actions #7

Updated by Marcos M almost 2 years ago

  • Private changed from Yes to No
Actions #8

Updated by Florian Achleitner almost 2 years ago

We observed this today. OpenVPN crashed with these log lines:
openvpn:

2023-02-16 10:06:43.005245+01:00     openvpn     93320     [username]/[peer_ip]:7492 TLS: Initial packet from [AF_INET][peer_ip]:7492, sid=b614ce7c 1ba237ad

kernel:

2023-02-16 10:06:43.014193+01:00     kernel     -     pid 95368 (openvpn), jid 0, uid 0: exited on signal 8 (core dumped) 

Our config also uses mssfix.

fragment 1200; mssfix;

Unfortunately, theres is no mechanism to restart a crashed service automatically. This can cause some trouble if there's no other path to the firewall left.

Actions #9

Updated by Florian Apolloner almost 2 years ago

I wonder if explicitly specifying a value for mssfix would fix this. From the docs:

If --fragment and --mssfix are used together, --mssfix will take its default max parameter from the --fragment max option.

So it could be fine to set mssfix to fragment manually?

Actions #10

Updated by Jim Pingle almost 2 years ago

Florian Achleitner wrote in #note-8:

Unfortunately, theres is no mechanism to restart a crashed service automatically. This can cause some trouble if there's no other path to the firewall left.

You can leverage the Service Watchdog package for this. It will check once per minute if the service is running, and if it isn't, it will restart it.

Florian Apolloner wrote in #note-9:

I wonder if explicitly specifying a value for mssfix would fix this. From the docs:

It's worth a try if you can reproduce it reliably.

Actions #11

Updated by Patrick Schmid over 1 year ago

OpenVPN has fixed it in version 2.6.1!

When is it available in pfsense+ 23.01?

Actions #12

Updated by Jim Pingle over 1 year ago

  • Subject changed from OpenVPN crashes with Signal 8 to OpenVPN crashes with Signal 8 with very low fragment size
  • Assignee changed from Marcos M to Kristof Provost
  • Target version set to 2.7.0
  • Plus Target Version set to 23.05
Actions #13

Updated by Kristof Provost over 1 year ago

  • Status changed from New to Ready To Test

Future snapshots will have OpenVPN 2.6.2, which contains the fix.

Actions #14

Updated by Marcos M over 1 year ago

  • Status changed from Ready To Test to Resolved

I could not reproduce the issue on 23.05.r.20230505.1836.

Actions #15

Updated by Jim Pingle over 1 year ago

  • Affected Version set to 2.7.0
Actions

Also available in: Atom PDF