Regression #13943
closedOpenVPN crashes with Signal 8 with very low fragment size
Added by Marcos M almost 2 years ago. Updated over 1 year ago.
0%
Description
OpenVPN crashes after updating from 22.01 to 22.05. The issue also occurs on 23.01-RC. Tested on an XG-1537-M2-32GB.
System logs
Feb 5 15:35:32 raptor kernel: pid 99288 (openvpn), jid 0, uid 0: exited on signal 8 (core dumped)
OpenVPN logs:
Feb 5 15:35:30 raptor openvpn[99288]: MANAGEMENT: Client connected from /var/etc/openvpn/server1/sock Feb 5 15:35:30 raptor openvpn[99288]: MANAGEMENT: CMD 'status 2' Feb 5 15:35:30 raptor openvpn[99288]: MANAGEMENT: Client disconnected Feb 5 15:36:02 raptor openvpn[23782]: event_wait : Interrupted system call (fd=-1,code=4) Feb 5 15:36:02 raptor openvpn[23782]: /usr/local/sbin/ovpn-linkdown ovpns2 1500 0 192.168.223.1 255.255.255.0 init
Updated by Marcos M almost 2 years ago
Signal 8 (SIGFPE) is floating-point exception
:
https://man.freebsd.org/cgi/man.cgi?sektion=3&query=signal
The OpenVPN configuration looks normal except for:
<custom_options>fragment 1300; mssfix</custom_options> <local_network>10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16</local_network>
Updated by Leon Dang almost 2 years ago
Marcos M wrote in #note-2:
Signal 8 (SIGFPE) is
floating-point exception
:
https://man.freebsd.org/cgi/man.cgi?sektion=3&query=signal
This is a divide by zero error.
From a quick look, it appears that the maximum fragment size value in one of the OpenVPN data structures is 0. I'm not sure how that happens.
The crash is at:
(gdb) x/i $rip => 0x257837: idiv %esi and %esi = 0.
Mapping back to a debug build of openvpn, we see that the problem is in:
00000000002564f0 <fragment_outgoing>: .... 256545: 41 8b 7f 10 mov 0x10(%r15),%edi <- frame->max_fragment_size .... 256551: 89 fe mov %edi,%esi 256553: 83 e6 fc and $0xfffffffc,%esi const int mfs_aligned = (max_frag_size & ~FRAG_SIZE_ROUND_MASK); 256556: 99 cltd 256557: f7 fe idiv %esi <--- esi = 0
Inspecting the contents of the "frame" (%r15 register points to it), shows max_fragment_size = 0 and mss_fix = 0:
0x82815fe30: 0x000006e8 0x00000088 0x00000232 0x00000000 <--- mss_fix = 0 0x82815fe40: 0x00000000 0x000005dc 0x00000640 0x00000000 ^^^^^^^^^^ max_fragment_size
The OpenVPN code, where the divide by zero happens:
void fragment_outgoing(struct fragment_master *f, struct buffer *buf, const struct frame *frame) { ... /* * Send the datagram as a series of 2 or more fragments. */ f->outgoing_frag_size = optimal_fragment_size(buf->len, frame->max_fragment_size); <----- optimal_fragment_size(int len, int max_frag_size) { const int mfs_aligned = (max_frag_size & ~FRAG_SIZE_ROUND_MASK); const int div = len / mfs_aligned; <-------------- crashes here. const int mod = len % mfs_aligned;
~~~~~~~~~~~~
max_fragment_size is set at https://github.com/OpenVPN/openvpn/blob/master/src/openvpn/mss.c#L257:
static void frame_calculate_fragment(struct frame *frame, struct key_type *kt, const struct options *options, struct link_socket_info *lsi) { ... unsigned int target = options->ce.fragment - overhead; /* The 4 bytes of header that fragment adds itself. The other extra payload * bytes (Ethernet header/compression) are handled by the fragment code * just as part of the payload and therefore automatically taken into * account if the packet needs to fragmented */ frame->max_fragment_size = adjust_payload_max_cbc(kt, target) - 4; if (cipher_kt_mode_cbc(kt->cipher)) { /* The packet id gets added to *each* fragment in CBC mode, so we need * to account for it */ frame->max_fragment_size -= calc_packet_id_size_dc(options, kt); } ... }
Updated by Jim Thompson almost 2 years ago
- File deleted (
openvpn_23.01.zip) - File deleted (
openvpn_22.05.zip) - File deleted (
server_config.ovpn)
Updated by Florian Achleitner almost 2 years ago
We observed this today. OpenVPN crashed with these log lines:
openvpn:
2023-02-16 10:06:43.005245+01:00 openvpn 93320 [username]/[peer_ip]:7492 TLS: Initial packet from [AF_INET][peer_ip]:7492, sid=b614ce7c 1ba237ad
kernel:
2023-02-16 10:06:43.014193+01:00 kernel - pid 95368 (openvpn), jid 0, uid 0: exited on signal 8 (core dumped)
Our config also uses mssfix.
fragment 1200; mssfix;
Unfortunately, theres is no mechanism to restart a crashed service automatically. This can cause some trouble if there's no other path to the firewall left.
Updated by Florian Apolloner almost 2 years ago
I wonder if explicitly specifying a value for mssfix would fix this. From the docs:
If --fragment and --mssfix are used together, --mssfix will take its default max parameter from the --fragment max option.
So it could be fine to set mssfix to fragment manually?
Updated by Jim Pingle almost 2 years ago
Florian Achleitner wrote in #note-8:
Unfortunately, theres is no mechanism to restart a crashed service automatically. This can cause some trouble if there's no other path to the firewall left.
You can leverage the Service Watchdog package for this. It will check once per minute if the service is running, and if it isn't, it will restart it.
Florian Apolloner wrote in #note-9:
I wonder if explicitly specifying a value for mssfix would fix this. From the docs:
It's worth a try if you can reproduce it reliably.
Updated by Patrick Schmid over 1 year ago
OpenVPN has fixed it in version 2.6.1!
When is it available in pfsense+ 23.01?
Updated by Jim Pingle over 1 year ago
- Subject changed from OpenVPN crashes with Signal 8 to OpenVPN crashes with Signal 8 with very low fragment size
- Assignee changed from Marcos M to Kristof Provost
- Target version set to 2.7.0
- Plus Target Version set to 23.05
For reference, the fix appears to be: https://github.com/OpenVPN/openvpn/commit/b9a9de156bc3ad517bfc6d1042ad0ef0350b638e
Updated by Kristof Provost over 1 year ago
- Status changed from New to Ready To Test
Future snapshots will have OpenVPN 2.6.2, which contains the fix.
Updated by Marcos M over 1 year ago
- Status changed from Ready To Test to Resolved
I could not reproduce the issue on 23.05.r.20230505.1836
.