VoIP - Dynamic Pinholes for RTP
The media stream for a SIP call uses dynamically assigned port numbers. These port numbers can change several times during the course of a call. The dynamic nature of these port numbers makes it impossible to create a static policy to control media traffic. Any attempt at a static policy will either be too permissive or too restrictive. Instead the policy needs to be dynamic, hence the term "Dynamic Pinholes." pfsense should read the SIP messages and their SDP content and extract the port-number information it needs to dynamically open pinholes to let the media stream traverse the firewall. An internal table should be maintained, and when the call is signalled to end, the pinhole should be closed, ie: the dynamic rule created to permit the media stream should be removed. The mechanism responsible for creating the pinhole, hereto referred as d'pinholer, needs to concern itself with SIP packets containing SDP's. When a SIP packet is permitted, d'pinholer checks to see if it includes an SDP, and if it does it should extract and record the IP addresses and port numbers.
#1 Updated by Chris Buechler over 8 years ago
RTP is easy to accommodate without that mess, tons of VoIP providers run as is with no difficulties. It's most commonly used outbound, from phones on LAN to PBX on the Internet or across a VPN, where the usual policy (though not ideal) is to allow everything, or at a minimum the RTP range that is in use, usually 10000-20000. Even in the PBX scenario it doesn't buy anything as you have to keep 5060 open and that's where all your attacks are going, and if someone breaks that it's just going to automatically open whatever RTP they want, so who cares if the RTP range you use is wide open. It really buys you nothing over having the full RTP range open all the time. I don't see where this is anything but a whole lot of work for no real benefit. I'm open to being proven wrong though. :)
#2 Updated by Ken Leland over 8 years ago
The application we intend to use this for is as follows:
Asterisk Cluster -- pfSense -- Public Internet -- VoIP Phones
That is, VoIP Phones are connecting over the public internet to an Asterisk Cluster. We want the Asterisk server to be protected by pfSense as much as possible. Simply opening all RTP ports (10000 to 20000) to the Asterisk Server leaves the Asterisk Server too vulnerable to malicously crafted RTP packets. Here is such a vulnerability:
With Dynamic Pinhole in pfSense the Asterisk Server only receives rtp packets on ports and from source ips that are associated with authenticated calls.
#3 Updated by Ken Leland over 8 years ago
As far as RTP changing ports during a call, in asterisk language its called re-inviting, and if it is non-standard, any VoIP Providers using Asterisk are missing out! ;)
The benefit that we get out of re-inviting is tremendous. It allows us to have one server responsible for setting up calls and a separate cluster of servers responsible for gatewaying calls from the pstn to ip. The scalability of this architecture has proven very efficient. We are on our Nth "Media Gateway" while our "Feature Server" is still under 5% utilized. This is because our feature server is asterisk, and after setting up each call, 99% of the time it re-invites the media streams directly to the "Media Gateways."
Myself and a colleague are interested in working for the pfSense project to contribute this feature.
#4 Updated by Chris Buechler over 8 years ago
Ah that's the first RTP-only security issue I've noticed, that does indeed make it worthwhile. Re-inviting is apparently rare, I've analyzed at least a thousand pcaps with RTP across numerous VoIP providers, the bulk of which are to/from Asterisk, and never seen that.
If you're willing to contribute it, knock yourself out. Just make sure you're starting with 2.0 as a base. We can either get it added as a package, or to the base system for 2.1, depending on the implementation specifics.
#7 Updated by Ken Leland over 8 years ago
We have concluded that this logic belongs in pf.
Here are a couple of the other options we evaluated and why we concluded they would be insufficient:
1) Add a user space proxy server that runs on the firewall server, eg: add a siproxd like package.
The original objectives cannot be met with a user space proxy server. Namely:
*This cannot be transparent to the client and/or server.
*This would require the sip servers behind the firewall to be NAT'd.
For the above two reasons, this solution is not competitive with the other available firewall options that exist.
The Juniper SSG series, Cisco PIX series, and Linux's netfilter/conntrack module all provide the Dynamic Pinhole feature as described in my initial summary.
2) Create a patch as part of the pfSense distribution that adds this logic to pf.
The maintenance associated with this option makes it unattractive. The freebsd developers would be making changes to pf that require us to change the patch.
What do you think Chris?
I will be mailing the pf developers mailing list to get their input on this.
#9 Updated by Ken Leland over 8 years ago
As I understand your suggestion, we would have a user space daemon running and passively listening (ie: sniffing with libpcap) to packets* coming in and out on an interface. When we see interesting packets with SDP messages, we use PF's userspace API to add or remove the firewall rules for the media streams(ie: pinholes).
*An important question with this architecture is what kind of kernel level packet filtering FreeBSD provides to user space. Will this userspace program have to parse through all of the traffic on the interface, or just SIP traffic? I will be investigating this question.
This architecture has some obvious advantages, as writing and debugging code outside of the kernel is easier, and more secure for the system as a whole. However it also has the following disadvantages, both of which are related to process scheduling:
Race Condition: If the program in userspace isn't scheduled before the phone receives the new SDP, the user hears no audio until the userspace program is invoked. This will occur because PF will forward the packet from the sip proxy to the phone, thus signalling the set-up of the media to the phone. However until the program in userspace runs the firewall will drop the media.
Buffer Overflow: If the program in userspace isn't scheduled before the buffer containing the sniffed packets overflows, the kernel will drop the packets, causing the userspace program to fail to open the pinholes, causing one-way or no-way audio.
For these two reasons passive listening isn't theoretically as robust as having this logic in the kernel. However, depending on the answer to the filtering questions it may work all of the time and be just as good in practice.
#10 Updated by Ken Leland over 8 years ago
Documentation for Berkeley Packet Filter indicates that the requisite filtering exists.
"In addition, it supports "filtering" packets, so that only "interesting" packets can be supplied to the software using BPF; this can avoid copying "uninteresting" packets from the operating system kernel to software running in user mode, reducing the CPU requirement to capture packets and the buffer space required to avoid dropping packets."
#11 Updated by Jim Pingle over 8 years ago
Using divert in pf lets you have a userspace daemon that gets only the traffic specified by a given rule sent through it, and that program can process it, take action, and then reinject the packet and send it along. So it could be made only to handle SIP traffic on standard or configurable port(s). No need for it to be in the kernel.
#12 Updated by Chris Buechler over 8 years ago
divert probably not a good solution for this scenario, that's good where you want to examine individual packets and pass/block them based on some additional analysis. This would be examining traffic to open other ports rather than pass/block that traffic, so no need for the overhead of divert. Unless you really need that to make sure the traffic is fully analyzed before anything else can come though, I don't see that as an issue for this specific need. It would guarantee no race conditions as the packet wouldn't get passed until analysis is finished, but with proper BPF filtering, unless running grossly undersized hardware for the traffic load on the system, that won't be an issue. This (without divert) is how some of the various PF FTP proxies function for some scenarios.
#14 Updated by Ken Leland over 8 years ago
Ermal, I want to implement this myself along with 2 of my colleagues. The purpose of this ticket is to discuss the design with you guys prior to implementation to ensure that the overall architecture that we decide on meets both my requirements as well as the pfSense community requirements.
Already the feedback Chris and Jim has given me has been outstanding!
We have two options on the table; bpf and divert.
I will be performing some timing tests to inform the conversation further. Namely, timing how long it takes to add a rule to an anchor, and how long from when a packet arrives on the network interface to when bpf delivers it to userspace.
In my preliminary tests I haven't seen it take longer than 5ms to add a rule to an anchor. Those tests were performed using pfctl so they also incurred process creation overhead. I will be repeating the tests but only timing the ioctl system call.