Project

General

Profile

Actions

Bug #14390

open

Squid: SECURITY ALERT: Host header forgery detected

Added by Simon Byrnand 12 months ago. Updated 6 months ago.

Status:
New
Priority:
High
Assignee:
-
Category:
Squid
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Plus Target Version:
Affected Version:
2.6.0
Affected Plus Version:
Affected Architecture:

Description

In Squid version 3.2 in 2012 a "fix" for a potential security vulnerability involving host header forgery was added, this is described briefly here:

https://wiki.squid-cache.org/KnowledgeBase/HostHeaderForgery

It is also mentioned in the documentation for the host_verify_strict squid option below, which references CVE-2009-0801:

http://www.squid-cache.org/Doc/config/host_verify_strict/

In essence, when transparent interception is used Squid will obtain the SNI for https or host header for http, do a DNS lookup on this hostname enumerating the IP addresses returned, then double check the destination IP address of the intercepted packet is one of those returned from the DNS request.

If it's not, the request fails and returns an HTTP/409 error to the client, which shows up as a "NONE/409" in the squid access.log.

While this extra check is well intentioned, probably seemed like the right thing to do in 2012 and no doubt does address CVE-2009-0801, today in 2023 it has massive collateral damage and makes transparent proxying almost unusable for many large services that use CDN's, as it causes frequent but intermittent connection failures for many CDN based services.

The issue here is one of DNS coherency. As I've discovered when looking into this in detail in the last week CDN service DNS records have extremely short TTL times of around 10 or 20 seconds typically. After that TTL expires you will more often than not get a completely disjoint set of different IP addresses returned than what you got just 20-30 seconds earlier.

A couple of examples I'm working with which can be queried multiple times over a couple of minutes with dig returning different results are:

res.cdn.office.net
uk.ng.msg.teams.microsoft.com

These are both used by Office 365 / Microsoft Teams, however this issue of very short TTL times and disjointed IP address response sets seems to be pervasive in large CDN infrastructure as of 2023.

The end result is intermittent http/https query failures returning HTTP/409. In my testing in the last few days Microsoft Teams on iOS is very susceptible to this problem and is basically unusable as it will work for a while then just fail to connect to the servers or be able to start a video call due to Squid returning HTTP/409 errors to it.

Ensuring DNS coherency between clients and proxy server (by forcing clients and proxy server to use the same DNS server - which I already do) does help on regular websites with long TTL's, or websites whose IP address result set is consistent, and is essential to getting transparent proxying working, however it does not help this issue with CDN's. :(

Here is what I think is happening in the specific instance of the Teams client, however the same basic scenario can apply to other client software as well.

The Teams app performs a DNS lookup on uk.ng.msg.teams.microsoft.com and gets multiple IP addresses returned. It chooses one and makes an https query to the server using this IP address. Squid intercepts this transparently, reads the SNI, does its own DNS lookup from the same DNS server, gets the same response, verifies that the IP address of the destination packets match the DNS records and everything works fine.

A couple of minutes later Teams makes a new https request to the same hostname using an internally cached (within the application) copy of the IP address - Squid performs another DNS lookup on the hostname as the extremely short TTL has already expired, however this time the IP addresses returned are completely different so the request fails with an HTTP/409 error. This breaks the client.

TTL on DNS records only mandates how long a DNS server or OS level resolver can cache the result for, it cannot prevent an application choosing to continue to use the same IP address for longer than the TTL. It would be perfectly reasonable for an HTTPS connection to stay open for an hour for example even if the DNS record has a TTL of 10 seconds. Absent the transparent proxy there is no issue because the server IP's remain valid for long periods of time (probably days/weeks) even though the DNS records are constantly rotating every 10-60 seconds or so.

From my perspective, the transparent mode is mostly useless now due to the combination of this original 2012 patch (which to be fair is a Squid patch not anything specific to PFSense) and the ubiquitous use of CDN's with extremely low DNS record TTL's and disjoint result sets that are constantly cycling IP addresses in and out of the results.

This issue has been discussed in the PFSense forum before:

https://forum.netgate.com/topic/159364/squid-squidguard-none-409-and-dns-issue

However the only outcomes were workarounds such as explicit client proxy settings or use of WPAD, both of which only admit that transparent proxying is broken. I make use of both of these where possible but there are still many cases where this is not possible.

The only true solution to get transparent mode working reliably again seems to be to walk back the "fix" applied back in Squid 3.2 and accept that theoretical exploit could be used, however without doing so we basically have to give up on transparent proxying altogether.

Patching this out has been discussed in a couple of places that I've found, here is one:

https://github.com/NethServer/dev/issues/5348

To reproduce and monitor this issue is fairly easy:

Set up Squid in PFSense with transparent proxying enabled. Enable additional debug logging by adding debug_options ALL,1 rotate=7 to Advanced Features -> Custom Options (Before Auth) then try using apps such as Microsoft Office apps that make heavy use of CDN's, making sure that the client has no proxy setting.

In /var/squid/logs/cache.log if the problem occurs you will see entries like:


2023/05/16 10:16:08 kid1| SECURITY ALERT: Host header forgery detected on local=3.227.250.226:443 remote=10.2.132.38:50635 FD 526 flags=33 (local IP does not match any domain IP)
2023/05/16 10:16:08 kid1| SECURITY ALERT: on URL: kinesis.us-east-1.amazonaws.com:443
2023/05/16 10:16:08 kid1| SECURITY ALERT: Host header forgery detected on local=3.227.250.226:443 remote=10.2.132.38:50636 FD 526 flags=33 (local IP does not match any domain IP)
2023/05/16 10:16:08 kid1| SECURITY ALERT: on URL: kinesis.us-east-1.amazonaws.com:443
2023/05/16 10:16:09 kid1| SECURITY ALERT: Host header forgery detected on local=17.253.29.208:443 remote=10.2.133.96:62179 FD 526 flags=33 (local IP does not match any domain IP)
2023/05/16 10:16:09 kid1| SECURITY ALERT: on URL: app-site-association.cdn-apple.com:443
2023/05/16 10:16:09 kid1| SECURITY ALERT: Host header forgery detected on local=3.227.250.226:443 remote=10.2.132.38:50637 FD 526 flags=33 (local IP does not match any domain IP)
2023/05/16 10:16:09 kid1| SECURITY ALERT: on URL: kinesis.us-east-1.amazonaws.com:443

These log entries show both the host name that was requested by the client and the "local" IP address that the client was trying to connect to, if you do multiple DNS lookups of the hostname over a couple of minutes you will find that sometimes it returns the IP address, sometimes it does not.

While I appreciate this issue is an upstream issue I think it very unlikely that the Squid maintainers would accept a patch to walk back the change that was made in 2012, however I would argue that the transparent mode is barely usable due to this issue, so hopefully you would consider a patch to address this.

Regards,
Simon


Files

Transparent Proxy test.py (1.67 KB) Transparent Proxy test.py Python script Simon Byrnand, 05/17/2023 10:18 AM

Related issues

Has duplicate Feature #14786: Add GUI option for host_verify_strictDuplicate

Actions
Actions #1

Updated by Simon Byrnand 12 months ago

I can't seem to edit my initial post but wanted to clarify the Squid debug option should be debug_options ALL,1 rotate=7 - the previous underlining obscured the necessary underline in the option.

Actions #2

Updated by Jim Pingle 12 months ago

  • Project changed from pfSense to pfSense Packages
  • Category changed from Services to Squid
  • Release Notes deleted (Default)
Actions #3

Updated by Simon Byrnand 12 months ago

I've written a small Python script to help reliably reproduce and demonstrate this issue.

To simulate an application re-using a resolved IP address beyond the official TTL it looks up the IP address once at the beginning then in a loop constructs https queries to the original IP address every 25 seconds. During the loop it separately resolves all IP addresses for the hostname and prints them to help show if the returned results are changing over time. (Which they are)

At some point usually between about 30 seconds and 3 minutes depending on which host is being tested an HTTP/409 failure occurs and it is caught as an SSL error exception, due to the certificate error that intercepting an https connection with an error response causes.

Below is an example run of the script.

Resolved hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com to 3.5.67.184. This will be used for all further requests to simulate an application internally caching the resolved IP address.

All currently resolved IP addresses for hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com: {'52.218.41.120', '52.92.16.58', '3.5.67.184', '52.218.56.16', '52.218.120.130', '3.5.64.142', '52.92.36.146', '52.218.101.136'}
Sending request to originally resolved IP address: 3.5.67.184

HTTP response code: 307
HTTP response headers:
x-amz-id-2: Gf7M4pJL525F3RFBFglTVsLenz5WQydrG1i+xEo1LBOzL+tBTdeHTuMDIeVHj30zdcMLARX2k7SToa8gCFTcsQ==
x-amz-request-id: 6WXHJC469WTK5M2P
Date: Wed, 17 May 2023 10:09:23 GMT
Location: https://aws.amazon.com/s3/
Server: AmazonS3
Content-Length: 0

All currently resolved IP addresses for hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com: {'52.92.37.2', '52.218.41.208', '3.5.65.0', '52.218.42.8', '3.5.65.109', '52.218.20.179', '3.5.68.15', '52.218.90.136'}
Sending request to originally resolved IP address: 3.5.67.184

HTTP response code: 307
HTTP response headers:
x-amz-id-2: FmZauTfW6oO5tZ/A8rmiGfKlIWLh7+g1LtM/OZV/UHPEMGvVMvk8opl7Din+zasbKM6wTQLDzs32KoFcC8Xwdw==
x-amz-request-id: YC9HEE5YA8DKFPGE
Date: Wed, 17 May 2023 10:09:49 GMT
Location: https://aws.amazon.com/s3/
Server: AmazonS3
Content-Length: 0

All currently resolved IP addresses for hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com: {'52.92.17.186', '52.92.20.18', '3.5.65.157', '3.5.70.193', '3.5.66.161', '52.92.36.74', '52.92.16.138', '3.5.71.10'}
Sending request to originally resolved IP address: 3.5.67.184

HTTP response code: 307
HTTP response headers:
x-amz-id-2: TfFSVyt0RCvRVnenB7H6mBQ+CabDJJieZ2LPPLLj4dUQcJ5XvYzcioL/NOyMC/vQ8jmh8U6j2dAvs/6fyFlSYQ==
x-amz-request-id: SGXF88X4YSF0K3NR
Date: Wed, 17 May 2023 10:10:14 GMT
Location: https://aws.amazon.com/s3/
Server: AmazonS3
Content-Length: 0

All currently resolved IP addresses for hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com: {'52.218.117.138', '52.92.33.186', '3.5.70.181', '52.92.18.42', '3.5.68.159', '52.218.92.120', '52.218.24.112', '52.218.62.160'}
Sending request to originally resolved IP address: 3.5.67.184

An SSL error occurred: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1129)

All currently resolved IP addresses for hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com: {'52.92.32.218', '52.218.106.48', '52.92.20.194', '52.218.105.147', '52.218.108.176', '52.218.92.64', '52.218.105.115', '52.92.1.250'}
Sending request to originally resolved IP address: 3.5.67.184

An SSL error occurred: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1129)

All currently resolved IP addresses for hostname eu-irl-00001.s3.dualstack.eu-west-1.amazonaws.com: {'52.92.0.250', '52.218.96.67', '52.92.16.162', '52.218.112.40', '52.218.96.235', '52.218.28.96', '52.92.35.34', '52.218.112.48'}
Sending request to originally resolved IP address: 3.5.67.184

An SSL error occurred: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1129)

Something that's interesting is while failure usually takes a minute or so to occur it occasionally occurs on the very first request, where the initial DNS query does not match the subsequent ones made a moment later. This would seem to happen if the initial query by the client is made just as the TTL of the record is about to expire and Squid's slightly delayed DNS query gets a different result.

So there is a small window of opportunity for the issue to trigger even if a client is not holding onto an IP address too long - if it makes its query just as the TTL is timing out it can still fail on the first attempt, and this is consistent with what I've observed on client devices.

Actions #4

Updated by Jonathan Lee 7 months ago

https://redmine.pfsense.org/issues/14786

I have also seen "UPP" utilizing this to get around non transparent mode when you have a single URL set to splice UPP takes it and uses that to piggyback to other URLs with the same IP address. Essentially a VPN to all the other URLs that are not spliced normally, utilizing the open spliced connection. A Get Headder Forgery

Actions #6

Updated by Jonathan Lee 7 months ago

https://github.com/rudiservo/pfsense_storeid

This program was made for CDN maybe it can be expanded

Actions #7

Updated by Mike Moore 7 months ago

host verify strict is set to OFF by default so technically we souldnt be having these /409 errors.
My suspicion is that Squid package is not respecting this setting. This is a regression in the package that hasnt been caught until now. IMO.

Actions #8

Updated by Simon Byrnand 7 months ago

Hi Mike, (and others)

Thanks for commenting and having a look at this - I agree, with "host_verify_strict off", which is the default, 409 errors should not be returned, so squid is not behaving as documented at http://www.squid-cache.org/Doc/config/host_verify_strict/ and doesn't seem to be respecting the setting correctly.

From reading I've done since my original post it seems that requests that fail the hostname/IP "coherency check", lets call it, should be marked as un-cachable assets (to avoid someone else retrieving them from the cache) and should also not be forwarded to squid peers but go directly to the destination server using the original IP address the client requested, and that these measures address the original security concern.

Neither of these restrictions are major issues in most use cases as they would only affect performance, and only for those who actually use peers or have caching enabled, which can't be many people these days with SSL everywhere unless they have certificate based MITM interception and decryption enabled. (We don't)

Probably like many, we only use Squid for Squidguard to allow domain name based filtering and to provide some logging mostly for troubleshooting purposes.

Rapidly changing/cycling DNS entries with very short (often 30 seconds) TTL for CDN's seems to be the norm now unfortunately, so even with Squid using the same caching DNS server as the clients (which isn't even always feasible) this coherency test is failing at a significantly high percentage rate resulting in random 409's to client devices, frequently enough for visible problems to occur for users, usually in the form of intermittent page load failures and difficulty logging into and using services like Office 365, which seems to be a particularly bad offender now.

My workaround for devices that can't get explicit proxy settings using mechanisms like group policy has been to set up WPAD to inform client devices of an explicit proxy setting but not all devices support this either, and Apple devices which we have a lot of on our network have defaulted to Automatic proxy configuration disabled for the last several major iOS versions which means trying to convince owners of several hundred BYO devices to correctly enable automatic proxy settings.

A fix for this issue would be a massive quality of life improvement for anyone who uses transparent proxying with PFSense and solve a lot of headaches.

I'd also add that I'm running 2.7.0 now and the issue is the same - not surprising as I think it's running the same or nearly the same version of squid. So the ticket can be updated to list 2.7.0.

Actions #9

Updated by Denis Roy 7 months ago

I have a transparent deployment with pfSense 2.7.0, and a mitigation has been to rely on pfBlockerNG and custom NAT rules for interception. In essence, to bypass interception for range of IPs that are extremely likely to implement DNS based load balancing for their web services. I am bypassing Squid for:
1- Azure Cloud and Azure Front Door, solutions who often relies on Azure Traffic Manager for DNS load balancing.
2- Akamai. This CDN also implements DNS based load balancing with low TTL.
3- AWS, solutions are somewhat likely to implement Route 53 for load balancing. But this isn't as frequent as #1 #2.
4- Custom solutions. Often larger corporations that owns multiple datacenters, and rely on other commercial solutions like F5 GLSB.

It is getting increasingly difficult to maintain, and a real solution would be greatly appreciated.

An alternative to disabling the check/marking un-cachable, could be to replace the validation routine. Instead of a DNS based validation, verify that he SNI from the Client Hello request matches the certificate obtained from the remote web server, and if the certificate is trusted AND valid then the request can be considered legitimate. For future proofing as a web security gateway, we could someday look at leveraging certificate transparency to ensure we are getting a certificate that has been logged (known to be in use).

At this point, I would be happy with whatever solution prevents Squid from dropping perfectly legitimate HTTPS traffic.

Actions #10

Updated by Simon Byrnand 7 months ago

Denis Roy wrote in #note-9:

I have a transparent deployment with pfSense 2.7.0, and a mitigation has been to rely on pfBlockerNG and custom NAT rules for interception. In essence, to bypass interception for range of IPs that are extremely likely to implement DNS based load balancing for their web services.

Could you not just use "Bypass Proxy for These Destination IPs" under "Transparent Proxy Settings" in the Squid General settings page to achieve the same effect much easier ? Even though it doesn't autocomplete when you type in the box you can put an alias here and then add networks/IP's to the alias. I use an alias here for a few specific destinations I want to bypass the transparent proxy for albeit for different reasons than you. As far as I know, behind the scenes this setting just adds exclusions in the ipfw redirect rules, essentially what you are doing manually.

I am bypassing Squid for:
1- Azure Cloud and Azure Front Door, solutions who often relies on Azure Traffic Manager for DNS load balancing.
2- Akamai. This CDN also implements DNS based load balancing with low TTL.
3- AWS, solutions are somewhat likely to implement Route 53 for load balancing. But this isn't as frequent as #1 #2.
4- Custom solutions. Often larger corporations that owns multiple datacenters, and rely on other commercial solutions like F5 GLSB.

It is getting increasingly difficult to maintain, and a real solution would be greatly appreciated.

I'm surprised you're able to maintain this at all to be honest, that seems like a never ending job. My workaround is WPAD based proxy auto configuration but it still requires convincing end users (mostly school children....) to set their proxy settings correctly. Also a never ending job. :)

Actions #11

Updated by Denis Roy 7 months ago

Simon Byrnand wrote in #note-10:

Could you not just use "Bypass Proxy for These Destination IPs" under "Transparent Proxy Settings" in the Squid General settings page to achieve the same effect much easier ? Even though it doesn't autocomplete when you type in the box you can put an alias here and then add networks/IP's to the alias. I use an alias here for a few specific destinations I want to bypass the transparent proxy for albeit for different reasons than you. As far as I know, behind the scenes this setting just adds exclusions in the ipfw redirect rules, essentially what you are doing manually.

You are correct: I am just doing this stuff manually, and "Bypass Proxy for These Destination IPs" work as well. I just find it easier to maintain if I can look at the NAT rules, get statistics on it from the WebUI, etc. I do it manually for the convenience. Takes a minute to setup, and it makes everything easier down the road.

I'm surprised you're able to maintain this at all to be honest, that seems like a never ending job. My workaround is WPAD based proxy auto configuration but it still requires convincing end users (mostly school children....) to set their proxy settings correctly. Also a never ending job. :)

It's actually not that hard. pfBlockerNG does it for me. I can share most of the lists that I currently have:

https://saasedl.paloaltonetworks.com/feeds/akamai/all/ipv4
https://ip-ranges.amazonaws.com/ip-ranges.json
https://saasedl.paloaltonetworks.com/feeds/azure/public/azurecloud/ipv4
https://saasedl.paloaltonetworks.com/feeds/azure/public/azurefrontdoor/ipv4
AS32934 [ FACEBOOK, US ] #Note pfBlockerNG automatically supports resolving ASNs too!
https://api.fastly.com/public-ip-list
https://saasedl.paloaltonetworks.com/feeds/googleworkspace/all/ipv4
https://saasedl.paloaltonetworks.com/feeds/m365/worldwide/skype/all/ipv4

I feed all my access.log into a SIEM so I can easily identify all the 409s that I get, get a count for each unique domain name, etc. I've built that list fairly quick TBH. And it is quite static at this point.

The real issue with my approach is that whitelisting stuff like AWS could eventually means opening up stuff I'd like to block. Lots of domains hosted on AWS these days. But what I can't block through Squidguard, is blocked through DNSBL. And that's enough for my needs.

Actions #12

Updated by Mike Moore 7 months ago

Or….
We could have a proper fix for this issue then the workarounds that aren’t scalable

Actions #13

Updated by Simon Byrnand 6 months ago

Can anyone advise on the feasibility of building a custom patched version of Squid (at least for testing purposes to further investigate possible solutions) from source to install in PFSense as a package ?

This problem affects us badly enough that I'd be willing to look into doing this, however I'm not familiar with the build environment and processes of PFSense packages to know what is involved in this and whether everything needed to build an identical package to the released ones is made available to those outside Netgate. (I'm also a Linux and Windows guy not a FreeBSD guy but I'm fairly comfortable building open source software from source on Linux, so it's not too far out of my wheelhouse with some pointers in the right direction)

Is there a mechanism in PFSense to install 3rd party custom built packages that are not listed in the official repos ? Could a package be built with a slightly later version number and installed from the command line using pkg, etc ? If anyone can point me in the right direction it would be appreciated.

Actions #14

Updated by Marcos M 6 months ago

  • Has duplicate Feature #14786: Add GUI option for host_verify_strict added
Actions

Also available in: Atom PDF