Project

General

Profile

Bug #5168

squid doesn't function during/after HA failover

Added by Adam Thompson about 5 years ago. Updated 17 days ago.

Status:
Resolved
Priority:
High
Category:
Squid
Target version:
-
Start date:
09/18/2015
Due date:
% Done:

100%

Estimated time:
Affected Version:
All
Affected Architecture:

Description

Per #2591, there is no spported way for squid to listen to a CARP VIP interface.
This means that HA isn't really HA for any scenario that enforces the use of a built-in proxy.

Per "Kill Bill"'s dismissive "use WPAD or whatever" answer, that solution ONLY works for desktop browsers that support WPAD "or whatever". Most non-browser software (e.g. pfsense itself!) requires a static proxy to be configured.

Simple solution: allow squid to service a VIP that fails over during a firewall failover event. Alternative solutions welcomed.

Without a way to do this, it is impossible (or rather, pointless!) to deploy pfSense as a high-availability solution in any environment that blocks direct outbound HTTP & HTTP/S. This is a bug, not a missing feature - it's equivalent to saying "HA works fine, just change your default gateway setting on every device".

History

#1 Updated by Kill Bill about 5 years ago

Dude. The XMLRPC feature was NEVER intended to do any HA. It's (like in tons of other packages) used to sync configuration between multiple boxes. That's it. Nothing else. Nothing in the GUI ever suggesting it's supposed to do any HA/failover either.

#2 Updated by Kill Bill about 5 years ago

P.S. If you want your "HA"/"failover", then stick

<showlistenall/>
<showvirtualips/>

after this line: https://github.com/pfsense/pfsense-packages/blob/master/config/squid3/34/squid.xml#L257

I obviously won't submit any such PR because it's BS that produces no real failover. As explained on the thread I linked you to on the other "bug" you filed.

#3 Updated by Chris Buechler about 5 years ago

should be possible, and a good idea, to list VIPs in the binding list.

As a workaround, can just bind it to localhost, and use a port forward to redirect traffic to the CARP IP and proxy port to 127.0.0.1.

#4 Updated by Adam Gibson over 2 years ago

Chris Buechler wrote:

should be possible, and a good idea, to list VIPs in the binding list.

As a workaround, can just bind it to localhost, and use a port forward to redirect traffic to the CARP IP and proxy port to 127.0.0.1.

I just use the following in the Advanced section of Squid General setup page where x.x.x.x is the virtual IP of the LAN.

Custom Options (Before Auth)
http_port x.x.x.x:3128

I also set the source address to the virtual address as well so that I can proxy traffic through VPN tunnels if needed. It is not needed for just Internet traffic though. The src IP will match the IPSEC tunnel and go through it. External traffic gets NATed properly as well.
tcp_outgoing_address x.x.x.x

Failover works fine for me with this config. Existing active connections in transfer get stuck of course but new connections go out properly.

#5 Updated by Zeev Zalessky over 1 year ago

Hello,

any updates with this issue?
i have 200 vlans on my firewall and adding 200 lines with http_port is not good option.
IMHO additionally squid in HA configuration should start on active and stop on passive node like FRR.

#6 Updated by Adam Gibson over 1 year ago

Zeev Zalessky wrote:

Hello,

any updates with this issue?
i have 200 vlans on my firewall and adding 200 lines with http_port is not good option.
IMHO additionally squid in HA configuration should start on active and stop on passive node like FRR.

Bill seems to have the opinion that if there is no way for the process state of the connections to migrate to the failover firewall's Squid proxy process, then it is not true HA so it is not worth using the VIP to do automatic failover of connections to the secondary's proxy server.

While I can understand the want to have a full 100% failover, Having that kind of requirement for failover would result in a lot of the HA stuff not meeting that strict requirement. Relayd being removed form pfsense 2.5 (because of upstream support) will also cause the same not true HA failover for the load balancing for those that have to transition from relayd to HAProxy. The process state of the connections in HAProxy will also not failover to the second instance of HAProxy during a HA failover causing the same issue that bill is mentioning happening to Squid. Current connections will not work but new connections will go to the new instance on the secondary firewall.

Knowing that Squid and HAProxy connections will not failover fully intact (because their internal state is inside the processes themselves so tcp state failover is not enough), that does not mean it is not useful for Squid or HAProxy to failover the TCP connections to the secondary instance to minimize downtime. Using the VIP for Squid or HAProxy is a very easy and good way to handle the failover knowing their limitations IMHO. A failover will impact current connections, but retries will go to the new Squid process automatically on the second Squid instance. This is without any config on the clients to try and deal with multiple Squid IPs/instances.

I disagree with Bill that the TCP state failover would hurt in this scenario. The TCP state failover will still help in this transition to make sure the client will get a TCP reset when the connection goes to the secondary firewall. The secondary will allow the existing connection's packets which will return a TCP reset because the TCP state of the process on the local system will no longer be valid causing the firewall IP stack to send a TCP reset. If the TCP state was not pushed to the secondary firewall, the secondary firewall would drop the packet causing more of a delay for the recovery.

To sum up, I think listing the VIP is a very good thing to help with HA failover for Squid without needing another NAT rule and listening on 127.0.0.1 or needing any custom config section to force http_port on the VIP. I have been using the custom config method now for a few years and it has been working well... with the limitation of knowing that existing connections will be reset but new connections will go to the secondary firewall's Squid instance and work.

#7 Updated by Viktor Gurov 5 months ago

https://github.com/pfsense/FreeBSD-ports/pull/867

This is mainly for Transparent mode and IPv6 squid configurations,
in case of IPv4, it's easier to use port forward to 127.0.0.1:3128:
https://redmine.pfsense.org/issues/2591#note-7

#8 Updated by Jim Pingle 5 months ago

  • Status changed from New to Pull Request Review

#9 Updated by Renato Botelho 4 months ago

  • Status changed from Pull Request Review to Feedback
  • Assignee set to Renato Botelho
  • % Done changed from 0 to 100

PR has been merged. Thanks!

#10 Updated by Azamat Khakimyanov about 1 month ago

  • Status changed from Feedback to Assigned

I've tested it on 2.4.4_p3 - HA cluster with simple Squid config (Transparent mode) so Squid is active on both Primary and Secondary nodes. When I put Primary node into "Persistent CARP Maintenance" mode all opened HTTPs connections die but I still can open new HTTPS websites on my laptop via Squid on Secondary node. It's OK

I tested it on 2.5-DEV (built on Wed Sep 16 01:00:40 EDT 2020): With new "CARP Status VIP" feature added now Squid is active only on current Master node. So when I put Primary node into "Persistent CARP Maintenance" mode (so Secondary node becomes Master, Squid stops on Primary node and starts to work on Secondary node) all opened HTTPS connections die but I can't open any new one. Into "Real Time" menu on Secondary node I see no activity at all: Squid Access Table is empty. Looks like Squid starts as a service (I see it into Status/Services) but doesn't start to work. If I just go /Package/Proxy Server: General Settings/General and press "Save" without changing anything Squid starts to work on Secondary node as it should.

When I press "Leave Persistent CARP Maintenance Mode" on Primary node I can open new HTTPS websites after Squid starts to work on Primary and stops on Secondary without needing pressing "Save" (/Package/Proxy Server: General Settings/General).

So with adding this new "CARP Status VIP" feature Squid is not working at all on HA cluster if failover happens and it's better to get back as it was on 2.4.4_p3 or create detailed manual how to use Squid on HA cluster properly. These steps
- Bind Squid to Loopback (127.0.0.1) interface.
- Create a port forward from <CARP IP>:3128 to 127.0.0.1:3128.
- Have your users hit <CARP IP>:3128.
don't help.

#11 Updated by Viktor Gurov 28 days ago

Azamat Khakimyanov wrote:

I tested it on 2.5-DEV (built on Wed Sep 16 01:00:40 EDT 2020): With new "CARP Status VIP" feature added now Squid is active only on current Master node. So when I put Primary node into "Persistent CARP Maintenance" mode (so Secondary node becomes Master, Squid stops on Primary node and starts to work on Secondary node) all opened HTTPS connections die but I can't open any new one. Into "Real Time" menu on Secondary node I see no activity at all: Squid Access Table is empty. Looks like Squid starts as a service (I see it into Status/Services) but doesn't start to work. If I just go /Package/Proxy Server: General Settings/General and press "Save" without changing anything Squid starts to work on Secondary node as it should.

Fix:
https://github.com/pfsense/FreeBSD-ports/pull/941

#12 Updated by Jim Pingle 28 days ago

  • Status changed from Assigned to Pull Request Review

#13 Updated by Renato Botelho 28 days ago

  • Status changed from Pull Request Review to Feedback

PR has been merged. Thanks!

#14 Updated by Azamat Khakimyanov 17 days ago

  • Status changed from Feedback to Resolved

Tested on:
2.5.0-DEVELOPMENT (amd64)
built on Sun Oct 04 00:53:54 EDT 2020
FreeBSD 12.2-STABLE

I created HA cluster with:
- Squid enabled on LAN CARP VIP and Loopback (but not on LAN)
- Transparent HTTP Proxy enabled on LAN CARP VIP only (but not on LAN)
- HTTPS/SSL Interception (in 'Splice All' mode) enabled on LAN CARP VIP only (but not on LAN)
when I put Primary into "Persistent CARP Maintenance" mode and Secondary became Master I still was able to open new websites. I wasn't able to open websites which are into Blacklist.
So Squid is working now on HA cluster.

Bug can be marked RESOLVED.

Also available in: Atom PDF