Skip to content

Lawtec

Sections
Personal tools
You are here: Home Members antonh's Home Load balancing for Zope using ZEO (updated)
Document Actions

Load balancing for Zope using ZEO (updated)

by antonh last modified 07-Sep-06 11:48 AM

A client site has been so successful that they need to scale, and quickly. For anyone finding themselves in such a position, "this document":http://www.upfrontsystems.co.za/Members/jean/optimizing-plone on optimising Plone is a good place to start. Load balancing seems the optimal solution, given that it can be done with minimal downtime and will increase availability. The main options seem to be "mod_backhand":http://www.backhand.org/mod_backhand/ , "Pound":http://apsis.ch/pound/ and "Squid":http://www.squid-cache.org/ . But as ever, the devil is in the details. Hoping to fill the gaps left in what is some otherwise great documentation, I've collected a few resources together for those wanting to set up a ZEO cluster to scale their Zope/Plone site.

First of all, one is confronted with a choice between three options which all seem to take different approaches. So which to choose?

Option One - Squid

In this thread Andreas Jung suggests that "[y]ou should think about using Squid as reverse proxy when your content is mostly static...this will improve your performance significantly and reduce your costs for ZEO clients." That suits our client's situation. If not, then Pound would be a good choice for those who want minimal configuration and can live with static routing. Mod_backhand has the advantage of being Apache based, so that mixed Zope/Static sites can live happily together, and operates per-request rather than per-connection. Note that you could run Pound behind Apache though in this situation.

If you choose Squid, then this document seems to be the authority on how to get the most out of it for ZEO. Plone.org also has a set of links collated on Squid/Plone integration. Option 2 is the one you will want, as Squid will treat ZEO servers as cache peers and you thus get free access to all of its more sophisticated load balancing gear (which doesn't seem to require any configuration out of the box to have a fairly acceptable solution - however, if you want to get tricky then you can).

Now one thing which isn't mentioned above is how to get the redirection (think Apache RewriteRules) working in Squid. There are a few references to redirect_program (the need to set it), but few examples. Not to fear. Thanks to a presentation by Andy McKay on Profiling, Benchmarking and Caching in Plone, I found the program SquidGuard which appears to fulfil this requirement. Here is the line for squid.conf:

redirect_program /usr/bin/squidGuard -c /etc/squid/squidGuard.conf

And an example SquidGuard configuration:

dbhome /var/lib/squidguard/db
logdir /var/log/squid
acl {
    default {
            redirect http://192.168.60.103:8080/VirtualHostBase/http
/www.agmweb.ca:80/Plone/VirtualHostRoot/%p
}
}
And a note: ensure cache headers are being set.

So that seems to clear the way to implement a ZEO cluster using Squid which allows for VirtualHosts. Now it is just a matter of getting it up and running :)

Option 2: Apache + Pound + ZEO

The previous configuration at our client site has been:

[Apache] -> [ZEO Client] -> [ZEO Server]

So, rather than learning about Squid and reconfiguring everything, it seems easier to add in Pound between Apache and  the ZEO Client(s). One thing which was initially a little bit mistifying about this whole load balancing thing was how to send the URL information through Pound to ZEO. But of course, Pound simply forwards a request for a particular URL to the ZEO client so it was in fact easier than you might think. The result of the initial Apache RewriteRule looked something like this:

http://localhost:8080/VirtualHostBase/http/www.clientsite.com/Plone/VirtualHostRoot/$1

Running Pound on port 81, the new resultant URL was:

http://localhost:81/VirtualHostBase/http/www.clientsite.com/Plone/VirtualHostRoot/$1

Configuring Pound

All Pound needed to do was select one of 3 ZEO clients, forward on the request and forward back the response. This was the configuration we went with, saved in /usr/local/etc/pound.cfg

ListenHTTP *,81
User apache
WebDAV 1
LogLevel 3

UrlGroup ".*"
#Backend Host,Port,Priority (greater number=greater priority)
BackEnd 127.0.0.1,8080,1
BackEnd 192.168.0.2,8080,1
BackEnd 192.168.0.3,8080,1
EndGroup
One final tidbit I picked up was the following script to make pound into an init.d service:

#! /bin/sh
# written by Christian Brandlehner, http://chris.brandlehner.at
#
# /etc/init.d/pound
#
### BEGIN INIT INFO
# Provides: pound
# Required-Start: $network $syslog
# Required-Stop:
# Default-Start: 3 5
# Default-Stop:
# Description: Starts pound reverse proxy
### END INIT INFO

POUND_BIN=/usr/local/sbin/pound
POUND_PID=/var/run/pound.pid
POUND_CONF=/usr/local/etc/pound.cfg

if [ ! -x $POUND_BIN ] ; then
echo -n "Pound not installed ! "
exit 5
fi

. /etc/rc.status
rc_reset

case "$1" in
start)
echo -n "Starting pound "
checkproc -p $POUND_PID $POUND_BIN
if [ $? -eq 0 ] ; then
echo -n "- Warning: Pound already running ! "
else
[ -e $POUND_PID ] && echo -n "- Warning: $POUND_PID exists !
"
fi
startproc -p $POUND_PID $POUND_BIN -f $POUND_CONF
rc_status -v
;;
stop)
echo -n "Shutting down pound "
checkproc -p $POUND_PID $POUND_BIN
[ $? -ne 0 ] && echo -n "- Warning: pound not running ! "
killproc -p $POUND_PID -TERM $POUND_BIN
killproc -p $POUND_PID -TERM $POUND_BIN
rc_status -v
;;
try-restart)
$0 stop && $0 start
rc_status
;;
restart)
$0 stop
$0 start
rc_status
;;
force-reload)
$0 reload
rc_status
;;
reload)
echo -n "Reloading pound "
checkproc -p $POUND_PID $POUND_BIN
[ $? -ne 0 ] && echo -n "- Warning: Pound not running ! "
killproc -p $POUND_PID -HUP $POUND_BIN
rc_status -v
;;
status)
echo -n "Checking for Pound "
checkproc -p $POUND_PID $POUND_BIN
rc_status -v
;;
probe)
test $POUND_CONF -nt $POUND_PID && echo reload
;;
*)
echo "Usage: $0
{start|stop|status|try-restart|restart|force-reload|reload|probe}"
exit 1
;;
esac
rc_exit

(Note that at the time of writing, I haven't tried this yet, but it came from the Pound mailing list and looks promising).

No need for redirector

Posted by Anonymous User at 09-Feb-05 01:42 AM
If you're just doing simple pass-through cached reverse proxying, there is no need to use redirector. The necessity of always using a redirector seems to be a common misconception - even the squid documentation is a bit misleading on this. Actually, if I remember correctly, I found the necessary hints in the squid newsgroup / mailing list rather than the documentation itself.

That's true but ...

Posted by antonh at 09-Feb-05 04:52 PM
If you want to use the Virtual Host Monster with anything more complicated than simple mappings then you do need something like Apache rewrites. I've actually decided to leave that to Apache and let Pound or Squid handle the load balancing.

That's true but ...

Posted by Anonymous User at 10-Mar-05 08:21 AM
Couldn't agree more - in my case apache does all the job redirecting stuff properly (constructing urls to utilize VHM) and all squid does is reverse-cache and load balancing. Works quite good and simple to manage - Need another virtual host? - add new apache virtual host with appropriate rules for proxy/rewrite and that's it - one spot for config and one thing to break. Once squid is running you don't have to even remeber about it :)
 


This site conforms to the following standards: