Posted by Steve on Wed 27 Aug 2014 at 14:12
HAProxy is a TCP/HTTP load-balancer, allowing you to route incoming traffic destined for one address to a number of different back-ends. The routing is very flexible and it can be a useful component of a high-availability setup.
The last time we looked at load-balancers was in 2005, where we briefly examined webserver load-balancing with pound .
HAProxy is a little more flexible than pound when it comes to configuration, and in this article we’ll show how it can be used to balance traffic between the internet-at-large and a small number of local webservers, along with some of the more advanced facilities it supports.
The general usecase for a load-balancer is to present a service on a network which is actually fulfilled by a number of different back-end hosts. Incoming traffic is accepted upon a single IP address, and then sent to actually be fulfilled by one of a number of back-end hosts.
Splitting traffic like this allows a service to scale pretty well, barring any other limiting factors (such as a shared filesystem, a single database host, etc, etc).
HAproxy can be used to route traffic regardless of the protocol, for example it could provide load balancing to:
- Such as a number of hosts running Apache, nginx, lighttped, etc.
- Mail servers
- Such as a small pool of hosts running postfix, exim4, qpsmtpd, etc.
- Arbitrary TCP services
- Such as APIs implemented in go, lua, or node.js.
I’d imagine the most popular use-case though would be directing traffic to webservers. In this next example we’ll show connections made to a single IP address can be passed to four backend hosts.
Getting Started with HAProxy
To get started first install HAProxy. Depending on the release of Debian you’re running you might find you need to enable the backports repository first.
If you see a “package not found” response do consult the package search results here. for a clue.
Once installed the configuration is carried out solely by editing the configuration file /etc/haproxy/haproxy.cfg .
The following example is perhaps the simplest useful configuration, listening for incoming HTTP requests on port 80, and distributing those requests to one of four back-end hosts:
With this configuration file in place the service can be restarted to make it live, and the configuration file will be tested for errors before that occurs:
This example is pretty similar to that we demonstrated with pound. so many years ago, however HAProxy has many useful additions which we can now explore.
Obviously for this example to be useful to you it must be updated to refer to the real backends, and they must be reachable from the host you’re running the proxy upon. In this case our traffic is passed to port 8080 on a number of hosts in the 10.0.0.0/24 network. In my case I tend to run a small VPN to allow members of a VLAN to communicate securely. Even though I trust my hosting company I see no reason that my traffic should be sniffed.
Equally although this example will give you increased availability, because any failing backend will be removed, it won’t provide high -availability because the proxy itself is now a single point of failure.
To use HAProxy for high-availability it should be coupled with IP failover to remove itself as a single point of failure.
The simple example listed previously routed traffic “randomly” between the various backend hosts.
There are various different options, which may be specified via the “balance ” directive, in the backend section. The three most common approaches are:
- Distributing each request in turn to the next server:
- Distributing each incoming request to the least loaded backed we have:
- Distribute each request to a particular server, based upon the hash of the source IP making that request:
Of these options only the “balance source ” requires any real discusion. This method will ensure that a request from the IP address 184.108.40.206 will always go to the same backend. assuming it remains alive. This allows you to sidestep any issues with cookie persistence if sessions are stored locally.
The roundrobin mode also allows you to assign weights to the backends, such that bigger hosts can receive more of the traffic. The following example has four hosts, two of which have more RAM/CPU to burn, and receive more of the traffic:
The “weight ” parameter is used to adjust the server’s weight relative to other servers. All servers will receive a load proportional to their weight relative to the sum of all weights, so the higher the weight, the higher the load. By giving the first two servers weights twice those of the last two we should see they handle twice as many requests as those.
HAproxy will notice if a back-end disappears entirely, because it will fail to connect.
Beyond that though you might wish to programatically determine whether a host is in the pool. The way you do that is by definining the URL that the proxy will poll.
Each backend host can have a URI defined which will be used to determine whether the host is alive – if that URI fails to return a “HTTP 200 OK ” response then the host will be removed, and receive no new connections.
The following example will request the file /check.php. sending the correct HTTP host header, against each of the named servers.
Although we’ve not talked about load-balancing TCP-connections, rather than HTTP-connections, this next example shows how you could test that a Redis server is still working:
This example first sends a “PING ” string, expecting a “PONG ” reply, then tests that the remote host is a Redis master. As you can see this is both simple to configure and extraordinarily powerful.
If you consult the HAProxy documentation you’ll find further details which can be used to specify the number of failures required to remove a host, and tweak things such that post-failure a host must respond positively a specific number of times before it is reintroduced, avoiding flaps as services come and go in quick succession.
Adding Gzip Support
Although it isn’t possible to rewrite incoming requests, or massage the output received from the backend hosts arbitrarily one thing that is supported by HAProxy is adding Gzip compression between itself and the requesting client.
To enable this update your front-end definition:
Obviously there is a trade-off to be made here:
- With no compression you’ll serve more network traffic.
- With compression enabled your CPU will have to do more work, (to performi the compression).
In the general case HAProxy has sufficiently low overhead that it is probably a good idea to enable compression. If your load-levels start to rise too high you might want to reconsider that though. This is a common server tuning consideration .
(Adding single/static headers is supported, but complex rewrites are not possible.)
HAProxy can be used to protect servers from particular kinds of attacks, most notably the “slowlaris” attack – where a remote host ties open multiple connections to your server and simply sends requests very very slowly.
To mitigate against this you’d add timeout options:
Here we’ve setup some timeout values which seem sane – If a remote client makes a request to your server that takes longer than 5 seconds it will be closed, for example.
Further documentation on the timeout options is available in the HAProxy website along with other notes on connection-counting.
If you have a sufficiently recent version of HAProxy it can be configured to keep a running count of the connections initiated by remote IP addresses – protecting you from a single host attempting to open many connections.
The version of HAProxy available to Debian’s stable release, as a backport, doesn’t currently support this connection-tracking, but you can install a later 1.5 version via the haproxy.debian.net site.
With a suitably recent release of HAProxy the following definition will allow you to reject more than ten simultaneous connections from a single source:
The final version of our sample configuration file would look like this, taking advantage of all the options we’ve covered so far.
There is a lot more to HAProxy than this brief introduction has covered, such as the SSL support, but I hope it was useful regardless.