~jehiah/txloadbalancer/bugfix-1.2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
~~~~~~~~~~~~~~
txLoadBalancer
~~~~~~~~~~~~~~

.. contents::
   :depth: 1

============
Introduction
============

This is a pure python TCP load balancer. It takes inbound TCP connections and
connects them to one of a number of backend servers.

txLoadBalancer is a fork of Anthony Baxter's PythonDirector. It removed all
threading and asyncore code and the admin UI with the Twisted-based analogs. It
also significantly reorganized the API and provided many new features (see
below).


=====
Usage
=====

    $ twistd -noy ./bin/txlb.tac

This will use the default configuration file in ./etc/config.xml; you can edit
the .tac file to point to the config you prefer. Be sure to edit the
config.xml file to properly reflect your services in need of load-balancing.

To enable the admin interface, your config file must have the admin section
defined, with the required fields. For an example, be sure to see
./etc/config.xml. For more details, please see the configuration information in
the ./doc directory.

If you are creating your own script and don't want to use txlb.tac, you can
import the application setup functions in txlb.application.

You can also "embed" a load-balancer in your Twisted application, though in
it's current state this is rather limited. For example, if you had started web
servers on three other hosts, your main application could lood balance them in
the following manner:

    from twisted.application import service

    from txlb import manager
    from txlb.model import HostMapper
    from txlb.schedulers import leastc
    from txlb.application.service import LoadBalancedService

    proxyServices = [
      HostMapper(proxy='127.0.0.1:8080', lbType=leastc, host='host1',
          address='10.0.0.1:80'),
      HostMapper(proxy='127.0.0.1:8080', lbType=leastc, host='host2',
          address='10.0.0.2:80'),
      HostMapper(proxy='127.0.0.1:8080', lbType=leastc, host='host3',
          address='10.0.0.3:80'),
    ]

    application = service.Application('Demo LB Service')
    pm = manager.proxyManagerFactory(proxyServices)
    lbs = LoadBalancedService(pm)
    lbs.setServiceParent(application)


========
Features
========

* It is a pure-Twisted TCP loadbalancer.

* Thanks to Twisted, it's async i/o based, so much less overhead than
  fork/thread based balancers.

* It has multiple scheduling algorithms (random, round robin, leastconns,
  weighted). If a server fails to answer, it's removed from the pool - the
  client that failed to connect gets transparently failed over to a new host.

* Provides an optional builtin webserver for a built-in admin UI.

* Seperate management timer services that perform such tasks as periodically
  readding failed hosts to the rotation, updated on-disk config files with
  changes made to the running server.

* A built-in SSH server for managing (and modifying) a running load-balancer
  instance.

* A Twisted API for adding a load-balancing service to your Twisted application
  without the need to run a separate load-balancer.

* The application uses an XML-based configuration file.


===========
Performance
===========

(This section is currently incomplete)

Duncan's notes from 2008, tested on a 2 CPU Sun Netra 240:

* a single apache instance on Solaris 10
  * starting threads:
  * max threads

* 2 load-balanced apache instances on Solaris 10

* twisted.web instances on Solaris 10 (same docroot)

* 2 load-balanced twisted.web instances

* PythonDirector 1.0.0 proxies for 2 load-balanced twisted.web instances

Anthony's original notes on performance:

* On my notebook, load balancing an apache on the same local ethernet
  (serving a static 18K text file) gets 155 connections per second and
  2850 kbytes/s throughput (apachebench -n 2000 -c 10). Connecting directly
  to the apache gets 180 conns/sec and 3400kbytes/s. So unless you're
  serving really really stupidly high hit rates it's unlikely to be
  pythondirector causing you difficulties. (Note that 155 connections/sec
  is 13 million hits per day...)

* Running purely over the loopback interface to a local apache seems to
  max out at around 350 conns/second.


=======
Changes
=======

From txLoadBalancer 1.1.1 to 1.2.0
----------------------------------

* Down-ed hosts are now re-checked before they are put back into rotation; only
  hosts with no connection problems are re-added.
* The use of setuptools is now optional.

From txLoadBalancer 1.0.1 to 1.1.0
----------------------------------

* Massive API changes: competely reorganzied the code base.
* Integrated patches from Apple's Calendar Server project.
* A new API for creating load-balanced services within a Twisted application
  (without the need to run a separate load-balancingn daemon).
* Added support for live interaction with load-balancer via SSH connection to
  running Python interpretter (Twisted manhole).
* The ability to start listening on a new port without restaring the
  application.
* Added a weighted load balance scheduler.

From txLoadBalancer 0.9.1 to 1.0.1
----------------------------------

* 100% Twisted: removed all threading and asyncore code completely.
* Significan API changes.
* Dropped the web API.

From PyDirector 1.0.0 to 1.1.1 (AKA txLoadBalancer 0.9.1)
---------------------------------------------------------

* Added support for Twisted, providing the option for all management, admin and
  load-balancing to utilize the Twisted reactor, skipping threading and asycore
  altogether.

From PyDirector 0.0.7 to 1.0.0
------------------------------

* Very few, mostly this is to update the project to 'stable' status.
* The networking code now uses twisted if available, and falls back
  to asyncore.

From PyDirector 0.0.6 to 0.0.7
------------------------------

* You can specify a hostname of '*' to the listen directive for both
  the scheduler and the administrative interface to mean 'listen on
  all interfaces'. Considerably more obvious than '0.0.0.0'. Thanks
  to Andrew Sydelko for the idea.
* New "leastconnsrr" scheduler - this is leastconns, with a roundrobin
  as well. Previously, leastconns would keep the list of hosts sorted,
  which often meant one system got beaten up pretty badly.
* Twisted backend group selection works again.
* The client address is now passed to the scheduler's getHost() method.
  This allows the creation of "sticky" schedulers, where a client is
  (by preference) sent to the same backend server. The factory function
  for schedulers will change to allow things like "roundrobin,sticky".

From PyDirector 0.0.5 to 0.0.6
------------------------------

* fixed an error in the (hopefully rare) case where all backend servers
  are down.
* the main script uses resource.setrlimit() to boost the number of open
  filedescriptors (solaris has stupidly low defaults)
* when all backend servers are down, the manager thread goes into a much
  more aggressive mode re-adding them.
* handle comments in the config file

From PyDirector 0.0.4 to 0.0.5
------------------------------

* bunch of bugfixes to the logging
* re-implemented the networking code using the 'twisted' framework; a simple
  loopback test with asyncore based pydir:

      Requests per second:    107.72
      Transfer rate:          2462.69 kb/s received

  the same test with twisted-based pydir:

      Requests per second:    197.90
      Transfer rate:          4519.69 kb/s received

From PyDirector 0.0.3 to 0.0.4
------------------------------

* can now specify more than one listener for a service
* 'client' in the config XML is now 'host'
* fixed a bug in leastconns and roundrobin scheduler if all backends
  were unavailable.
* whole lotta documentation added.
* running display in web api now shows count of total connections
* running display now has refresh and auto-refresh
* compareconf module - takes a running config and a new config and
  emits the web api commands needed to make the running config match
  the new config
* first cut at enabling https for web interface (needs m2crypto)

From PyDirector 0.0.2 to 0.0.3
------------------------------

* delHost hooked up
* running.xml added - XML dump of current config
* centralised logging - the various things that write logfile
  entries need to be made consistent, and a lot of additional
  logging needs to be added.
* Python2.1 compatibility fix: no socket.gaierror exception on 2.1

From PyDirector 0.0.1 to 0.0.2
------------------------------

* refactored web publishing (babybobo)
* package-ised and distutil-ised the code