2
The accumulation logic generates data points for times that are a
3
multiple of a step size. In other words, if the step size is 300
4
seconds, any data reported by the accumulation code will always be for
5
a timestamp that is a multiple of 300. The purpose of this behaviour
6
is to (a) limit the amount of data that is sent to the server and (b)
7
provide data in a predictable format to make server-side handling of
8
the data straight-forward. A nice side-effect of providing data at a
9
known step-interval is that the server can detect blackholes in the
10
data simply by testing for the absence of data points at step
13
Limiting the amount of data sent to the server and making the data
14
format predictable are both desirable attributes, but we need to
15
ensure the data reported is accurate. We can't rely on plugins to
16
report data exactly at step boundaries and even if we could we
17
wouldn't necessarily end up with data points that are representative
18
of the resource being monitored. We need a way to calculate a
19
representative data point from the set of data points that a plugin
20
provided during a step period.
22
Suppose we want to calculate data points for timestamps 300 and 600.
23
Assume a plugin runs at an interval less than 300 seconds to get
24
values to provide to the accumulator. Each value received by the
25
accumulator is used to update a data point that will be sent to the
26
server when we cross the step boundary. The algorithm, based on
29
(current time - previous time) * value + last accumulated value
31
If the 'last accumulated value' isn't available, it defaults to 0.
32
For example, consider these timestamp/load average measurements:
33
300/2.0, 375/3.0, 550/3.5 and 650/0.5. Also assume we have no data
34
prior to 300/2.0. This data would be processed as follows:
36
Input Calculation Accumulated Value
37
----- ----------- -----------------
38
300/2.0 (300 - 300) * 2.0 + 0 0.0
40
375/3.0 (375 - 300) * 3.0 + 0.0 225.0
42
550/3.5 (550 - 375) * 3.5 + 225.0 837.5
44
650/0.5 (600 - 550) * 0.5 + 837.5 862.5
46
Notice that the last value crosses a step boundary; the calculation
49
(step boundary time - previous time) * value + last accumulated value
51
This yields the final accumulated value for the step period we've just
52
traversed. The data point sent to the server is generated using the
53
following calculation:
55
accumulated value / step interval size
57
The data point sent to the server in our example would be:
61
This value is representative of the activity that actually occurred
62
and is returned to the plugin to queue for delivery to the server.
63
The accumulated value for the next interval is calculated using the
64
portion of time that crossed into the new step period:
66
Input Calculation Accumulated Value
67
----- ----------- -----------------
68
650/0.5 (650 - 600) * 0.5 + 0 25
70
And so the logic goes, continuing in a similar fashion, yielding
71
representative data at each step boundary.
74
class Accumulator(object):
76
def __init__(self, persist, step_size):
77
self._persist = persist
78
self._step_size = step_size
80
def __call__(self, new_timestamp, new_free_space, key):
81
previous_timestamp, accumulated_value = self._persist.get(key, (0, 0))
82
accumulated_value, step_data = \
83
accumulate(previous_timestamp, accumulated_value,
84
new_timestamp, new_free_space, self._step_size)
85
self._persist.set(key, (new_timestamp, accumulated_value))
89
def accumulate(previous_timestamp, accumulated_value,
90
new_timestamp, new_value,
92
previous_step = previous_timestamp // step_size
93
new_step = new_timestamp // step_size
94
step_boundary = new_step * step_size
95
step_diff = new_step - previous_step
99
diff = new_timestamp - previous_timestamp
100
accumulated_value += diff * new_value
102
diff = step_boundary - previous_timestamp
103
accumulated_value += diff * new_value
104
step_value = float(accumulated_value) / step_size
105
step_data = (step_boundary, step_value)
106
diff = new_timestamp - step_boundary
107
accumulated_value = diff * new_value
109
diff = new_timestamp - step_boundary
110
accumulated_value = diff * new_value
112
return accumulated_value, step_data