~graphite-dev/graphite/1.1

« back to all changes in this revision

Viewing changes to docs/config-carbon.rst

Committer: Chris Davis
Date: 2011-08-08 03:42:23 UTC
mfrom: (337.5.162 graphite)
Revision ID: chrismd@gmail.com-20110808034223-lrvejc3tfmyn6o9g

cherrypicking trunk revs 500-507

files modified:
carbon/conf/carbon.conf.example

carbon/lib/carbon/conf.py

docs/config-carbon.rst

webapp/graphite/render/functions.py

whisper/whisper.py

Show diffs side-by-side

added added

removed removed

docs/config-carbon.rst

--------------------

This file defines how much data to store, and at what precision.

Important notes before continuing:

* There can be many sections in this file.

* Each section must have a header in square brackets, a pattern and a retentions line.

* The sections are applied in order from the top (first) and bottom (last).

* The patterns are regular expressions, as opposed to the wildcards used in the URL API.

* The first pattern that matches the metric name is used.

* These are set at the time the first metric is sent.

* Changing this file will not affect .wsp files already created on disk. Use whisper-resize.py to change those.

* These are set at the time the first metric is sent.

* Changing this file will not affect .wsp files already created on disk. Use whisper-resize.py to change those.

* There are two very different ways to specify retentions. We will show the new, easier way first, and the old, more difficult way second for historical purposes second.

Here's an example:

.. code-block:: none

[garbage_collection]

pattern = garbageCollections$

retentions = 10s:14d

Here's a more complicated example with multiple retention rates:

.. code-block:: none

[apache_busyWorkers]

pattern = servers\.www.*\.workers\.busyWorkers$

retentions = 15s:7d,1min:21d,15min:5y

The pattern matches server names that start with 'www', followed by anything, that end in '.workers.busyWorkers'. This way not all metrics associated with your webservers need this type of retention.

As you can see there are multiple retentions. Each is used in the order that it is provided. As a general rule, they should be in most-precise:shortest-length to least-precise:longest-time. Retentions are merely a way to save you disk space an decrease I/O for graphs that span a long period of time. When data moves from a higher precision to a lower precision, it is **averaged**. This way, you can still find the **total** for a particular time period if you know the original precision.

As you can see there are multiple retentions. Each is used in the order that it is provided. As a general rule, they should be in most-precise:shortest-length to least-precise:longest-time. Retentions are merely a way to save you disk space and decrease I/O for graphs that span a long period of time. When data moves from a higher precision to a lower precision, it is **averaged**. This way, you can still find the **total** for a particular time period if you know the original precision.

Example: You store the number of sales per minute for 1 year, and the sales per hour for 5 years after that. You need to know the total sales for January 1st of the year before. You can query whisper for the raw data, and you'll get 24 datapoints, one for each hour. They will most likely be floating point numbers. You can take each datapoint, multiply by 60 (the ratio of high-precision to low-precision datapoints) and still get the total sales per hour.

The old retentions was done as follows:

.. code-block:: none

retentions = 60:1440

'output_template' filling in any captured fields from 'input_pattern'.

For example, if you're metric naming scheme is:

.. code-block:: none

100

101

<env>.applications.<app>.<server>.<metric>

119

121

aggregate metric 'prod.applications.apache.all.requests' would be calculated

120

122

by summing their values.

121

123

122

Older »