1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
3
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
4
<title>The SQLite R*Tree Module</title>
5
<style type="text/css">
8
font-family: Verdana, sans-serif;
13
a:visited { color: #734559 }
15
.logo { position:absolute; margin:3px; }
31
.toolbar a { color: white; text-decoration: none; padding: 6px 12px; }
32
.toolbar a:visited { color: white; }
33
.toolbar a:hover { color: #044a64; background: white; }
35
.content { margin: 5%; }
36
.content dt { font-weight:bold; }
37
.content dd { margin-bottom: 25px; margin-left:20%; }
38
.content ul { padding:0px; padding-left: 15px; margin:0px; }
41
.se { background: url(images/se.gif) 100% 100% no-repeat #044a64}
42
.sw { background: url(images/sw.gif) 0% 100% no-repeat }
43
.ne { background: url(images/ne.gif) 100% 0% no-repeat }
44
.nw { background: url(images/nw.gif) 0% 0% no-repeat }
46
/* Things for "fancyformat" documents start here. */
47
.fancy img+p {font-style:italic}
48
.fancy .codeblock i { color: darkblue; }
49
.fancy h1,.fancy h2,.fancy h3,.fancy h4 {font-weight:normal;color:#044a64}
50
.fancy h2 { margin-left: 10px }
51
.fancy h3 { margin-left: 20px }
52
.fancy h4 { margin-left: 30px }
53
.fancy th {white-space:nowrap;text-align:left;border-bottom:solid 1px #444}
54
.fancy th, .fancy td {padding: 0.2em 1ex; vertical-align:top}
55
.fancy #toc a { color: darkblue ; text-decoration: none }
56
.fancy .todo { color: #AA3333 ; font-style : italic }
57
.fancy .todo:before { content: 'TODO:' }
58
.fancy p.todo { border: solid #AA3333 1px; padding: 1ex }
59
.fancy img { display:block; }
60
.fancy :link:hover, .fancy :visited:hover { background: wheat }
61
.fancy p,.fancy ul,.fancy ol { margin: 1em 5ex }
62
.fancy li p { margin: 1em 0 }
63
/* End of "fancyformat" specific rules. */
69
<div><!-- container div to satisfy validator -->
72
<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite Logo"
74
<div><!-- IE hack to prevent disappearing logo--></div>
75
<div class="tagline">Small. Fast. Reliable.<br>Choose any three.</div>
77
<table width=100% style="clear:both"><tr><td>
78
<div class="se"><div class="sw"><div class="ne"><div class="nw">
79
<table width=100% style="padding:0;margin:0;cell-spacing:0"><tr>
82
<a href="about.html">About</a>
83
<a href="sitemap.html">Sitemap</a>
84
<a href="docs.html">Documentation</a>
85
<a href="download.html">Download</a>
86
<a href="copyright.html">License</a>
87
<a href="news.html">News</a>
88
<a href="support.html">Support</a>
91
gMsg = "Search SQLite Docs..."
92
function entersearch() {
93
var q = document.getElementById("q");
94
if( q.value == gMsg ) { q.value = "" }
95
q.style.color = "black"
96
q.style.fontStyle = "normal"
98
function leavesearch() {
99
var q = document.getElementById("q");
100
if( q.value == "" ) {
102
q.style.color = "#044a64"
103
q.style.fontStyle = "italic"
108
<div style="padding:0 1em 0px 0;white-space:nowrap">
109
<form name=f method="GET" action="http://www.sqlite.org/search">
110
<input id=q name=q type=text
111
onfocus="entersearch()" onblur="leavesearch()" style="width:24ex;padding:1px 1ex; border:solid white 1px; font-size:0.9em ; font-style:italic;color:#044a64;" value="Search SQLite Docs...">
112
<input type=submit value="Go" style="border:solid white 1px;background-color:#044a64;color:white;font-size:0.9em;padding:0 1ex">
116
</div></div></div></div>
118
<div class=startsearch></div>
122
<h1 align="center">The SQLite R*Tree Module</h1>
124
<h2>1.0 Overview</h2>
127
An <a href="http://en.wikipedia.org/wiki/R-tree">R-Tree</a> is a special
128
index that is designed for doing range queries. R-Trees are most commonly
129
used in geospatial systems where each entry is a rectangle with minimum and
130
maximum X and Y coordinates. Given a query rectangle, an R-Tree is able
131
to quickly find all entries that are contained within the query rectangle
132
or which overlap the query rectangle. This idea is easily extended to
133
three dimensions for use in CAD systems. R-Trees also find use in time-domain
134
range look-ups. For example, suppose a database records the starting and
135
ending times for a large number of events. A R-Tree is able to quickly
136
find all events, for example, that were active at any time during a given
137
time interval, or all events that started during a particular time interval,
138
or all events that both started and ended within a given time interval.
143
The R-Tree concept originated with
144
<a href="http://www.baymoon.com/~tg2/">Toni Guttman</a>:
145
<em>R-Trees: A Dynamic Index Structure for Spatial Searching</em>,
146
Proc. 1984 ACM SIGMOD International Conference on Management of Data,
148
The implementation found in SQLite is a refinement of Guttman's original
149
idea, commonly called "R*Trees", that was described by
150
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger:
151
<em>The R*-Tree: An Efficient and Robust Access Method for Points
152
and Rectangles.</em> SIGMOD Conference 1990: 322-331.
155
<h2>2.0 Compiling The R*Tree Module</h2>
158
The source code to the SQLite R*Tree module is included as part
159
of the <a href="amalgamation.html">amalgamation</a> but is disabled by default. To enable the
160
R*Tree module, simply compile with the <a href="compile.html#enable_rtree">SQLITE_ENABLE_RTREE</a>
161
C-preprocessor macro defined. With many compilers, this is accomplished
162
by adding the option "-DSQLITE_ENABLE_RTREE=1" to the compiler
166
<h2>3.0 Using the R*Tree Module</h2>
169
The SQLite R*Tree module is implemented as a
170
<a href="c3ref/create_module.html">virtual table</a>. Each R*Tree index is a
171
virtual table with an odd number of columns between 3 and 11.
172
The first column is always a 64-bit signed integer primary key.
173
The other columns are minimum- and maximum-value pairs (stored as
174
32-bit floating point numbers) for each
175
dimension. A 1-dimensional R*Tree thus has 3 columns.
176
A 2-dimensional R*Tree (the most common case) has 5 columns.
177
A 5-dimensional R*Tree has 11 columns. The SQLite R*Tree implementation
178
does not support R*Trees wider than 5 dimensions.
182
The first column of an SQLite R*Tree must always be an integer
184
The min/max-value pair columns are always stored as
185
32-bit floating point values. Unlike regular SQLite tables which
186
can store data in a variety of datatypes and formats, the R*Tree
187
indices rigidly enforce these two storage types. Attempts to insert
188
something other than an integer into the first column, or something
189
other than a floating point value into the other columns, will result
193
<h3>3.1 Creating An R*Tree Index</h3>
196
A new R*Tree index is created as follows:
200
CREATE VIRTUAL TABLE <em><name></em> USING rtree(<em><column-names></em>);
204
The <em><name></em> is the name your application chooses for the
205
R*Tree index and <em><column-names></em> is a comma separated list
206
of between 3 and 11 columns.
207
The virtual <name> table creates three "shadow" tables to actually
208
store its content. The names of these shadow tables are:
212
<em><name></em><strong>_node</strong><br>
213
<em><name></em><strong>_rowid</strong><br>
214
<em><name></em><strong>_parent</strong>
218
The shadow tables are ordinary SQLite data tables. You can query them
219
directly if you like, though this unlikely to reveal anything particularly
221
And you can <a href="lang_update.html">UPDATE</a>, <a href="lang_delete.html">DELETE</a>, <a href="lang_insert.html">INSERT</a> or even <a href="lang_droptable.html">DROP</a>
222
the shadow tables, though doing so will corrupt your R*Tree index.
223
So it is best to simply ignore the shadow tables. Recognize that they
224
are there to hold your R*Tree index information and let it go as that.
228
As an example, consider creating a two-dimensional R*Tree index for use in
233
CREATE VIRTUAL TABLE demo_index USING rtree(
234
id, -- Integer primary key
235
minX, maxX, -- Minimum and maximum X coordinate
236
minY, maxY -- Minimum and maximum Y coordinate
240
<h3>3.2 Populating An R*Tree Index</h3>
243
The usual <a href="lang_insert.html">INSERT</a>, <a href="lang_update.html">UPDATE</a>, and <a href="lang_delete.html">DELETE</a> commands work on an R*Tree
244
index just like on regular tables. So to insert some data into our sample
245
R*Tree index, we can do something like this:
249
INSERT INTO demo_index VALUES(
251
-80.7749, -80.7747, -- Longitude range
252
30.3776, 30.3778 -- Latitude range
254
INSERT INTO demo_index VALUES(
262
The entries above might represent (for example) a bounding box around
263
the offices for SQLite.org and bounding box around the
264
12th Congressional District of North Carolina in which SQLite.org is located.
267
<h3>3.3 Querying An R*Tree Index</h3>
270
Any valid query will work against an R*Tree index. But the R*Tree
271
implementation is designed to make two kinds of queries especially
272
efficient. First, queries against the primary key are efficient:
276
SELECT * FROM demo_index WHERE id=1;
280
Of course, an ordinary SQLite table will do a query against its
281
integer primary key efficiently, so the previous is no big deal.
282
The real reason for using an R*Tree in the first place is so that
283
you can efficiently do inequality queries against the coordinate
284
ranges. To find all elements of the index that are contained within
285
the vicinity of Charlotte, North Carolina, one might do:
289
SELECT id FROM demo_index
290
WHERE minX>=-81.08 AND maxX<=-80.58
291
AND minY>=35.00 AND maxY<=35.44;
295
The query above would very quickly locate the id of 1 even if the
296
R*Tree contained millions of entries. The previous is an example
297
of a "contained-within" query. The R*Tree also supports "overlapping"
298
queries. For example, to find all bounding boxes that overlap the
303
SELECT id FROM demo_index
304
WHERE maxX>=-81.08 AND minX<=-80.58
305
AND maxY>=35.00 AND minY<=35.44;
309
This second query would find both entry 1 (the SQLite.org office) which
310
is entirely contained within the query box and also
311
the 12th Congressional District which extends well outside the
312
query box but still overlaps the query box.
316
Note that is not necessary for all coordinates in an R*Tree index
317
to be constrained in order for the index search to be efficient.
318
One might, for example, want to query all objects that overlap with
323
SELECT id FROM demo_index
324
WHERE maxY>=35.0 AND minY<=35.0;
328
But, generally speaking, the more constraints that the R*Tree module
329
has to work with, and the smaller the bounding box, the faster the
330
results will come back.
333
<h3>3.3 Roundoff Error</h3>
336
By default, coordinates are stored in an R*Tree using 32-bit floating
337
point values. When a coordinate cannot be exactly represented by a
338
32-bit floating point number, the lower-bound coordinates are rounded down
339
and the upper-bound coordinates are rounded up. Thus, bounding boxes might
340
be slightly larger than specified, but will never be any smaller. This
341
is exactly what is desired for doing the more common "overlapping" queries
342
where the application wants to find every entry in the R*Tree that overlaps
343
a query bounding box. Rounding the entry bounding boxes outward might cause a
344
few extra entries to appears in an overlapping query if the edge of the
345
entry bounding box corresponds to an edge of the query bounding box. But
346
the overlapping query will never miss a valid table entry.
348
<p>However, for a "contained-within" style query, rounding the bounding
349
boxes outward might cause some entries to be excluded from the result set
350
if the edge of the entry bounding box corresponds to the edge of the query
351
bounding box. To guard against this, applications should expand their
352
contained-within query boxes slightly (by 0.000012%) by rounding down the
353
lower coordinates and rounding up the top coordinates, in each dimension.
355
<h2>4.0 Using R*Trees Effectively</h2>
358
The only information that an R*Tree index stores about an object is
359
its integer ID and its bounding box. Additional information needs to
360
be stored in separate tables and related to the R*Tree index using
361
the primary key. For the example above, one might create an auxiliary
366
CREATE TABLE demo_data(
367
id INTEGER PRIMARY KEY, -- primary key
368
objname TEXT, -- name of the object
369
objtype TEXT, -- object type
370
boundary BLOB -- detailed boundary of object
375
In this example, the demo_data.boundary field is intended to hold some
376
kind of binary representation of the precise boundaries of the object.
377
The R*Tree index only holds an axis-aligned rectangular boundary for the
378
object. The R*Tree boundary is just an approximation of the true object
379
boundary. So what typically happens is that the R*Tree index is used to
380
narrow a search down to a list of candidate objects and then more detailed
381
and expensive computations are done on each candidate to find if the
382
candidate truly meets the search criteria.
386
<strong>Key Point:</strong>
387
An R*Tree index does not normally provide the exact answer but merely
388
reduces the set of potential answers from millions to dozens.
392
Suppose the demo_data.boundary field holds some proprietary data description
393
of a complex two-dimensional boundary for an object and suppose that the
394
application has used the <a href="c3ref/create_function.html">sqlite3_create_function()</a> interface to
395
created application-defined functions "contained_in" and
396
"overlaps" accepting two demo_data.boundary objects and return true or false.
397
One may assume that "contained_in" and "overlaps" are relatively slow
398
functions that we do not want to invoke too frequently.
399
Then an efficient way to find the name of all objects located within
400
the North Carolina 12th District, one may be to run a query like this:
404
SELECT objname FROM demo_data, demo_index
405
WHERE demo_data.id=demo_index.id
406
AND contained_in(demo_data.boundary, :boundary)
407
AND minX>=-81.0 AND maxX<=-79.6
408
AND minY>=35.0 AND maxY<=36.2;
411
<p>In the query above, one would presumably bind the binary BLOB
412
description of the precise boundary of the 12th district to the
413
":boundary" parameter.</p>
415
<p>Notice how the query above works: The R*Tree index runs in the outer
416
loop to find entries that are contained within the bounding box
417
of longitude -81..-79.6 and latitude 35.0..36.2.
418
For each object identifier found, SQLite looks up
419
the corresponding entry in the demo_data table. It then uses the boundary
420
field from the demo_data table as a parameter to the contained_in()
421
function and if that function returns true, the objname field from
422
the demo_data table is returned as the next row of query result.</p>
424
<p>One would get the same answer without the use of the R*Tree index
425
using the following simpler query:</p>
428
SELECT objname FROM demo_data
429
WHERE contained_in(demo_data.boundary, :boundary);
432
<p>The problem with this latter query is that it must apply the
433
contained_in() function to millions of entries in the demo_data table.
434
The use of the R*Tree in the penultimate query reduces the number of
435
calls to contained_in() function to a small subset of the entire table.
436
The R*Tree index did not find the exact answer itself, it merely
437
limited the search space.</p>
439
<a name="intrtree"></a>
441
<h2>5.0 Integer-Valued R-Trees</h2>
444
The default virtual table ("rtree") normally stores coordinates as
445
single-precision (4-byte) floating point numbers. If integer coordinates
446
are desired, declare the table using "rtree_i32" instead:
449
CREATE VIRTUAL TABLE intrtree USING rtree_i32(id,x0,x1,y0,y1,z0,y1);
453
An rtree_i32 stores coordinates as 32-bit signed integers. But it still
454
using floating point computations internally as part of the r-tree algorithm.
455
For applications running on processors without hardware floating point,
456
it might be desirable to have a pure integer implementation of r-trees.
457
This is accomplished by compiling with the <a href="compile.html#rtree_int_only">SQLITE_RTREE_INT_ONLY</a> option.
458
When SQLITE_RTREE_INT_ONLY is used, both the "rtree" and the "rtree_i32"
459
virtual tables store 32-bit integers and only integer values are used for
460
internal computations.
462
<a name="customquery"></a>
464
<h2>6.0 Custom R-Tree Queries</h2>
466
<p>By using standard SQL expressions in the WHERE clause of a SELECT query,
467
a user may query for all r-tree entries that intersect a specified
468
bounding-box, or for all entries that are completely encapsulated by a
469
specified bounding-box. Custom r-tree queries, which use the special MATCH
470
operator in the WHERE clause of a SELECT, allow the user to query for the set
471
of r-tree entries that intersect any arbitrarily defined region.
473
<p>Regions for custom r-tree queries are defined by r-tree geometry callbacks
474
implemented by the application and registered with SQLite via a call to the
478
int sqlite3_rtree_geometry_callback(
481
int (*xGeom)(sqlite3_rtree_geometry *, int nCoord, double *aCoord, int *pRes),
486
<p>A call to the above API registers an r-tree geometry callback named zGeom.
487
If an r-tree geometry callback or ordinary SQL user function named zGeom
488
already exists when sqlite3_rtree_geometry_callback() is called, it is replaced
489
by the new r-tree geometry callback. If the xGeom parameter is passed a NULL
490
value, then any existing r-tree geometry callback or SQL user function is
491
removed from the system, but no new r-tree geometry callback is registered.
493
<p>When the r-tree geometry callback is used in a custom r-tree query, the
494
registered callback is invoked one or more times by SQLite to test whether or
495
not the user-defined region intersects with specific bounding boxes.
496
The bounding box being tested is defined by the contents of the aCoord[]
497
array (size nCoord) passed to the callback. The aCoord[] array always contains
498
the same number of entries as there are coordinate columns in the r-tree table
499
(one less than the total number of columns, since the object id column does not
500
contain a coordinate). They define a bounding-box in the same way as each row
501
of the r-tree table itself does - the first scalar coordinate contains the
502
minimum value for the first dimension, followed by the maximum value for the
503
first dimension, followed by the minimum value for the second dimension, and so
504
on. If the specified bounding box intersects with the custom query region, then
505
the implementation of the callback must set the output parameter *pRes to
506
non-zero and return SQLITE_OK. If the specified bounding box does not intersect
507
the queried region, *pRes should be set to zero before returning SQLITE_OK. If
508
an error occurs, the callback may return an SQLite error code other than
509
SQLITE_OK, in which case the value of *pRes is ignored by SQLite and the query
512
<p>A registered r-tree geometry callback is used in an r-tree query by adding
513
a MATCH condition to the WHERE clause of a SELECT statement. For example,
514
assuming a custom geometry callback named "circle" has been registered, it may
515
be used in a query on the two-dimensional r-tree table "demo_index" defined in
516
earlier examples as follows:
519
SELECT * FROM demo_index WHERE id MATCH circle(45.3, 22.9, 5.0)
522
<p>The left-hand side of the MATCH operator may be any column from the r-tree
523
table, including the object id column. It makes no difference which column is
524
used. The right-hand side of the MATCH operator is passed the results of an
525
SQL function with the same name as the r-tree geometry callback. Zero or more
526
function parameters may be specified by the user. Parameters are always
527
interpreted as 64-bit real values. If a text, integer or blob value is passed
528
as a parameter to an r-tree geometry callback function, it is converted to
529
a real value as if by a <a href="lang_expr.html#castexpr">CAST expression</a>. If an SQL NULL value is passed to
530
an r-tree geometry callback function, it is converted to the value 0.0.
532
<p>Parameters passed to r-tree geometry callback functions may be used by the
533
implementation of the r-tree geometry callback to define the specified
534
region in any way. For example, the three parameters passed to the "circle"
535
geometry callback above could identify the center point and radius of the
536
circular region of interest.
538
<p>The first argument to each invocation of an r-tree geometry callback is
539
a pointer to a structure of the following type. The contents of the structure
540
are not modified between multiple calls to the r-tree geometry callback
541
associated with a single query (unless the pUser or xDelUser member variables
542
are modified by the callback implementation - see below).
545
typedef struct sqlite3_rtree_geometry sqlite3_rtree_geometry;
546
struct sqlite3_rtree_geometry {
547
void *pContext; /* Copy of pContext passed to s_r_g_c() */
548
int nParam; /* Size of array aParam */
549
double *aParam; /* Parameters passed to SQL geom function */
550
void *pUser; /* Callback implementation user data */
551
void (*xDelUser)(void *); /* Called by SQLite to clean up pUser */
555
<p>The pContext member of the structure is always set to a copy of the pContext
556
argument passed to sqlite3_rtree_geometry_callback() when the r-tree geometry
557
callback is registered. The aParam[] array (size nParam) contains the parameter
558
values passed to the r-tree geometry callback as part of the SQL query. In
559
the example "circle" query above, nParam would be set to 3 and the aParam[]
560
array would contain the three values 45.3, 22.9 and 5.0.
562
<p>The pUser and xDelUser members of the sqlite3_rtree_geometry structure are
563
initially set to NULL. The pUser variable may be set by the callback
564
implementation to any arbitrary value that may be useful to subsequent
565
invocations of the callback within the same custom r-tree query (for example, a
566
pointer to a complicated data structure used to test for region intersection).
567
If the xDelUser variable is set to a non-NULL value, then after the custom
568
r-tree query has finished running SQLite automatically invokes it with the
569
value of the pUser variable as the only argument. In other words, xDelUser
570
may be set to a destructor function for the pUser value.
572
<p>Example code implementing the "circle" r-tree geometry callback may be
574
<a href=http://www.sqlite.org/src/doc/trunk/src/test_rtree.c>test_rtree.c</a>.