1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
3
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
4
<title>No Title</title>
5
<style type="text/css">
8
font-family: Verdana, sans-serif;
13
a:visited { color: #734559 }
15
.logo { position:absolute; margin:3px; }
31
.toolbar a { color: white; text-decoration: none; padding: 6px 12px; }
32
.toolbar a:visited { color: white; }
33
.toolbar a:hover { color: #044a64; background: white; }
35
.content { margin: 5%; }
36
.content dt { font-weight:bold; }
37
.content dd { margin-bottom: 25px; margin-left:20%; }
38
.content ul { padding:0px; padding-left: 15px; margin:0px; }
41
.se { background: url(images/se.gif) 100% 100% no-repeat #044a64}
42
.sw { background: url(images/sw.gif) 0% 100% no-repeat }
43
.ne { background: url(images/ne.gif) 100% 0% no-repeat }
44
.nw { background: url(images/nw.gif) 0% 0% no-repeat }
46
/* Things for "fancyformat" documents start here. */
47
.fancy img+p {font-style:italic}
48
.fancy .codeblock i { color: darkblue; }
49
.fancy h1,.fancy h2,.fancy h3,.fancy h4 {font-weight:normal;color:#044a64}
50
.fancy h2 { margin-left: 10px }
51
.fancy h3 { margin-left: 20px }
52
.fancy h4 { margin-left: 30px }
53
.fancy th {white-space:nowrap;text-align:left;border-bottom:solid 1px #444}
54
.fancy th, .fancy td {padding: 0.2em 1ex; vertical-align:top}
55
.fancy #toc a { color: darkblue ; text-decoration: none }
56
.fancy .todo { color: #AA3333 ; font-style : italic }
57
.fancy .todo:before { content: 'TODO:' }
58
.fancy p.todo { border: solid #AA3333 1px; padding: 1ex }
59
.fancy img { display:block; }
60
.fancy :link:hover, .fancy :visited:hover { background: wheat }
61
.fancy p,.fancy ul,.fancy ol { margin: 1em 5ex }
62
.fancy li p { margin: 1em 0 }
63
/* End of "fancyformat" specific rules. */
69
<div><!-- container div to satisfy validator -->
72
<img class="logo" src="images/sqlite370_banner.gif" alt="SQLite Logo"
74
<div><!-- IE hack to prevent disappearing logo--></div>
75
<div class="tagline">Small. Fast. Reliable.<br>Choose any three.</div>
77
<table width=100% style="clear:both"><tr><td>
78
<div class="se"><div class="sw"><div class="ne"><div class="nw">
79
<table width=100% style="padding:0;margin:0;cell-spacing:0"><tr>
82
<a href="about.html">About</a>
83
<a href="sitemap.html">Sitemap</a>
84
<a href="docs.html">Documentation</a>
85
<a href="download.html">Download</a>
86
<a href="copyright.html">License</a>
87
<a href="news.html">News</a>
88
<a href="support.html">Support</a>
91
gMsg = "Search SQLite Docs..."
92
function entersearch() {
93
var q = document.getElementById("q");
94
if( q.value == gMsg ) { q.value = "" }
95
q.style.color = "black"
96
q.style.fontStyle = "normal"
98
function leavesearch() {
99
var q = document.getElementById("q");
100
if( q.value == "" ) {
102
q.style.color = "#044a64"
103
q.style.fontStyle = "italic"
108
<div style="padding:0 1em 0px 0;white-space:nowrap">
109
<form name=f method="GET" action="http://www.sqlite.org/search">
110
<input id=q name=q type=text
111
onfocus="entersearch()" onblur="leavesearch()" style="width:24ex;padding:1px 1ex; border:solid white 1px; font-size:0.9em ; font-style:italic;color:#044a64;" value="Search SQLite Docs...">
112
<input type=submit value="Go" style="border:solid white 1px;background-color:#044a64;color:white;font-size:0.9em;padding:0 1ex">
116
</div></div></div></div>
118
<div class=startsearch></div>
120
<!-- title>SQLite B-Tree Module</title -->
126
<div style="font-size:2em;text-align:center;color:#80a796">SQLite B-Tree Module</div>
127
<div style="font-size:1.5em;margin:1em;color:#80a796">Table Of Contents</div>
130
<div style="margin-left:6ex">
131
<a href="#section_1">1 Document Overview</a>
134
<div style="margin-left:12ex">
135
<a href="#section_1_1">1.1 Scope and Purpose</a>
138
<div style="margin-left:12ex">
139
<a href="#section_1_2">1.2 Document and Requirements Organization</a>
142
<div style="margin-left:12ex">
143
<a href="#section_1_3">1.3 Glossary</a>
146
<div style="margin-left:6ex">
147
<a href="#section_2">2 Module Requirements</a>
150
<div style="margin-left:12ex">
151
<a href="#section_2_1">2.1 Functional Requirements</a>
154
<div style="margin-left:18ex">
155
<a href="#section_2_1_1">2.1.1 Opening and Closing Connections</a>
158
<div style="margin-left:18ex">
159
<a href="#section_2_1_2">2.1.2 New Database Image Configuration</a>
162
<div style="margin-left:18ex">
163
<a href="#hlr_transactions">2.1.3 Transaction and Savepoint Functions</a>
166
<div style="margin-left:18ex">
167
<a href="#hlr_reading_data">2.1.4 Reading From the Database Image</a>
170
<div style="margin-left:18ex">
171
<a href="#section_2_1_5">2.1.5 Writing to the Database Image</a>
174
<div style="margin-left:18ex">
175
<a href="#section_2_1_6">2.1.6 Page-Cache Configuration Requirements</a>
178
<div style="margin-left:18ex">
179
<a href="#section_2_1_7">2.1.7 Multi-User Database Requirements</a>
182
<div style="margin-left:18ex">
183
<a href="#section_2_1_8">2.1.8 Backup/Vacuum API Requirements</a>
186
<div style="margin-left:18ex">
187
<a href="#section_2_1_9">2.1.9 Integrity Check Requirements</a>
190
<div style="margin-left:12ex">
191
<a href="#section_2_2">2.2 Other Requirements and Constraints</a>
194
<div style="margin-left:18ex">
195
<a href="#hlr_memory">2.2.1 Caching and Memory Management Requirements</a>
198
<div style="margin-left:18ex">
199
<a href="#section_2_2_2">2.2.2 Exception Handling Requirements</a>
202
<div style="margin-left:18ex">
203
<a href="#section_2_2_3">2.2.3 Well-Formedness Requirements</a>
206
<div style="margin-left:6ex">
207
<a href="#section_3">3 Module API</a>
210
<div style="margin-left:12ex">
211
<a href="#section_3_1">3.1 Opening and Closing Connections</a>
214
<div style="margin-left:18ex">
215
<a href="#section_3_1_1">3.1.1 sqlite3BtreeOpen</a>
218
<div style="margin-left:18ex">
219
<a href="#section_3_1_2">3.1.2 sqlite3BtreeClose</a>
222
<div style="margin-left:12ex">
223
<a href="#section_3_2">3.2 Database Image Configuration</a>
226
<div style="margin-left:12ex">
227
<a href="#section_3_3">3.3 Connection Configuration</a>
230
<div style="margin-left:12ex">
231
<a href="#section_3_4">3.4 Query Interfaces</a>
234
<div style="margin-left:12ex">
235
<a href="#section_3_5">3.5 Mutex Functions</a>
238
<div style="margin-left:12ex">
239
<a href="#section_3_6">3.6 Transaction and Savepoint API</a>
242
<div style="margin-left:12ex">
243
<a href="#section_3_7">3.7 Reading and Traversing the Database Image</a>
246
<div style="margin-left:18ex">
247
<a href="#section_3_7_1">3.7.1 sqlite3BtreeMovetoUnpacked</a>
250
<div style="margin-left:18ex">
251
<a href="#section_3_7_2">3.7.2 sqlite3BtreeGetMeta</a>
254
<div style="margin-left:12ex">
255
<a href="#section_3_8">3.8 Modifying the Database Image</a>
258
<div style="margin-left:18ex">
259
<a href="#sqlite3BtreeCreateTable">3.8.1 sqlite3BtreeCreateTable</a>
262
<div style="margin-left:18ex">
263
<a href="#sqlite3BtreeDropTable">3.8.2 sqlite3BtreeDropTable</a>
266
<div style="margin-left:18ex">
267
<a href="#sqlite3BtreeClearTable">3.8.3 sqlite3BtreeClearTable</a>
270
<div style="margin-left:18ex">
271
<a href="#sqlite3BtreeCursorHasMoved">3.8.4 sqlite3BtreeCursorHasMoved</a>
274
<div style="margin-left:18ex">
275
<a href="#sqlite3BtreePutData">3.8.5 sqlite3BtreePutData</a>
278
<div style="margin-left:18ex">
279
<a href="#sqlite3BtreeUpdateMeta">3.8.6 sqlite3BtreeUpdateMeta</a>
282
<div style="margin-left:18ex">
283
<a href="#sqlite3BtreeDelete">3.8.7 sqlite3BtreeDelete</a>
286
<div style="margin-left:18ex">
287
<a href="#section_3_8_8">3.8.8 sqlite3BtreeInsert</a>
290
<div style="margin-left:18ex">
291
<a href="#sqlite3BtreeIncrVacuum">3.8.9 sqlite3BtreeIncrVacuum</a>
294
<div style="margin-left:12ex">
295
<a href="#section_3_9">3.9 Advisory B-Tree Locks</a>
298
<div style="margin-left:18ex">
299
<a href="#section_3_9_1">3.9.1 sqlite3BtreeLockTable</a>
302
<div style="margin-left:18ex">
303
<a href="#section_3_9_2">3.9.2 sqlite3BtreeSchemaLocked</a>
306
<div style="margin-left:12ex">
307
<a href="#section_3_10">3.10 What do these do?</a>
310
<div style="margin-left:12ex">
311
<a href="#section_3_11">3.11 APIs not branded sqlite3BtreeXXX()</a>
314
<div style="margin-left:6ex">
315
<a href="#section_4">4 Module Implementation</a>
318
<div style="margin-left:12ex">
319
<a href="#section_4_1">4.1 Database Image Traversal</a>
322
<div style="margin-left:12ex">
323
<a href="#section_4_2">4.2 Database Image Manipulation</a>
326
<div style="margin-left:18ex">
327
<a href="#section_4_2_1">4.2.1 Creating a B-Tree Structure</a>
330
<div style="margin-left:18ex">
331
<a href="#section_4_2_2">4.2.2 Clearing a B-Tree Structure</a>
334
<div style="margin-left:18ex">
335
<a href="#section_4_2_3">4.2.3 Deleting a B-Tree Structure</a>
338
<div style="margin-left:18ex">
339
<a href="#section_4_2_4">4.2.4 Inserting, Replacing and Deleting B-Tree Entries</a>
342
<div style="margin-left:24ex">
343
<a href="#section_4_2_4_1">4.2.4.1 B-Tree Insert/Replace Entry</a>
346
<div style="margin-left:24ex">
347
<a href="#section_4_2_4_2">4.2.4.2 B-Tree Delete Entry</a>
350
<div style="margin-left:18ex">
351
<a href="#btree_balancing_algorithm">4.2.5 B-Tree Balancing Algorithm</a>
354
<div style="margin-left:24ex">
355
<a href="#section_4_2_5_1">4.2.5.1 Balance Deeper</a>
358
<div style="margin-left:24ex">
359
<a href="#section_4_2_5_2">4.2.5.2 Balance Shallower</a>
362
<div style="margin-left:24ex">
363
<a href="#section_4_2_5_3">4.2.5.3 Balance Quick</a>
366
<div style="margin-left:24ex">
367
<a href="#balance_siblings">4.2.5.4 Balance Siblings</a>
370
<div style="margin-left:18ex">
371
<a href="#section_4_2_6">4.2.6 Page Allocation and Deallocation</a>
374
<div style="margin-left:24ex">
375
<a href="#free_overflow_chain">4.2.6.1 Moving an overflow-chain to the free-list</a>
378
<div style="margin-left:18ex">
379
<a href="#section_4_2_7">4.2.7 Incremental Vacuum Step</a>
382
<div style="margin-left:12ex">
383
<a href="#section_4_3">4.3 Transactions and Savepoints</a>
386
<div style="margin-left:6ex">
387
<a href="#section_5">5 References</a>
393
<h1 id="section_1">1 Document Overview</h1>
396
<h2 id="section_1_1">1.1 Scope and Purpose</h2>
400
This document provides a description of the functionality of and public
401
interface offered by the SQLite b-tree module. It also, to a certain extent,
402
describes the algorithm and implementation techniques used internally by
406
<li><p> To make it easier to maintain, test and improve this critical
407
sub-system of the SQLite software library.
409
<li><p> To facilitate development of compatible backend modules that can
410
be used with other SQLite sub-systems, either for experimental or
415
It is important to note that, even given the second bullet point above,
416
the interfaces and sub-systems described in this document are not stable.
417
They may be changed in any way with each new SQLite release. Any
418
external software development that uses these interfaces must be prepared
419
to adapt to interface refactoring without notice.
421
<h2 id="section_1_2">1.2 Document and Requirements Organization</h2>
425
Change the following so that those requirements that describe the API
426
are "low-level" requirements.
428
<table style="margin:1em auto;width:80%;border-spacing:0">
429
<tr style="text-align:left"> <th> Requirement ids <th> Contents
430
<tr style="text-align:left;background-color:#DDDDDD"> <td> H50**** <td> Requirement statements specifying the functionality required of the B-Tree module.
431
<tr style="text-align:left"> <td> H51**** <td> Requirement statements specifying the API provided by the B-Tree module.
432
<tr style="text-align:left;background-color:#DDDDDD"> <td> L****** <td> Requirement statements specifying some details of the internal workings of the B-Tree module.
435
<h2 id="section_1_3">1.3 Glossary</h2>
439
<tr><td class=defn><a name="glossary_Balance-Siblings_Algorithm"></a><a class=defnlink href="#glossary_Balance-Siblings_Algorithm">Balance-Siblings Algorithm</a> <td>
440
The <a class=defnlink href="#glossary_Balance-Siblings_Algorithm">balance-siblings algorithm</a> is one of four algorithms that may be used
441
used to redistribute data within a b-tree structure after an insert or
442
delete operation that causes a b-tree node to become overfull or underfull.
443
See section <cite><a href="#balance_siblings" title="Balance Siblings">4.2.5.4</a></cite> for details.
445
<tr><td class=defn><a name="glossary_B-Tree_Cursor"></a><a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree Cursor</a> <td>
446
<span class=todo>Define this.
448
<tr><td class=defn><a name="glossary_B-Tree_Database_Connection"></a><a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree Database Connection</a> <td>
449
A <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a> is a single client connection to an in-memory
450
<a class=defnlink href="#glossary_Page_cache">page cache</a>, through which a single temporary or <a class=defnlink href="#glossary_Persistent_database">persistent database</a> may
451
be accessed. This term is used throughout this document to avoid confusing
452
such connections with SQL level SQLite client connections, which are
453
sometime simply termed "database connections".
455
<tr><td class=defn><a name="glossary_Lazy-write_cache"></a><a class=defnlink href="#glossary_Lazy-write_cache">Lazy-write cache</a> <td>
456
<span class=todo>Define this.
458
<tr><td class=defn><a name="glossary_Overflow_Cell"></a><a class=defnlink href="#glossary_Overflow_Cell">Overflow Cell</a> <td>
459
<span class=todo>Define this.
461
<tr><td class=defn><a name="glossary_Page_cache"></a><a class=defnlink href="#glossary_Page_cache">Page cache</a> <td>
462
<span class=todo>Define this.
464
<tr><td class=defn><a name="glossary_Persistent_database"></a><a class=defnlink href="#glossary_Persistent_database">Persistent database</a> <td>
465
<span class=todo>Define this.
467
<tr><td class=defn><a name="glossary_Read-through_cache"></a><a class=defnlink href="#glossary_Read-through_cache">Read-through cache</a> <td>
468
<span class=todo>Define this.
470
<tr><td class=defn><a name="glossary_Shared-cache_mode"></a><a class=defnlink href="#glossary_Shared-cache_mode">Shared-cache mode</a> <td>
471
<span class=todo>Define this.
473
<tr><td class=defn><a name="glossary_SQLite_Error_Code"></a><a class=defnlink href="#glossary_SQLite_Error_Code">SQLite Error Code</a> <td>
474
<span class=todo>Define this.
476
<tr><td class=defn><a name="glossary_Temporary_database"></a><a class=defnlink href="#glossary_Temporary_database">Temporary database</a> <td>
477
<span class=todo>Define this.
482
<h1 id="section_2">2 Module Requirements</h1>
486
The SQLite B-Tree module, the software module described by this document,
487
is designed to query and modify a database stored using the database image
488
format described in <cite><a href="#ref_file_format" title="">[1]</a></cite>. Database images may
489
exist only in volatile main-memory (in-memory databases), or may be stored
490
persistently within the file-system (also described in
491
<cite><a href="#ref_file_format" title="">[1]</a></cite>). Or, a database image may be stored primarily
492
in main-memory with the file-system used as secondary storage if the
493
database image grows too large. Database images stored only in main-memory,
494
and those stored primarily in main-memory with the file-system used only to
495
provide secondary storage space are known collectively as <a class=defnlink href="#glossary_Temporary_database">temporary
496
databases</a>. Database images stored persistently in the file-system are termed
497
<a class=defnlink href="#glossary_Persistent_database">persistent databases</a>.
500
This module implements an in-memory <a class=defnlink href="#glossary_Page_cache">page cache</a> to manage database image
501
content. The size of the pages managed by the cache are the same as the
502
page-size of the database image. When operating on a <a class=defnlink href="#glossary_Persistent_database">persistent database</a>,
503
the cache operates as a read-through, <a class=defnlink href="#glossary_Lazy-write_cache">lazy-write cache</a>. When committing a
504
database transaction, the user explicitly directs the cache to flush all
505
dirty pages through to persistent storage. A single in-memory <a class=defnlink href="#glossary_Page_cache">page cache</a>
506
used to access the content of a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> may support multiple
507
logical client connections. <span class=todo>Some brief explanation of what
508
this means. And maybe a pointer to the "Multi-User Database Requirements"
512
When operating on a <a class=defnlink href="#glossary_Temporary_database">temporary database</a>, there may only be one client for
513
each <a class=defnlink href="#glossary_Page_cache">page cache</a>. Depending on the SQLite configuration, either the database
514
or journal file, or both, may be omitted from the system.
518
<a name="figure_overview"></a>
519
<object data="images/btreemodule_overview.svg" type="image/svg+xml" width=640 height=271 style="overflow:hidden"></object>
520
<p><i>Figure 1 - Role of <a class=defnlink href="#glossary_Page_cache">Page Cache</a></i>
525
Figure <cite><a href="#figure_overview" title="Role of Page Cache">1</a></cite> depicts...
528
Roughly what is encapsulated by the module.
531
<h2 id="section_2_1">2.1 Functional Requirements</h2>
535
This section contains requirements describing the functionality required
536
from the B-Tree module.
539
Figure out where integrity-check goes.
541
<h3 id="section_2_1_1">2.1.1 Opening and Closing Connections</h3>
545
The B-Tree module provides an interface to open new <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connections</a>.
547
<p class=req id=H50010><span>The B-Tree module shall provide an interface to open a connection
548
to either a named <a class=defnlink href="#glossary_Persistent_database">persistent database</a> file, or an anonymous <a class=defnlink href="#glossary_Temporary_database">temporary
549
database</a>.</span> (C: * <a class=reqlink href=#H51001>H51001</a>)</p>
550
<p class=req id=H50020><span>When opening a <a class=defnlink href="#glossary_Persistent_database">persistent database</a>, the B-Tree module shall allow the user
551
to specify that the connection be opened for read-only access.</span></p>
552
<p class=req id=H50030><span>When opening a <a class=defnlink href="#glossary_Persistent_database">persistent database</a>, the B-Tree module shall allow the user
553
to specify that the connection only be opened if the specified file exists.</span></p>
554
<p class=req id=H50040><span>If SQLite is configured to run in <a class=defnlink href="#glossary_Shared-cache_mode">shared-cache mode</a>, and a connection is opened
555
to a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> file for which there exists already a <a class=defnlink href="#glossary_Page_cache">page-cache</a> within
556
the current processes address space, then the connection opened shall be a
557
connection to the existing <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
558
<p class=req id=H50050><span>If a new <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a> is opened and requirement <a class=reqlink href=#H50040>H50040</a> does not apply,
559
then a new <a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be created within the processes address space. The
560
opened connection shall be a connection to the new <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
563
The B-Tree module also provides an interface to close existing <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database
566
<p class=req id=H50060><span>The B-Tree module shall provide an interface to close a <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a>.</span></p>
567
<p class=req id=H50070><span>If a <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a> is closed and this causes the associated
568
<a class=defnlink href="#glossary_Page_cache">page-cache</a> to have zero connections to it, then the <a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be closed
569
and all associated resources released.</span></p>
571
<h3 id="section_2_1_2">2.1.2 New Database Image Configuration</h3>
575
The following requirements describe database configuration options that
576
are only applicable to new database images. For the purposes of the
577
following requirements, a "new database image" is defined as one that is
580
<p class=req id=H50080><span>The B-Tree module shall provide an interface to configure the page-size of a
581
new database image.</span></p>
582
<p class=req id=H50090><span>The B-Tree module shall provide an interface to configure whether or not a new
583
database image is auto-vacuum capable.</span></p>
585
<h3 id="hlr_transactions">2.1.3 Transaction and Savepoint Functions</h3>
589
This needs a lot of work...
592
All read and write operations performed on a database image via the
593
B-Tree module interfaces occur within the context of a read or write
594
transaction. <span class=todo>Something about the ACID nature of
595
transactions and how this applies to read and write transactions</span>)
597
<p class=req id=H50100><span>The B-Tree module shall provide an interface to open (start) a read-only transaction.</span></p>
598
<p class=req id=H50101><span>The B-Tree module shall provide an interface to close (finish) a read-only transaction.</span></p>
603
<p class=req id=H50102><span>The B-Tree module shall provide an interface to open a read/write transaction
604
or to upgrade from a read-only transaction to a read/write transaction.</span></p>
605
<p class=req id=H50103><span>The B-Tree module shall provide an interface to commit a read/write transaction.</span></p>
606
<p class=req id=H50104><span>The B-Tree module shall provide an interface to rollback a read/write transaction.</span></p>
609
Multi-file transaction support.
612
Transaction state query:
613
<p class=req id=H50108><span>The B-Tree module shall provide an interface to query a <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database
614
connection</a> to determine if there is an open transaction, and if so if the open
615
transaction is read-only or read/write.</span></p>
621
Define "savepoint transactions" and fix the following requirements.
623
<p class=req id=H50105><span>The B-Tree module shall provide an interface to open savepoint transactions.</span></p>
624
<p class=req id=H50106><span>The B-Tree module shall provide an interface to commit savepoint transactions.</span></p>
625
<p class=req id=H50107><span>The B-Tree module shall provide an interface to rollback savepoint transactions.</span></p>
627
<h3 id="hlr_reading_data">2.1.4 Reading From the Database Image</h3>
631
The B-Tree module allows the user to read a subset of the fields from the
632
database image header. Each such field is stored in the header as a 4-byte
633
unsigned big-endian integer. A complete description of each field and its
634
interpretation may be found in <cite><a href="#ref_file_format" title="">[1]</a></cite>.
636
<p class=req id=H50109><span>The B-Tree module shall provide an interface to read the value of any of the
637
4-byte unsigned big-endian integer fields beginning at byte offset 36 of the
638
database image.</span> (C: <a class=reqlink href=#H51015>H51015</a> <a class=reqlink href=#H51016>H51016</a> <a class=reqlink href=#H51017>H51017</a>)</p>
641
In other words, the database image header fields that may be read via
645
<li> The number of free pages in the database image,
646
<li> The database image schema version (schema cookie).
647
<li> The database image schema layer file-format.
648
<li> The default <a class=defnlink href="#glossary_Page_cache">page-cache</a> size.
649
<li> The "auto-vacuum last root-page" field.
650
<li> The database image text-encoding field.
651
<li> The database image user-cookie value.
652
<li> The database image incremental-vacuum flag.
656
With the exception of the database image header fields described above,
657
all data is read from the database image using <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursors</a>. A <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree
658
cursor</a> is a control structure for traversing the contents of a single
659
table or index b-tree structure within a database image. As well as
660
"forward" and "back" operations, a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> supports fast seeking to
661
a table entry identified by key value, or to the first or last entry in
665
When a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> is created, the specific table or index b-tree that
666
it is used to traverse is identified by the database image page number
667
of its root page. Since the root-page of the schema table is always page
668
1, and the contents of the schema table includes the root page numbers of all
669
other index and table b-tree structures in the database image, it is
670
possible for the application to determine the set of valid root-page
671
numbers by first traversing the schema table.
673
<p class=req id=H50110><span>The B-Tree module shall provide an interface to open a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> on any table or
674
index b-tree within the database image, given its root page number.</span></p>
675
<p class=req id=H50111><span>The B-Tree module shall provide an interface to close a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a>.</span></p>
676
<p class=req id=H50112><span>The B-Tree module shall provide an interface to move an open <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> to
677
the entry associated with the largest key in the open b-tree structure.</span></p>
678
<p class=req id=H50113><span>The B-Tree module shall provide an interface to move an open <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> to
679
the entry associated with the smallest key in the open b-tree structure.</span></p>
680
<p class=req id=H50114><span>The B-Tree module shall provide an interface to move an open <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> that
681
currently points at a valid b-tree entry to the next entry in the b-tree
682
structure, sorted in order of key value, if any.</span></p>
683
<p class=req id=H50115><span>The B-Tree module shall provide an interface to move an open <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> that
684
currently points at a valid b-tree entry to the previous entry in the b-tree
685
structure, sorted in order of key value, if any.</span></p>
686
<p class=req id=H50116><span>The B-Tree module shall provide an interface to retrieve the key value
687
associated with the b-tree structure entry that a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> is pointing to,
689
<p class=req id=H50117><span>The B-Tree module shall provide an interface to retrieve the blob of data (the
690
database record) associated with the b-tree structure entry that a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree
691
cursor</a> open on a table b-tree is pointing to, if any.</span></p>
692
<p class=req id=H50118><span>The B-Tree module shall provide an interface to return the number of entries
693
currently stored in the b-tree structure that a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> is open on.</span></p>
696
As well as traversing a b-tree structure using the operations enumerated
697
by the above requirements, it is also possible to use a cursor to search
698
a b-tree structure for a specified key value. If the key value can be
699
found, the cursor is left pointing at the entry with the specified key
700
value. Otherwise, the cursor is left pointing at either the entry with the
701
largest key that is smaller than the specified key, or to the entry with
702
the smallest key that is larger than the specified key. For table b-tree
703
structures, where the key values are 64-bit integers, the definition of
704
smaller, larger and equal to is straightforward. For index b-tree
705
structures, where the key values are database records, the manner in
706
which key values must be compared is more complicated. Refer to
707
<cite><a href="#ref_file_format" title="">[1]</a></cite> for a full explanation.
710
There is a specific section in <cite><a href="#ref_file_format" title="">[1]</a></cite> devoted to
711
record sort order in index b-tree structures. There needs to be some way to
712
point to it. Or, better, to the requirement or range of requirements.
715
Maybe a system that automatically links text like H30100 to the
716
corresponding requirement. Within a document if it can find it, or a
717
summary page (hlreq.html for example).
720
<p class=req id=H50119><span>Given a key value, the B-Tree module shall provide an interface to move a
721
<a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> open on a b-tree structure to the B-Tree entry with the matching
722
key value, if such an entry exists.</span></p>
723
<p class=req id=H50120><span>If the interface required by <a class=reqlink href=#H50119>H50119</a> is used to search for a key value that is
724
not present in the b-tree structure and the b-tree is not empty, the cursor shall
725
be moved to an existing entry that would be adjacent to a hypothetical
726
entry with the specified key value.</span></p>
727
<p class=req id=H50121><span>The interface required by <a class=reqlink href=#H50119>H50119</a> shall provide an indication to the caller as
728
to whether the cursor is left pointing at an entry with a key value that is
729
smaller, larger or equal to the requested value, or if it is pointing to no
730
entry at all (because the b-tree structure is empty).</span></p>
733
Does it depend on the structure of the tree whether the cursor is left
734
pointing to a smaller or larger entry after a failed search? Or is it
735
possible to determine which it will be based only on the set of keys
739
As well as the standard search operation described by the above
740
requirements, cursors open on index b-tree structures are required to
741
support several variants, as follows:
744
<li> <b>Ignore rowid search mode</b>. The final value in a database
745
record used as an index-btree key is always an integer "rowid"
746
field. A search in this mode proceeds as if each key in the b-tree
747
was missing this field.
749
<li> <b>Increment key mode</b>.
750
<li> <b>Prefix match mode</b>.
751
<li> <b>Prefix search mode</b>.
755
Finish the bullet points above and add HLR for each search mode.
758
More than one cursor can be open on a single b-tree structure at one time.
759
It is also possible for a write-cursor to modify the contents of a b-tree
760
structure while other cursors are open on it. The b-tree module does not
761
include any type of row-locking mechanism. It is possible for a write-cursor
762
to be used to delete an entry from a b-tree structure even if there are
763
one or more other cursors currently pointing to the entry being deleted.
766
Requirements to do with how the above is handled. Traceability to
767
sqlite3BtreeCursorHasMoved is required.
769
<h3 id="section_2_1_5">2.1.5 Writing to the Database Image</h3>
773
The B-Tree module allows the user to write values to a subset of the
774
fields from the database image header. The set of writable fields is
775
the same as the set of fields enumerated in section
776
<cite><a href="#hlr_reading_data" title="Reading From the Database Image">2.1.4</a></cite> that the B-Tree module is required to
777
provide read access to by requirement <a class=reqlink href=#H50109>H50109</a>.
779
<p class=req id=H50122><span>The B-Tree module shall provide an interface to write a value to any of the
780
4-byte unsigned big-endian integer fields beginning at byte offset 36 of the
781
database image.</span></p>
784
The B-Tree module also supports operations to create new b-tree
785
structures within the database image. Existing b-tree structures may be
786
deleted from the database image entirely, or their entire contents may be
787
deleted, leaving an empty b-tree structure.
789
<p class=req id=H50123><span>The B-Tree module shall provide an interface to create a new index or table
790
b-tree structures within the database image. The interface shall automatically
791
assign a root-page to the new b-tree structure.</span></p>
792
<p class=req id=H50124><span>The B-Tree module shall provide an interface to remove an existing index or
793
table b-tree structure from the database image, given the root page number of
794
the b-tree to remove.</span></p>
795
<p class=req id=H50125><span>The B-Tree module shall provide an interface to remove all entries from (delete
796
the contents of) an index or table b-tree, given the root page number of the
797
b-tree to empty.</span></p>
800
As one would expect, the B-Tree module also provides an interface to
801
insert and delete entries from b-tree structures. These operations are
802
performed using a B-Tree write cursor, a special type of <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a>
803
(see section <cite><a href="#hlr_reading_data" title="Reading From the Database Image">2.1.4</a></cite>).
805
<p class=req id=H50126><span>When opening a <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursor</a> using the interface required by <a class=reqlink href=#H50110>H50110</a>, it shall
806
be possible to specify that the new cursor be a write cursor, or an ordinary
807
read-only cursor.</span></p>
808
<p class=req id=H50127><span>The B-Tree module shall provide an interface that allows the user to delete the
809
b-tree entry that a write cursor points to, if any.</span> (C: <a class=reqlink href=#L50013>L50013</a>)</p>
810
<p class=req id=H50128><span>The B-Tree module shall provide an interface to insert new entries into a table
811
or index B-Tree, given a write cursor open on the table or index b-tree the new
812
entry is to be inserted into.</span> (C: <a class=reqlink href=#L50001>L50001</a> <a class=reqlink href=#L50002>L50002</a> <a class=reqlink href=#L50003>L50003</a> <a class=reqlink href=#L50004>L50004</a> <a class=reqlink href=#L50012>L50012</a>)</p>
815
Incremental vacuum step.
817
<h3 id="section_2_1_6">2.1.6 <a class=defnlink href="#glossary_Page_cache">Page-Cache</a> Configuration Requirements</h3>
821
A <a class=defnlink href="#glossary_Page_cache">page-cache</a> has a number of operational parameters that may be configured
822
at run-time via an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a>. Note that even though the
823
interfaces provided by this module allow these parameters to be set via a
824
<a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a>, they are properties of the <a class=defnlink href="#glossary_Page_cache">page-cache</a>, not
825
the <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a>. In situations where more than one <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree
826
database connection</a> is connected to a single <a class=defnlink href="#glossary_Page_cache">page-cache</a>, writes made via
827
one <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> may overwrite the values set by another.
828
The following table summarizes the available configuration parameters.
830
<table style="margin:1em auto;width:80%;border-spacing:0">
831
<tr style="text-align:left"> <th>Parameter <th>Description <th>Requirements
832
<tr style="text-align:left;background-color:#DDDDDD"> <td>Locking-mode
833
<td><span class=todo>This!</span>
834
<td><a class=reqlink href=#H50138>H50138</a>, <a class=reqlink href=#H50139>H50139</a>, <a class=reqlink href=#H50140>H50140</a>
835
<tr style="text-align:left"> <td>Journal-mode
836
<td><span class=todo>This!</span>
837
<td><a class=reqlink href=#H50141>H50141</a>, <a class=reqlink href=#H50142>H50142</a>, <a class=reqlink href=#H50143>H50143</a>, <a class=reqlink href=#H50144>H50144</a>, <a class=reqlink href=#H50145>H50145</a>, <a class=reqlink href=#H50146>H50146</a>
838
<tr style="text-align:left;background-color:#DDDDDD"> <td>Journal-file size limit
839
<td>The journal-file size limit parameter may be set to any integer
840
value within the range of a 64-bit signed integer. Any negative
841
values is interpreted as "no limit". Otherwise, if the
842
journal-file size limit is set to zero or a positive number, it
843
represents an upper limit on the size of the journal file in
844
bytes. If the application executes a database write operation that
845
would normally cause the journal file to grow larger than this
846
configured limit, the operation fails and an error is returned
847
to the user. The default value of this parameter is -1 (no
849
<td><a class=reqlink href=#H50147>H50147</a>, <a class=reqlink href=#H50148>H50148</a>, <a class=reqlink href=#H50149>H50149</a>
850
<tr style="text-align:left"> <td style="white-space:nowrap">Database-file size limit
851
<td>The database-image size limit parameter may be set to any integer
852
value greater than zero within the range of a 32-bit signed
853
integer. The configured value represents an upper limit on the size of
854
the database image in pages. If the application executes a
855
database write operation that would normally cause the database image to
856
grow larger than this configured limit, the operation fails and
857
an error is returned to the user.
858
<td><a class=reqlink href=#H50150>H50150</a>, <a class=reqlink href=#H50151>H50151</a>, <a class=reqlink href=#H50152>H50152</a>
859
<tr style="text-align:left;background-color:#DDDDDD"> <td>Cache size
860
<td>The cache-size parameter may be set to any integer value. How it
861
affects operation depends on the specific P-Cache implementation used
862
by the <a class=defnlink href="#glossary_Page_cache">page-cache</a>. <span class=todo>Refer to details for the
863
behaviour of the built-in default P-Cache.</span>
864
<td><a class=reqlink href=#H50153>H50153</a>
865
<tr style="text-align:left"> <td>Safety level
866
<td>The safety-level parameter may be set to "none", "normal" or "full".
867
<span class=todo> Where will the effect of this defined/required?</span>
868
<td><a class=reqlink href=#H50154>H50154</a>, <a class=reqlink href=#H50155>H50155</a>
871
<p class=req id=H50138><span>The B-Tree module shall provide an interface allowing the application to set
872
the locking-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> to either "normal" or "exclusive", given an
873
open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
874
<p class=req id=H50139><span>If the locking-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "normal" when a read/write
875
or read-only transaction is ended, any locks held on the database file-system
876
representation by the <a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be relinquished.</span></p>
877
<p class=req id=H50140><span>If the locking-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "exclusive" when a read/write
878
or read-only transaction is ended, any locks held on the database file-system
879
representation by the <a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be retained.</span></p>
882
And if a read/write transaction is downgraded to a read-only transaction?
883
This scenario should also be dealt with in section <cite><a href="#hlr_transactions" title="Transaction and Savepoint Functions">2.1.3</a></cite>.
885
<p class=req id=H50141><span>The B-Tree module shall provide an interface allowing the application to set
886
the journal-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> to one of "off", "memory", "delete",
887
"persist", or "truncate", given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that
888
<a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
889
<p class=req id=H50142><span>If the journal-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "off" when a read/write
890
transaction is opened, then the transaction shall use no journal file.</span></p>
891
<p class=req id=H50143><span>If the journal-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "memory" when a read/write
892
transaction is opened, then instead of using the journal file located in the
893
file-system, journal-file data shall be stored in main-memory.</span></p>
894
<p class=req id=H50144><span>If the journal-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "delete" when a read/write
895
transaction is opened, then any journal file used by the transaction shall
896
be deleted at the conclusion of the transaction.</span></p>
897
<p class=req id=H50145><span>If the journal-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "truncate" when a read/write
898
transaction is opened, then any journal file used by the transaction shall
899
be truncated to zero bytes in size at the conclusion of the transaction.</span></p>
900
<p class=req id=H50146><span>If the journal-mode of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> is set to "persist" when a read/write
901
transaction is opened, then any journal file used by the transaction shall
902
remain in the file-system at the conclusion of the transaction.</span></p>
905
The difference in functionality provided by "off", "memory" and the 3
906
modes that use a real journal file should also feature in
907
<cite><a href="#hlr_transactions" title="Transaction and Savepoint Functions">2.1.3</a></cite>.
909
<p class=req id=H50147><span>The B-Tree module shall provide an interface to set the value of the
910
journal-file size limit configuration parameter of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given
911
an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
912
<p class=req id=H50148><span>The default value assigned to the journal-file size limit configuration of a
913
<a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be -1.</span></p>
914
<p class=req id=H50149><span>If the journal-file size limit parameter is set to a non-negative value, and
915
the user executes a write operation that would otherwise require the journal
916
file to be extended to a size greater than the configured value in bytes, then
917
the operation shall fail and an error be returned to the user.</span></p>
919
<p class=req id=H50150><span>The B-Tree module shall provide an interface to set the value of the
920
database-image size limit configuration parameter of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given
921
an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
922
<p class=req id=H50151><span>The default value assigned to the database-image size limit configuration of a
923
<a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be the value of the compile time symbol SQLITE_MAX_PAGE_COUNT
924
(1073741823 by default).</span></p>
925
<p class=req id=H50152><span>If the database-image size limit parameter is set to a non-negative value, and
926
the user executes a write operation that would otherwise require the journal
927
file to be extended to a size greater than the configured value in bytes, then
928
the operation shall fail and an error be returned to the user.</span></p>
930
<p class=req id=H50153><span>The B-Tree module shall provide an interface to set the value of the
931
cache-size configuration parameter of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree
932
database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
935
See section <cite><a href="#hlr_memory" title="Caching and Memory Management Requirements">2.2.1</a></cite> for a description of and requirements
936
specifying how the value of the cache-size parameter affects the
937
operation of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>. <span class=todo>Check this reference is
938
relevant after it is written. Refer to a specific requirement if possible
941
<p class=req id=H50154><span>The B-Tree module shall provide an interface allowing the application to set
942
the safety-level of a <a class=defnlink href="#glossary_Page_cache">page-cache</a> to one of "off", "normal" or "full",
943
given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
944
<p class=req id=H50155><span>The default value assigned to the safety-level configuration parameter of a
945
<a class=defnlink href="#glossary_Page_cache">page-cache</a> shall be "full".</span></p>
948
Description of what the safety-level actually does. Or pointer to where a
949
description and requirements can be found (transactions section?).
952
Interface to set the codec function (encryption).
955
The busy-handler. Where exactly does this come in? Transactions and
959
The six <a class=defnlink href="#glossary_Page_cache">page-cache</a> operational parameters listed above may also be
960
queried. The following requirements specify the required query
963
<p class=req id=H50132><span>The B-Tree module shall provide an interface to query the current locking-mode
964
of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
965
<p class=req id=H50133><span>The B-Tree module shall provide an interface to query the current journal-mode
966
of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
967
<p class=req id=H50134><span>The B-Tree module shall provide an interface to query the current journal file
968
size-limit of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that
969
<a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
970
<p class=req id=H50135><span>The B-Tree module shall provide an interface to query the current database file
971
size-limit of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that
972
<a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
973
<p class=req id=H50136><span>The B-Tree module shall provide an interface to query the current cache-size
974
of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
975
<p class=req id=H50137><span>The B-Tree module shall provide an interface to query the current safety-level
976
of a <a class=defnlink href="#glossary_Page_cache">page-cache</a>, given an open <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> to that <a class=defnlink href="#glossary_Page_cache">page-cache</a>.</span></p>
979
It is also possible to interrogate a b-tree database handle to determine
980
if it was opened on a temporary or <a class=defnlink href="#glossary_Persistent_database">persistent database</a>. An b-tree
981
database handle opened on a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> may be queried for the
982
name of (full-path to) either the database or journal file associated
983
with the open database.
985
<p class=req id=H50131><span>The B-Tree module shall provide an interface to query an open b-tree database
986
handle to determine if the underlying database is a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> or a
987
<a class=defnlink href="#glossary_Temporary_database">temporary database</a>.</span></p>
988
<p class=req id=H50129><span>The B-Tree module shall provide an interface allowing the application to query
989
a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> open on a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> for the name of the
990
underlying database file within the file-system.</span></p>
991
<p class=req id=H50130><span>The B-Tree module shall provide an interface allowing the application to query
992
a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> open on a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> for the name of the
993
underlying journal file within the file-system.</span></p>
995
<h3 id="section_2_1_7">2.1.7 Multi-User Database Requirements</h3>
998
<p class=req id=H50156><span>The b-tree module shall provide an interface allowing database clients to
999
acquire advisory read (shared) or write (exclusive) locks on a specific b-tree
1000
structure within the database.</span></p>
1003
<li> Lock on schema memory object.
1004
<li> Locks on b-tree tables.
1005
<li> "Unlock notify" feature.
1006
<li> Mutexes/thread-safety features.
1010
The b-tree module preventing deadlock (by always grabbing mutexes in
1011
order of BtShared pointer) should be required here.
1013
<h3 id="section_2_1_8">2.1.8 Backup/Vacuum API Requirements</h3>
1016
<li> Callbacks for backup module.
1017
<li> Page read/write APIs for backup module.
1020
<h3 id="section_2_1_9">2.1.9 Integrity Check Requirements</h3>
1023
<li> Callbacks for backup module.
1024
<li> Page read/write APIs for backup module.
1027
<h2 id="section_2_2">2.2 Other Requirements and Constraints</h2>
1030
<h3 id="hlr_memory">2.2.1 Caching and Memory Management Requirements</h3>
1033
<li> Memory allocation related features (pcache, scratch memory, other...).
1034
<li> Default pcache implementation (sqlite3_release_memory()).
1035
<li> Schema memory object allocation (destructor registration).
1038
<h3 id="section_2_2_2">2.2.2 Exception Handling Requirements</h3>
1042
System failure. Do not corrupt the database image.
1044
Three kinds of exception:
1047
<li> Malloc request failure.
1048
<li> Database image corruption.
1051
<h3 id="section_2_2_3">2.2.3 Well-Formedness Requirements</h3>
1054
<li> Identify the subset of file-format well-formedness requirements that
1055
this module is responsible for implementing.
1056
<li> Define how the module should respond to corrupt database files: don't
1057
crash, return SQLITE_CORRUPT as early as is practical. Should it also
1058
put the b-tree into a permanent error state?
1064
<h1 id="section_3">3 Module API</h1>
1068
Description of the interface in btree.h. Also other interfaces accessed by
1069
external modules. Including release_memory() and those pager interfaces that
1070
are accessed directly by other modules. All of these requirements will be
1071
descended/derived from requirements in the previous sections. Some of the
1072
text could/should be pulled in from btree.h.
1075
The name of sqlite3BtreeBeginStmt() should probably change to
1076
sqlite3BtreeOpenSavepoint(). Matches the pager layer and is a more
1077
accurate description of the function.
1080
There are only a few places in which the pager object is used directly,
1081
always to call some trivial get/set configuration function. These should
1082
be replaced somehow with sqlite3BtreeXXX() APIs. Also, the current
1083
approach is probably Ok, but worth looking it over for thread-safety
1087
It would be easier to write up if the dependency between the B-Tree
1088
layer and the sqlite3 structure did not exist. At present, it is used for:
1090
<br> * The unlock-notify feature (arguments to sqlite3ConnectionBlocked() are database handles),
1091
<br> * Accessing the SQLITE_ReadUncommitted flag,
1092
<br> * Invoking the busy-handler callback,
1093
<br> * During sqlite3BtreeOpen(), to find the VFS to use,
1094
<br> * Accessing the SQLITE_SharedCache flag (for setting it),
1095
<br> * To check the same B-Tree is not attached more than once in <a class=defnlink href="#glossary_Shared-cache_mode">shared-cache mode</a>,
1096
<br> * To link the B-Tree into the pointer-order list of shared-cache b-trees used by the same handle (used for mutexes).
1097
<br> * To determine if an in-memory sub-journal should be used.
1098
<br> * To know how many savepoints are open in BtreeBeginTrans().
1099
<br> * Many, many times to assert() that the db mutex is held when the b-tree layer is accessed..
1102
<h2 id="section_3_1">3.1 Opening and Closing Connections</h2>
1106
This section describes the API offered by the B-Tree module to other
1107
SQLite sub-systems to open and close <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connections</a>.
1109
<pre class=api>typedef struct Btree Btree;</pre>
1112
A <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a> is accessed by other SQLite sub-systems
1113
using an opaque handle, modelled in C code using the type "Btree*".
1115
<h3 id="section_3_1_1">3.1.1 sqlite3BtreeOpen</h3>
1118
<pre class=api>int sqlite3BtreeOpen(
1119
sqlite3_vfs *pVfs, /* VFS to use with this b-tree */
1120
const char *zFilename, /* Name of database file to open */
1121
sqlite3 *db, /* Associated database connection */
1122
Btree **ppBtree, /* Return open Btree* here */
1123
int flags, /* Flags */
1124
int vfsFlags /* Flags passed through to VFS open */
1127
<p class=req id=H51001><span>If successful, a call to the sqlite3BtreeOpen function shall return SQLITE_OK
1128
and set the value of *ppBtree to contain a new <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a>
1129
handle.</span> (P: <a class=reqlink href=#H50010>H50010</a>)</p>
1130
<p class=req id=H51002><span>If unsuccessful, a call to the sqlite3BtreeOpen function shall return an <a class=defnlink href="#glossary_SQLite_Error_Code">SQLite
1131
error code</a> other than SQLITE_OK indicating the reason for the failure. The
1132
value of *ppBtree shall not be modified in this case.</span></p>
1134
<p class=req id=H51003><span>If the zFilename parameter to a call to sqlite3BtreeOpen is NULL or a pointer
1135
to a buffer of which the first byte is a nul (0x00), then sqlite3BtreeOpen
1136
shall attempt to open a connection to a <a class=defnlink href="#glossary_Temporary_database">temporary database</a>.</span></p>
1137
<p class=req id=H51004><span>If the zFilename parameter to a call to sqlite3BtreeOpen is a pointer to a
1138
buffer containing a nul-terminated UTF-8 encoded string, sqlite3BtreeOpen shall
1139
attempt to open a connection to a <a class=defnlink href="#glossary_Persistent_database">persistent database</a>.</span></p>
1142
The combination of the above two requirements implies that if the
1143
zFilename argument passed to sqlite3BtreeOpen is other than a NULL
1144
pointer or a pointer to a nul-terminated string, the type of or
1145
filename of the database that sqlite3BtreeOpen attempts to open a
1146
connection to are undefined.
1149
Valid values for the <i>flags</i> argument to the sqlite3BtreeOpen
1150
function consist of the bitwise OR of zero or more of the following
1153
<pre class=api>#define BTREE_OMIT_JOURNAL 1 /* Do not create or use a rollback journal */
1156
<p class=req id=H51005><span>If the BTREE_OMIT_JOURNAL bit is set in the flags parameter passed to a
1157
successful call to sqlite3BtreeOpen to open a <a class=defnlink href="#glossary_Temporary_database">temporary database</a>, then the
1158
<a class=defnlink href="#glossary_Page_cache">page-cache</a> created as a result shall not open or use a journal file for any
1162
When opening a connection to a <a class=defnlink href="#glossary_Persistent_database">persistent database</a>, the value of the
1163
BTREE_OMIT_JOURNAL bit in the flags parameter is ignored by
1166
<p class=req id=H51006><span>If the BTREE_NO_READLOCK bit is set in the flags parameter passed to a
1167
successful call to sqlite3BtreeOpen to open a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> and a
1168
new <a class=defnlink href="#glossary_Page_cache">page-cache</a> is created as a result of the call, then the new <a class=defnlink href="#glossary_Page_cache">page-cache</a>
1169
shall only lock the database file-system representation when writing to
1173
When opening a connection to a <a class=defnlink href="#glossary_Temporary_database">temporary database</a>, the value of the
1174
BTREE_NO_READLOCK bit in the flags parameter is ignored, as <a class=defnlink href="#glossary_Temporary_database">temporary
1175
databases</a> are never locked for either reading or writing
1176
(<span class=todo>reference to some requirement for this statement.</span>).
1177
Whether or not a new <a class=defnlink href="#glossary_Page_cache">page-cache</a> is created when a connection to a
1178
<a class=defnlink href="#glossary_Persistent_database">persistent database</a> is opened is governed by requirements <a class=reqlink href=#H50040>H50040</a> and
1179
<a class=reqlink href=#H50050>H50050</a>.
1181
<p class=req id=H51007><span>If the sqlite3BtreeOpen function is called to open a connection to a <a class=defnlink href="#glossary_Persistent_database">persistent
1182
database</a>, and the call causes a new <a class=defnlink href="#glossary_Page_cache">page-cache</a> to be created, when opening the
1183
database file using the VFS interface xOpen method the 4th parameter passed to
1184
xOpen (flags) shall be a copy of the vfsFlags value passed to sqlite3BtreeOpen.</span></p>
1185
<p class=req id=H51008><span>If the sqlite3BtreeOpen function is called to open a connection to a <a class=defnlink href="#glossary_Temporary_database">temporary
1186
database</a>, if and when a temporary file is opened to use as secondary storage
1187
using the VFS interface xOpen method the 4th parameter passed to xOpen (flags)
1188
shall be a copy of the vfsFlags value passed to sqlite3BtreeOpen with the
1189
SQLITE_OPEN_READWRITE, SQLITE_OPEN_CREATE, SQLITE_OPEN_EXCLUSIVE and
1190
SQLITE_OPEN_DELETEONCLOSE bits also set.</span></p>
1193
Requirements explaining how the db parameter to sqlite3BtreeOpen is used. Must be there for something.
1195
<h3 id="section_3_1_2">3.1.2 sqlite3BtreeClose</h3>
1198
<pre class=api>int sqlite3BtreeClose(Btree*);</pre>
1200
<p class=req id=H51009><span>A call to the sqlite3BtreeClose function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database
1201
connection</a> handle passed as the only argument shall invalidate the handle,
1202
close the <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> and release all associated resources.</span></p>
1205
If a call to sqlite3BtreeClose is made with a value that is not a valid
1206
<a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle passed as the only argument, the
1207
results are undefined.
1209
<p class=req id=H51010><span>If a call to sqlite3BtreeClose is made to close a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a>
1210
while there exist open <a class=defnlink href="#glossary_B-Tree_Cursor">B-Tree cursors</a> that were opened using the specified
1211
<a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a>, they shall be closed automatically from within
1212
sqlite3BtreeClose, just as if their handles were passed to
1213
sqlite3BtreeCloseCursor.</span></p>
1216
See also requirement <a class=reqlink href=#H50070>H50070</a>.
1219
<h2 id="section_3_2">3.2 Database Image Configuration</h2>
1223
This category doesn't work all that well. These APIs are used for other
1224
things too (i.e. switching to incremental-vacuum mode).
1226
<pre class=api>#define BTREE_AUTOVACUUM_NONE 0 /* Do not do auto-vacuum */
1227
#define BTREE_AUTOVACUUM_FULL 1 /* Do full auto-vacuum */
1228
#define BTREE_AUTOVACUUM_INCR 2 /* Incremental vacuum */</pre>
1229
<pre class=api>int sqlite3BtreeSetAutoVacuum(Btree *, int);
1230
int sqlite3BtreeSetPageSize(Btree *p, int nPagesize, int nReserve, int eFix);</pre>
1235
<pre class=api>int sqlite3BtreeGetPageSize(Btree*);</pre>
1236
<pre class=api>int sqlite3BtreeGetReserve(Btree*);</pre>
1237
<pre class=api>int sqlite3BtreeGetAutoVacuum(Btree *);</pre>
1239
<h2 id="section_3_3">3.3 Connection Configuration</h2>
1242
<pre class=api>int sqlite3BtreeSetCacheSize(Btree*,int);
1243
int sqlite3BtreeSetSafetyLevel(Btree*,int,int,int);
1244
int sqlite3BtreeMaxPageCount(Btree*,int);</pre>
1247
And functions to query the current configuration:
1249
<pre class=api>int sqlite3BtreeSyncDisabled(Btree*);</pre>
1251
<h2 id="section_3_4">3.4 Query Interfaces</h2>
1254
<pre class=api>const char *sqlite3BtreeGetFilename(Btree *);</pre>
1256
<p class=req id=H51011><span>A call to the sqlite3BtreeGetFilename function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database
1257
connection</a> handle opened on a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> as the first argument shall
1258
return a pointer to a buffer containing the full-path of the database file
1259
formatted as a nul-terminated, UTF-8 string.</span></p>
1260
<p class=req id=H51012><span>A call to the sqlite3BtreeGetFilename function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database
1261
connection</a> handle opened on a <a class=defnlink href="#glossary_Temporary_database">temporary database</a> as the first argument shall
1262
return a pointer to a buffer to a nul-terminated string zero bytes in length
1263
(i.e. the first byte of the buffer shall be 0x00).</span></p>
1265
<pre class=api>const char *sqlite3BtreeGetJournalname(Btree *);</pre>
1267
<p class=req id=H51013><span>A call to the sqlite3BtreeGetJournalname function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database
1268
connection</a> handle opened on a <a class=defnlink href="#glossary_Persistent_database">persistent database</a> as the first argument shall
1269
return a pointer to a buffer containing the full-path of the journal file
1270
formatted as a nul-terminated, UTF-8 string.</span></p>
1271
<p class=req id=H51014><span>A call to the sqlite3BtreeGetJournalname function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database
1272
connection</a> handle opened on a <a class=defnlink href="#glossary_Temporary_database">temporary database</a> as the first argument shall
1273
return a pointer to a buffer to a nul-terminated string zero bytes in length
1274
(i.e. the first byte of the buffer shall be 0x00).</span></p>
1277
Requirement <a class=reqlink href=#H51013>H51013</a> holds true even if the <a class=defnlink href="#glossary_B-Tree_Database_Connection">B-Tree database connection</a> is
1278
configured to use an in-memory journal file or no journal file at all
1279
(<span class=todo>ref requirements</span>). In these cases the buffer returned
1280
contains the full-path of the journal file that would be used if the
1281
connection were configured to use a journal file.
1283
<h2 id="section_3_5">3.5 Mutex Functions</h2>
1289
<pre class=api>void sqlite3BtreeEnter(Btree*);
1290
void sqlite3BtreeEnterAll(sqlite3*);
1291
void sqlite3BtreeLeave(Btree*);
1292
void sqlite3BtreeEnterCursor(BtCursor*);
1293
void sqlite3BtreeLeaveCursor(BtCursor*);
1294
void sqlite3BtreeLeaveAll(sqlite3*);
1299
<h2 id="section_3_6">3.6 Transaction and Savepoint API</h2>
1302
<pre class=api>int sqlite3BtreeBeginTrans(Btree*,int);
1303
int sqlite3BtreeCommitPhaseOne(Btree*, const char *zMaster);
1304
int sqlite3BtreeCommitPhaseTwo(Btree*, int);
1305
int sqlite3BtreeCommit(Btree*);
1306
int sqlite3BtreeRollback(Btree*,int);</pre>
1308
<pre class=api>int sqlite3BtreeBeginStmt(Btree*,int);
1309
int sqlite3BtreeSavepoint(Btree *, int, int);</pre>
1311
<pre class=api>int sqlite3BtreeIsInTrans(Btree*);
1312
int sqlite3BtreeIsInReadTrans(Btree*);
1313
int sqlite3BtreeIsInBackup(Btree*);</pre>
1316
<h2 id="section_3_7">3.7 Reading and Traversing the Database Image</h2>
1320
sqlite3BtreeMoveto is never called from outside of the b-tree layer. It
1321
could/should be removed from the API.
1324
The "bias" argument to sqlite3BtreeMovetoUnpacked is only ever true
1325
when it is called from within sqlite3BtreeInsert. This argument could/should
1326
also be removed from the API, if only to make it simpler to describe.
1328
<pre class=api>typedef struct BtCursor BtCursor;</pre>
1330
<pre class=api>int sqlite3BtreeCursor(
1331
Btree*, /* BTree containing table to open */
1332
int iTable, /* Index of root page */
1333
int wrFlag, /* 1 for writing. 0 for read-only */
1334
struct KeyInfo*, /* First argument to compare function */
1335
BtCursor *pCursor /* Space to write cursor structure */
1337
int sqlite3BtreeCursorSize(void);</pre>
1338
<pre class=api>int sqlite3BtreeCloseCursor(BtCursor*);
1339
void sqlite3BtreeClearCursor(BtCursor *);</pre>
1341
<pre class=api>int sqlite3BtreeFirst(BtCursor*, int *pRes);
1342
int sqlite3BtreeLast(BtCursor*, int *pRes);
1343
int sqlite3BtreeNext(BtCursor*, int *pRes);
1344
int sqlite3BtreePrevious(BtCursor*, int *pRes);
1345
int sqlite3BtreeEof(BtCursor*);</pre>
1347
<pre class=api>int sqlite3BtreeKeySize(BtCursor*, i64 *pSize);
1348
int sqlite3BtreeKey(BtCursor*, u32 offset, u32 amt, void*);
1349
const void *sqlite3BtreeKeyFetch(BtCursor*, int *pAmt);
1350
const void *sqlite3BtreeDataFetch(BtCursor*, int *pAmt);
1351
int sqlite3BtreeDataSize(BtCursor*, u32 *pSize);
1352
int sqlite3BtreeData(BtCursor*, u32 offset, u32 amt, void*);</pre>
1354
<pre class=api>int sqlite3BtreeCount(BtCursor *, i64 *);</pre>
1356
<h3 id="section_3_7_1">3.7.1 sqlite3BtreeMovetoUnpacked</h3>
1360
The sqlite3BtreeMovetoUnpacked function is used to
1362
<pre class=api>int sqlite3BtreeMovetoUnpacked(
1364
UnpackedRecord *pUnKey,
1371
The following requirements specify exactly how a <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> is to be moved
1372
by a successful call to sqlite3BtreeMovetoUnpacked.
1374
<p class=req id=L50008><span>If a call is made to sqlite3BtreeMovetoUnpacked specifying a key value for
1375
which there exists an entry with a matching key value in the b-tree structure,
1376
the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> shall be moved to point to this entry. In this case *pRes
1377
(the value of the "int" variable pointed to by the pointer passed as the
1378
fifth parameter to sqlite3BtreeMovetoUnpacked) shall be set to 0 before
1379
returning.</span></p>
1380
<p class=req id=L50009><span>If a call is made to sqlite3BtreeMovetoUnpacked specifying a key value for
1381
which there does not exist an entry with a matching key value in the b-tree
1382
structure, the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> shall be moved to point to an entry located
1383
on the leaf page that would contain the requested entry, were it present.</span></p>
1384
<p class=req id=L50010><span>If the condition specified in <a class=reqlink href=#L50009>L50009</a> is met and the b-tree structure
1385
contains one or more entries (is not empty), the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> shall be left
1386
pointing to an entry that would lie adjacent (immediately before or after in
1387
order by key) to the requested entry on the leaf page, were it present.</span></p>
1388
<p class=req id=L50011><span>If the condition specified in <a class=reqlink href=#L50009>L50009</a> is met and the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> is left
1389
pointing to an entry with a smaller key than that requested, or the cursor
1390
is left pointing a no entry at all because the b-tree structure is completely
1391
empty, *pRes (the value of the "int" variable pointed to by the pointer passed
1392
as the fifth parameter to sqlite3BtreeMovetoUnpacked) shall be set to -1.
1393
Otherwise, if the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> is left pointing to an entry with a larger key
1394
than that requested, *pRes shall be set to 1.</span></p>
1398
Not clear how to deal with these. Define an external module to
1399
encapsulate these and define sorting order etc.? That's tricky as things are
1400
because the UnpackedRecord.flags field defines the "search mode" used
1401
by sqlite3BtreeMovetoUnpacked.
1403
<pre class=api>typedef struct KeyInfo KeyInfo;
1405
sqlite3 *db; /* The database connection */
1406
u8 enc; /* Text encoding - one of the SQLITE_UTF* values */
1407
u16 nField; /* Number of entries in aColl}; hd_resolve_one {}; hd_puts { */
1408
u8 *aSortOrder; /* Sort order for each column. May be NULL */
1409
CollSeq *aColl}; hd_resolve_one {1}; hd_puts {; /* Collating sequence for each term of the key */
1411
<pre class=api>typedef struct UnpackedRecord UnpackedRecord;
1412
struct UnpackedRecord {
1413
KeyInfo *pKeyInfo; /* Collation and sort-order information */
1414
u16 nField; /* Number of entries in apMem}; hd_resolve_one {}; hd_puts { */
1415
u8 flags; /* Boolean settings. UNPACKED_... below */
1416
i64 rowid; /* Used by UNPACKED_PREFIX_SEARCH */
1417
Mem *aMem; /* Values */
1420
<h3 id="section_3_7_2">3.7.2 sqlite3BtreeGetMeta</h3>
1424
The sqlite3BtreeGetMeta interface may be used to retrieve the current
1425
value of certain fields from the database image header.
1427
<pre class=api>void sqlite3BtreeGetMeta(Btree *pBtree, int idx, u32 *pValue);</pre>
1429
<p class=req id=H51015><span>If the first parameter is a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle with an open
1430
read-only or read-write transaction, and the second parameter is an integer
1431
between 0 and 7 inclusive, and the database image consists of zero pages,
1432
a call to the sqlite3BtreeGetMeta function shall set the value of *pValue to
1433
zero.</span> (P: <a class=reqlink href=#H50109>H50109</a>)</p>
1434
<p class=req id=H51016><span>If the first parameter is a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle with an open
1435
read-only or read-write transaction, and the second parameter is an integer
1436
between 0 and 7 inclusive, and the database image consists of one or more
1437
pages, a call to the sqlite3BtreeGetMeta function shall set the value of
1438
*pValue to the current value of the specified 32-bit unsigned integer in the
1439
database image database header.</span> (P: <a class=reqlink href=#H50109>H50109</a>)</p>
1442
The two requirements above imply that if sqlite3BtreeGetMeta is called with
1443
anything other than a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle with an open read-only
1444
or read-write transaction as the first argument, or with anything other than
1445
an integer between 0 and 7 (inclusive) as the second, the results are undefined.
1447
<pre class=api>#define BTREE_FREE_PAGE_COUNT 0
1448
#define BTREE_SCHEMA_VERSION 1
1449
#define BTREE_FILE_FORMAT 2
1450
#define BTREE_DEFAULT_CACHE_SIZE 3
1451
#define BTREE_LARGEST_ROOT_PAGE 4
1452
#define BTREE_TEXT_ENCODING 5
1453
#define BTREE_USER_VERSION 6
1454
#define BTREE_INCR_VACUUM 7</pre>
1456
<p class=req id=H51017><span>The database header field read from the database image by a call to
1457
sqlite3BtreeGetMeta in the situation specified by <a class=reqlink href=#H51016>H51016</a> shall be the 32-bit
1458
unsigned integer header field stored at byte offset (36 + 4 * idx) of the
1459
database header, where idx is the value of the second parameter passed to
1460
sqlite3BtreeGetMeta.</span> (P: <a class=reqlink href=#H50109>H50109</a>)</p>
1465
<h2 id="section_3_8">3.8 Modifying the Database Image</h2>
1468
<h3 id="sqlite3BtreeCreateTable">3.8.1 sqlite3BtreeCreateTable</h3>
1470
<pre class=api>int sqlite3BtreeCreateTable(Btree*, int*, int flags);</pre>
1471
<pre class=api>** Every SQLite table must have either BTREE_INTKEY or BTREE_BLOBKEY set.
1475
<h3 id="sqlite3BtreeDropTable">3.8.2 sqlite3BtreeDropTable</h3>
1477
<pre class=api>int sqlite3BtreeDropTable(Btree*, int, int*);</pre>
1479
<h3 id="sqlite3BtreeClearTable">3.8.3 sqlite3BtreeClearTable</h3>
1481
<pre class=api>int sqlite3BtreeClearTable(Btree*, int, int*);</pre>
1483
<h3 id="sqlite3BtreeCursorHasMoved">3.8.4 sqlite3BtreeCursorHasMoved</h3>
1485
<pre class=api>int sqlite3BtreeCursorHasMoved(BtCursor*, int*);</pre>
1487
<h3 id="sqlite3BtreePutData">3.8.5 sqlite3BtreePutData</h3>
1489
<pre class=api>int sqlite3BtreePutData(BtCursor*, u32 offset, u32 amt, void*);</pre>
1491
<h3 id="sqlite3BtreeUpdateMeta">3.8.6 sqlite3BtreeUpdateMeta</h3>
1493
<pre class=api>int sqlite3BtreeUpdateMeta(Btree*, int idx, u32 value);</pre>
1495
<h3 id="sqlite3BtreeDelete">3.8.7 sqlite3BtreeDelete</h3>
1498
<pre class=api>int sqlite3BtreeDelete(BtCursor*);</pre>
1500
<p class=req id=L50013><span>A successful call to the sqlite3BtreeDelete function made with a read/write
1501
<a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> passed as the first argument shall remove the entry pointed to by
1502
the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> from the b-tree structure.</span> (P: <a class=reqlink href=#H50127>H50127</a>)</p>
1505
Effect of a delete operation on other cursors that are pointing to the
1506
deleted b-tree entry.
1509
Malloc and IO error handling. Same as for sqlite3BtreeInsert.
1511
<h3 id="section_3_8_8">3.8.8 sqlite3BtreeInsert</h3>
1514
<pre class=api>int sqlite3BtreeInsert(BtCursor*, const void *pKey, i64 nKey,
1515
const void *pData, int nData,
1516
int nZero, int bias, int seekResult);</pre>
1518
<p class=req id=L50001><span>A successful call to the sqlite3BtreeInsert function made with a read/write
1519
<a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> passed as the first argument shall insert a new entry into
1520
the b-tree structure the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> is open on.</span> (P: <a class=reqlink href=#H50128>H50128</a>)</p>
1523
The requirement above implies that the results of passing anything else as
1524
the first argument to sqlite3BtreeInsert, for example a read-only <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a>,
1527
<p class=req id=L50012><span>If a call to sqlite3BtreeInsert is made to insert an entry specifying a key
1528
value for which there already exists a matching key within the b-tree
1529
structure, the entry with the matching key shall be removed from the b-tree
1530
structure before the new entry is inserted.</span> (P: <a class=reqlink href=#H50128>H50128</a>)</p>
1533
In other words, the sqlite3BtreeInsert API could easily be renamed
1534
sqlite3BtreeInsertOrReplace. <span class=todo>We will probably need a module
1535
requirement for the "replace" operation.</span>
1537
<p class=req id=L50002><span>If the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> passed to sqlite3BtreeInsert as the first argument is
1538
open on a table b-tree, then the values passed as the second parameter (pKey)
1539
shall be ignored. The value passed as the third parameter (nKey) shall be
1540
used as the integer key for the new entry.</span> (P: <a class=reqlink href=#H50128>H50128</a>)</p>
1541
<p class=req id=L50003><span>If the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> passed to sqlite3BtreeInsert as the first argument is
1542
open on a table b-tree, then the database record associated with the new entry
1543
shall consist of a copy of the first nData bytes of the buffer pointed to by pData
1544
followed by nZero zero (0x00) bytes, where pData, nData and nZero are the
1545
fourth, fifth and sixth parameters passed to sqlite3BtreeInsert, respectively.</span> (P: <a class=reqlink href=#H50128>H50128</a>)</p>
1546
<p class=req id=L50004><span>If the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> passed to sqlite3BtreeInsert as the first argument is
1547
open on an index b-tree, then the values passed as the fourth, fifth and sixth
1548
parameters shall be ignored. The key (a database record) used by the new entry
1549
shall consist of the first nKey bytes of the buffer pointed to by pKey, where
1550
pKey and nKey are the second and third parameters passed to sqlite3BtreeInsert,
1551
respectively.</span> (P: <a class=reqlink href=#H50128>H50128</a>)</p>
1554
The following requirements describe the seventh and eighth paramaters passed
1555
to the sqlite3BtreeInsert function. Both of these are used to provide extra
1556
information used by sqlite3BtreeInsert to optimize the insert operation. They
1557
may be safely ignored by alternative b-tree implementations.
1560
There should be some rationalization for these, eventually. Some traceability
1561
from somewhere to show how the b-tree module offering these slightly esoteric
1562
interfaces is helpful to SQLite overall.
1564
<p class=req id=L50005><span>If the value passed as the seventh parameter to a call to sqlite3BtreeInsert
1565
is non-zero, sqlite3BtreeInsert shall interpret this to mean that it is likely
1566
(but not certain) that the key belonging to the new entry is larger than the
1567
largest key currently stored in the b-tree structure, and optimize accordingly.</span></p>
1568
<p class=req id=L50006><span>If the value passed as the eighth parameter to a call to sqlite3BtreeInsert
1569
is non-zero, then the B-Tree module shall interpret this to mean that the
1570
the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> has already been positioned by a successful call to
1571
sqlite3BtreeMovetoUnpacked specifying the same key value as is being inserted,
1572
and that sqlite3BtreeMovetoUnpacked has set the output value required by <a class=reqlink href=#L50011>L50011</a> to
1573
this value.</span></p>
1576
If a non-zero value is passed as the eighth parameter to sqlite3BtreeInsert
1577
and the <a class=defnlink href="#glossary_B-Tree_Cursor">b-tree cursor</a> has not been positioned as assumed by <a class=reqlink href=#L50006>L50006</a>, the
1578
results are undefined.
1581
Malloc and IO error handling. Maybe these should be grouped together
1582
for a whole bunch of APIs. And hook into the above via a definition of
1585
<h3 id="sqlite3BtreeIncrVacuum">3.8.9 sqlite3BtreeIncrVacuum</h3>
1587
<pre class=api>int sqlite3BtreeIncrVacuum(Btree *);</pre>
1589
<h2 id="section_3_9">3.9 Advisory B-Tree Locks</h2>
1593
This section describes the b-tree module interfaces used for acquiring
1594
and querying the advisory locks that can be placed on database image
1595
pages. The locking mechanisms described in this section are only used
1596
to arbitrate between multiple clients of the same in-memory <a class=defnlink href="#glossary_Page_cache">page-cache</a>.
1597
The locking mechanism used to control access to a file-system
1598
representation of the database when multiple in-memory <a class=defnlink href="#glossary_Page_cache">page caches</a>
1599
(possibly located in different OS processes) are open on it is
1600
described in <span class=todo>this</span>.
1603
As well as obtaining advisory locks explicitly using the
1604
sqlite3BtreeLockTable API (see below), a read-lock on page 1 of the
1605
database image is automatically obtained whenever a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database
1606
connection</a> opens a read-only or read-write transaction (see
1607
<span class=todo>requirement number</span>). Note that this means
1608
that a write-lock on page 1 is effectively an exclusive lock on
1609
the entire <a class=defnlink href="#glossary_Page_cache">page-cache</a>, as it prevents any other connection from opening
1610
a transaction of any kind.
1612
<h3 id="section_3_9_1">3.9.1 sqlite3BtreeLockTable</h3>
1615
<pre class=api>int sqlite3BtreeLockTable(Btree *pBtree, int iTab, u8 isWriteLock);</pre>
1618
The sqlite3BtreeLockTable API allows database clients to place
1619
advisory read or write locks on a specified page of the database
1620
image. The specified page need not exist within the database image.
1621
By convention, SQLite acquires read and write locks on the root
1622
pages of table b-trees only, but this is not required to be enforced
1623
by the b-tree module. Locks may only be obtained when a database
1624
client has an open transaction. All locks are automatically released
1625
when the open transaction is concluded.
1627
<p class=req id=L50016><span>A call to sqlite3BtreeLockTable, specifying a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle
1628
with an open read-only or read-write transaction as the first parameter, and
1629
zero as the third parameter, shall attempt to obtain a read-lock on the database
1630
page specified by the second parameter.</span></p>
1631
<p class=req id=L50017><span>A call to sqlite3BtreeLockTable, specifying a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle
1632
with an open read-write transaction as the first parameter, and a non-zero value as
1633
the third parameter, shall attempt to obtain a write-lock on the database
1634
page specified by the second parameter.</span></p>
1637
The two requirements above imply that the results of calling
1638
sqlite3BtreeLockTable on a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle that does
1639
not currently have an open transaction, or attempting to obtain
1640
a write-lock using a <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> handle that only has
1641
a read-only transaction open are undefined.
1644
<p class=req id=L50019><span>If, when attempting to obtain a read-lock as described in <a class=reqlink href=#L50016>L50016</a>, there exists
1645
another <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> connected to the same <a class=defnlink href="#glossary_Page_cache">page-cache</a> that is
1646
holding a write-lock on the same database image page, the read-lock shall not
1647
be granted and the call to sqlite3BtreeLockTable shall return SQLITE_LOCKED_SHAREDCACHE.</span></p>
1648
<p class=req id=L50020><span>If, when attempting to obtain a write-lock as described in <a class=reqlink href=#L50017>L50017</a>, there exists
1649
another <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> connected to the same <a class=defnlink href="#glossary_Page_cache">page-cache</a> that is
1650
holding a read or write-lock on the same database image page, the write-lock
1651
shall not be granted and the call to sqlite3BtreeLockTable shall return
1652
SQLITE_LOCKED_SHAREDCACHE.</span></p>
1655
Requirement <a class=reqlink href=#L50020>L50020</a> is overly conservative. Because a write-lock may
1656
only be requested if the <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> has an open read-write
1657
transaction (<a class=reqlink href=#L50017>L50017</a>), and at most a single <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a>
1658
may have such an open transaction at one time, it is not possible for
1659
a request for a write-lock to fail because another connection is holding
1660
a write-lock on the same b-tree database image page. It may, however,
1661
fail because another connection is holding a read-lock.
1664
All locks are held until the current transaction is concluded. Or, if
1665
a read-write transaction is downgraded to a read-only transaction, all
1666
currently held write-locks are changed to read-locks.
1668
<p class=req id=L50018><span>When a read-only or read-write transaction is concluded, all advisory b-tree locks
1669
held by the <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> shall be relinquished.</span></p>
1670
<p class=req id=L50021><span>When a read-write transaction is downgraded to a read-only transaction, all
1671
advisory b-tree write-locks held by the <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> shall be
1672
changed to read-locks.</span></p>
1674
<p class=todo> The requirement above should refer to some other
1675
requirement that defines when a read-write transaction is downgraded to
1676
a read-only transaction.
1678
<p class=todo> Malloc failure?
1680
<p class=todo> Read uncommitted flag. Maybe this should be handled
1681
outside of the b-tree module. Is there anything to stop connections
1682
with this flag set simply not obtaining read locks? There are assert()
1683
statements in the b-tree module that need to take this flag into account,
1684
but not actual functionality.
1686
<h3 id="section_3_9_2">3.9.2 sqlite3BtreeSchemaLocked</h3>
1688
<pre class=api>int sqlite3BtreeSchemaLocked(Btree *pBtree);</pre>
1690
<p class=req id=L50014><span>A call to the sqlite3BtreeSchemaLocked function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree
1691
database connection</a> as the only argument shall return SQLITE_LOCKED_SHAREDCACHE
1692
if there exists another <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree database connection</a> connected to the
1693
same <a class=defnlink href="#glossary_Page_cache">page-cache</a> that currently holds a write-lock on database image
1695
<p class=req id=L50015><span>A call to the sqlite3BtreeSchemaLocked function with a valid <a class=defnlink href="#glossary_B-Tree_Database_Connection">b-tree
1696
database connection</a> as the only argument shall return SQLITE_OK if
1697
<a class=reqlink href=#H51017>H51017</a> does not apply.</span></p>
1699
<h2 id="section_3_10">3.10 What do these do?</h2>
1704
Where do the following go?
1706
<pre class=api>char *sqlite3BtreeIntegrityCheck(Btree*, int *aRoot, int nRoot, int, int*);
1707
struct Pager *sqlite3BtreePager(Btree*);
1708
int sqlite3BtreeCopyFile(Btree *, Btree *);</pre>
1710
<pre class=api>void *sqlite3BtreeSchema(Btree *, int, void(*)(void *));
1711
int sqlite3BtreeSchemaLocked(Btree *pBtree);
1712
int sqlite3BtreeLockTable(Btree *pBtree, int iTab, u8 isWriteLock);
1713
void sqlite3BtreeTripAllCursors(Btree*, int);</pre>
1716
I know what the following do, but is this mechanism ever used? Or has it been superceded by other tricks in OP_NewRowid?
1718
<pre class=api>void sqlite3BtreeSetCachedRowid(BtCursor*, sqlite3_int64);
1719
sqlite3_int64 sqlite3BtreeGetCachedRowid(BtCursor*);</pre>
1722
Should move to btreeInt.h
1724
<pre class=api>typedef struct BtShared BtShared;</pre>
1726
<h2 id="section_3_11">3.11 APIs not branded sqlite3BtreeXXX()</h2>
1730
<li> sqlite3PagerLockingMode
1731
<li> sqlite3PagerJournalMode
1732
<li> sqlite3PagerIsMemdb (vacuum and backup).
1733
<li> sqlite3PagerJournalSizeLimit
1734
<li> sqlite3PagerFile (used by sqlite3_file_control() and pragma lock_proxy_file).
1735
<li> sqlite3PagerPagecount (pragma page_count and backup).
1736
<li> Page APIs used by backup routines:
1738
<li> sqlite3PagerGet
1739
<li> sqlite3PagerWrite
1740
<li> sqlite3PagerGetData
1741
<li> sqlite3PagerGetExtra
1742
<li> sqlite3PagerUnref
1743
<li> sqlite3PagerTruncateImage
1744
<li> sqlite3PagerSync
1745
<li> sqlite3PagerFile
1746
<li> sqlite3PagerCommitPhaseOne, sqlite3PagerCommitPhaseTwo
1747
<li> sqlite3PagerBackupPtr
1752
<h1 id="section_4">4 Module Implementation</h1>
1755
<h2 id="section_4_1">4.1 Database Image Traversal</h2>
1758
<h2 id="section_4_2">4.2 Database Image Manipulation</h2>
1762
This section should describe exactly how bits and bytes are shifted
1763
around when the database image is traversed and modified. i.e. how the b-tree
1764
balancing works, deleting an internal cell from an index b-tree etc.
1766
<h3 id="section_4_2_1">4.2.1 Creating a B-Tree Structure</h3>
1768
<h3 id="section_4_2_2">4.2.2 Clearing a B-Tree Structure</h3>
1770
<h3 id="section_4_2_3">4.2.3 Deleting a B-Tree Structure</h3>
1773
<h3 id="section_4_2_4">4.2.4 Inserting, Replacing and Deleting B-Tree Entries</h3>
1777
The following two sections describe the way entries are added and removed
1778
from B-Tree structures within a database image.
1781
As one might expect, the algorithms described in the following sections
1782
involve adding and removing b-tree cells to and from b-tree node pages.
1783
The format of b-tree node pages is described in detail in
1784
<cite><a href="#ref_file_format" title="">[1]</a></cite>. This document does not describe the exact
1785
way in which content is manipulated within a page, as these details are
1786
considered not considered high-level enough to be documented outside of
1787
the SQLite source code itself. For the purposes of the descriptions in
1788
the following sections, a b-tree node page is considered to be a container
1789
for an ordered list of b-tree cells. Cells may be inserted into or removed
1790
from any position in the ordered list as required.
1793
A b-tree node page has a finite capacity. If one of the algorithms
1794
described here is required to insert a cell into a b-tree node page,
1795
and there is not enough free space within the page to accommodate the
1796
cell, it is still nominally inserted into the requested position within
1797
the node, but becomes an <a class=defnlink href="#glossary_Overflow_Cell">overflow cell</a>. <a class=defnlink href="#glossary_Overflow_Cell">Overflow cells</a> never remain so
1798
for very long. If an insert, replace or delete entry operation creates
1799
one or more <a class=defnlink href="#glossary_Overflow_Cell">overflow cells</a>, the b-tree structure is rearranged so that
1800
all cells are stored within the body of a b-tree node page before the
1801
operation is considered complete. This process of rearranging the b-tree
1802
structure is termed b-tree balancing, and is described in section
1803
<cite><a href="#btree_balancing_algorithm" title="B-Tree Balancing Algorithm">4.2.5</a></cite>.
1806
<h4 id="section_4_2_4_1">4.2.4.1 B-Tree Insert/Replace Entry</h4>
1810
This section describes the way in which new entries may be inserted
1811
into a b-tree structure, and how existing entries may be replaced. Both
1812
of these operations are accessed using the sqlite3BtreeInsert API.
1815
An insert/replace operation involves the following steps:
1818
<li> Based on the supplied key and value, and the type of b-tree being
1819
inserted into, allocate and populate any required overflow pages.
1820
<span class=todo>Should reference file-format requirements that
1821
provide the formula for doing this.</span>
1823
<li> Attempt to move the b-tree write cursor to an entry with a key
1824
that matches the new key being inserted. If a matching entry is
1825
found, then the operation is a replace. Otherwise, if the key is
1826
not found, an insert.
1829
<li> Requirements <a class=reqlink href=#L50008>L50008</a>, <a class=reqlink href=#L50009>L50009</a>, <a class=reqlink href=#L50010>L50010</a> and <a class=reqlink href=#L50011>L50011</a> apply to the cursor
1830
seek operation here. This ensures that if the search does not find
1831
an exact match, the cursor is left pointing to the leaf page that
1832
the new entry should be added into.
1834
<li> As specified by <a class=reqlink href=#L50006>L50006</a>, the cursor may already be positioned. In
1835
this case the seek operation is not required.
1838
<li> If a matching key was found in the b-tree, then it must be removed and
1839
the new entry added in its place.
1842
<li> If there are one or more overflow pages associated with the entry
1843
being replaced, they are moved to the free-list.
1844
<li> The cell corresponding to the entry being removed is removed from
1845
the b-tree node page.
1846
<li> The new cell is inserted in the position previously occupied by the
1847
cell removed in the previous step. If the page is not a leaf page,
1848
then the first four-bytes (the child-page pointer) of the old
1849
cell are copied to the first four bytes of the new cell. If the new
1850
cell is larger than the cell that it replaced, then it may become
1851
an <a class=defnlink href="#glossary_Overflow_Cell">overflow cell</a>.
1854
<li> If no matching key was found in the b-tree, then the new cell is inserted
1855
into the leaf page that the cursor was left pointing to by step 1. The
1856
new cell may become an <a class=defnlink href="#glossary_Overflow_Cell">overflow cell</a>.
1858
<li> If the new cell is now an <a class=defnlink href="#glossary_Overflow_Cell">overflow cell</a>, then the balancing algorithm
1859
(see section <cite><a href="#btree_balancing_algorithm" title="B-Tree Balancing Algorithm">4.2.5</a></cite>) is run on the
1860
overflowing b-tree node page.
1863
<h4 id="section_4_2_4_2">4.2.4.2 B-Tree Delete Entry</h4>
1867
This section describes the way in which entries may be removed from
1868
a b-tree structure, as required when the sqlite3BtreeDelete (section
1869
<cite><a href="#sqlite3BtreeDelete" title="sqlite3BtreeDelete">3.8.7</a></cite>) API is invoked. Removing an entry
1870
from a b-tree table involves the following steps:
1873
<li> All overflow pages in the overflow page chain (if any) associated
1874
with the entry must be moved to the database free-list. If the
1875
database image is an autovacuum database, the pointer-map entries
1876
that correspond to each overflow page in the chain must be updated.
1878
<li> The b-tree cell corresponding to the entry must be removed from
1879
the b-tree structure.
1883
Note about the optimization that makes it possible to move overflow pages
1884
to the free-list without reading their contents (i.e. without loading them
1888
If the b-tree entry being removed is located on a leaf page (as is always the
1889
case with table b-tree structures), then deleting an entry from a b-tree
1894
<a name="figure_delete1"></a>
1895
<object data="images/btreemodule_delete1.svg" type="image/svg+xml" width=901 height=231 style="overflow:hidden"></object>
1896
<p><i>Figure 2 - Delete from an Internal Node</i>
1900
<h3 id="btree_balancing_algorithm">4.2.5 B-Tree Balancing Algorithm</h3>
1904
<li><p>The <b>balance deeper</b> sub-algorithm is used when the root page of
1905
a b-tree is overfull. It creates a new page and copies the
1906
entire contents of the overfull root page to it. The root page
1907
is then zeroed and the new page installed as its only child.
1908
The balancing algorithm is then run on the new child page (in case
1911
<li><p>The <b>balance shallower</b> sub-algorithm is used when the root page
1912
of a b-tree has only a single child page. If possible, the data from
1913
the child page is copied into the root-page and the child page discarded.
1915
<li><p>The <b>balance quick</b> sub-algorithm is used in a very specific,
1916
but common scenario. It is used only for table b-trees, when a new entry that
1917
has a key value greater than all existing keys in the b-tree is inserted and
1918
causes the right-most leaf page of the b-tree structure to become overfull.
1920
<li><p>The <b>balance siblings</b> sub-algorithm is run when a b-tree page that
1921
is not the root-page of its b-tree structure is either overfull or underfull.
1928
<h4 id="section_4_2_5_1">4.2.5.1 Balance Deeper</h4>
1931
<li> Allocate a new page (the child-page).
1932
<li> Copy page data from root-page to child-page (including <a class=defnlink href="#glossary_Overflow_Cell">overflow cells</a>).
1933
<li> Fix pointer map entries associated with new child-page content.
1934
<li> Zero the root-page.
1935
<li> Set the right-child pointer of the root-page to point to the new child-page.
1936
<li> Set the pointer map entry for the new child page.
1937
<li> Execute the balance procedure on the new child page.
1942
<a name="figure_balance_deeper"></a>
1943
<object data="images/btreemodule_balance_deeper.svg" type="image/svg+xml" width=901 height=196 style="overflow:hidden"></object>
1944
<p><i>Figure 3 - Example Balance Deeper Transform</i>
1948
<h4 id="section_4_2_5_2">4.2.5.2 Balance Shallower</h4>
1952
<li> Copy node data from child-page to root-page.
1953
<li> Fix pointer map entries associated with new root-page content.
1954
<li> Move child-page to database image free-list.
1959
<a name="figure_balance_shallower"></a>
1960
<object data="images/btreemodule_balance_shallower.svg" type="image/svg+xml" width=901 height=196 style="overflow:hidden"></object>
1961
<p><i>Figure 4 - Example Balance Shallower Transform</i>
1965
<h4 id="section_4_2_5_3">4.2.5.3 Balance Quick</h4>
1969
<li> Allocate a new page (the new sibling-page).
1971
<li> Populate the new sibling page with the new b-tree entry.
1973
<li> Add a new divider cell to the parent. The divider cell contains a
1974
pointer to the page that is currently the right-child of the parent.
1975
The key in the new divider cell is a copy of the largest key in the
1976
page that is currently the right-child of the parent.
1978
<li> Set the right-child of the parent page to point to the new sibling page.
1980
<li> If the database is an auto-vacuum database, set the pointer map
1981
entry associated with the new sibling page. If the cell on the new
1982
sibling page contains a pointer to an overflow page, set the pointer map
1983
entry associated with the overflow page.
1985
<li> Execute the balance procedure on the parent page.
1990
<a name="figure_balance_quick"></a>
1991
<object data="images/btreemodule_balance_quick.svg" type="image/svg+xml" width=901 height=292 style="overflow:hidden"></object>
1992
<p><i>Figure 5 - Example Balance Quick Transform</i>
1996
<h4 id="balance_siblings">4.2.5.4 Balance Siblings</h4>
1999
<p class=req id=L51001><span>The <a class=defnlink href="#glossary_Balance-Siblings_Algorithm">balance-siblings algorithm</a> shall redistribute the b-tree cells currently
2000
stored on a overfull or underfull page and up to two sibling pages, adding
2001
or removing siblings as required, such that no sibling page is overfull and
2002
the minimum possible number of sibling pages is used to store the
2003
redistributed b-tree cells.</span></p>
2006
The following description describes how balance() is to be implemented. This
2007
represents (I think) the lowest level of detail that should be in this document.
2008
One skilled in the art could use this description to reimplement SQLite's
2009
<a class=defnlink href="#glossary_Balance-Siblings_Algorithm">balance-siblings algorithm</a>. We also need requirements at a higher level
2010
of detail in this section. Something to test!
2013
The <a class=defnlink href="#glossary_Balance-Siblings_Algorithm">balance-siblings algorithm</a>, as implemented by SQLite, is described as
2014
a series of steps below. <span class=todo> there are a few terms used
2015
below that need definitions/clarifications.</span>
2018
<li> Determine the set of sibling pages to redistribute the cells of, using
2019
the following rules:
2021
<li> If the parent page has three or fewer child pages, then all child
2022
pages are deemed to be sibling pages for the purposes of the <a class=defnlink href="#glossary_Balance-Siblings_Algorithm">balance-siblings
2024
<li> If the page being balanced is the left-most child of the parent
2025
page, then the three left-most child pages are used as the siblings.
2026
<li> If the page being balanced is the right-most child of the parent
2027
page, then the three right-most child pages are used as the siblings.
2028
<li> Otherwise, if none of the above three conditions are true, then the
2029
sibling pages are page being balanced and the child pages immediately
2030
to the left and right of it.
2033
<li> Determine an ordered list of cells to redistribute. There are several
2034
variations of this step, depending on the type of page being balanced.
2036
<li> If the page being balanced is a leaf page of a table b-tree,
2037
then the list of cells to redistribute is simply the concatenation
2038
of the ordered lists of cells stored on each sibling page, in order
2039
from left-most sibling to right-most.
2040
<li> If the page being balanced is a leaf page of an index b-tree, then
2041
the list of cells to redistribute is comprised of the cells on each
2042
of the sibling pages and the divider cells in the parent page that
2043
contain the pointers to each sibling page except the right-most. The
2044
list is arranged so that it contains:
2046
<li> The cells from the left-most sibling page, in order, followed by
2047
<li> the divider cell from the parent page that contains the pointer
2048
to the left-most sibling (if there is more than one sibling
2050
<li> the divider cell that contains the pointer to the second left-most
2051
sibling and the cells from the remaining sibling page (if there are three
2054
<li> If the page being balanced is an internal b-tree node, then the list of
2055
cells to redistribute is determined as described in the previous case.
2056
However, when balancing an internal node each cell is associated with
2057
the page number of a child page of one of the sibling pages. The page
2058
number associated with cells stored on a sibling page is the same as
2059
the page number stored as the first four bytes of the cell. The page
2060
number associated with a divider cell within the parent page is the page
2061
number of the right-child page of the sibling page to which the divider
2062
cell contains a pointer.
2065
<li> Determine the new cell distribution, using the following steps:
2067
<li> Assign as may cells as will fit from the start of the ordered list of
2068
cells to the left-most sibling page. Then, if any cells remain, assign
2069
one to be a divider cell, and as many as will fit to the next sibling
2070
page. Repeat until all cells have been assigned a location.
2071
<span class=todo> no divider cells for table b-tree leaf balance</span>
2073
<li> The previous step generates a distribution that is biased towards the
2074
left-hand side. The right-most sibling may even be completely
2075
empty (if the last cell in the ordered list was assigned to be a
2076
divider cell). To rectify this, cells are moved out of the second
2077
right-most sibling page and into the right-most, one at a time, until
2078
there is at least one cell in the right-most sibling page and to move
2079
another cell would mean that the right-most sibling page is more full
2080
than the next to right-most sibling page. This is repeated for the next
2081
right-most pair of sibling pages, shifting cells out of the third
2082
right-most sibling page and into the second right-most, and so on.
2083
<span class=todo> note about divider cells </span>
2086
<li> Determine the set of database pages to use as the new sibling pages.
2089
<li> If there were an equal or greater number of siblings identified
2090
in step 1 than are required by the distribution calculated in step 3,
2091
reuse as many as possible, starting with the left-most. If step 3
2092
calculated a distribution that requires more sibling pages than were
2093
identified in step 1, allocate the required extra pages using the
2094
<span class=todo>Refer to ???</span> algorithm.
2096
<li> Arrange the new sibling pages from left to right in ascending
2097
page number order. The new sibling page with the smallest page number
2098
becomes the left-most sibling page, and so forth.
2101
<li> Populate the new sibling pages.
2103
<li> Populate each new sibling page with the required set of cells. If the
2104
page being balanced is not a leaf page, then the child-page pointer
2105
field of each cell is populated with the page-number associated with
2106
the cell as part of step 2 above.
2108
<li> If the page being balanced is not a leaf page, then the right-child
2109
pointer stored in the page header of each new sibling page must also
2110
be populated. For each new sibling page except the right-most, this
2111
field is set to the page number associated with the cell that
2112
immediately follows the cells stored on the page (the cell that was
2113
assigned to be divider cell in step 3). For the right-most sibling page,
2114
the right-child pointer is set to the value that was stored in the
2115
right-child pointer of the right-most original sibling page identified
2118
<li> Populate the parent page.
2120
<li> If the page being balanced is (was) not a leaf page of a table
2121
b-tree, the cells that contained pointers to the old sibling
2122
pages are replaced by the cells designated as divider cells as part
2123
of step 3. The right-child pointer field of the first divider cell
2124
is overwritten with the page number of the first new sibling page, and
2127
<li> If the page being balanced is (was) a leaf page of a table
2128
b-tree, the cells that contained pointers to the old sibling
2129
pages are replaced by a divider cell associated with all but the
2130
right-most sibling page. The child-page number stored in each divider
2131
cell is set to the page number of the associated sibling. The integer key
2132
value stored in each divider cell is a copy of the largest integer key
2133
value stored on the associated sibling page.
2135
<li> Before balancing, the parent page contained a pointer to the right-most
2136
sibling page, either as part of a cell or as the right-child pointer
2137
stored in the page header. Either way, this value must be overwritten
2138
with the page number of the new right-most sibling page.
2142
<li> Populate pointer map entries.
2144
<li> For each sibling page that was not also an original sibling page, the
2145
associated pointer-map entry must be updated. Similarly, the pointer-map
2146
entry associated with each original sibling page that is no longer a
2147
sibling page must be updated.
2148
<li> For each cell containing an overflow pointer that has been moved from one
2149
page to another, the pointer-map entry associated with the overflow page
2151
<li> If the page being balanced is (was) not a leaf, then for each cell that
2152
has moved from one page to another the pointer-map entry associated with
2153
the cell's child page must be updated.
2154
<li> If the page being balanced is (was) not a leaf, then the pointer-map entry
2155
associated with each sibling's right-child page may need to be updated.
2159
<h3 id="section_4_2_6">4.2.6 Page Allocation and Deallocation</h3>
2163
Amongst other things, this section needs to explain our old pals the
2164
DontWrite() and DontRollback() optimizations.
2166
<h4 id="free_overflow_chain">4.2.6.1 Moving an overflow-chain to the free-list</h4>
2170
Describe how this can sometimes be done without reading the content of
2173
<h3 id="section_4_2_7">4.2.7 Incremental Vacuum Step</h3>
2179
<h2 id="section_4_3">4.3 Transactions and Savepoints</h2>
2183
Requirements surrounding how transactions are made atomic and isolated.
2184
Also how savepoints are implemented. What happens to active cursors after
2185
a rollback or savepoint-rollback.
2187
<h1 id="section_5">5 References</h1>
2190
<table id="refs" style="width:auto; margin: 1em 5ex">
2191
<tr><td style="width:5ex ; vertical-align:top" id="ref_file_format">[1]<td>
2192
SQLite Online Documentation,<u>SQLite Database File Format</u>,
2193
<a href="fileformat.html">http://www.sqlite.org/fileformat.html</a>.
2195
<tr><td style="width:5ex ; vertical-align:top" id="ref_pcache_interface">[2]<td>
2196
SQLite Online Documentation,<u>Application Defined <a class=defnlink href="#glossary_Page_cache">Page Cache</a></u>,
2197
<a href="c3ref/pcache_methods.html">http://www.sqlite.org/c3ref/pcache_methods.html</a>.
2199
<tr><td style="width:5ex ; vertical-align:top" id="ref_os_interface">[3]<td>
2200
SQLite Online Documentation,<u>OS Interface Object</u>,
2201
<a href="c3ref/vfs.html">http://www.sqlite.org/c3ref/vfs.html</a>.