~ubuntu-branches/ubuntu/wily/sqlite3/wily

onfocus="entersearch()" onblur="leavesearch()" style="width:24ex;padding:1px 1ex; border:solid white 1px; font-size:0.9em ; font-style:italic;color:#044a64;" value="Search SQLite Docs...">

112

113

</form>

114

</div>

115

</table>

116

</div></div></div></div>

117

</td></tr></table>

118

119

120

121

122

<h2>The Virtual Database Engine of SQLite</h2>

123

124

125

This document describes the virtual machine used in SQLite version 2.8.0.

126

The virtual machine in SQLite version 3.0 and 3.1 is very similar in

127

concept but many of the opcodes have changed and the algorithms are

128

somewhat different. Use this document as a rough guide to the idea

129

behind the virtual machine in SQLite version 3, not as a reference on

130

how the virtual machine works.

131

</blockquote>

132

133

If you want to know how the SQLite library works internally,

134

you need to begin with a solid understanding of the Virtual Database

135

Engine or VDBE. The VDBE occurs right in the middle of the

136

processing stream (see the <a href="arch.html">architecture diagram</a>)

137

and so it seems to touch most parts of the library. Even

138

parts of the code that do not directly interact with the VDBE

139

are usually in a supporting role. The VDBE really is the heart of

140

SQLite.

141

142

This article is a brief introduction to how the VDBE

143

works and in particular how the various VDBE instructions

144

(documented <a href="opcode.html">here</a>) work together

145

to do useful things with the database. The style is tutorial,

146

beginning with simple tasks and working toward solving more

147

complex problems. Along the way we will visit most

148

submodules in the SQLite library. After completing this tutorial,

149

you should have a pretty good understanding of how SQLite works

150

and will be ready to begin studying the actual source code.

151

152

<h2>Preliminaries</h2>

153

154

The VDBE implements a virtual computer that runs a program in

155

its virtual machine language. The goal of each program is to

156

interrogate or change the database. Toward this end, the machine

157

language that the VDBE implements is specifically designed to

158

search, read, and modify databases.

159

160

Each instruction of the VDBE language contains an opcode and

161

three operands labeled P1, P2, and P3. Operand P1 is an arbitrary

162

integer. P2 is a non-negative integer. P3 is a pointer to a data

163

structure or null-terminated string, possibly null. Only a few VDBE

164

instructions use all three operands. Many instructions use only

165

one or two operands. A significant number of instructions use

166

no operands at all but instead take their data and store their results

167

on the execution stack. The details of what each instruction

168

does and which operands it uses are described in the separate

169

<a href="opcode.html">opcode description</a> document.

170

171

A VDBE program begins

172

execution on instruction 0 and continues with successive instructions

173

until it either (1) encounters a fatal error, (2) executes a

174

Halt instruction, or (3) advances the program counter past the

175

last instruction of the program. When the VDBE completes execution,

176

all open database cursors are closed, all memory is freed, and

177

everything is popped from the stack.

178

So there are never any worries about memory leaks or

179

undeallocated resources.

180

181

If you have done any assembly language programming or have

182

worked with any kind of abstract machine before, all of these

183

details should be familiar to you. So let's jump right in and

184

start looking as some code.

185

186

187

<h2>Inserting Records Into The Database</h2>

188

189

We begin with a problem that can be solved using a VDBE program

190

that is only a few instructions long. Suppose we have an SQL

191

table that was created like this:

192

193

194

CREATE TABLE examp(one text, two int);

195

</pre></blockquote>

196

197

In words, we have a database table named "examp" that has two

198

columns of data named "one" and "two". Now suppose we want to insert a single

199

record into this table. Like this:

200

201

202

INSERT INTO examp VALUES('Hello, World!',99);

203

</pre></blockquote>

204

205

We can see the VDBE program that SQLite uses to implement this

206

INSERT using the sqlite command-line utility. First start

207

up sqlite on a new, empty database, then create the table.

208

Next change the output format of sqlite to a form that

209

is designed to work with VDBE program dumps by entering the

210

".explain" command.

211

Finally, enter the [INSERT] statement shown above, but precede the

212

[INSERT] with the special keyword [EXPLAIN]. The [EXPLAIN] keyword

213

will cause sqlite to print the VDBE program rather than

214

execute it. We have:

215

<blockquote><tt>$ sqlite test_database_1

216

sqlite> CREATE TABLE examp(one text, two int);

217

sqlite> .explain

218

sqlite> EXPLAIN INSERT INTO examp VALUES('Hello, World!',99);

219

addr  opcode        p1     p2     p3

220

----  ------------  -----  -----  -----------------------------------

221

0     Transaction   0      0

222

1     VerifyCookie  0      81

223

2     Transaction   1      0

224

3     Integer       0      0

225

4     OpenWrite     0      3      examp

226

5     NewRecno      0      0

227

6     String        0      0      Hello, World!

228

7     Integer       99     0      99

229

8     MakeRecord    2      0

230

9     PutIntKey     0      1

231

10    Close         0      0

232

11    Commit        0      0

233

12    Halt          0      0</tt></blockquote>As you can see above, our simple insert statement is

234

implemented in 12 instructions. The first 3 and last 2 instructions are

235

a standard prologue and epilogue, so the real work is done in the middle

236

7 instructions. There are no jumps, so the program executes once through

237

from top to bottom. Let's now look at each instruction in detail.

238

<blockquote><tt>0     Transaction   0      0

239

1     VerifyCookie  0      81

240

2     Transaction   1      0</tt></blockquote>

241

The instruction <a href="opcode.html#Transaction">Transaction</a>

242

begins a transaction. The transaction ends when a Commit or Rollback

243

opcode is encountered. P1 is the index of the database file on which

244

the transaction is started. Index 0 is the main database file. A write

245

lock is obtained on the database file when a transaction is started.

246

No other process can read or write the file while the transaction is

247

underway. Starting a transaction also creates a rollback journal. A

248

transaction must be started before any changes can be made to the

249

database.

250

251

The instruction <a href="opcode.html#VerifyCookie">VerifyCookie</a>

252

checks cookie 0 (the database schema version) to make sure it is equal

253

to P2 (the value obtained when the database schema was last read).

254

P1 is the database number (0 for the main database). This is done to

255

make sure the database schema hasn't been changed by another thread, in

256

which case it has to be reread.

257

258

The second <a href="opcode.html#Transaction">Transaction</a>

259

instruction begins a transaction and starts a rollback journal for

260

database 1, the database used for temporary tables.

261

<blockquote><tt>3     Integer       0      0

262

4     OpenWrite     0      3      examp</tt></blockquote>

263

The instruction <a href="opcode.html#Integer">Integer</a> pushes

264

the integer value P1 (0) onto the stack. Here 0 is the number of the

265

database to use in the following OpenWrite instruction. If P3 is not

266

NULL then it is a string representation of the same integer. Afterwards

267

the stack looks like this:

268

<blockquote><table border=2><tr><td align=left>(integer) 0</td></tr></table></blockquote>

269

The instruction <a href="opcode.html#OpenWrite">OpenWrite</a> opens

270

a new read/write cursor with handle P1 (0 in this case) on table "examp",

271

whose root page is P2 (3, in this database file). Cursor handles can be

272

any non-negative integer. But the VDBE allocates cursors in an array

273

with the size of the array being one more than the largest cursor. So

274

to conserve memory, it is best to use handles beginning with zero and

275

working upward consecutively. Here P3 ("examp") is the name of the

276

table being opened, but this is unused, and only generated to make the

277

code easier to read. This instruction pops the database number to use

278

(0, the main database) from the top of the stack, so afterwards the

279

stack is empty again.

280

<blockquote><tt>5     NewRecno      0      0</tt></blockquote>

281

The instruction <a href="opcode.html#NewRecno">NewRecno</a> creates

282

a new integer record number for the table pointed to by cursor P1. The

283

record number is one not currently used as a key in the table. The new

284

record number is pushed onto the stack. Afterwards the stack looks like

285

this:

286

<blockquote><table border=2><tr><td align=left>(integer) new record key</td></tr></table></blockquote><blockquote><tt>6     String        0      0      Hello, World!</tt></blockquote>

287

The instruction <a href="opcode.html#String">String</a> pushes its

288

P3 operand onto the stack. Afterwards the stack looks like this:

289

<blockquote><table border=2><tr><td align=left>(string) "Hello, World!"</td></tr><tr><td align=left>(integer) new record key</td></tr></table></blockquote><blockquote><tt>7     Integer       99     0      99</tt></blockquote>

290

The instruction <a href="opcode.html#Integer">Integer</a> pushes

291

its P1 operand (99) onto the stack. Afterwards the stack looks like

292

this:

293

<blockquote><table border=2><tr><td align=left>(integer) 99</td></tr><tr><td align=left>(string) "Hello, World!"</td></tr><tr><td align=left>(integer) new record key</td></tr></table></blockquote><blockquote><tt>8     MakeRecord    2      0</tt></blockquote>

294

The instruction <a href="opcode.html#MakeRecord">MakeRecord</a> pops

295

the top P1 elements off the stack (2 in this case) and converts them into

296

the binary format used for storing records in a database file.

297

(See the <a href="fileformat.html">file format</a> description for

298

details.) The new record generated by the MakeRecord instruction is

299

pushed back onto the stack. Afterwards the stack looks like this:

300

</ul>

301

<blockquote><table border=2><tr><td align=left>(record) "Hello, World!", 99</td></tr><tr><td align=left>(integer) new record key</td></tr></table></blockquote><blockquote><tt>9     PutIntKey     0      1</tt></blockquote>

302

The instruction <a href="opcode.html#PutIntKey">PutIntKey</a> uses

303

the top 2 stack entries to write an entry into the table pointed to by

304

cursor P1. A new entry is created if it doesn't already exist or the

305

data for an existing entry is overwritten. The record data is the top

306

stack entry, and the key is the next entry down. The stack is popped

307

twice by this instruction. Because operand P2 is 1 the row change count

308

is incremented and the rowid is stored for subsequent return by the

309

sqlite_last_insert_rowid() function. If P2 is 0 the row change count is

310

unmodified. This instruction is where the insert actually occurs.

311

<blockquote><tt>10    Close         0      0</tt></blockquote>

312

The instruction <a href="opcode.html#Close">Close</a> closes a

313

cursor previously opened as P1 (0, the only open cursor). If P1 is not

314

currently open, this instruction is a no-op.

315

<blockquote><tt>11    Commit        0      0</tt></blockquote>

316

The instruction <a href="opcode.html#Commit">Commit</a> causes all

317

modifications to the database that have been made since the last

318

Transaction to actually take effect. No additional modifications are

319

allowed until another transaction is started. The Commit instruction

320

deletes the journal file and releases the write lock on the database.

321

A read lock continues to be held if there are still cursors open.

322

323

The instruction <a href="opcode.html#Halt">Halt</a> causes the VDBE

324

engine to exit immediately. All open cursors, Lists, Sorts, etc are

325

closed automatically. P1 is the result code returned by sqlite_exec().

326

For a normal halt, this should be SQLITE_OK (0). For errors, it can be

327

some other value. The operand P2 is only used when there is an error.

328

There is an implied "Halt 0 0 0" instruction at the end of every

329

program, which the VDBE appends when it prepares a program to run.

330

331

332

333

<h2>Tracing VDBE Program Execution</h2>

334

335

If the SQLite library is compiled without the NDEBUG preprocessor

336

macro, then the PRAGMA <a href="pragma.html#pragma_vdbe_trace">vdbe_trace

337

</a> causes the VDBE to trace the execution of programs. Though this

338

feature was originally intended for testing and debugging, it can also

339

be useful in learning about how the VDBE operates.

340

Use "<tt>PRAGMA vdbe_trace=ON;</tt>" to turn tracing on and

341

"<tt>PRAGMA vdbe_trace=OFF</tt>" to turn tracing back off.

342

Like this:

343

<blockquote><tt>sqlite> PRAGMA vdbe_trace=ON;

344

0 Halt            0    0

345

sqlite> INSERT INTO examp VALUES('Hello, World!',99);

346

0 Transaction     0    0

347

1 VerifyCookie    0   81

348

2 Transaction     1    0

349

3 Integer         0    0

350

Stack: i:0

351

4 OpenWrite       0    3 examp

352

5 NewRecno        0    0

353

Stack: i:2

354

6 String          0    0 Hello, World!

355

Stack: t[Hello,.World!] i:2

356

7 Integer        99    0 99

357

Stack: si:99 t[Hello,.World!] i:2

358

8 MakeRecord      2    0

359

Stack: s[...Hello,.World!.99] i:2

360

9 PutIntKey       0    1

361

10 Close           0    0

362

11 Commit          0    0

363

12 Halt            0    0</tt></blockquote>

364

With tracing mode on, the VDBE prints each instruction prior

365

to executing it. After the instruction is executed, the top few

366

entries in the stack are displayed. The stack display is omitted

367

if the stack is empty.

368

369

On the stack display, most entries are shown with a prefix

370

that tells the datatype of that stack entry. Integers begin

371

with "<tt>i:</tt>". Floating point values begin with "<tt>r:</tt>".

372

(The "r" stands for "real-number".) Strings begin with either

373

"<tt>s:</tt>", "<tt>t:</tt>", "<tt>e:</tt>" or "<tt>z:</tt>".

374

The difference among the string prefixes is caused by how their

375

memory is allocated. The z: strings are stored in memory obtained

376

from malloc(). The t: strings are statically allocated.

377

The e: strings are ephemeral. All other strings have the s: prefix.

378

This doesn't make any difference to you,

379

the observer, but it is vitally important to the VDBE since the

380

z: strings need to be passed to free() when they are

381

popped to avoid a memory leak. Note that only the first 10

382

characters of string values are displayed and that binary

383

values (such as the result of the MakeRecord instruction) are

384

treated as strings. The only other datatype that can be stored

385

on the VDBE stack is a NULL, which is display without prefix

386

as simply "<tt>NULL</tt>". If an integer has been placed on the

387

stack as both an integer and a string, its prefix is "<tt>si:</tt>".

388

389

390

391

<h2>Simple Queries</h2>

392

393

At this point, you should understand the basics of how the VDBE

394

writes to a database. Now let's look at how it does queries.

395

We will use the following simple SELECT statement as our example:

396

397

398

SELECT * FROM examp;

399

</pre></blockquote>

400

401

The VDBE program generated for this SQL statement is as follows:

402

<blockquote><tt>sqlite> EXPLAIN SELECT * FROM examp;

403

addr  opcode        p1     p2     p3

404

----  ------------  -----  -----  -----------------------------------

405

0     ColumnName    0      0      one

406

1     ColumnName    1      0      two

407

2     Integer       0      0

408

3     OpenRead      0      3      examp

409

4     VerifyCookie  0      81

410

5     Rewind        0      10

411

6     Column        0      0

412

7     Column        0      1

413

8     Callback      2      0

414

9     Next          0      6

415

10    Close         0      0

416

11    Halt          0      0</tt></blockquote>

417

Before we begin looking at this problem, let's briefly review

418

how queries work in SQLite so that we will know what we are trying

419

to accomplish. For each row in the result of a query,

420

SQLite will invoke a callback function with the following

421

prototype:

422

423

424

int Callback(void *pUserData, int nColumn, char *azData[], char *azColumnName[]);

425

</pre></blockquote>

426

427

The SQLite library supplies the VDBE with a pointer to the callback function

428

and the pUserData pointer. (Both the callback and the user data were

429

originally passed in as arguments to the sqlite_exec() API function.)

430

The job of the VDBE is to

431

come up with values for nColumn, azData[],

432

and azColumnName[].

433

nColumn is the number of columns in the results, of course.

434

azColumnName[] is an array of strings where each string is the name

435

of one of the result columns. azData[] is an array of strings holding

436

the actual data.

437

<blockquote><tt>0     ColumnName    0      0      one

438

1     ColumnName    1      0      two</tt></blockquote>

439

The first two instructions in the VDBE program for our query are

440

concerned with setting up values for azColumn.

441

The <a href="opcode.html#ColumnName">ColumnName</a> instructions tell

442

the VDBE what values to fill in for each element of the azColumnName[]

443

array. Every query will begin with one ColumnName instruction for each

444

column in the result, and there will be a matching Column instruction for

445

each one later in the query.

446

447

<blockquote><tt>2     Integer       0      0

448

3     OpenRead      0      3      examp

449

4     VerifyCookie  0      81</tt></blockquote>

450

Instructions 2 and 3 open a read cursor on the database table that is

451

to be queried. This works the same as the OpenWrite instruction in the

452

INSERT example except that the cursor is opened for reading this time

453

instead of for writing. Instruction 4 verifies the database schema as

454

in the INSERT example.

455

<blockquote><tt>5     Rewind        0      10</tt></blockquote>

456

The <a href="opcode.html#Rewind">Rewind</a> instruction initializes

457

a loop that iterates over the "examp" table. It rewinds the cursor P1

458

to the first entry in its table. This is required by the Column and

459

Next instructions, which use the cursor to iterate through the table.

460

If the table is empty, then jump to P2 (10), which is the instruction just

461

past the loop. If the table is not empty, fall through to the following

462

instruction at 6, which is the beginning of the loop body.

463

<blockquote><tt>6     Column        0      0

464

7     Column        0      1

465

8     Callback      2      0</tt></blockquote>

466

The instructions 6 through 8 form the body of the loop that will

467

execute once for each record in the database file.

468

469

The <a href="opcode.html#Column">Column</a> instructions at addresses 6

470

and 7 each take the P2-th column from the P1-th cursor and push it onto

471

the stack. In this example, the first Column instruction is pushing the

472

value for the column "one" onto the stack and the second Column

473

instruction is pushing the value for column "two".

474

475

The <a href="opcode.html#Callback">Callback</a> instruction at address 8

476

invokes the callback() function. The P1 operand to Callback becomes the

477

value for nColumn. The Callback instruction pops P1 values from

478

the stack and uses them to fill the azData[] array.

479

480

The instruction at address 9 implements the branching part of the

481

loop. Together with the Rewind at address 5 it forms the loop logic.

482

This is a key concept that you should pay close attention to.

483

The <a href="opcode.html#Next">Next</a> instruction advances the cursor

484

P1 to the next record. If the cursor advance was successful, then jump

485

immediately to P2 (6, the beginning of the loop body). If the cursor

486

was at the end, then fall through to the following instruction, which

487

ends the loop.

488

<blockquote><tt>10    Close         0      0

489

11    Halt          0      0</tt></blockquote>

490

The Close instruction at the end of the program closes the

491

cursor that points into the table "examp". It is not really necessary

492

to call Close here since all cursors will be automatically closed

493

by the VDBE when the program halts. But we needed an instruction

494

for the Rewind to jump to so we might as well go ahead and have that

495

instruction do something useful.

496

The Halt instruction ends the VDBE program.

497

498

Note that the program for this SELECT query didn't contain the

499

Transaction and Commit instructions used in the INSERT example. Because

500

the SELECT is a read operation that doesn't alter the database, it

501

doesn't require a transaction.

502

503

504

<h2>A Slightly More Complex Query</h2>

505

506

The key points of the previous example were the use of the Callback

507

instruction to invoke the callback function, and the use of the Next

508

instruction to implement a loop over all records of the database file.

509

This example attempts to drive home those ideas by demonstrating a

510

slightly more complex query that involves more columns of

511

output, some of which are computed values, and a WHERE clause that

512

limits which records actually make it to the callback function.

513

Consider this query:

514

515

516

SELECT one, two, one || two AS 'both'

517

FROM examp

518

WHERE one LIKE 'H%'

519

</pre></blockquote>

520

521

This query is perhaps a bit contrived, but it does serve to

522

illustrate our points. The result will have three column with

523

names "one", "two", and "both". The first two columns are direct

524

copies of the two columns in the table and the third result

525

column is a string formed by concatenating the first and

526

second columns of the table.

527

Finally, the

528

WHERE clause says that we will only chose rows for the

529

results where the "one" column begins with an "H".

530

Here is what the VDBE program looks like for this query:

531

<blockquote><tt>addr  opcode        p1     p2     p3

532

----  ------------  -----  -----  -----------------------------------

533

0     ColumnName    0      0      one

534

1     ColumnName    1      0      two

535

2     ColumnName    2      0      both

536

3     Integer       0      0

537

4     OpenRead      0      3      examp

538

5     VerifyCookie  0      81

539

6     Rewind        0      18

540

7     String        0      0      H%

541

8     Column        0      0

542

9     Function      2      0      ptr(0x7f1ac0)

543

10    IfNot         1      17

544

11    Column        0      0

545

12    Column        0      1

546

13    Column        0      0

547

14    Column        0      1

548

15    Concat        2      0

549

16    Callback      3      0

550

17    Next          0      7

551

18    Close         0      0

552

19    Halt          0      0</tt></blockquote>

553

Except for the WHERE clause, the structure of the program for

554

this example is very much like the prior example, just with an

555

extra column. There are now 3 columns, instead of 2 as before,

556

and there are three ColumnName instructions.

557

A cursor is opened using the OpenRead instruction, just like in the

558

prior example. The Rewind instruction at address 6 and the

559

Next at address 17 form a loop over all records of the table.

560

The Close instruction at the end is there to give the

561

Rewind instruction something to jump to when it is done. All of

562

this is just like in the first query demonstration.

563

564

The Callback instruction in this example has to generate

565

data for three result columns instead of two, but is otherwise

566

the same as in the first query. When the Callback instruction

567

is invoked, the left-most column of the result should be

568

the lowest in the stack and the right-most result column should

569

be the top of the stack. We can see the stack being set up

570

this way at addresses 11 through 15. The Column instructions at

571

11 and 12 push the values for the first two columns in the result.

572

The two Column instructions at 13 and 14 pull in the values needed

573

to compute the third result column and the Concat instruction at

574

15 joins them together into a single entry on the stack.

575

576

The only thing that is really new about the current example

577

is the WHERE clause which is implemented by instructions at

578

addresses 7 through 10. Instructions at address 7 and 8 push

579

onto the stack the value of the "one" column from the table

580

and the literal string "H%".

581

The <a href="opcode.html#Function">Function</a> instruction at address 9

582

pops these two values from the stack and pushes the result of the LIKE()

583

function back onto the stack.

584

The <a href="opcode.html#IfNot">IfNot</a> instruction pops the top stack

585

value and causes an immediate jump forward to the Next instruction if the

586

top value was false (not not like the literal string "H%").

587

Taking this jump effectively skips the callback, which is the whole point

588

of the WHERE clause. If the result

589

of the comparison is true, the jump is not taken and control

590

falls through to the Callback instruction below.

591

592

Notice how the LIKE operator is implemented. It is a user-defined

593

function in SQLite, so the address of its function definition is

594

specified in P3. The operand P1 is the number of function arguments for

595

it to take from the stack. In this case the LIKE() function takes 2

596

arguments. The arguments are taken off the stack in reverse order

597

(right-to-left), so the pattern to match is the top stack element, and

598

the next element is the data to compare. The return value is pushed

599

onto the stack.

600

601

602

603

<h2>A Template For SELECT Programs</h2>

604

605

The first two query examples illustrate a kind of template that

606

every SELECT program will follow. Basically, we have:

607

608

609

<ol>

610

<li>Initialize the azColumnName[] array for the callback.</li>

611

<li>Open a cursor into the table to be queried.</li>

612

<li>For each record in the table, do:

613

614

<li>If the WHERE clause evaluates to FALSE, then skip the steps that

615

follow and continue to the next record.</li>

616

<li>Compute all columns for the current row of the result.</li>

617

<li>Invoke the callback function for the current row of the result.</li>

618

</ol>

619

<li>Close the cursor.</li>

620

</ol>

621

622

623

This template will be expanded considerably as we consider

624

additional complications such as joins, compound selects, using

625

indices to speed the search, sorting, and aggregate functions

626

with and without GROUP BY and HAVING clauses.

627

But the same basic ideas will continue to apply.

628

629

<h2>UPDATE And DELETE Statements</h2>

630

631

The UPDATE and DELETE statements are coded using a template

632

that is very similar to the SELECT statement template. The main

633

difference, of course, is that the end action is to modify the

634

database rather than invoke a callback function. Because it modifies

635

the database it will also use transactions. Let's begin

636

by looking at a DELETE statement:

637

638

639

DELETE FROM examp WHERE two<50;

640

</pre></blockquote>

641

642

This DELETE statement will remove every record from the "examp"

643

table where the "two" column is less than 50.

644

The code generated to do this is as follows:

645

<blockquote><tt>addr  opcode        p1     p2     p3

646

----  ------------  -----  -----  -----------------------------------

647

0     Transaction   1      0

648

1     Transaction   0      0

649

2     VerifyCookie  0      178

650

3     Integer       0      0

651

4     OpenRead      0      3      examp

652

5     Rewind        0      12

653

6     Column        0      1

654

7     Integer       50     0      50

655

8     Ge            1      11

656

9     Recno         0      0

657

10    ListWrite     0      0

658

11    Next          0      6

659

12    Close         0      0

660

13    ListRewind    0      0

661

14    Integer       0      0

662

15    OpenWrite     0      3

663

16    ListRead      0      20

664

17    NotExists     0      19

665

18    Delete        0      1

666

19    Goto          0      16

667

20    ListReset     0      0

668

21    Close         0      0

669

22    Commit        0      0

670

23    Halt          0      0</tt></blockquote>

671

Here is what the program must do. First it has to locate all of

672

the records in the table "examp" that are to be deleted. This is

673

done using a loop very much like the loop used in the SELECT examples

674

above. Once all records have been located, then we can go back through

675

and delete them one by one. Note that we cannot delete each record

676

as soon as we find it. We have to locate all records first, then

677

go back and delete them. This is because the SQLite database

678

backend might change the scan order after a delete operation.

679

And if the scan

680

order changes in the middle of the scan, some records might be

681

visited more than once and other records might not be visited at all.

682

683

So the implementation of DELETE is really in two loops. The first loop

684

(instructions 5 through 11) locates the records that are to be deleted

685

and saves their keys onto a temporary list, and the second loop

686

(instructions 16 through 19) uses the key list to delete the records one

687

by one.

688

<blockquote><tt>0     Transaction   1      0

689

1     Transaction   0      0

690

2     VerifyCookie  0      178

691

3     Integer       0      0

692

4     OpenRead      0      3      examp</tt></blockquote>

693

Instructions 0 though 4 are as in the INSERT example. They start

694

transactions for the main and temporary databases, verify the database

695

schema for the main database, and open a read cursor on the table

696

"examp". Notice that the cursor is opened for reading, not writing. At

697

this stage of the program we are only going to be scanning the table,

698

not changing it. We will reopen the same table for writing later, at

699

instruction 15.

700

<blockquote><tt>5     Rewind        0      12</tt></blockquote>

701

As in the SELECT example, the <a href="opcode.html#Rewind">Rewind</a>

702

instruction rewinds the cursor to the beginning of the table, readying

703

it for use in the loop body.

704

<blockquote><tt>6     Column        0      1

705

7     Integer       50     0      50

706

8     Ge            1      11</tt></blockquote>

707

The WHERE clause is implemented by instructions 6 through 8.

708

The job of the where clause is to skip the ListWrite if the WHERE

709

condition is false. To this end, it jumps ahead to the Next instruction

710

if the "two" column (extracted by the Column instruction) is

711

greater than or equal to 50.

712

713

As before, the Column instruction uses cursor P1 and pushes the data

714

record in column P2 (1, column "two") onto the stack. The Integer

715

instruction pushes the value 50 onto the top of the stack. After these

716

two instructions the stack looks like:

717

<blockquote><table border=2><tr><td align=left>(integer) 50</td></tr><tr><td align=left>(record) current record for column "two" </td></tr></table></blockquote>

718

The <a href="opcode.html#Ge">Ge</a> operator compares the top two

719

elements on the stack, pops them, and then branches based on the result

720

of the comparison. If the second element is >= the top element, then

721

jump to address P2 (the Next instruction at the end of the loop).

722

Because P1 is true, if either operand is NULL (and thus the result is

723

NULL) then take the jump. If we don't jump, just advance to the next

724

instruction.

725

<blockquote><tt>9     Recno         0      0

726

10    ListWrite     0      0</tt></blockquote>

727

The <a href="opcode.html#Recno">Recno</a> instruction pushes onto the

728

stack an integer which is the first 4 bytes of the key to the current

729

entry in a sequential scan of the table pointed to by cursor P1.

730

The <a href="opcode.html#ListWrite">ListWrite</a> instruction writes the

731

integer on the top of the stack into a temporary storage list and pops

732

the top element. This is the important work of this loop, to store the

733

keys of the records to be deleted so we can delete them in the second

734

loop. After this ListWrite instruction the stack is empty again.

735

<blockquote><tt>11    Next          0      6

736

12    Close         0      0</tt></blockquote>

737

The Next instruction increments the cursor to point to the next

738

element in the table pointed to by cursor P0, and if it was successful

739

branches to P2 (6, the beginning of the loop body). The Close

740

instruction closes cursor P1. It doesn't affect the temporary storage

741

list because it isn't associated with cursor P1; it is instead a global

742

working list (which can be saved with ListPush).

743

<blockquote><tt>13    ListRewind    0      0</tt></blockquote>

744

The <a href="opcode.html#ListRewind">ListRewind</a> instruction

745

rewinds the temporary storage list to the beginning. This prepares it

746

for use in the second loop.

747

<blockquote><tt>14    Integer       0      0

748

15    OpenWrite     0      3</tt></blockquote>

749

As in the INSERT example, we push the database number P1 (0, the main

750

database) onto the stack and use OpenWrite to open the cursor P1 on table

751

P2 (base page 3, "examp") for modification.

752

<blockquote><tt>16    ListRead      0      20

753

17    NotExists     0      19

754

18    Delete        0      1

755

19    Goto          0      16</tt></blockquote>

756

This loop does the actual deleting. It is organized differently from

757

the one in the UPDATE example. The ListRead instruction plays the role

758

that the Next did in the INSERT loop, but because it jumps to P2 on

759

failure, and Next jumps on success, we put it at the start of the loop

760

instead of the end. This means that we have to put a Goto at the end of

761

the loop to jump back to the loop test at the beginning. So this

762

loop has the form of a C while(){...} loop, while the loop in the INSERT

763

example had the form of a do{...}while() loop. The Delete instruction

764

fills the role that the callback function did in the preceding examples.

765

766

The <a href="opcode.html#ListRead">ListRead</a> instruction reads an

767

element from the temporary storage list and pushes it onto the stack.

768

If this was successful, it continues to the next instruction. If this

769

fails because the list is empty, it branches to P2, which is the

770

instruction just after the loop. Afterwards the stack looks like:

771

<blockquote><table border=2><tr><td align=left>(integer) key for current record</td></tr></table></blockquote>

772

Notice the similarity between the ListRead and Next instructions.

773

Both operations work according to this rule:

774

775

776

Push the next "thing" onto the stack and fall through OR jump to P2,

777

depending on whether or not there is a next "thing" to push.

778

</blockquote>

779

One difference between Next and ListRead is their idea of a "thing".

780

The "things" for the Next instruction are records in a database file.

781

"Things" for ListRead are integer keys in a list. Another difference

782

is whether to jump or fall through if there is no next "thing". In this

783

case, Next falls through, and ListRead jumps. Later on, we will see

784

other looping instructions (NextIdx and SortNext) that operate using the

785

same principle.

786

787

The <a href="opcode.html#NotExists">NotExists</a> instruction pops

788

the top stack element and uses it as an integer key. If a record with

789

that key does not exist in table P1, then jump to P2. If a record does

790

exist, then fall through to the next instruction. In this case P2 takes

791

us to the Goto at the end of the loop, which jumps back to the ListRead

792

at the beginning. This could have been coded to have P2 be 16, the

793

ListRead at the start of the loop, but the SQLite parser which generated

794

this code didn't make that optimization.

795

The <a href="opcode.html#Delete">Delete</a> does the work of this

796

loop; it pops an integer key off the stack (placed there by the

797

preceding ListRead) and deletes the record of cursor P1 that has that key.

798

Because P2 is true, the row change counter is incremented.

799

The <a href="opcode.html#Goto">Goto</a> jumps back to the beginning

800

of the loop. This is the end of the loop.

801

<blockquote><tt>20    ListReset     0      0

802

21    Close         0      0

803

22    Commit        0      0

804

23    Halt          0      0</tt></blockquote>

805

This block of instruction cleans up the VDBE program. Three of these

806

instructions aren't really required, but are generated by the SQLite

807

parser from its code templates, which are designed to handle more

808

complicated cases.

809

The <a href="opcode.html#ListReset">ListReset</a> instruction empties

810

the temporary storage list. This list is emptied automatically when the

811

VDBE program terminates, so it isn't necessary in this case. The Close

812

instruction closes the cursor P1. Again, this is done by the VDBE

813

engine when it is finished running this program. The Commit ends the

814

current transaction successfully, and causes all changes that occurred

815

in this transaction to be saved to the database. The final Halt is also

816

unnecessary, since it is added to every VDBE program when it is

817

prepared to run.

818

819

820

UPDATE statements work very much like DELETE statements except

821

that instead of deleting the record they replace it with a new one.

822

Consider this example:

823

824

825

826

UPDATE examp SET one= '(' || one || ')' WHERE two < 50;

827

</pre></blockquote>

828

829

Instead of deleting records where the "two" column is less than

830

50, this statement just puts the "one" column in parentheses

831

The VDBE program to implement this statement follows:

832

<blockquote><tt>addr  opcode        p1     p2     p3

833

----  ------------  -----  -----  -----------------------------------

834

0     Transaction   1      0

835

1     Transaction   0      0

836

2     VerifyCookie  0      178

837

3     Integer       0      0

838

4     OpenRead      0      3      examp

839

5     Rewind        0      12

840

6     Column        0      1

841

7     Integer       50     0      50

842

8     Ge            1      11

843

9     Recno         0      0

844

10    ListWrite     0      0

845

11    Next          0      6

846

12    Close         0      0

847

13    Integer       0      0

848

14    OpenWrite     0      3

849

15    ListRewind    0      0

850

16    ListRead      0      28

851

17    Dup           0      0

852

18    NotExists     0      16

853

19    String        0      0      (

854

20    Column        0      0

855

21    Concat        2      0

856

22    String        0      0      )

857

23    Concat        2      0

858

24    Column        0      1

859

25    MakeRecord    2      0

860

26    PutIntKey     0      1

861

27    Goto          0      16

862

28    ListReset     0      0

863

29    Close         0      0

864

30    Commit        0      0

865

31    Halt          0      0</tt></blockquote>

866

This program is essentially the same as the DELETE program except

867

that the body of the second loop has been replace by a sequence of

868

instructions (at addresses 17 through 26) that update the record rather

869

than delete it. Most of this instruction sequence should already be

870

familiar to you, but there are a couple of minor twists so we will go

871

over it briefly. Also note that the order of some of the instructions

872

before and after the 2nd loop has changed. This is just the way the

873

SQLite parser chose to output the code using a different template.

874

875

As we enter the interior of the second loop (at instruction 17)

876

the stack contains a single integer which is the key of the

877

record we want to modify. We are going to need to use this

878

key twice: once to fetch the old value of the record and

879

a second time to write back the revised record. So the first instruction

880

is a Dup to make a duplicate of the key on the top of the stack. The

881

Dup instruction will duplicate any element of the stack, not just the top

882

element. You specify which element to duplication using the

883

P1 operand. When P1 is 0, the top of the stack is duplicated.

884

When P1 is 1, the next element down on the stack duplication.

885

And so forth.

886

887

After duplicating the key, the next instruction, NotExists,

888

pops the stack once and uses the value popped as a key to

889

check the existence of a record in the database file. If there is no record

890

for this key, it jumps back to the ListRead to get another key.

891

892

Instructions 19 through 25 construct a new database record

893

that will be used to replace the existing record. This is

894

the same kind of code that we saw

895

in the description of INSERT and will not be described further.

896

After instruction 25 executes, the stack looks like this:

897

<blockquote><table border=2><tr><td align=left>(record) new data record</td></tr><tr><td align=left>(integer) key</td></tr></table></blockquote>

898

The PutIntKey instruction (also described

899

during the discussion about INSERT) writes an entry into the

900

database file whose data is the top of the stack and whose key

901

is the next on the stack, and then pops the stack twice. The

902

PutIntKey instruction will overwrite the data of an existing record

903

with the same key, which is what we want here. Overwriting was not

904

an issue with INSERT because with INSERT the key was generated

905

by the NewRecno instruction which is guaranteed to provide a key

906

that has not been used before.

907

908

<h2>CREATE and DROP</h2>

909

910

Using CREATE or DROP to create or destroy a table or index is

911

really the same as doing an INSERT or DELETE from the special

912

"sqlite_master" table, at least from the point of view of the VDBE.

913

The sqlite_master table is a special table that is automatically

914

created for every SQLite database. It looks like this:

915

916

917

CREATE TABLE sqlite_master (

918

type TEXT, -- either "table" or "index"

919

name TEXT, -- name of this table or index

920

tbl_name TEXT, -- for indices: name of associated table

921

sql TEXT -- SQL text of the original CREATE statement

922

)

923

</pre></blockquote>

924

925

Every table (except the "sqlite_master" table itself)

926

and every named index in an SQLite database has an entry

927

in the sqlite_master table. You can query this table using

928

a SELECT statement just like any other table. But you are

929

not allowed to directly change the table using UPDATE, INSERT,

930

or DELETE. Changes to sqlite_master have to occur using

931

the CREATE and DROP commands because SQLite also has to update

932

some of its internal data structures when tables and indices

933

are added or destroyed.

934

935

But from the point of view of the VDBE, a CREATE works

936

pretty much like an INSERT and a DROP works like a DELETE.

937

When the SQLite library opens to an existing database,

938

the first thing it does is a SELECT to read the "sql"

939

columns from all entries of the sqlite_master table.

940

The "sql" column contains the complete SQL text of the

941

CREATE statement that originally generated the index or

942

table. This text is fed back into the SQLite parser

943

and used to reconstruct the

944

internal data structures describing the index or table.

945

946

<h2>Using Indexes To Speed Searching</h2>

947

948

In the example queries above, every row of the table being

949

queried must be loaded off of the disk and examined, even if only

950

a small percentage of the rows end up in the result. This can

951

take a long time on a big table. To speed things up, SQLite

952

can use an index.

953

954

An SQLite file associates a key with some data. For an SQLite

955

table, the database file is set up so that the key is an integer

956

and the data is the information for one row of the table.

957

Indices in SQLite reverse this arrangement. The index key

958

is (some of) the information being stored and the index data

959

is an integer.

960

To access a table row that has some particular

961

content, we first look up the content in the index table to find

962

its integer index, then we use that integer to look up the

963

complete record in the table.

964

965

Note that SQLite uses b-trees, which are a sorted data structure,

966

so indices can be used when the WHERE clause of the SELECT statement

967

contains tests for equality or inequality. Queries like the following

968

can use an index if it is available:

969

970

971

SELECT * FROM examp WHERE two==50;

972

SELECT * FROM examp WHERE two<50;

973

SELECT * FROM examp WHERE two IN (50, 100);

974

</pre></blockquote>

975

976

If there exists an index that maps the "two" column of the "examp"

977

table into integers, then SQLite will use that index to find the integer

978

keys of all rows in examp that have a value of 50 for column two, or

979

all rows that are less than 50, etc.

980

But the following queries cannot use the index:

981

982

983

SELECT * FROM examp WHERE two%50 == 10;

984

SELECT * FROM examp WHERE two&127 == 3;

985

</pre></blockquote>

986

987

Note that the SQLite parser will not always generate code to use an

988

index, even if it is possible to do so. The following queries will not

989

currently use the index:

990

991

992

SELECT * FROM examp WHERE two+10 == 50;

993

SELECT * FROM examp WHERE two==50 OR two==100;

994

</pre></blockquote>

995

996

To understand better how indices work, lets first look at how

997

they are created. Let's go ahead and put an index on the two

998

column of the examp table. We have:

999

1000

1001

CREATE INDEX examp_idx1 ON examp(two);

1002

</pre></blockquote>

1003

1004

The VDBE code generated by the above statement looks like the

1005

following:

1006

<blockquote><tt>addr  opcode        p1     p2     p3

1007

----  ------------  -----  -----  -----------------------------------

1008

0     Transaction   1      0

1009

1     Transaction   0      0

1010

2     VerifyCookie  0      178

1011

3     Integer       0      0

1012

4     OpenWrite     0      2

1013

5     NewRecno      0      0

1014

6     String        0      0      index

1015

7     String        0      0      examp_idx1

1016

8     String        0      0      examp

1017

9     CreateIndex   0      0      ptr(0x791380)

1018

10    Dup           0      0

1019

11    Integer       0      0

1020

12    OpenWrite     1      0

1021

13    String        0      0      CREATE INDEX examp_idx1 ON examp(tw

1022

14    MakeRecord    5      0

1023

15    PutIntKey     0      0

1024

16    Integer       0      0

1025

17    OpenRead      2      3      examp

1026

18    Rewind        2      24

1027

19    Recno         2      0

1028

20    Column        2      1

1029

21    MakeIdxKey    1      0      n

1030

22    IdxPut        1      0      indexed columns are not unique

1031

23    Next          2      19

1032

24    Close         2      0

1033

25    Close         1      0

1034

26    Integer       333    0

1035

27    SetCookie     0      0

1036

28    Close         0      0

1037

29    Commit        0      0

1038

30    Halt          0      0</tt></blockquote>

1039

Remember that every table (except sqlite_master) and every named

1040

index has an entry in the sqlite_master table. Since we are creating

1041

a new index, we have to add a new entry to sqlite_master. This is

1042

handled by instructions 3 through 15. Adding an entry to sqlite_master

1043

works just like any other INSERT statement so we will not say any more

1044

about it here. In this example, we want to focus on populating the

1045

new index with valid data, which happens on instructions 16 through

1046

23.

1047

<blockquote><tt>16    Integer       0      0

1048

17    OpenRead      2      3      examp</tt></blockquote>

1049

The first thing that happens is that we open the table being

1050

indexed for reading. In order to construct an index for a table,

1051

we have to know what is in that table. The index has already been

1052

opened for writing using cursor 0 by instructions 3 and 4.

1053

<blockquote><tt>18    Rewind        2      24

1054

19    Recno         2      0

1055

20    Column        2      1

1056

21    MakeIdxKey    1      0      n

1057

22    IdxPut        1      0      indexed columns are not unique

1058

23    Next          2      19</tt></blockquote>

1059

Instructions 18 through 23 implement a loop over every row of the

1060

table being indexed. For each table row, we first extract the integer

1061

key for that row using Recno in instruction 19, then get the value of

1062

the "two" column using Column in instruction 20.

1063

The <a href="opcode.html#MakeIdxKey">MakeIdxKey</a> instruction at 21

1064

converts data from the "two" column (which is on the top of the stack)

1065

into a valid index key. For an index on a single column, this is

1066

basically a no-op. But if the P1 operand to MakeIdxKey had been

1067

greater than one multiple entries would have been popped from the stack

1068

and converted into a single index key.

1069

The <a href="opcode.html#IdxPut">IdxPut</a> instruction at 22 is what

1070

actually creates the index entry. IdxPut pops two elements from the

1071

stack. The top of the stack is used as a key to fetch an entry from the

1072

index table. Then the integer which was second on stack is added to the

1073

set of integers for that index and the new record is written back to the

1074

database file. Note

1075

that the same index entry can store multiple integers if there

1076

are two or more table entries with the same value for the two

1077

column.

1078

1079

1080

Now let's look at how this index will be used. Consider the

1081

following query:

1082

1083

1084

SELECT * FROM examp WHERE two==50;

1085

</pre></blockquote>

1086

1087

SQLite generates the following VDBE code to handle this query:

1088

<blockquote><tt>addr  opcode        p1     p2     p3

1089

----  ------------  -----  -----  -----------------------------------

1090

0     ColumnName    0      0      one

1091

1     ColumnName    1      0      two

1092

2     Integer       0      0

1093

3     OpenRead      0      3      examp

1094

4     VerifyCookie  0      256

1095

5     Integer       0      0

1096

6     OpenRead      1      4      examp_idx1

1097

7     Integer       50     0      50

1098

8     MakeKey       1      0      n

1099

9     MemStore      0      0

1100

10    MoveTo        1      19

1101

11    MemLoad       0      0

1102

12    IdxGT         1      19

1103

13    IdxRecno      1      0

1104

14    MoveTo        0      0

1105

15    Column        0      0

1106

16    Column        0      1

1107

17    Callback      2      0

1108

18    Next          1      11

1109

19    Close         0      0

1110

20    Close         1      0

1111

21    Halt          0      0</tt></blockquote>

1112

The SELECT begins in a familiar fashion. First the column

1113

names are initialized and the table being queried is opened.

1114

Things become different beginning with instructions 5 and 6 where

1115

the index file is also opened. Instructions 7 and 8 make

1116

a key with the value of 50.

1117

The <a href="opcode.html#MemStore">MemStore</a> instruction at 9 stores

1118

the index key in VDBE memory location 0. The VDBE memory is used to

1119

avoid having to fetch a value from deep in the stack, which can be done,

1120

but makes the program harder to generate. The following instruction

1121

<a href="opcode.html#MoveTo">MoveTo</a> at address 10 pops the key off

1122

the stack and moves the index cursor to the first row of the index with

1123

that key. This initializes the cursor for use in the following loop.

1124

1125

Instructions 11 through 18 implement a loop over all index records

1126

with the key that was fetched by instruction 8. All of the index

1127

records with this key will be contiguous in the index table, so we walk

1128

through them and fetch the corresponding table key from the index.

1129

This table key is then used to move the cursor to that row in the table.

1130

The rest of the loop is the same as the loop for the non-indexed SELECT

1131

query.

1132

1133

The loop begins with the <a href="opcode.html#MemLoad">MemLoad</a>

1134

instruction at 11 which pushes a copy of the index key back onto the

1135

stack. The instruction <a href="opcode.html#IdxGT">IdxGT</a> at 12

1136

compares the key to the key in the current index record pointed to by

1137

cursor P1. If the index key at the current cursor location is greater

1138

than the index we are looking for, then jump out of the loop.

1139

1140

The instruction <a href="opcode.html#IdxRecno">IdxRecno</a> at 13

1141

pushes onto the stack the table record number from the index. The

1142

following MoveTo pops it and moves the table cursor to that row. The

1143

next 3 instructions select the column data the same way as in the non-

1144

indexed case. The Column instructions fetch the column data and the

1145

callback function is invoked. The final Next instruction advances the

1146

index cursor, not the table cursor, to the next row, and then branches

1147

back to the start of the loop if there are any index records left.

1148

1149

Since the index is used to look up values in the table,

1150

it is important that the index and table be kept consistent.

1151

Now that there is an index on the examp table, we will have

1152

to update that index whenever data is inserted, deleted, or

1153

changed in the examp table. Remember the first example above

1154

where we were able to insert a new row into the "examp" table using

1155

12 VDBE instructions. Now that this table is indexed, 19

1156

instructions are required. The SQL statement is this:

1157

1158

1159

INSERT INTO examp VALUES('Hello, World!',99);

1160

</pre></blockquote>

1161

1162

And the generated code looks like this:

1163

<blockquote><tt>addr  opcode        p1     p2     p3

1164

----  ------------  -----  -----  -----------------------------------

1165

0     Transaction   1      0

1166

1     Transaction   0      0

1167

2     VerifyCookie  0      256

1168

3     Integer       0      0

1169

4     OpenWrite     0      3      examp

1170

5     Integer       0      0

1171

6     OpenWrite     1      4      examp_idx1

1172

7     NewRecno      0      0

1173

8     String        0      0      Hello, World!

1174

9     Integer       99     0      99

1175

10    Dup           2      1

1176

11    Dup           1      1

1177

12    MakeIdxKey    1      0      n

1178

13    IdxPut        1      0

1179

14    MakeRecord    2      0

1180

15    PutIntKey     0      1

1181

16    Close         0      0

1182

17    Close         1      0

1183

18    Commit        0      0

1184

19    Halt          0      0</tt></blockquote>

1185

At this point, you should understand the VDBE well enough to

1186

figure out on your own how the above program works. So we will

1187

not discuss it further in this text.

1188

1189

<h2>Joins</h2>

1190

1191

In a join, two or more tables are combined to generate a single

1192

result. The result table consists of every possible combination

1193

of rows from the tables being joined. The easiest and most natural

1194

way to implement this is with nested loops.

1195

1196

Recall the query template discussed above where there was a

1197

single loop that searched through every record of the table.

1198

In a join we have basically the same thing except that there

1199

are nested loops. For example, to join two tables, the query

1200

template might look something like this:

1201

1202

1203

<ol>

1204

<li>Initialize the azColumnName[] array for the callback.</li>

1205

<li>Open two cursors, one to each of the two tables being queried.</li>

1206

<li>For each record in the first table, do:

1207

1208

<li>For each record in the second table do:

1209

1210

<li>If the WHERE clause evaluates to FALSE, then skip the steps that

1211

follow and continue to the next record.</li>

1212

<li>Compute all columns for the current row of the result.</li>

1213

<li>Invoke the callback function for the current row of the result.</li>

1214

</ol></li>

1215

</ol>

1216

<li>Close both cursors.</li>

1217

</ol>

1218

1219

1220

This template will work, but it is likely to be slow since we

1221

are now dealing with an O(N2) loop. But it often works

1222

out that the WHERE clause can be factored into terms and that one or

1223

more of those terms will involve only columns in the first table.

1224

When this happens, we can factor part of the WHERE clause test out of

1225

the inner loop and gain a lot of efficiency. So a better template

1226

would be something like this:

1227

1228

1229

<ol>

1230

<li>Initialize the azColumnName[] array for the callback.</li>

1231

<li>Open two cursors, one to each of the two tables being queried.</li>

1232

<li>For each record in the first table, do:

1233

1234

<li>Evaluate terms of the WHERE clause that only involve columns from

1235

the first table. If any term is false (meaning that the whole

1236

WHERE clause must be false) then skip the rest of this loop and

1237

continue to the next record.</li>

1238

<li>For each record in the second table do:

1239

1240

<li>If the WHERE clause evaluates to FALSE, then skip the steps that

1241

follow and continue to the next record.</li>

1242

<li>Compute all columns for the current row of the result.</li>

1243

<li>Invoke the callback function for the current row of the result.</li>

1244

</ol></li>

1245

</ol>

1246

<li>Close both cursors.</li>

1247

</ol>

1248

1249

1250

Additional speed-up can occur if an index can be used to speed

1251

the search of either or the two loops.

1252

1253

SQLite always constructs the loops in the same order as the

1254

tables appear in the FROM clause of the SELECT statement. The

1255

left-most table becomes the outer loop and the right-most table

1256

becomes the inner loop. It is possible, in theory, to reorder

1257

the loops in some circumstances to speed the evaluation of the

1258

join. But SQLite does not attempt this optimization.

1259

1260

You can see how SQLite constructs nested loops in the following

1261

example:

1262

1263

1264

CREATE TABLE examp2(three int, four int);

1265

SELECT * FROM examp, examp2 WHERE two<50 AND four==two;

1266

</pre></blockquote>

1267

<blockquote><tt>addr  opcode        p1     p2     p3

1268

----  ------------  -----  -----  -----------------------------------

1269

0     ColumnName    0      0      examp.one

1270

1     ColumnName    1      0      examp.two

1271

2     ColumnName    2      0      examp2.three

1272

3     ColumnName    3      0      examp2.four

1273

4     Integer       0      0

1274

5     OpenRead      0      3      examp

1275

6     VerifyCookie  0      909

1276

7     Integer       0      0

1277

8     OpenRead      1      5      examp2

1278

9     Rewind        0      24

1279

10    Column        0      1

1280

11    Integer       50     0      50

1281

12    Ge            1      23

1282

13    Rewind        1      23

1283

14    Column        1      1

1284

15    Column        0      1

1285

16    Ne            1      22

1286

17    Column        0      0

1287

18    Column        0      1

1288

19    Column        1      0

1289

20    Column        1      1

1290

21    Callback      4      0

1291

22    Next          1      14

1292

23    Next          0      10

1293

24    Close         0      0

1294

25    Close         1      0

1295

26    Halt          0      0</tt></blockquote>

1296

The outer loop over table examp is implement by instructions

1297

7 through 23. The inner loop is instructions 13 through 22.

1298

Notice that the "two<50" term of the WHERE expression involves

1299

only columns from the first table and can be factored out of

1300

the inner loop. SQLite does this and implements the "two<50"

1301

test in instructions 10 through 12. The "four==two" test is

1302

implement by instructions 14 through 16 in the inner loop.

1303

1304

SQLite does not impose any arbitrary limits on the tables in

1305

a join. It also allows a table to be joined with itself.

1306

1307

<h2>The ORDER BY clause</h2>

1308

1309

For historical reasons, and for efficiency, all sorting is currently

1310

done in memory.

1311

1312

SQLite implements the ORDER BY clause using a special

1313

set of instructions to control an object called a sorter. In the

1314

inner-most loop of the query, where there would normally be

1315

a Callback instruction, instead a record is constructed that

1316

contains both callback parameters and a key. This record

1317

is added to the sorter (in a linked list). After the query loop

1318

finishes, the list of records is sorted and this list is walked. For

1319

each record on the list, the callback is invoked. Finally, the sorter

1320

is closed and memory is deallocated.

1321

1322

We can see the process in action in the following query:

1323

1324

1325

SELECT * FROM examp ORDER BY one DESC, two;

1326

</pre></blockquote>

1327

<blockquote><tt>addr  opcode        p1     p2     p3

1328

----  ------------  -----  -----  -----------------------------------

1329

0     ColumnName    0      0      one

1330

1     ColumnName    1      0      two

1331

2     Integer       0      0

1332

3     OpenRead      0      3      examp

1333

4     VerifyCookie  0      909

1334

5     Rewind        0      14

1335

6     Column        0      0

1336

7     Column        0      1

1337

8     SortMakeRec   2      0

1338

9     Column        0      0

1339

10    Column        0      1

1340

11    SortMakeKey   2      0      D+

1341

12    SortPut       0      0

1342

13    Next          0      6

1343

14    Close         0      0

1344

15    Sort          0      0

1345

16    SortNext      0      19

1346

17    SortCallback  2      0

1347

18    Goto          0      16

1348

19    SortReset     0      0

1349

20    Halt          0      0</tt></blockquote>

1350

There is only one sorter object, so there are no instructions to open

1351

or close it. It is opened automatically when needed, and it is closed

1352

when the VDBE program halts.

1353

1354

The query loop is built from instructions 5 through 13. Instructions

1355

6 through 8 build a record that contains the azData[] values for a single

1356

invocation of the callback. A sort key is generated by instructions

1357

9 through 11. Instruction 12 combines the invocation record and the

1358

sort key into a single entry and puts that entry on the sort list.

1359

1360

The P3 argument of instruction 11 is of particular interest. The

1361

sort key is formed by prepending one character from P3 to each string

1362

and concatenating all the strings. The sort comparison function will

1363

look at this character to determine whether the sort order is

1364

ascending or descending, and whether to sort as a string or number.

1365

In this example, the first column should be sorted as a string

1366

in descending order so its prefix is "D" and the second column should

1367

sorted numerically in ascending order so its prefix is "+". Ascending

1368

string sorting uses "A", and descending numeric sorting uses "-".

1369

1370

After the query loop ends, the table being queried is closed at

1371

instruction 14. This is done early in order to allow other processes

1372

or threads to access that table, if desired. The list of records

1373

that was built up inside the query loop is sorted by the instruction

1374

at 15. Instructions 16 through 18 walk through the record list

1375

(which is now in sorted order) and invoke the callback once for

1376

each record. Finally, the sorter is closed at instruction 19.

1377

1378

<h2>Aggregate Functions And The GROUP BY and HAVING Clauses</h2>

1379

1380

To compute aggregate functions, the VDBE implements a special

1381

data structure and instructions for controlling that data structure.

1382

The data structure is an unordered set of buckets, where each bucket

1383

has a key and one or more memory locations. Within the query

1384

loop, the GROUP BY clause is used to construct a key and the bucket

1385

with that key is brought into focus. A new bucket is created with

1386

the key if one did not previously exist. Once the bucket is in

1387

focus, the memory locations of the bucket are used to accumulate

1388

the values of the various aggregate functions. After the query

1389

loop terminates, each bucket is visited once to generate a

1390

single row of the results.

1391

1392

An example will help to clarify this concept. Consider the

1393

following query:

1394

1395

1396

SELECT three, min(three+four)+avg(four)

1397

FROM examp2

1398

GROUP BY three;

1399

</pre></blockquote>

1400

1401

1402

The VDBE code generated for this query is as follows:

1403

<blockquote><tt>addr  opcode        p1     p2     p3

1404

----  ------------  -----  -----  -----------------------------------

1405

0     ColumnName    0      0      three

1406

1     ColumnName    1      0      min(three+four)+avg(four)

1407

2     AggReset      0      3

1408

3     AggInit       0      1      ptr(0x7903a0)

1409

4     AggInit       0      2      ptr(0x790700)

1410

5     Integer       0      0

1411

6     OpenRead      0      5      examp2

1412

7     VerifyCookie  0      909

1413

8     Rewind        0      23

1414

9     Column        0      0

1415

10    MakeKey       1      0      n

1416

11    AggFocus      0      14

1417

12    Column        0      0

1418

13    AggSet        0      0

1419

14    Column        0      0

1420

15    Column        0      1

1421

16    Add           0      0

1422

17    Integer       1      0

1423

18    AggFunc       0      1      ptr(0x7903a0)

1424

19    Column        0      1

1425

20    Integer       2      0

1426

21    AggFunc       0      1      ptr(0x790700)

1427

22    Next          0      9

1428

23    Close         0      0

1429

24    AggNext       0      31

1430

25    AggGet        0      0

1431

26    AggGet        0      1

1432

27    AggGet        0      2

1433

28    Add           0      0

1434

29    Callback      2      0

1435

30    Goto          0      24

1436

31    Noop          0      0

1437

32    Halt          0      0</tt></blockquote>

1438

The first instruction of interest is the

1439

<a href="opcode.html#AggReset">AggReset</a> at 2.

1440

The AggReset instruction initializes the set of buckets to be the

1441

empty set and specifies the number of memory slots available in each

1442

bucket as P2. In this example, each bucket will hold 3 memory slots.

1443

It is not obvious, but if you look closely at the rest of the program

1444

you can figure out what each of these slots is intended for.

1445

1446

1447

<tr><th>Memory Slot</th><th>Intended Use Of This Memory Slot</th></tr>

1448

<tr><td>0</td><td>The "three" column -- the key to the bucket</td></tr>

1449

<tr><td>1</td><td>The minimum "three+four" value</td></tr>

1450

<tr><td>2</td><td>The sum of all "four" values. This is used to compute

1451

"avg(four)".</td></tr>

1452

</table></blockquote>

1453

1454

The query loop is implemented by instructions 8 through 22.

1455

The aggregate key specified by the GROUP BY clause is computed

1456

by instructions 9 and 10. Instruction 11 causes the appropriate

1457

bucket to come into focus. If a bucket with the given key does

1458

not already exists, a new bucket is created and control falls

1459

through to instructions 12 and 13 which initialize the bucket.

1460

If the bucket does already exist, then a jump is made to instruction

1461

14. The values of aggregate functions are updated by the instructions

1462

between 11 and 21. Instructions 14 through 18 update memory

1463

slot 1 to hold the next value "min(three+four)". Then the sum of the

1464

"four" column is updated by instructions 19 through 21.

1465

1466

After the query loop is finished, the table "examp2" is closed at

1467

instruction 23 so that its lock will be released and it can be

1468

used by other threads or processes. The next step is to loop

1469

over all aggregate buckets and output one row of the result for

1470

each bucket. This is done by the loop at instructions 24

1471

through 30. The AggNext instruction at 24 brings the next bucket

1472

into focus, or jumps to the end of the loop if all buckets have

1473

been examined already. The 3 columns of the result are fetched from

1474

the aggregator bucket in order at instructions 25 through 27.

1475

Finally, the callback is invoked at instruction 29.

1476

1477

In summary then, any query with aggregate functions is implemented

1478

by two loops. The first loop scans the input table and computes

1479

aggregate information into buckets and the second loop scans through

1480

all the buckets to compute the final result.

1481

1482

The realization that an aggregate query is really two consecutive

1483

loops makes it much easier to understand the difference between

1484

a WHERE clause and a HAVING clause in SQL query statement. The

1485

WHERE clause is a restriction on the first loop and the HAVING

1486

clause is a restriction on the second loop. You can see this

1487

by adding both a WHERE and a HAVING clause to our example query:

1488

1489

1490

1491

SELECT three, min(three+four)+avg(four)

1492

FROM examp2

1493

WHERE three>four

1494

GROUP BY three

1495

HAVING avg(four)<10;

1496

</pre></blockquote>

1497

<blockquote><tt>addr  opcode        p1     p2     p3

1498

----  ------------  -----  -----  -----------------------------------

1499

0     ColumnName    0      0      three

1500

1     ColumnName    1      0      min(three+four)+avg(four)

1501

2     AggReset      0      3

1502

3     AggInit       0      1      ptr(0x7903a0)

1503

4     AggInit       0      2      ptr(0x790700)

1504

5     Integer       0      0

1505

6     OpenRead      0      5      examp2

1506

7     VerifyCookie  0      909

1507

8     Rewind        0      26

1508

9     Column        0      0

1509

10    Column        0      1

1510

11    Le            1      25

1511

12    Column        0      0

1512

13    MakeKey       1      0      n

1513

14    AggFocus      0      17

1514

15    Column        0      0

1515

16    AggSet        0      0

1516

17    Column        0      0

1517

18    Column        0      1

1518

19    Add           0      0

1519

20    Integer       1      0

1520

21    AggFunc       0      1      ptr(0x7903a0)

1521

22    Column        0      1

1522

23    Integer       2      0

1523

24    AggFunc       0      1      ptr(0x790700)

1524

25    Next          0      9

1525

26    Close         0      0

1526

27    AggNext       0      37

1527

28    AggGet        0      2

1528

29    Integer       10     0      10

1529

30    Ge            1      27

1530

31    AggGet        0      0

1531

32    AggGet        0      1

1532

33    AggGet        0      2

1533

34    Add           0      0

1534

35    Callback      2      0

1535

36    Goto          0      27

1536

37    Noop          0      0

1537

38    Halt          0      0</tt></blockquote>

1538

The code generated in this last example is the same as the

1539

previous except for the addition of two conditional jumps used

1540

to implement the extra WHERE and HAVING clauses. The WHERE

1541

clause is implemented by instructions 9 through 11 in the query

1542

loop. The HAVING clause is implemented by instruction 28 through

1543

30 in the output loop.

1544

1545

<h2>Using SELECT Statements As Terms In An Expression</h2>

1546

1547

The very name "Structured Query Language" tells us that SQL should

1548

support nested queries. And, in fact, two different kinds of nesting

1549

are supported. Any SELECT statement that returns a single-row, single-column

1550

result can be used as a term in an expression of another SELECT statement.

1551

And, a SELECT statement that returns a single-column, multi-row result

1552

can be used as the right-hand operand of the IN and NOT IN operators.

1553

We will begin this section with an example of the first kind of nesting,

1554

where a single-row, single-column SELECT is used as a term in an expression

1555

of another SELECT. Here is our example:

1556

1557

1558

SELECT * FROM examp

1559

WHERE two!=(SELECT three FROM examp2

1560

WHERE four=5);

1561

</pre></blockquote>

1562

1563

The way SQLite deals with this is to first run the inner SELECT

1564

(the one against examp2) and store its result in a private memory

1565

cell. SQLite then substitutes the value of this private memory

1566

cell for the inner SELECT when it evaluates the outer SELECT.

1567

The code looks like this:

1568

<blockquote><tt>addr  opcode        p1     p2     p3

1569

----  ------------  -----  -----  -----------------------------------

1570

0     String        0      0

1571

1     MemStore      0      1

1572

2     Integer       0      0

1573

3     OpenRead      1      5      examp2

1574

4     VerifyCookie  0      909

1575

5     Rewind        1      13

1576

6     Column        1      1

1577

7     Integer       5      0      5

1578

8     Ne            1      12

1579

9     Column        1      0

1580

10    MemStore      0      1

1581

11    Goto          0      13

1582

12    Next          1      6

1583

13    Close         1      0

1584

14    ColumnName    0      0      one

1585

15    ColumnName    1      0      two

1586

16    Integer       0      0

1587

17    OpenRead      0      3      examp

1588

18    Rewind        0      26

1589

19    Column        0      1

1590

20    MemLoad       0      0

1591

21    Eq            1      25

1592

22    Column        0      0

1593

23    Column        0      1

1594

24    Callback      2      0

1595

25    Next          0      19

1596

26    Close         0      0

1597

27    Halt          0      0</tt></blockquote>

1598

The private memory cell is initialized to NULL by the first

1599

two instructions. Instructions 2 through 13 implement the inner

1600

SELECT statement against the examp2 table. Notice that instead of

1601

sending the result to a callback or storing the result on a sorter,

1602

the result of the query is pushed into the memory cell by instruction

1603

10 and the loop is abandoned by the jump at instruction 11.

1604

The jump at instruction at 11 is vestigial and never executes.

1605

1606

The outer SELECT is implemented by instructions 14 through 25.

1607

In particular, the WHERE clause that contains the nested select

1608

is implemented by instructions 19 through 21. You can see that

1609

the result of the inner select is loaded onto the stack by instruction

1610

20 and used by the conditional jump at 21.

1611

1612

When the result of a sub-select is a scalar, a single private memory

1613

cell can be used, as shown in the previous

1614

example. But when the result of a sub-select is a vector, such

1615

as when the sub-select is the right-hand operand of IN or NOT IN,

1616

a different approach is needed. In this case,

1617

the result of the sub-select is

1618

stored in a transient table and the contents of that table

1619

are tested using the Found or NotFound operators. Consider this

1620

example:

1621

1622

1623

SELECT * FROM examp

1624

WHERE two IN (SELECT three FROM examp2);

1625

</pre></blockquote>

1626

1627

The code generated to implement this last query is as follows:

1628

<blockquote><tt>addr  opcode        p1     p2     p3

1629

----  ------------  -----  -----  -----------------------------------

1630

0     OpenTemp      1      1

1631

1     Integer       0      0

1632

2     OpenRead      2      5      examp2

1633

3     VerifyCookie  0      909

1634

4     Rewind        2      10

1635

5     Column        2      0

1636

6     IsNull        -1     9

1637

7     String        0      0

1638

8     PutStrKey     1      0

1639

9     Next          2      5

1640

10    Close         2      0

1641

11    ColumnName    0      0      one

1642

12    ColumnName    1      0      two

1643

13    Integer       0      0

1644

14    OpenRead      0      3      examp

1645

15    Rewind        0      25

1646

16    Column        0      1

1647

17    NotNull       -1     20

1648

18    Pop           1      0

1649

19    Goto          0      24

1650

20    NotFound      1      24

1651

21    Column        0      0

1652

22    Column        0      1

1653

23    Callback      2      0

1654

24    Next          0      16

1655

25    Close         0      0

1656

26    Halt          0      0</tt></blockquote>

1657

The transient table in which the results of the inner SELECT are

1658

stored is created by the <a href="opcode.html#OpenTemp">OpenTemp</a>

1659

instruction at 0. This opcode is used for tables that exist for the

1660

duration of a single SQL statement only. The transient cursor is always

1661

opened read/write even if the main database is read-only. The transient

1662

table is deleted automatically when the cursor is closed. The P2 value

1663

of 1 means the cursor points to a BTree index, which has no data but can

1664

have an arbitrary key.

1665

1666

The inner SELECT statement is implemented by instructions 1 through 10.

1667

All this code does is make an entry in the temporary table for each

1668

row of the examp2 table with a non-NULL value for the "three" column.

1669

The key for each temporary table entry is the "three" column of examp2

1670

and the data is an empty string since it is never used.

1671

1672

The outer SELECT is implemented by instructions 11 through 25. In

1673

particular, the WHERE clause containing the IN operator is implemented

1674

by instructions at 16, 17, and 20. Instruction 16 pushes the value of

1675

the "two" column for the current row onto the stack and instruction 17

1676

checks to see that it is non-NULL. If this is successful, execution

1677

jumps to 20, where it tests to see if top of the stack matches any key

1678

in the temporary table. The rest of the code is the same as what has

1679

been shown before.

1680

1681

<h2>Compound SELECT Statements</h2>

1682

1683

SQLite also allows two or more SELECT statements to be joined as

1684

peers using operators UNION, UNION ALL, INTERSECT, and EXCEPT. These

1685

compound select statements are implemented using transient tables.

1686

The implementation is slightly different for each operator, but the

1687

basic ideas are the same. For an example we will use the EXCEPT

1688

operator.

1689

1690

1691

SELECT two FROM examp

1692

EXCEPT

1693

SELECT four FROM examp2;

1694

</pre></blockquote>

1695

1696

The result of this last example should be every unique value

1697

of the "two" column in the examp table, except any value that is

1698

in the "four" column of examp2 is removed. The code to implement

1699

this query is as follows:

1700

<blockquote><tt>addr  opcode        p1     p2     p3

1701

----  ------------  -----  -----  -----------------------------------

1702

0     OpenTemp      0      1

1703

1     KeyAsData     0      1

1704

2     Integer       0      0

1705

3     OpenRead      1      3      examp

1706

4     VerifyCookie  0      909

1707

5     Rewind        1      11

1708

6     Column        1      1

1709

7     MakeRecord    1      0

1710

8     String        0      0

1711

9     PutStrKey     0      0

1712

10    Next          1      6

1713

11    Close         1      0

1714

12    Integer       0      0

1715

13    OpenRead      2      5      examp2

1716

14    Rewind        2      20

1717

15    Column        2      1

1718

16    MakeRecord    1      0

1719

17    NotFound      0      19

1720

18    Delete        0      0

1721

19    Next          2      15

1722

20    Close         2      0

1723

21    ColumnName    0      0      four

1724

22    Rewind        0      26

1725

23    Column        0      0

1726

24    Callback      1      0

1727

25    Next          0      23

1728

26    Close         0      0

1729

27    Halt          0      0</tt></blockquote>

1730

The transient table in which the result is built is created by

1731

instruction 0. Three loops then follow. The loop at instructions

1732

5 through 10 implements the first SELECT statement. The second

1733

SELECT statement is implemented by the loop at instructions 14 through

1734

19. Finally, a loop at instructions 22 through 25 reads the transient

1735

table and invokes the callback once for each row in the result.

1736

1737

Instruction 1 is of particular importance in this example. Normally,

1738

the Column instruction extracts the value of a column from a larger

1739

record in the data of an SQLite file entry. Instruction 1 sets a flag on

1740

the transient table so that Column will instead treat the key of the

1741

SQLite file entry as if it were data and extract column information from

1742

the key.

1743

1744

Here is what is going to happen: The first SELECT statement

1745

will construct rows of the result and save each row as the key of

1746

an entry in the transient table. The data for each entry in the

1747

transient table is a never used so we fill it in with an empty string.

1748

The second SELECT statement also constructs rows, but the rows

1749

constructed by the second SELECT are removed from the transient table.

1750

That is why we want the rows to be stored in the key of the SQLite file

1751

instead of in the data -- so they can be easily located and deleted.

1752

1753

Let's look more closely at what is happening here. The first

1754

SELECT is implemented by the loop at instructions 5 through 10.

1755

Instruction 5 initializes the loop by rewinding its cursor.

1756

Instruction 6 extracts the value of the "two" column from "examp"

1757

and instruction 7 converts this into a row. Instruction 8 pushes

1758

an empty string onto the stack. Finally, instruction 9 writes the

1759

row into the temporary table. But remember, the PutStrKey opcode uses

1760

the top of the stack as the record data and the next on stack as the

1761

key. For an INSERT statement, the row generated by the

1762

MakeRecord opcode is the record data and the record key is an integer

1763

created by the NewRecno opcode. But here the roles are reversed and

1764

the row created by MakeRecord is the record key and the record data is

1765

just an empty string.

1766

1767

The second SELECT is implemented by instructions 14 through 19.

1768

Instruction 14 initializes the loop by rewinding its cursor.

1769

A new result row is created from the "four" column of table "examp2"

1770

by instructions 15 and 16. But instead of using PutStrKey to write this

1771

new row into the temporary table, we instead call Delete to remove

1772

it from the temporary table if it exists.

1773

1774

The result of the compound select is sent to the callback routine

1775

by the loop at instructions 22 through 25. There is nothing new

1776

or remarkable about this loop, except for the fact that the Column

1777

instruction at 23 will be extracting a column out of the record key

1778

rather than the record data.

1779

1780

<h2>Summary</h2>

1781

1782

This article has reviewed all of the major techniques used by

1783

SQLite's VDBE to implement SQL statements. What has not been shown

1784

is that most of these techniques can be used in combination to

1785

generate code for an appropriately complex query statement. For

1786

example, we have shown how sorting is accomplished on a simple query

1787

and we have shown how to implement a compound query. But we did

1788

not give an example of sorting in a compound query. This is because

1789

sorting a compound query does not introduce any new concepts: it

1790

merely combines two previous ideas (sorting and compounding)

1791

in the same VDBE program.

1792

1793

For additional information on how the SQLite library

1794

functions, the reader is directed to look at the SQLite source

1795

code directly. If you understand the material in this article,

1796

you should not have much difficulty in following the sources.

1797

Serious students of the internals of SQLite will probably

1798

also want to make a careful study of the VDBE opcodes

1799

as documented <a href="opcode.html">here</a>. Most of the

1800

opcode documentation is extracted from comments in the source

1801

code using a script so you can also get information about the

1802

various opcodes directly from the vdbe.c source file.

1803

If you have successfully read this far, you should have little

1804

difficulty understanding the rest.

1805

1806

If you find errors in either the documentation or the code,

1807

feel free to fix them and/or contact the author at

1808

<a href="mailto:drh@hwaci.com">drh@hwaci.com</a>. Your bug fixes or

1809

suggestions are always welcomed.

1810

1811