1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
5
>Tables and Indexes</TITLE
8
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
10
HREF="mailto:pgsql-docs@postgresql.org"><LINK
12
TITLE="PostgreSQL 9.1beta1 Documentation"
13
HREF="index.html"><LINK
15
TITLE="Full Text Search"
16
HREF="textsearch.html"><LINK
19
HREF="textsearch-intro.html"><LINK
21
TITLE="Controlling Text Search"
22
HREF="textsearch-controls.html"><LINK
25
HREF="stylesheet.css"><META
26
HTTP-EQUIV="Content-Type"
27
CONTENT="text/html; charset=ISO-8859-1"><META
29
CONTENT="2011-04-27T21:20:33"></HEAD
35
SUMMARY="Header navigation table"
47
>PostgreSQL 9.1beta1 Documentation</A
57
HREF="textsearch-intro.html"
66
TITLE="Full Text Search"
67
HREF="textsearch.html"
74
>Chapter 12. Full Text Search</TD
80
TITLE="Full Text Search"
81
HREF="textsearch.html"
89
TITLE="Controlling Text Search"
90
HREF="textsearch-controls.html"
104
NAME="TEXTSEARCH-TABLES"
105
>12.2. Tables and Indexes</A
108
> The examples in the previous section illustrated full text matching using
109
simple constant strings. This section shows how to search table data,
110
optionally using indexes.
117
NAME="TEXTSEARCH-TABLES-SEARCH"
118
>12.2.1. Searching a Table</A
121
> It is possible to do a full text search without an index. A simple query
125
> of each row that contains the word
135
CLASS="PROGRAMLISTING"
138
WHERE to_tsvector('english', body) @@ to_tsquery('english', 'friend');</PRE
141
This will also find related words such as <TT
148
>, since all these are reduced to the same
152
> The query above specifies that the <TT
156
is to be used to parse and normalize the strings. Alternatively we
157
could omit the configuration parameters:
160
CLASS="PROGRAMLISTING"
163
WHERE to_tsvector(body) @@ to_tsquery('friend');</PRE
166
This query will use the configuration set by <A
167
HREF="runtime-config-client.html#GUC-DEFAULT-TEXT-SEARCH-CONFIG"
168
>default_text_search_config</A
172
> A more complex example is to
173
select the ten most recent documents that contain <TT
189
CLASS="PROGRAMLISTING"
192
WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table')
193
ORDER BY last_mod_date DESC
197
For clarity we omitted the <CODE
201
which would be needed to find rows that contain <TT
205
in one of the two fields.
208
> Although these queries will work without an index, most applications
209
will find this approach too slow, except perhaps for occasional ad-hoc
210
searches. Practical use of text searching usually requires creating
219
NAME="TEXTSEARCH-TABLES-INDEX"
220
>12.2.2. Creating Indexes</A
223
> We can create a <ACRONYM
227
HREF="textsearch-indexes.html"
229
>) to speed up text searches:
232
CLASS="PROGRAMLISTING"
233
>CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', body));</PRE
236
Notice that the 2-argument version of <CODE
240
used. Only text search functions that specify a configuration name can
241
be used in expression indexes (<A
242
HREF="indexes-expressional.html"
245
This is because the index contents must be unaffected by <A
246
HREF="runtime-config-client.html#GUC-DEFAULT-TEXT-SEARCH-CONFIG"
247
>default_text_search_config</A
248
>. If they were affected, the
249
index contents might be inconsistent because different entries could
253
>s that were created with different text search
254
configurations, and there would be no way to guess which was which. It
255
would be impossible to dump and restore such an index correctly.
258
> Because the two-argument version of <CODE
262
used in the index above, only a query reference that uses the 2-argument
266
> with the same configuration
267
name will use that index. That is, <TT
270
to_tsvector('english', body) @@ 'a & b'</TT
274
>WHERE to_tsvector(body) @@ 'a & b'</TT
276
This ensures that an index will be used only with the same configuration
277
used to create the index entries.
280
> It is possible to set up more complex expression indexes wherein the
281
configuration name is specified by another column, e.g.:
284
CLASS="PROGRAMLISTING"
285
>CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(config_name, body));</PRE
291
> is a column in the <TT
295
table. This allows mixed configurations in the same index while
296
recording which configuration was used for each index entry. This
297
would be useful, for example, if the document collection contained
298
documents in different languages. Again,
299
queries that are meant to use the index must be phrased to match, e.g.,
302
>WHERE to_tsvector(config_name, body) @@ 'a & b'</TT
306
> Indexes can even concatenate columns:
309
CLASS="PROGRAMLISTING"
310
>CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector('english', title || ' ' || body));</PRE
314
> Another approach is to create a separate <TT
318
to hold the output of <CODE
332
> to ensure that one field will still be
333
indexed when the other is <TT
339
CLASS="PROGRAMLISTING"
340
>ALTER TABLE pgweb ADD COLUMN textsearchable_index_col tsvector;
341
UPDATE pgweb SET textsearchable_index_col =
342
to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''));</PRE
345
Then we create a <ACRONYM
348
> index to speed up the search:
351
CLASS="PROGRAMLISTING"
352
>CREATE INDEX textsearch_idx ON pgweb USING gin(textsearchable_index_col);</PRE
355
Now we are ready to perform a fast full text search:
358
CLASS="PROGRAMLISTING"
361
WHERE textsearchable_index_col @@ to_tsquery('create & table')
362
ORDER BY last_mod_date DESC
367
> When using a separate column to store the <TT
372
it is necessary to create a trigger to keep the <TT
376
column current anytime <TT
384
HREF="textsearch-features.html#TEXTSEARCH-UPDATE-TRIGGERS"
386
> explains how to do that.
389
> One advantage of the separate-column approach over an expression index
390
is that it is not necessary to explicitly specify the text search
391
configuration in queries in order to make use of the index. As shown
392
in the example above, the query can depend on
395
>default_text_search_config</TT
396
>. Another advantage is that
397
searches will be faster, since it will not be necessary to redo the
401
> calls to verify index matches. (This is more
402
important when using a GiST index than a GIN index; see <A
403
HREF="textsearch-indexes.html"
405
>.) The expression-index approach is
406
simpler to set up, however, and it requires less disk space since the
410
> representation is not stored explicitly.
419
SUMMARY="Footer navigation table"
430
HREF="textsearch-intro.html"
448
HREF="textsearch-controls.html"
464
HREF="textsearch.html"
472
>Controlling Text Search</TD
b'\\ No newline at end of file'