~ubuntu-branches/ubuntu/natty/postgresql-8.4/natty-updates

<replaceable class="parameter">colname</replaceable> integer NOT NULL DEFAULT nextval('<replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq')

782

);

783

ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceable class="parameter">colname</replaceable>_seq OWNED BY <replaceable class="parameter">tablename</replaceable>.<replaceable class="parameter">colname</replaceable>;

784

</programlisting>

785

786

Thus, we have created an integer column and arranged for its default

787

values to be assigned from a sequence generator. A <literal>NOT NULL</>

788

constraint is applied to ensure that a null value cannot be explicitly

789

inserted, either. (In most cases you would also want to attach a

790

<literal>UNIQUE</> or <literal>PRIMARY KEY</> constraint to prevent

791

duplicate values from being inserted by accident, but this is

792

not automatic.) Lastly, the sequence is marked as <quote>owned by</>

793

the column, so that it will be dropped if the column or table is dropped.

794

</para>

795

796

<note>

797

<para>

798

Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>

799

implied <literal>UNIQUE</literal>. This is no longer automatic. If

800

you wish a serial column to be in a unique constraint or a

801

primary key, it must now be specified, same as with

802

any other data type.

803

</para>

804

</note>

805

806

<para>

807

To insert the next value of the sequence into the <type>serial</type>

808

column, specify that the <type>serial</type>

809

column should be assigned its default value. This can be done

810

either by excluding the column from the list of columns in

811

the <command>INSERT</command> statement, or through the use of

812

the <literal>DEFAULT</literal> key word.

813

</para>

814

815

<para>

816

The type names <type>serial</type> and <type>serial4</type> are

817

equivalent: both create <type>integer</type> columns. The type

818

names <type>bigserial</type> and <type>serial8</type> work just

819

the same way, except that they create a <type>bigint</type>

820

column. <type>bigserial</type> should be used if you anticipate

821

the use of more than 2<superscript>31</> identifiers over the

822

lifetime of the table.

823

</para>

824

825

<para>

826

The sequence created for a <type>serial</type> column is

827

automatically dropped when the owning column is dropped.

828

You can drop the sequence without dropping the column, but this

829

will force removal of the column default expression.

830

</para>

831

</sect2>

832

</sect1>

833

834

835

<title>Monetary Types</title>

836

837

<para>

838

The <type>money</type> type stores a currency amount with a fixed

839

fractional precision; see <xref

840

linkend="datatype-money-table">.

841

Input is accepted in a variety of formats, including integer and

842

floating-point literals, as well as <quote>typical</quote>

843

currency formatting, such as <literal>'$1,000.00'</literal>.

844

Output is generally in the latter form but depends on the locale.

845

Non-quoted numeric values can be converted to <type>money</type> by

846

casting the numeric value to <type>text</type> and then

847

<type>money</type>:

848

849

SELECT 1234::text::money;

850

</programlisting>

851

There is no simple way of doing the reverse in a locale-independent

852

manner, namely casting a <type>money</type> value to a numeric type.

853

If you know the currency symbol and thousands separator you can use

854

<function>regexp_replace()</>:

855

856

SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;

857

</programlisting>

858

859

</para>

860

861

<para>

862

Since the output of this data type is locale-sensitive, it may not

863

work to load <type>money</> data into a database that has a different

864

setting of <varname>lc_monetary</>. To avoid problems, before

865

restoring a dump make sure <varname>lc_monetary</> has the same or

866

equivalent value as in the database that was dumped.

867

</para>

868

869

870

<title>Monetary Types</title>

871

872

<thead>

873

<row>

874

875

<entry>Storage Size</entry>

876

<entry>Description</entry>

877

<entry>Range</entry>

878

</row>

879

</thead>

880

<tbody>

881

<row>

882

<entry>money</entry>

883

<entry>8 bytes</entry>

884

<entry>currency amount</entry>

885

886

</row>

887

</tbody>

888

</tgroup>

889

</table>

890

</sect1>

891

892

893

894

<title>Character Types</title>

895

896

897

<primary>character string</primary>

898

<secondary>data types</secondary>

899

</indexterm>

900

901

902

<primary>string</primary>

903

<see>character string</see>

904

</indexterm>

905

906

907

<primary>character</primary>

908

</indexterm>

909

910

911

<primary>character varying</primary>

912

</indexterm>

913

914

915

916

</indexterm>

917

918

919

920

</indexterm>

921

922

923

<primary>varchar</primary>

924

</indexterm>

925

926

927

<title>Character Types</title>

928

929

<thead>

930

<row>

931

932

<entry>Description</entry>

933

</row>

934

</thead>

935

<tbody>

936

<row>

937

<entry><type>character varying(<replaceable>n</>)</type>, <type>varchar(<replaceable>n</>)</type></entry>

938

<entry>variable-length with limit</entry>

939

</row>

940

<row>

941

<entry><type>character(<replaceable>n</>)</type>, <type>char(<replaceable>n</>)</type></entry>

942

<entry>fixed-length, blank padded</entry>

943

</row>

944

<row>

945

946

<entry>variable unlimited length</entry>

947

</row>

948

</tbody>

949

</tgroup>

950

</table>

951

952

<para>

953

<xref linkend="datatype-character-table"> shows the

954

general-purpose character types available in

955

<productname>PostgreSQL</productname>.

956

</para>

957

958

<para>

959

<acronym>SQL</acronym> defines two primary character types:

960

<type>character varying(<replaceable>n</>)</type> and

961

<type>character(<replaceable>n</>)</type>, where <replaceable>n</>

962

is a positive integer. Both of these types can store strings up to

963

<replaceable>n</> characters in length. An attempt to store a

964

longer string into a column of these types will result in an

965

error, unless the excess characters are all spaces, in which case

966

the string will be truncated to the maximum length. (This somewhat

967

bizarre exception is required by the <acronym>SQL</acronym>

968

standard.) If the string to be stored is shorter than the declared

969

length, values of type <type>character</type> will be space-padded;

970

values of type <type>character varying</type> will simply store the

971

shorter

972

string.

973

</para>

974

975

<para>

976

If one explicitly casts a value to <type>character

977

varying(<replaceable>n</>)</type> or

978

<type>character(<replaceable>n</>)</type>, then an over-length

979

value will be truncated to <replaceable>n</> characters without

980

raising an error. (This too is required by the

981

<acronym>SQL</acronym> standard.)

982

</para>

983

984

<para>

985

The notations <type>varchar(<replaceable>n</>)</type> and

986

<type>char(<replaceable>n</>)</type> are aliases for <type>character

987

varying(<replaceable>n</>)</type> and

988

<type>character(<replaceable>n</>)</type>, respectively.

989

<type>character</type> without length specifier is equivalent to

990

<type>character(1)</type>. If <type>character varying</type> is used

991

without length specifier, the type accepts strings of any size. The

992

latter is a <productname>PostgreSQL</> extension.

993

</para>

994

995

<para>

996

In addition, <productname>PostgreSQL</productname> provides the

997

<type>text</type> type, which stores strings of any length.

998

Although the type <type>text</type> is not in the

999

<acronym>SQL</acronym> standard, several other SQL database

1000

management systems have it as well.

1001

</para>

1002

1003

<para>

1004

Values of type <type>character</type> are physically padded

1005

with spaces to the specified width <replaceable>n</>, and are

1006

stored and displayed that way. However, the padding spaces are

1007

treated as semantically insignificant. Trailing spaces are

1008

disregarded when comparing two values of type <type>character</type>,

1009

and they will be removed when converting a <type>character</type> value

1010

to one of the other string types. Note that trailing spaces

1011

<emphasis>are</> semantically significant in

1012

<type>character varying</type> and <type>text</type> values.

1013

</para>

1014

1015

<para>

1016

The storage requirement for a short string (up to 126 bytes) is 1 byte

1017

plus the actual string, which includes the space padding in the case of

1018

<type>character</type>. Longer strings have 4 bytes overhead instead

1019

of 1. Long strings are compressed by the system automatically, so

1020

the physical requirement on disk might be less. Very long values are also

1021

stored in background tables so that they do not interfere with rapid

1022

access to shorter column values. In any case, the longest

1023

possible character string that can be stored is about 1 GB. (The

1024

maximum value that will be allowed for <replaceable>n</> in the data

1025

type declaration is less than that. It wouldn't be very useful to

1026

change this because with multibyte character encodings the number of

1027

characters and bytes can be quite different anyway. If you desire to

1028

store long strings with no specific upper limit, use

1029

<type>text</type> or <type>character varying</type> without a length

1030

specifier, rather than making up an arbitrary length limit.)

1031

</para>

1032

1033

<tip>

1034

<para>

1035

There are no performance differences between these three types,

1036

apart from increased storage size when using the blank-padded

1037

type, and a few extra cycles to check the length when storing into

1038

a length-constrained column. While

1039

<type>character(<replaceable>n</>)</type> has performance

1040

advantages in some other database systems, it has no such advantages in

1041

<productname>PostgreSQL</productname>. In most situations

1042

<type>text</type> or <type>character varying</type> should be used

1043

instead.

1044

</para>

1045

</tip>

1046

1047

<para>

1048

Refer to <xref linkend="sql-syntax-strings"> for information about

1049

the syntax of string literals, and to <xref linkend="functions">

1050

for information about available operators and functions. The

1051

database character set determines the character set used to store

1052

textual values; for more information on character set support,

1053

refer to <xref linkend="multibyte">.

1054

</para>

1055

1056

1057

<title>Using the character types</title>

1058

1059

1060

CREATE TABLE test1 (a character(4));

1061

INSERT INTO test1 VALUES ('ok');

1062

SELECT a, char_length(a) FROM test1; -- <co id="co.datatype-char">

1063

1064

a | char_length

1065

------+-------------

1066

ok | 2

1067

</computeroutput>

1068

1069

CREATE TABLE test2 (b varchar(5));

1070

INSERT INTO test2 VALUES ('ok');

1071

INSERT INTO test2 VALUES ('good ');

1072

INSERT INTO test2 VALUES ('too long');

1073

<computeroutput>ERROR: value too long for type character varying(5)</computeroutput>

1074

INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation

1075

SELECT b, char_length(b) FROM test2;

1076

1077

b | char_length

1078

-------+-------------

1079

ok | 2

1080

good | 5

1081

too l | 5

1082

</computeroutput>

1083

</programlisting>

1084

1085

1086

<para>

1087

The <function>char_length</function> function is discussed in

1088

<xref linkend="functions-string">.

1089

</para>

1090

</callout>

1091

</calloutlist>

1092

</example>

1093

1094

<para>

1095

There are two other fixed-length character types in

1096

<productname>PostgreSQL</productname>, shown in <xref

1097

linkend="datatype-character-special-table">. The <type>name</type>

1098

type exists <emphasis>only</emphasis> for storage of identifiers

1099

in the internal system catalogs and is not intended for use by the general user. Its

1100

length is currently defined as 64 bytes (63 usable characters plus

1101

terminator) but should be referenced using the constant

1102

<symbol>NAMEDATALEN</symbol>. The length is set at compile time (and

1103

is therefore adjustable for special uses); the default maximum

1104

length might change in a future release. The type <type>"char"</type>

1105

(note the quotes) is different from <type>char(1)</type> in that it

1106

only uses one byte of storage. It is internally used in the system

1107

catalogs as a poor-man's enumeration type.

1108

</para>

1109

1110

1111

<title>Special Character Types</title>

1112

1113

<thead>

1114

<row>

1115

1116

<entry>Storage Size</entry>

1117

<entry>Description</entry>

1118

</row>

1119

</thead>

1120

<tbody>

1121

<row>

1122

1123

1124

<entry>single-byte internal type</entry>

1125

</row>

1126

<row>

1127

1128

<entry>64 bytes</entry>

1129

<entry>internal type for object names</entry>

1130

</row>

1131

</tbody>

1132

</tgroup>

1133

</table>

1134

1135

</sect1>

1136

1137

1138

<title>Binary Data Types</title>

1139

1140

1141

<primary>binary data</primary>

1142

</indexterm>

1143

1144

1145

<primary>bytea</primary>

1146

</indexterm>

1147

1148

<para>

1149

The <type>bytea</type> data type allows storage of binary strings;

1150

see <xref linkend="datatype-binary-table">.

1151

</para>

1152

1153

1154

<title>Binary Data Types</title>

1155

1156

<thead>

1157

<row>

1158

1159

<entry>Storage Size</entry>

1160

<entry>Description</entry>

1161

</row>

1162

</thead>

1163

<tbody>

1164

<row>

1165

<entry><type>bytea</type></entry>

1166

<entry>1 or 4 bytes plus the actual binary string</entry>

1167

<entry>variable-length binary string</entry>

1168

</row>

1169

</tbody>

1170

</tgroup>

1171

</table>

1172

1173

<para>

1174

A binary string is a sequence of octets (or bytes). Binary

1175

strings are distinguished from character strings by two

1176

characteristics: First, binary strings specifically allow storing

1177

octets of value zero and other <quote>non-printable</quote>

1178

octets (usually, octets outside the range 32 to 126).

1179

Character strings disallow zero octets, and also disallow any

1180

other octet values and sequences of octet values that are invalid

1181

according to the database's selected character set encoding.

1182

Second, operations on binary strings process the actual bytes,

1183

whereas the processing of character strings depends on locale settings.

1184

In short, binary strings are appropriate for storing data that the

1185

programmer thinks of as <quote>raw bytes</>, whereas character

1186

strings are appropriate for storing text.

1187

</para>

1188

1189

<para>

1190

When entering <type>bytea</type> values, octets of certain

1191

values <emphasis>must</emphasis> be escaped (but all octet

1192

values <emphasis>can</emphasis> be escaped) when used as part

1193

of a string literal in an <acronym>SQL</acronym> statement. In

1194

general, to escape an octet, it is converted into the three-digit

1195

octal number equivalent of its decimal octet value, and preceded

1196

by two backslashes. <xref linkend="datatype-binary-sqlesc">

1197

shows the characters that must be escaped, and gives the alternative

1198

escape sequences where applicable.

1199

</para>

1200

1201

1202

<title><type>bytea</> Literal Escaped Octets</title>

1203

1204

<thead>

1205

<row>

1206

<entry>Decimal Octet Value</entry>

1207

<entry>Description</entry>

1208

<entry>Escaped Input Representation</entry>

1209

<entry>Example</entry>

1210

<entry>Output Representation</entry>

1211

</row>

1212

</thead>

1213

1214

<tbody>

1215

<row>

1216

1217

<entry>zero octet</entry>

1218

1219

<entry><literal>SELECT E'\\000'::bytea;</literal></entry>

1220

1221

</row>

1222

1223

<row>

1224

1225

<entry>single quote</entry>

1226

1227

<entry><literal>SELECT E'\''::bytea;</literal></entry>

1228

1229

</row>

1230

1231

<row>

1232

1233

<entry>backslash</entry>

1234

1235

<entry><literal>SELECT E'\\\\'::bytea;</literal></entry>

1236

1237

</row>

1238

1239

<row>

1240

1241

<entry><quote>non-printable</quote> octets</entry>

1242

<entry><literal>E'\\<replaceable>xxx'</></literal> (octal value)</entry>

1243

<entry><literal>SELECT E'\\001'::bytea;</literal></entry>

1244

1245

</row>

1246

1247

</tbody>

1248

</tgroup>

1249

</table>

1250

1251

<para>

1252

The requirement to escape <quote>non-printable</quote> octets actually

1253

varies depending on locale settings. In some instances you can get away

1254

with leaving them unescaped. Note that the result in each of the examples

1255

in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in

1256

length, even though the output representation of the zero octet and

1257

backslash are more than one character.

1258

</para>

1259

1260

<para>

1261

The reason that you have to write so many backslashes, as shown

1262

in <xref linkend="datatype-binary-sqlesc">, is that an input

1263

string written as a string literal must pass through two parse

1264

phases in the <productname>PostgreSQL</productname> server.

1265

The first backslash of each pair is interpreted as an escape

1266

character by the string-literal parser (assuming escape string

1267

syntax is used) and is therefore consumed, leaving the second backslash of the

1268

pair. (Dollar-quoted strings can be used to avoid this level

1269

of escaping.) The remaining backslash is then recognized by the

1270

<type>bytea</type> input function as starting either a three

1271

digit octal value or escaping another backslash. For example,

1272

a string literal passed to the server as <literal>E'\\001'</literal>

1273

becomes <literal>\001</literal> after passing through the

1274

escape string parser. The <literal>\001</literal> is then sent

1275

to the <type>bytea</type> input function, where it is converted

1276

to a single octet with a decimal value of 1. Note that the

1277

single-quote character is not treated specially by <type>bytea</type>,

1278

so it follows the normal rules for string literals. (See also

1279

<xref linkend="sql-syntax-strings">.)

1280

</para>

1281

1282

<para>

1283

<type>Bytea</type> octets are also escaped in the output. In general, each

1284

<quote>non-printable</quote> octet is converted into

1285

its equivalent three-digit octal value and preceded by one backslash.

1286

Most <quote>printable</quote> octets are represented by their standard

1287

representation in the client character set. The octet with decimal

1288

value 92 (backslash) has a special alternative output representation.

1289

Details are in <xref linkend="datatype-binary-resesc">.

1290

</para>

1291

1292

1293

<title><type>bytea</> Output Escaped Octets</title>

1294

1295

<thead>

1296

<row>

1297

<entry>Decimal Octet Value</entry>

1298

<entry>Description</entry>

1299

<entry>Escaped Output Representation</entry>

1300

<entry>Example</entry>

1301

<entry>Output Result</entry>

1302

</row>

1303

</thead>

1304

1305

<tbody>

1306

1307

<row>

1308

1309

<entry>backslash</entry>

1310

1311

<entry><literal>SELECT E'\\134'::bytea;</literal></entry>

1312

1313

</row>

1314

1315

<row>

1316

1317

<entry><quote>non-printable</quote> octets</entry>

1318

<entry><literal>\<replaceable>xxx</></literal> (octal value)</entry>

1319

<entry><literal>SELECT E'\\001'::bytea;</literal></entry>

1320

1321

</row>

1322

1323

<row>

1324

1325

<entry><quote>printable</quote> octets</entry>

1326

<entry>client character set representation</entry>

1327

<entry><literal>SELECT E'\\176'::bytea;</literal></entry>

1328

1329

</row>

1330

1331

</tbody>

1332

</tgroup>

1333

</table>

1334

1335

<para>

1336

Depending on the front end to <productname>PostgreSQL</> you use,

1337

you might have additional work to do in terms of escaping and

1338

unescaping <type>bytea</type> strings. For example, you might also

1339

have to escape line feeds and carriage returns if your interface

1340

automatically translates these.

1341

</para>

1342

1343

<para>

1344

The <acronym>SQL</acronym> standard defines a different binary

1345

string type, called <type>BLOB</type> or <type>BINARY LARGE

1346

OBJECT</type>. The input format is different from

1347

<type>bytea</type>, but the provided functions and operators are

1348

mostly the same.

1349

</para>

1350

</sect1>

1351

1352

1353

1354

<title>Date/Time Types</title>

1355

1356

1357

1358

</indexterm>

1359

1360

1361

</indexterm>

1362

1363

<primary>time without time zone</primary>

1364

</indexterm>

1365

1366

1367

</indexterm>

1368

1369

<primary>timestamp</primary>

1370

</indexterm>

1371

1372

<primary>timestamp with time zone</primary>

1373

</indexterm>

1374

1375

<primary>timestamp without time zone</primary>

1376

</indexterm>

1377

1378

<primary>interval</primary>

1379

</indexterm>

1380

1381

1382

</indexterm>

1383

1384

<para>

1385

<productname>PostgreSQL</productname> supports the full set of

1386

<acronym>SQL</acronym> date and time types, shown in <xref

1387

linkend="datatype-datetime-table">. The operations available

1388

on these data types are described in

1389

<xref linkend="functions-datetime">.

1390

</para>

1391

1392

1393

<title>Date/Time Types</title>

1394

1395

<thead>

1396

<row>

1397

1398

<entry>Storage Size</entry>

1399

<entry>Description</entry>

1400

<entry>Low Value</entry>

1401

<entry>High Value</entry>

1402

<entry>Resolution</entry>

1403

</row>

1404

</thead>

1405

<tbody>

1406

<row>

1407

<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>

1408

<entry>8 bytes</entry>

1409

1410

1411

1412

<entry>1 microsecond / 14 digits</entry>

1413

</row>

1414

<row>

1415

<entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>

1416

<entry>8 bytes</entry>

1417

1418

1419

1420

<entry>1 microsecond / 14 digits</entry>

1421

</row>

1422

<row>

1423

1424

<entry>4 bytes</entry>

1425

<entry>dates only</entry>

1426

1427

1428

1429

</row>

1430

<row>

1431

<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>

1432

<entry>8 bytes</entry>

1433

<entry>times of day only</entry>

1434

1435

1436

<entry>1 microsecond / 14 digits</entry>

1437

</row>

1438

<row>

1439

1440

<entry>12 bytes</entry>

1441

<entry>times of day only, with time zone</entry>

1442

1443

1444

<entry>1 microsecond / 14 digits</entry>

1445

</row>

1446

<row>

1447

<entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>

1448

<entry>12 bytes</entry>

1449

<entry>time intervals</entry>

1450

<entry>-178000000 years</entry>

1451

<entry>178000000 years</entry>

1452

<entry>1 microsecond / 14 digits</entry>

1453

</row>

1454

</tbody>

1455

</tgroup>

1456

</table>

1457

1458

<note>

1459

<para>

1460

Prior to <productname>PostgreSQL</productname> 7.3, writing just

1461

<type>timestamp</type> was equivalent to <type>timestamp with

1462

time zone</type>. This was changed for SQL compliance.

1463

</para>

1464

</note>

1465

1466

<para>

1467

<type>time</type>, <type>timestamp</type>, and

1468

<type>interval</type> accept an optional precision value

1469

<replaceable>p</replaceable> which specifies the number of

1470

fractional digits retained in the seconds field. By default, there

1471

is no explicit bound on precision. The allowed range of

1472

<replaceable>p</replaceable> is from 0 to 6 for the

1473

<type>timestamp</type> and <type>interval</type> types.

1474

</para>

1475

1476

<note>

1477

<para>

1478

When <type>timestamp</> values are stored as eight-byte integers

1479

(currently the default), microsecond precision is available over

1480

the full range of values. When <type>timestamp</> values are

1481

stored as double precision floating-point numbers instead (a

1482

deprecated compile-time option), the effective limit of precision

1483

might be less than 6. <type>timestamp</type> values are stored as

1484

seconds before or after midnight 2000-01-01. When

1485

<type>timestamp</type> values are implemented using floating-point

1486

numbers, microsecond precision is achieved for dates within a few

1487

years of 2000-01-01, but the precision degrades for dates further

1488

away. Note that using floating-point datetimes allows a larger

1489

range of <type>timestamp</type> values to be represented than

1490

shown above: from 4713 BC up to 5874897 AD.

1491

</para>

1492

1493

<para>

1494

The same compile-time option also determines whether

1495

<type>time</type> and <type>interval</type> values are stored as

1496

floating-point numbers or eight-byte integers. In the

1497

floating-point case, large <type>interval</type> values degrade in

1498

precision as the size of the interval increases.

1499

</para>

1500

</note>

1501

1502

<para>

1503

For the <type>time</type> types, the allowed range of

1504

<replaceable>p</replaceable> is from 0 to 6 when eight-byte integer

1505

storage is used, or from 0 to 10 when floating-point storage is used.

1506

</para>

1507

1508

<para>

1509

The <type>interval</type> type has an additional option, which is

1510

to restrict the set of stored fields by writing one of these phrases:

1511

1512

YEAR

1513

MONTH

1514

DAY

1515

HOUR

1516

MINUTE

1517

SECOND

1518

YEAR TO MONTH

1519

DAY TO HOUR

1520

DAY TO MINUTE

1521

DAY TO SECOND

1522

HOUR TO MINUTE

1523

MINUTE TO SECOND

1524

</programlisting>

1525

Input falling outside the specified set of fields is silently discarded.

1526

Note that if both <replaceable>fields</replaceable> and

1527

<replaceable>precision</replaceable> are specified, the

1528

<replaceable>fields</replaceable> must include <literal>SECOND</>,

1529

since the precision applies only to the seconds.

1530

</para>

1531

1532

<para>

1533

The type <type>time with time zone</type> is defined by the SQL

1534

standard, but the definition exhibits properties which lead to

1535

questionable usefulness. In most cases, a combination of

1536

<type>date</type>, <type>time</type>, <type>timestamp without time

1537

zone</type>, and <type>timestamp with time zone</type> should

1538

provide a complete range of date/time functionality required by

1539

any application.

1540

</para>

1541

1542

<para>

1543

The types <type>abstime</type>

1544

and <type>reltime</type> are lower precision types which are used internally.

1545

You are discouraged from using these types in new

1546

applications and are encouraged to move any old

1547

ones over when appropriate. Any or all of these internal types

1548

might disappear in a future release.

1549

</para>

1550

1551

1552

<title>Date/Time Input</title>

1553

1554

<para>

1555

Date and time input is accepted in almost any reasonable format, including

1556

ISO 8601, <acronym>SQL</acronym>-compatible,

1557

traditional <productname>POSTGRES</productname>, and others.

1558

For some formats, ordering of month, day, and year in date input is

1559

ambiguous and there is support for specifying the expected

1560

ordering of these fields. Set the <xref linkend="guc-datestyle"> parameter

1561

to <literal>MDY</> to select month-day-year interpretation,

1562

<literal>DMY</> to select day-month-year interpretation, or

1563

<literal>YMD</> to select year-month-day interpretation.

1564

</para>

1565

1566

<para>

1567

<productname>PostgreSQL</productname> is more flexible in

1568

handling date/time input than the

1569

<acronym>SQL</acronym> standard requires.

1570

See <xref linkend="datetime-appendix">

1571

for the exact parsing rules of date/time input and for the

1572

recognized text fields including months, days of the week, and

1573

time zones.

1574

</para>

1575

1576

<para>

1577

Remember that any date or time literal input needs to be enclosed

1578

in single quotes, like text strings. Refer to

1579

<xref linkend="sql-syntax-constants-generic"> for more

1580

information.

1581

<acronym>SQL</acronym> requires the following syntax

1582

1583

<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'

1584

</synopsis>

1585

where <replaceable>p</replaceable> in the optional precision

1586

specification is an integer corresponding to the number of

1587

fractional digits in the seconds field. Precision can be

1588

specified for <type>time</type>, <type>timestamp</type>, and

1589

<type>interval</type> types. The allowed values are mentioned

1590

above. If no precision is specified in a constant specification,

1591

it defaults to the precision of the literal value.

1592

</para>

1593

1594

<sect3>

1595

<title>Dates</title>

1596

1597

1598

1599

</indexterm>

1600

1601

<para>

1602

<xref linkend="datatype-datetime-date-table"> shows some possible

1603

inputs for the <type>date</type> type.

1604

</para>

1605

1606

1607

<title>Date Input</title>

1608

1609

<thead>

1610

<row>

1611

<entry>Example</entry>

1612

<entry>Description</entry>

1613

</row>

1614

</thead>

1615

<tbody>

1616

<row>

1617

<entry>January 8, 1999</entry>

1618

<entry>unambiguous in any <varname>datestyle</varname> input mode</entry>

1619

</row>

1620

<row>

1621

1622

<entry>ISO 8601; January 8 in any mode

1623

(recommended format)</entry>

1624

</row>

1625

<row>

1626

1627

<entry>January 8 in <literal>MDY</> mode;

1628

August 1 in <literal>DMY</> mode</entry>

1629

</row>

1630

<row>

1631

1632

<entry>January 18 in <literal>MDY</> mode;

1633

rejected in other modes</entry>

1634

</row>

1635

<row>

1636

1637

<entry>January 2, 2003 in <literal>MDY</> mode;

1638

February 1, 2003 in <literal>DMY</> mode;

1639

February 3, 2001 in <literal>YMD</> mode

1640

</entry>

1641

</row>

1642

<row>

1643

1644

<entry>January 8 in any mode</entry>

1645

</row>

1646

<row>

1647

1648

<entry>January 8 in any mode</entry>

1649

</row>

1650

<row>

1651

1652

<entry>January 8 in any mode</entry>

1653

</row>

1654

<row>

1655

1656

<entry>January 8 in <literal>YMD</> mode, else error</entry>

1657

</row>

1658

<row>

1659

1660

<entry>January 8, except error in <literal>YMD</> mode</entry>

1661

</row>

1662

<row>

1663

1664

<entry>January 8, except error in <literal>YMD</> mode</entry>

1665

</row>

1666

<row>

1667

1668

<entry>ISO 8601; January 8, 1999 in any mode</entry>

1669

</row>

1670

<row>

1671

1672

<entry>ISO 8601; January 8, 1999 in any mode</entry>

1673

</row>

1674

<row>

1675

1676

1677

</row>

1678

<row>

1679

1680

<entry>Julian day</entry>

1681

</row>

1682

<row>

1683

<entry>January 8, 99 BC</entry>

1684

<entry>year 99 before the Common Era</entry>

1685

</row>

1686

</tbody>

1687

</tgroup>

1688

</table>

1689

</sect3>

1690

1691

<sect3>

1692

<title>Times</title>

1693

1694

1695

1696

</indexterm>

1697

1698

<primary>time without time zone</primary>

1699

</indexterm>

1700

1701

1702

</indexterm>

1703

1704

<para>

1705

The time-of-day types are <type>time [

1706

(<replaceable>p</replaceable>) ] without time zone</type> and

1707

<type>time [ (<replaceable>p</replaceable>) ] with time

1708

zone</type>. Writing just <type>time</type> is equivalent to

1709

<type>time without time zone</type>.

1710

</para>

1711

1712

<para>

1713

Valid input for these types consists of a time of day followed

1714

by an optional time zone. (See <xref

1715

linkend="datatype-datetime-time-table">

1716

and <xref linkend="datatype-timezone-table">.) If a time zone is

1717

specified in the input for <type>time without time zone</type>,

1718

it is silently ignored. You can also specify a date but it will

1719

be ignored, except when you use a time zone name that involves a

1720

daylight-savings rule, such as

1721

<literal>America/New_York</literal>. In this case specifying the date

1722

is required in order to determine whether standard or daylight-savings

1723

time applies. The appropriate time zone offset is recorded in the

1724

<type>time with time zone</type> value.

1725

</para>

1726

1727

1728

<title>Time Input</title>

1729

1730

<thead>

1731

<row>

1732

<entry>Example</entry>

1733

<entry>Description</entry>

1734

</row>

1735

</thead>

1736

<tbody>

1737

<row>

1738

1739

1740

</row>

1741

<row>

1742

1743

1744

</row>

1745

<row>

1746

1747

1748

</row>

1749

<row>

1750

1751

1752

</row>

1753

<row>

1754

1755

<entry>same as 04:05; AM does not affect value</entry>

1756

</row>

1757

<row>

1758

1759

<entry>same as 16:05; input hour must be <= 12</entry>

1760

</row>

1761

<row>

1762

1763

1764

</row>

1765

<row>

1766

1767

1768

</row>

1769

<row>

1770

1771

1772

</row>

1773

<row>

1774

1775

1776

</row>

1777

<row>

1778

1779

<entry>time zone specified by abbreviation</entry>

1780

</row>

1781

<row>

1782

<entry><literal>2003-04-12 04:05:06 America/New_York</literal></entry>

1783

<entry>time zone specified by full name</entry>

1784

</row>

1785

</tbody>

1786

</tgroup>

1787

</table>

1788

1789

1790

<title>Time Zone Input</title>

1791

1792

<thead>

1793

<row>

1794

<entry>Example</entry>

1795

<entry>Description</entry>

1796

</row>

1797

</thead>

1798

<tbody>

1799

<row>

1800

1801

<entry>Abbreviation (for Pacific Standard Time)</entry>

1802

</row>

1803

<row>

1804

<entry><literal>America/New_York</literal></entry>

1805

1806

</row>

1807

<row>

1808

1809

<entry>POSIX-style time zone specification</entry>

1810

</row>

1811

<row>

1812

1813

<entry>ISO-8601 offset for PST</entry>

1814

</row>

1815

<row>

1816

1817

<entry>ISO-8601 offset for PST</entry>

1818

</row>

1819

<row>

1820

1821

<entry>ISO-8601 offset for PST</entry>

1822

</row>

1823

<row>

1824

1825

<entry>Military abbreviation for UTC</entry>

1826

</row>

1827

<row>

1828

1829

<entry>Short form of <literal>zulu</literal></entry>

1830

</row>

1831

</tbody>

1832

</tgroup>

1833

</table>

1834

1835

<para>

1836

Refer to <xref linkend="datatype-timezones"> for more information on how

1837

to specify time zones.

1838

</para>

1839

</sect3>

1840

1841

<sect3>

1842

<title>Time Stamps</title>

1843

1844

1845

<primary>timestamp</primary>

1846

</indexterm>

1847

1848

1849

<primary>timestamp with time zone</primary>

1850

</indexterm>

1851

1852

1853

<primary>timestamp without time zone</primary>

1854

</indexterm>

1855

1856

<para>

1857

Valid input for the time stamp types consists of a concatenation

1858

of a date and a time, followed by an optional time zone,

1859

followed by an optional <literal>AD</literal> or <literal>BC</literal>.

1860

(Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear

1861

before the time zone, but this is not the preferred ordering.)

1862

Thus:

1863

1864

1865

1999-01-08 04:05:06

1866

</programlisting>

1867

and:

1868

1869

1999-01-08 04:05:06 -8:00

1870

</programlisting>

1871

1872

are valid values, which follow the <acronym>ISO</acronym> 8601

1873

standard. In addition, the wide-spread format:

1874

1875

January 8 04:05:06 1999 PST

1876

</programlisting>

1877

is supported.

1878

</para>

1879

1880

<para>

1881

The <acronym>SQL</acronym> standard differentiates <type>timestamp without time zone</type>

1882

and <type>timestamp with time zone</type> literals by the presence of a

1883

<quote>+</quote> or <quote>-</quote>. Hence, according to the standard,

1884

<programlisting>TIMESTAMP '2004-10-19 10:23:54'</programlisting>

1885

is a <type>timestamp without time zone</type>, while

1886

<programlisting>TIMESTAMP '2004-10-19 10:23:54+02'</programlisting>

1887

is a <type>timestamp with time zone</type>.

1888

<productname>PostgreSQL</productname> never examines the content of a

1889

literal string before determining its type, and therefore will treat

1890

both of the above as <type>timestamp without time zone</type>. To

1891

ensure that a literal is treated as <type>timestamp with time

1892

zone</type>, give it the correct explicit type:

1893

<programlisting>TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'</programlisting>

1894

In a literal that has been decided to be <type>timestamp without time

1895

zone</type>, <productname>PostgreSQL</productname> will silently ignore

1896

any time zone indication.

1897

That is, the resulting value is derived from the date/time

1898

fields in the input value, and is not adjusted for time zone.

1899

</para>

1900

1901

<para>

1902

For <type>timestamp with time zone</type>, the internally stored

1903

value is always in UTC (Universal

1904

Coordinated Time, traditionally known as Greenwich Mean Time,

1905

<acronym>GMT</>). An input value that has an explicit

1906

time zone specified is converted to UTC using the appropriate offset

1907

for that time zone. If no time zone is stated in the input string,

1908

then it is assumed to be in the time zone indicated by the system's

1909

<xref linkend="guc-timezone"> parameter, and is converted to UTC using the

1910

offset for the <varname>timezone</> zone.

1911

</para>

1912

1913

<para>

1914

When a <type>timestamp with time

1915

zone</type> value is output, it is always converted from UTC to the

1916

current <varname>timezone</> zone, and displayed as local time in that

1917

zone. To see the time in another time zone, either change

1918

<varname>timezone</> or use the <literal>AT TIME ZONE</> construct

1919

(see <xref linkend="functions-datetime-zoneconvert">).

1920

</para>

1921

1922

<para>

1923

Conversions between <type>timestamp without time zone</type> and

1924

<type>timestamp with time zone</type> normally assume that the

1925

<type>timestamp without time zone</type> value should be taken or given

1926

as <varname>timezone</> local time. A different zone reference can

1927

be specified for the conversion using <literal>AT TIME ZONE</>.

1928

</para>

1929

</sect3>

1930

1931

<sect3>

1932

<title>Special Values</title>

1933

1934

1935

1936

<secondary>constants</secondary>

1937

</indexterm>

1938

1939

1940

1941

<secondary>constants</secondary>

1942

</indexterm>

1943

1944

<para>

1945

<productname>PostgreSQL</productname> supports several

1946

special date/time input values for convenience, as shown in <xref

1947

linkend="datatype-datetime-special-table">. The values

1948

<literal>infinity</literal> and <literal>-infinity</literal>

1949

are specially represented inside the system and will be displayed

1950

the same way; but the others are simply notational shorthands

1951

that will be converted to ordinary date/time values when read.

1952

(In particular, <literal>now</> and related strings are converted

1953

to a specific time value as soon as they are read.)

1954

All of these values need to be written in single quotes when used

1955

as constants in SQL commands.

1956

</para>

1957

1958

1959

<title>Special Date/Time Inputs</title>

1960

1961

<thead>

1962

<row>

1963

<entry>Input String</entry>

1964

<entry>Valid Types</entry>

1965

<entry>Description</entry>

1966

</row>

1967

</thead>

1968

<tbody>

1969

<row>

1970

<entry><literal>epoch</literal></entry>

1971

<entry><type>date</type>, <type>timestamp</type></entry>

1972

<entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>

1973

</row>

1974

<row>

1975

<entry><literal>infinity</literal></entry>

1976

<entry><type>date</type>, <type>timestamp</type></entry>

1977

<entry>later than all other time stamps</entry>

1978

</row>

1979

<row>

1980

<entry><literal>-infinity</literal></entry>

1981

<entry><type>date</type>, <type>timestamp</type></entry>

1982

<entry>earlier than all other time stamps</entry>

1983

</row>

1984

<row>

1985

1986

<entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>

1987

<entry>current transaction's start time</entry>

1988

</row>

1989

<row>

1990

<entry><literal>today</literal></entry>

1991

<entry><type>date</type>, <type>timestamp</type></entry>

1992

<entry>midnight today</entry>

1993

</row>

1994

<row>

1995

<entry><literal>tomorrow</literal></entry>

1996

<entry><type>date</type>, <type>timestamp</type></entry>

1997

<entry>midnight tomorrow</entry>

1998

</row>

1999

<row>

2000

<entry><literal>yesterday</literal></entry>

2001

<entry><type>date</type>, <type>timestamp</type></entry>

2002

<entry>midnight yesterday</entry>

2003

</row>

2004

<row>

2005

<entry><literal>allballs</literal></entry>

2006

2007

2008

</row>

2009

</tbody>

2010

</tgroup>

2011

</table>

2012

2013

<para>

2014

The following <acronym>SQL</acronym>-compatible functions can also

2015

be used to obtain the current time value for the corresponding data

2016

type:

2017

<literal>CURRENT_DATE</literal>, <literal>CURRENT_TIME</literal>,

2018

<literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,

2019

<literal>LOCALTIMESTAMP</literal>. The latter four accept an

2020

optional subsecond precision specification. (See <xref

2021

linkend="functions-datetime-current">.) Note however that these are

2022

SQL functions and are <emphasis>not</> recognized as data input strings.

2023

</para>

2024

2025

</sect3>

2026

</sect2>

2027

2028

2029

<title>Date/Time Output</title>

2030

2031

2032

2033

<secondary>output format</secondary>

2034

<seealso>formatting</seealso>

2035

</indexterm>

2036

2037

2038

2039

<secondary>output format</secondary>

2040

<seealso>formatting</seealso>

2041

</indexterm>

2042

2043

<para>

2044

The output format of the date/time types can be set to one of the four

2045

styles ISO 8601,

2046

<acronym>SQL</acronym> (Ingres), traditional POSTGRES, and

2047

German, using the command <literal>SET datestyle</literal>. The default

2048

is the <acronym>ISO</acronym> format. (The

2049

<acronym>SQL</acronym> standard requires the use of the ISO 8601

2050

format. The name of the <quote>SQL</quote> output format is a

2051

historical accident.) <xref

2052

linkend="datatype-datetime-output-table"> shows examples of each

2053

output style. The output of the <type>date</type> and

2054

<type>time</type> types is of course only the date or time part

2055

in accordance with the given examples.

2056

</para>

2057

2058

2059

<title>Date/Time Output Styles</title>

2060

2061

<thead>

2062

<row>

2063

<entry>Style Specification</entry>

2064

<entry>Description</entry>

2065

<entry>Example</entry>

2066

</row>

2067

</thead>

2068

<tbody>

2069

<row>

2070

2071

<entry>ISO 8601/SQL standard</entry>

2072

2073

</row>

2074

<row>

2075

2076

<entry>traditional style</entry>

2077

2078

</row>

2079

<row>

2080

<entry>POSTGRES</entry>

2081

<entry>original style</entry>

2082

2083

</row>

2084

<row>

2085

<entry>German</entry>

2086

<entry>regional style</entry>

2087

2088

</row>

2089

</tbody>

2090

</tgroup>

2091

</table>

2092

2093

<para>

2094

In the <acronym>SQL</acronym> and POSTGRES styles, day appears before

2095

month if DMY field ordering has been specified, otherwise month appears

2096

before day.

2097

(See <xref linkend="datatype-datetime-input">

2098

for how this setting also affects interpretation of input values.)

2099

<xref linkend="datatype-datetime-output2-table"> shows an

2100

example.

2101

</para>

2102

2103

2104

<title>Date Order Conventions</title>

2105

2106

<thead>

2107

<row>

2108

<entry><varname>datestyle</varname> Setting</entry>

2109

<entry>Input Ordering</entry>

2110

<entry>Example Output</entry>

2111

</row>

2112

</thead>

2113

<tbody>

2114

<row>

2115

2116

<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>

2117

2118

</row>

2119

<row>

2120

2121

<entry><replaceable>month</replaceable>/<replaceable>day</replaceable>/<replaceable>year</replaceable></entry>

2122

2123

</row>

2124

<row>

2125

<entry><literal>Postgres, DMY</></entry>

2126

<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>

2127

2128

</row>

2129

</tbody>

2130

</tgroup>

2131

</table>

2132

2133

<para>

2134

The date/time styles can be selected by the user using the

2135

<command>SET datestyle</command> command, the <xref

2136

linkend="guc-datestyle"> parameter in the

2137

<filename>postgresql.conf</filename> configuration file, or the

2138

<envar>PGDATESTYLE</envar> environment variable on the server or

2139

client. The formatting function <function>to_char</function>

2140

(see <xref linkend="functions-formatting">) is also available as

2141

a more flexible way to format date/time output.

2142

</para>

2143

</sect2>

2144

2145

2146

<title>Time Zones</title>

2147

2148

2149

2150

</indexterm>

2151

2152

<para>

2153

Time zones, and time-zone conventions, are influenced by

2154

political decisions, not just earth geometry. Time zones around the

2155

world became somewhat standardized during the 1900's,

2156

but continue to be prone to arbitrary changes, particularly with

2157

respect to daylight-savings rules.

2158

<productname>PostgreSQL</productname> uses the widely-used

2159

<literal>zoneinfo</> time zone database for information about

2160

historical time zone rules. For times in the future, the assumption

2161

is that the latest known rules for a given time zone will

2162

continue to be observed indefinitely far into the future.

2163

</para>

2164

2165

<para>

2166

<productname>PostgreSQL</productname> endeavors to be compatible with

2167

the <acronym>SQL</acronym> standard definitions for typical usage.

2168

However, the <acronym>SQL</acronym> standard has an odd mix of date and

2169

time types and capabilities. Two obvious problems are:

2170

2171

2172

2173

<para>

2174

Although the <type>date</type> type

2175

does not have an associated time zone, the

2176

<type>time</type> type can.

2177

Time zones in the real world have little meaning unless

2178

associated with a date as well as a time,

2179

since the offset can vary through the year with daylight-saving

2180

time boundaries.

2181

</para>

2182

</listitem>

2183

2184

2185

<para>

2186

The default time zone is specified as a constant numeric offset

2187

from <acronym>UTC</>. It is therefore not possible to adapt to

2188

daylight-saving time when doing date/time arithmetic across

2189

<acronym>DST</acronym> boundaries.

2190

</para>

2191

</listitem>

2192

2193

</itemizedlist>

2194

</para>

2195

2196

<para>

2197

To address these difficulties, we recommend using date/time types

2198

that contain both date and time when using time zones. We

2199

recommend <emphasis>not</emphasis> using the type <type>time with

2200

time zone</type> (though it is supported by

2201

<productname>PostgreSQL</productname> for legacy applications and

2202

for compliance with the <acronym>SQL</acronym> standard).

2203

<productname>PostgreSQL</productname> assumes

2204

your local time zone for any type containing only date or time.

2205

</para>

2206

2207

<para>

2208

All timezone-aware dates and times are stored internally in

2209

<acronym>UTC</acronym>. They are converted to local time

2210

in the zone specified by the <xref linkend="guc-timezone"> configuration

2211

parameter before being displayed to the client.

2212

</para>

2213

2214

<para>

2215

<productname>PostgreSQL</productname> allows you to specify time zones in

2216

three different forms:

2217

2218

2219

<para>

2220

A full time zone name, for example <literal>America/New_York</>.

2221

The recognized time zone names are listed in the

2222

<literal>pg_timezone_names</literal> view (see <xref

2223

linkend="view-pg-timezone-names">).

2224

<productname>PostgreSQL</productname> uses the widely-used

2225

<literal>zoneinfo</> time zone data for this purpose, so the same

2226

names are also recognized by much other software.

2227

</para>

2228

</listitem>

2229

2230

<para>

2231

A time zone abbreviation, for example <literal>PST</>. Such a

2232

specification merely defines a particular offset from UTC, in

2233

contrast to full time zone names which might imply a set of daylight

2234

savings transition-date rules as well. The recognized abbreviations

2235

are listed in the <literal>pg_timezone_abbrevs</> view (see <xref

2236

linkend="view-pg-timezone-abbrevs">). You cannot set the

2237

configuration parameters <xref linkend="guc-timezone"> or

2238

<xref linkend="guc-log-timezone"> using a time

2239

zone abbreviation, but you can use abbreviations in

2240

date/time input values and with the <literal>AT TIME ZONE</>

2241

operator.

2242

</para>

2243

</listitem>

2244

2245

<para>

2246

In addition to the timezone names and abbreviations,

2247

<productname>PostgreSQL</productname> will accept POSIX-style time zone

2248

specifications of the form <replaceable>STD</><replaceable>offset</> or

2249

<replaceable>STD</><replaceable>offset</><replaceable>DST</>, where

2250

<replaceable>STD</> is a zone abbreviation, <replaceable>offset</> is a

2251

numeric offset in hours west from UTC, and <replaceable>DST</> is an

2252

optional daylight-savings zone abbreviation, assumed to stand for one

2253

hour ahead of the given offset. For example, if <literal>EST5EDT</>

2254

were not already a recognized zone name, it would be accepted and would

2255

be functionally equivalent to USA East Coast time. When a

2256

daylight-savings zone name is present, it is assumed to be used

2257

according to the same daylight-savings transition rules used in the

2258

<literal>zoneinfo</> time zone database's <filename>posixrules</> entry.

2259

In a standard <productname>PostgreSQL</productname> installation,

2260

<filename>posixrules</> is the same as <literal>US/Eastern</>, so

2261

that POSIX-style time zone specifications follow USA daylight-savings

2262

rules. If needed, you can adjust this behavior by replacing the

2263

<filename>posixrules</> file.

2264

</para>

2265

</listitem>

2266

</itemizedlist>

2267

2268

There is a conceptual and practical difference between the abbreviations

2269

and the full names: abbreviations always represent a fixed offset from

2270

UTC, whereas most of the full names imply a local daylight-savings time

2271

rule and so have two possible UTC offsets.

2272

</para>

2273

2274

<para>

2275

One should be wary that the POSIX-style time zone feature can

2276

lead to silently accepting bogus input, since there is no check on the

2277

reasonableness of the zone abbreviations. For example, <literal>SET

2278

TIMEZONE TO FOOBAR0</> will work, leaving the system effectively using

2279

a rather peculiar abbreviation for UTC.

2280

Another issue to keep in mind is that in POSIX time zone names,

2281

positive offsets are used for locations <emphasis>west</> of Greenwich.

2282

Everywhere else, <productname>PostgreSQL</productname> follows the

2283

ISO-8601 convention that positive timezone offsets are <emphasis>east</>

2284

of Greenwich.

2285

</para>

2286

2287

<para>

2288

In all cases, timezone names are recognized case-insensitively.

2289

(This is a change from <productname>PostgreSQL</productname> versions

2290

prior to 8.2, which were case-sensitive in some contexts and not others.)

2291

</para>

2292

2293

<para>

2294

Neither full names nor abbreviations are hard-wired into the server;

2295

they are obtained from configuration files stored under

2296

<filename>.../share/timezone/</> and <filename>.../share/timezonesets/</>

2297

of the installation directory

2298

(see <xref linkend="datetime-config-files">).

2299

</para>

2300

2301

<para>

2302

The <xref linkend="guc-timezone"> configuration parameter can

2303

be set in the file <filename>postgresql.conf</>, or in any of the

2304

other standard ways described in <xref linkend="runtime-config">.

2305

There are also several special ways to set it:

2306

2307

2308

2309

<para>

2310

If <varname>timezone</> is not specified in

2311

<filename>postgresql.conf</> nor as a server command-line option,

2312

the server attempts to use the value of the <envar>TZ</envar>

2313

environment variable as the default time zone. If <envar>TZ</envar>

2314

is not defined or is not any of the time zone names known to

2315

<productname>PostgreSQL</productname>, the server attempts to

2316

determine the operating system's default time zone by checking the

2317

behavior of the C library function <literal>localtime()</>. The

2318

default time zone is selected as the closest match among

2319

<productname>PostgreSQL</productname>'s known time zones.

2320

(These rules are also used to choose the default value of

2321

<xref linkend="guc-log-timezone">, if it is not specified.)

2322

</para>

2323

</listitem>

2324

2325

2326

<para>

2327

The <acronym>SQL</acronym> command <command>SET TIME ZONE</command>

2328

sets the time zone for the session. This is an alternative spelling

2329

of <command>SET TIMEZONE TO</> with a more SQL-spec-compatible syntax.

2330

</para>

2331

</listitem>

2332

2333

2334

<para>

2335

The <envar>PGTZ</envar> environment variable, if set at the

2336

client, is used by <application>libpq</application>

2337

applications to send a <command>SET TIME ZONE</command>

2338

command to the server upon connection.

2339

</para>

2340

</listitem>

2341

</itemizedlist>

2342

</para>

2343

</sect2>

2344

2345

2346

<title>Interval Input</title>

2347

2348

2349

<primary>interval</primary>

2350

</indexterm>

2351

2352

<para>

2353

<type>interval</type> values can be written with the following

2354

verbose syntax:

2355

2356

2357

<optional>@</> <replaceable>quantity</> <replaceable>unit</> <optional><replaceable>quantity</> <replaceable>unit</>...</> <optional><replaceable>direction</></optional>

2358

</synopsis>

2359

2360

where <replaceable>quantity</> is a number (possibly signed);

2361

<replaceable>unit</> is <literal>microsecond</literal>,

2362

<literal>millisecond</literal>, <literal>second</literal>,

2363

<literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,

2364

<literal>week</literal>, <literal>month</literal>, <literal>year</literal>,

2365

<literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,

2366

or abbreviations or plurals of these units;

2367

<replaceable>direction</> can be <literal>ago</literal> or

2368

empty. The at sign (<literal>@</>) is optional noise. The amounts

2369

of different units are implicitly added up with appropriate

2370

sign accounting. <literal>ago</literal> negates all the fields.

2371

This syntax is also used for interval output, if

2372

<xref linkend="guc-intervalstyle"> is set to

2373

<literal>postgres_verbose</>.

2374

</para>

2375

2376

<para>

2377

Quantities of days, hours, minutes, and seconds can be specified without

2378

explicit unit markings. For example, <literal>'1 12:59:10'</> is read

2379

the same as <literal>'1 day 12 hours 59 min 10 sec'</>. Also,

2380

a combination of years and months can be specified with a dash;

2381

for example <literal>'200-10'</> is read the same as <literal>'200 years

2382

10 months'</>. (These shorter forms are in fact the only ones allowed

2383

by the <acronym>SQL</acronym> standard, and are used for output when

2384

<varname>IntervalStyle</> is set to <literal>sql_standard</literal>.)

2385

</para>

2386

2387

<para>

2388

Interval values can also be written as ISO 8601 time intervals, using

2389

either the <quote>format with designators</> of the standard's section

2390

4.4.3.2 or the <quote>alternative format</> of section 4.4.3.3. The

2391

format with designators looks like this:

2392

2393

P <replaceable>quantity</> <replaceable>unit</> <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional> <optional> T <optional> <replaceable>quantity</> <replaceable>unit</> ...</optional></optional>

2394

</synopsis>

2395

The string must start with a <literal>P</>, and may include a

2396

<literal>T</> that introduces the time-of-day units. The

2397

available unit abbreviations are given in <xref

2398

linkend="datatype-interval-iso8601-units">. Units may be

2399

omitted, and may be specified in any order, but units smaller than

2400

a day must appear after <literal>T</>. In particular, the meaning of

2401

<literal>M</> depends on whether it is before or after

2402

<literal>T</>.

2403

</para>

2404

2405

2406

<title>ISO 8601 interval unit abbreviations</title>

2407

2408

<thead>

2409

<row>

2410

<entry>Abbreviation</entry>

2411

<entry>Meaning</entry>

2412

</row>

2413

</thead>

2414

<tbody>

2415

<row>

2416

2417

<entry>Years</entry>

2418

</row>

2419

<row>

2420

2421

<entry>Months (in the date part)</entry>

2422

</row>

2423

<row>

2424

2425

<entry>Weeks</entry>

2426

</row>

2427

<row>

2428

2429

2430

</row>

2431

<row>

2432

2433

<entry>Hours</entry>

2434

</row>

2435

<row>

2436

2437

<entry>Minutes (in the time part)</entry>

2438

</row>

2439

<row>

2440

2441

<entry>Seconds</entry>

2442

</row>

2443

</tbody>

2444

</tgroup>

2445

</table>

2446

2447

<para>

2448

In the alternative format:

2449

2450

P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> </optional> <optional> T <replaceable>hours</>:<replaceable>minutes</>:<replaceable>seconds</> </optional>

2451

</synopsis>

2452

the string must begin with <literal>P</literal>, and a

2453

<literal>T</> separates the date and time parts of the interval.

2454

The values are given as numbers similar to ISO 8601 dates.

2455

</para>

2456

2457

<para>

2458

When writing an interval constant with a <replaceable>fields</>

2459

specification, or when assigning to an interval column that was defined

2460

with a <replaceable>fields</> specification, the interpretation of

2461

unmarked quantities depends on the <replaceable>fields</>. For

2462

example <literal>INTERVAL '1' YEAR</> is read as 1 year, whereas

2463

<literal>INTERVAL '1'</> means 1 second.

2464

</para>

2465

2466

<para>

2467

According to the <acronym>SQL</> standard all fields of an interval

2468

value must have the same sign, so a leading negative sign applies to all

2469

fields; for example the negative sign in the interval literal

2470

<literal>'-1 2:03:04'</> applies to both the days and hour/minute/second

2471

parts. <productname>PostgreSQL</> allows the fields to have different

2472

signs, and traditionally treats each field in the textual representation

2473

as independently signed, so that the hour/minute/second part is

2474

considered positive in this example. If <varname>IntervalStyle</> is

2475

set to <literal>sql_standard</literal> then a leading sign is considered

2476

to apply to all fields (but only if no additional signs appear).

2477

Otherwise the traditional <productname>PostgreSQL</> interpretation is

2478

used. To avoid ambiguity, it's recommended to attach an explicit sign

2479

to each field if any field is negative.

2480

</para>

2481

2482

<para>

2483

Internally <type>interval</> values are stored as months, days,

2484

and seconds. This is done because the number of days in a month

2485

varies, and a day can have 23 or 25 hours if a daylight savings

2486

time adjustment is involved. The months and days fields are integers

2487

while the seconds field can store fractions. Because intervals are

2488

usually created from constant strings or <type>timestamp</> subtraction,

2489

this storage method works well in most cases. Functions

2490

<function>justify_days</> and <function>justify_hours</> are

2491

available for adjusting days and hours that overflow their normal

2492

ranges.

2493

</para>

2494

2495

<para>

2496

In the verbose input format, and in some fields of the more compact

2497

input formats, field values can have fractional parts; for example

2498

<literal>'1.5 week'</> or <literal>'01:02:03.45'</>. Such input is

2499

converted to the appropriate number of months, days, and seconds

2500

for storage. When this would result in a fractional number of

2501

months or days, the fraction is added to the lower-order fields

2502

using the conversion factors 1 month = 30 days and 1 day = 24 hours.

2503

For example, <literal>'1.5 month'</> becomes 1 month and 15 days.

2504

Only seconds will ever be shown as fractional on output.

2505

</para>

2506

2507

<para>

2508

<xref linkend="datatype-interval-input-examples"> shows some examples

2509

of valid <type>interval</> input.

2510

</para>

2511

2512

2513

<title>Interval Input</title>

2514

2515

<thead>

2516

<row>

2517

<entry>Example</entry>

2518

<entry>Description</entry>

2519

</row>

2520

</thead>

2521

<tbody>

2522

<row>

2523

2524

<entry>SQL standard format: 1 year 2 months</entry>

2525

</row>

2526

<row>

2527

2528

<entry>SQL standard format: 3 days 4 hours 5 minutes 6 seconds</entry>

2529

</row>

2530

<row>

2531

<entry>1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>

2532

<entry>Traditional Postgres format: 1 year 2 months 3 days 4 hours 5 minutes 6 seconds</entry>

2533

</row>

2534

<row>

2535

2536

<entry>ISO 8601 <quote>format with designators</>: same meaning as above</entry>

2537

</row>

2538

<row>

2539

2540

<entry>ISO 8601 <quote>alternative format</>: same meaning as above</entry>

2541

</row>

2542

</tbody>

2543

</tgroup>

2544

</table>

2545

2546

</sect2>

2547

2548

2549

<title>Interval Output</title>

2550

2551

2552

<primary>interval</primary>

2553

<secondary>output format</secondary>

2554

<seealso>formatting</seealso>

2555

</indexterm>

2556

2557

<para>

2558

The output format of the interval type can be set to one of the

2559

four styles <literal>sql_standard</>, <literal>postgres</>,

2560

<literal>postgres_verbose</>, or <literal>iso_8601</>,

2561

using the command <literal>SET intervalstyle</literal>.

2562

The default is the <literal>postgres</> format.

2563

<xref linkend="interval-style-output-table"> shows examples of each

2564

output style.

2565

</para>

2566

2567

<para>

2568

The <literal>sql_standard</> style produces output that conforms to

2569

the SQL standard's specification for interval literal strings, if

2570

the interval value meets the standard's restrictions (either year-month

2571

only or day-time only, with no mixing of positive

2572

and negative components). Otherwise the output looks like a standard

2573

year-month literal string followed by a day-time literal string,

2574

with explicit signs added to disambiguate mixed-sign intervals.

2575

</para>

2576

2577

<para>

2578

The output of the <literal>postgres</> style matches the output of

2579

<productname>PostgreSQL</> releases prior to 8.4 when the

2580

<xref linkend="guc-datestyle"> parameter was set to <literal>ISO</>.

2581

</para>

2582

2583

<para>

2584

The output of the <literal>postgres_verbose</> style matches the output of

2585

<productname>PostgreSQL</> releases prior to 8.4 when the

2586

<varname>DateStyle</> parameter was set to non-<literal>ISO</> output.

2587

</para>

2588

2589

<para>

2590

The output of the <literal>iso_8601</> style matches the <quote>format

2591

with designators</> described in section 4.4.3.2 of the

2592

ISO 8601 standard.

2593

</para>

2594

2595

2596

<title>Interval Output Style Examples</title>

2597

2598

<thead>

2599

<row>

2600

<entry>Style Specification</entry>

2601

<entry>Year-Month Interval</entry>

2602

<entry>Day-Time Interval</entry>

2603

<entry>Mixed Interval</entry>

2604

</row>

2605

</thead>

2606

<tbody>

2607

<row>

2608

<entry><literal>sql_standard</></entry>

2609

2610

2611

2612

</row>

2613

<row>

2614

<entry><literal>postgres</></entry>

2615

2616

2617

2618

</row>

2619

<row>

2620

<entry><literal>postgres_verbose</></entry>

2621

2622

<entry>@ 3 days 4 hours 5 mins 6 secs</entry>

2623

<entry>@ 1 year 2 mons -3 days 4 hours 5 mins 6 secs ago</entry>

2624

</row>

2625

<row>

2626

2627

2628

2629

2630

</row>

2631

</tbody>

2632

</tgroup>

2633

</table>

2634

2635

</sect2>

2636

2637

2638

<title>Internals</title>

2639

2640

<para>

2641

<productname>PostgreSQL</productname> uses Julian dates

2642

for all date/time calculations. They have the nice property of correctly

2643

predicting/calculating any date more recent than 4713 BC

2644

to far into the future, using the assumption that the length of the

2645

year is 365.2425 days.

2646

</para>

2647

2648

<para>

2649

Date conventions before the 19th century make for interesting reading,

2650

but are not consistent enough to warrant coding into a date/time handler.

2651

</para>

2652

</sect2>

2653

2654

</sect1>

2655

2656

2657

<title>Boolean Type</title>

2658

2659

2660

<primary>Boolean</primary>

2661

2662

</indexterm>

2663

2664

2665

2666

</indexterm>

2667

2668

2669

<primary>false</primary>

2670

</indexterm>

2671

2672

<para>

2673

<productname>PostgreSQL</productname> provides the

2674

standard <acronym>SQL</acronym> type <type>boolean</type>.

2675

<type>boolean</type> can have one of only two states:

2676

<quote>true</quote> or <quote>false</quote>. A third state,

2677

<quote>unknown</quote>, is represented by the

2678

<acronym>SQL</acronym> null value.

2679

</para>

2680

2681

<para>

2682

Valid literal values for the <quote>true</quote> state are:

2683

2684

2685

2686

2687

2688

2689

2690

2691

</simplelist>

2692

For the <quote>false</quote> state, the following values can be

2693

used:

2694

2695

<member><literal>FALSE</literal></member>

2696

2697

<member><literal>'false'</literal></member>

2698

2699

2700

2701

2702

</simplelist>

2703

Leading and trailing whitespace is ignored. Using the key words

2704

<literal>TRUE</literal> and <literal>FALSE</literal> is preferred

2705

(and <acronym>SQL</acronym>-compliant).

2706

</para>

2707

2708

2709

<title>Using the <type>boolean</type> type</title>

2710

2711

2712

CREATE TABLE test1 (a boolean, b text);

2713

INSERT INTO test1 VALUES (TRUE, 'sic est');

2714

INSERT INTO test1 VALUES (FALSE, 'non est');

2715

SELECT * FROM test1;

2716

a | b

2717

---+---------

2718

t | sic est

2719

f | non est

2720

2721

SELECT * FROM test1 WHERE a;

2722

a | b

2723

---+---------

2724

t | sic est

2725

</programlisting>

2726

</example>

2727

2728

<para>

2729

<xref linkend="datatype-boolean-example"> shows that

2730

<type>boolean</type> values are output using the letters

2731

<literal>t</literal> and <literal>f</literal>.

2732

</para>

2733

2734

<para>

2735

<type>boolean</type> uses 1 byte of storage.

2736

</para>

2737

</sect1>

2738

2739

2740

<title>Enumerated Types</title>

2741

2742

2743

2744

<secondary>enumerated (enum)</secondary>

2745

</indexterm>

2746

2747

2748

<primary>enumerated types</primary>

2749

</indexterm>

2750

2751

<para>

2752

Enumerated (enum) types are data types that

2753

are comprised of a static, predefined set of values with a

2754

specific order. They are equivalent to the <type>enum</type>

2755

types in a number of programming languages. An example of an enum

2756

type might be the days of the week, or a set of status values for

2757

a piece of data.

2758

</para>

2759

2760

<sect2>

2761

<title>Declaration of Enumerated Types</title>

2762

2763

<para>

2764

Enum types are created using the <xref

2765

linkend="sql-createtype" endterm="sql-createtype-title"> command,

2766

for example:

2767

2768

2769

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

2770

</programlisting>

2771

2772

Once created, the enum type can be used in table and function

2773

definitions much like any other type:

2774

</para>

2775

2776

2777

<title>Basic Enum Usage</title>

2778

2779

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

2780

CREATE TABLE person (

2781

name text,

2782

current_mood mood

2783

);

2784

INSERT INTO person VALUES ('Moe', 'happy');

2785

SELECT * FROM person WHERE current_mood = 'happy';

2786

name | current_mood

2787

------+--------------

2788

Moe | happy

2789

(1 row)

2790

</programlisting>

2791

</example>

2792

</sect2>

2793

2794

<sect2>

2795

<title>Ordering</title>

2796

2797

<para>

2798

The ordering of the values in an enum type is the

2799

order in which the values were listed when the type was declared.

2800

All standard comparison operators and related

2801

aggregate functions are supported for enums. For example:

2802

</para>

2803

2804

2805

<title>Enum Ordering</title>

2806

2807

INSERT INTO person VALUES ('Larry', 'sad');

2808

INSERT INTO person VALUES ('Curly', 'ok');

2809

SELECT * FROM person WHERE current_mood > 'sad';

2810

name | current_mood

2811

-------+--------------

2812

Moe | happy

2813

Curly | ok

2814

(2 rows)

2815

2816

SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;

2817

name | current_mood

2818

-------+--------------

2819

Curly | ok

2820

Moe | happy

2821

(2 rows)

2822

2823

SELECT name FROM person

2824

WHERE current_mood = (SELECT MIN(current_mood) FROM person);

2825

name

2826

-------

2827

Larry

2828

(1 row)

2829

</programlisting>

2830

</example>

2831

</sect2>

2832

2833

<sect2>

2834

<title>Type Safety</title>

2835

2836

<para>

2837

Enumerated types are completely separate data types and may not

2838

be compared with each other.

2839

</para>

2840

2841

2842

<title>Lack of Casting</title>

2843

2844

CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');

2845

CREATE TABLE holidays (

2846

num_weeks int,

2847

happiness happiness

2848

);

2849

INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');

2850

INSERT INTO holidays(num_weeks,happiness) VALUES (6, 'very happy');

2851

INSERT INTO holidays(num_weeks,happiness) VALUES (8, 'ecstatic');

2852

INSERT INTO holidays(num_weeks,happiness) VALUES (2, 'sad');

2853

ERROR: invalid input value for enum happiness: "sad"

2854

SELECT person.name, holidays.num_weeks FROM person, holidays

2855

WHERE person.current_mood = holidays.happiness;

2856

ERROR: operator does not exist: mood = happiness

2857

</programlisting>

2858

</example>

2859

2860

<para>

2861

If you really need to do something like that, you can either

2862

write a custom operator or add explicit casts to your query:

2863

</para>

2864

2865

2866

<title>Comparing Different Enums by Casting to Text</title>

2867

2868

SELECT person.name, holidays.num_weeks FROM person, holidays

2869

WHERE person.current_mood::text = holidays.happiness::text;

2870

name | num_weeks

2871

------+-----------

2872

Moe | 4

2873

(1 row)

2874

2875

</programlisting>

2876

</example>

2877

</sect2>

2878

2879

<sect2>

2880

<title>Implementation Details</title>

2881

2882

<para>

2883

An enum value occupies four bytes on disk. The length of an enum

2884

value's textual label is limited by the <symbol>NAMEDATALEN</symbol>

2885

setting compiled into <productname>PostgreSQL</productname>; in standard

2886

builds this means at most 63 bytes.

2887

</para>

2888

2889

<para>

2890

Enum labels are case sensitive, so

2891

<type>'happy'</type> is not the same as <type>'HAPPY'</type>.

2892

Spaces in the labels are significant, too.

2893

</para>

2894

2895

<para>

2896

The translations from internal enum values to textual labels are

2897

kept in the system catalog

2898

2899

Querying this catalog directly can be useful.

2900

</para>

2901

2902

</sect2>

2903

</sect1>

2904

2905

2906

<title>Geometric Types</title>

2907

2908

<para>

2909

Geometric data types represent two-dimensional spatial

2910

objects. <xref linkend="datatype-geo-table"> shows the geometric

2911

types available in <productname>PostgreSQL</productname>. The

2912

most fundamental type, the point, forms the basis for all of the

2913

other types.

2914

</para>

2915

2916

2917

<title>Geometric Types</title>

2918

2919

<thead>

2920

<row>

2921

2922

<entry>Storage Size</entry>

2923

<entry>Representation</entry>

2924

<entry>Description</entry>

2925

</row>

2926

</thead>

2927

<tbody>

2928

<row>

2929

<entry><type>point</type></entry>

2930

<entry>16 bytes</entry>

2931

<entry>Point on the plane</entry>

2932

2933

</row>

2934

<row>

2935

2936

<entry>32 bytes</entry>

2937

<entry>Infinite line (not fully implemented)</entry>

2938

2939

</row>

2940

<row>

2941

2942

<entry>32 bytes</entry>

2943

<entry>Finite line segment</entry>

2944

2945

</row>

2946

<row>

2947

2948

<entry>32 bytes</entry>

2949

<entry>Rectangular box</entry>

2950

2951

</row>

2952

<row>

2953

2954

<entry>16+16n bytes</entry>

2955

<entry>Closed path (similar to polygon)</entry>

2956

2957

</row>

2958

<row>

2959

2960

<entry>16+16n bytes</entry>

2961

2962

2963

</row>

2964

<row>

2965

<entry><type>polygon</type></entry>

2966

<entry>40+16n bytes</entry>

2967

<entry>Polygon (similar to closed path)</entry>

2968

2969

</row>

2970

<row>

2971

<entry><type>circle</type></entry>

2972

<entry>24 bytes</entry>

2973

<entry>Circle</entry>

2974

<entry><(x,y),r> (center and radius)</entry>

2975

</row>

2976

</tbody>

2977

</tgroup>

2978

</table>

2979

2980

<para>

2981

A rich set of functions and operators is available to perform various geometric

2982

operations such as scaling, translation, rotation, and determining

2983

intersections. They are explained in <xref linkend="functions-geometry">.

2984

</para>

2985

2986

<sect2>

2987

<title>Points</title>

2988

2989

2990

<primary>point</primary>

2991

</indexterm>

2992

2993

<para>

2994

Points are the fundamental two-dimensional building block for geometric types.

2995

Values of type <type>point</type> are specified using the following syntax:

2996

2997

2998

( <replaceable>x</replaceable> , <replaceable>y</replaceable> )

2999

3000

</synopsis>

3001

3002

where <replaceable>x</> and <replaceable>y</> are the respective

3003

coordinates as floating-point numbers.

3004

</para>

3005

</sect2>

3006

3007

<sect2>

3008

<title>Line Segments</title>

3009

3010

3011

3012

</indexterm>

3013

3014

3015

<primary>line segment</primary>

3016

</indexterm>

3017

3018

<para>

3019

Line segments (<type>lseg</type>) are represented by pairs of points.

3020

Values of type <type>lseg</type> are specified using the following syntax:

3021

3022

3023

( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )

3024

( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )

3025

3026

</synopsis>

3027

3028

where

3029

3030

and

3031

3032

are the end points of the line segment.

3033

</para>

3034

</sect2>

3035

3036

<sect2>

3037

<title>Boxes</title>

3038

3039

3040

3041

</indexterm>

3042

3043

3044

<primary>rectangle</primary>

3045

</indexterm>

3046

3047

<para>

3048

Boxes are represented by pairs of points that are opposite

3049

corners of the box.

3050

Values of type <type>box</type> are specified using the following syntax:

3051

3052

3053

( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )

3054

( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> )

3055

3056

</synopsis>

3057

3058

where

3059

3060

and

3061

3062

are any two opposite corners of the box.

3063

</para>

3064

3065

<para>

3066

Boxes are output using the first syntax.

3067

The corners are reordered on input to store

3068

the upper right corner, then the lower left corner.

3069

Other corners of the box can be entered, but the lower

3070

left and upper right corners are determined from the input and stored.

3071

</para>

3072

</sect2>

3073

3074

<sect2>

3075

<title>Paths</title>

3076

3077

3078

3079

</indexterm>

3080

3081

<para>

3082

Paths are represented by lists of connected points. Paths can be

3083

<firstterm>open</firstterm>, where

3084

the first and last points in the list are not considered connected, or

3085

<firstterm>closed</firstterm>,

3086

where the first and last points are considered connected.

3087

</para>

3088

3089

<para>

3090

Values of type <type>path</type> are specified using the following syntax:

3091

3092

3093

( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )

3094

[ ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) ]

3095

( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )

3096

( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )

3097

3098

</synopsis>

3099

3100

where the points are the end points of the line segments

3101

comprising the path. Square brackets (<literal>[]</>) indicate

3102

an open path, while parentheses (<literal>()</>) indicate a

3103

closed path.

3104

</para>

3105

3106

<para>

3107

Paths are output using the first syntax.

3108

</para>

3109

</sect2>

3110

3111

<sect2>

3112

<title>Polygons</title>

3113

3114

3115

<primary>polygon</primary>

3116

</indexterm>

3117

3118

<para>

3119

Polygons are represented by lists of points (the vertexes of the

3120

polygon). Polygons should probably be

3121

considered equivalent to closed paths, but are stored differently

3122

and have their own set of support routines.

3123

</para>

3124

3125

<para>

3126

Values of type <type>polygon</type> are specified using the following syntax:

3127

3128

3129

( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )

3130

( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )

3131

( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable> )

3132

3133

</synopsis>

3134

3135

where the points are the end points of the line segments

3136

comprising the boundary of the polygon.

3137

</para>

3138

3139

<para>

3140

Polygons are output using the first syntax.

3141

</para>

3142

</sect2>

3143

3144

<sect2>

3145

<title>Circles</title>

3146

3147

3148

<primary>circle</primary>

3149

</indexterm>

3150

3151

<para>

3152

Circles are represented by a center point and a radius.

3153

Values of type <type>circle</type> are specified using the following syntax:

3154

3155

3156

< ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> >

3157

( ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> )

3158

( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable>

3159

3160

</synopsis>

3161

3162

where

3163

3164

is the center and <replaceable>r</replaceable> is the radius of the circle.

3165

</para>

3166

3167

<para>

3168

Circles are output using the first syntax.

3169

</para>

3170

</sect2>

3171

3172

</sect1>

3173

3174

3175

<title>Network Address Types</title>

3176

3177

3178

<primary>network</primary>

3179

<secondary>data types</secondary>

3180

</indexterm>

3181

3182

<para>

3183

<productname>PostgreSQL</> offers data types to store IPv4, IPv6, and MAC

3184

addresses, as shown in <xref linkend="datatype-net-types-table">. It

3185

is preferable to use these types instead of plain text types to store

3186

network addresses, because

3187

these types offer input error checking and several specialized

3188

operators and functions (see <xref linkend="functions-net">).

3189

</para>

3190

3191

3192

<title>Network Address Types</title>

3193

3194

<thead>

3195

<row>

3196

3197

<entry>Storage Size</entry>

3198

<entry>Description</entry>

3199

</row>

3200

</thead>

3201

<tbody>

3202

3203

<row>

3204

3205

<entry>7 or 19 bytes</entry>

3206

<entry>IPv4 and IPv6 networks</entry>

3207

</row>

3208

3209

<row>

3210

3211

<entry>7 or 19 bytes</entry>

3212

<entry>IPv4 and IPv6 hosts and networks</entry>

3213

</row>

3214

3215

<row>

3216

<entry><type>macaddr</type></entry>

3217

<entry>6 bytes</entry>

3218

<entry>MAC addresses</entry>

3219

</row>

3220

3221

</tbody>

3222

</tgroup>

3223

</table>

3224

3225

<para>

3226

When sorting <type>inet</type> or <type>cidr</type> data types,

3227

IPv4 addresses will always sort before IPv6 addresses, including

3228

IPv4 addresses encapsulated or mapped into IPv6 addresses, such as

3229

::10.2.3.4 or ::ffff:10.4.3.2.

3230

</para>

3231

3232

3233

3234

3235

3236

3237

3238

</indexterm>

3239

3240

<para>

3241

The <type>inet</type> type holds an IPv4 or IPv6 host address, and

3242

optionally the identity of the subnet it is in, all in one field.

3243

The subnet identity is represented by stating how many bits of

3244

the host address represent the network address (the

3245

<quote>netmask</quote>). If the netmask is 32 and the address is IPv4,

3246

then the value does not indicate a subnet, only a single host.

3247

In IPv6, the address length is 128 bits, so 128 bits specify a

3248

unique host address. Note that if you

3249

want to accept networks only, you should use the

3250

<type>cidr</type> type rather than <type>inet</type>.

3251

</para>

3252

3253

<para>

3254

The input format for this type is

3255

<replaceable class="parameter">address/y</replaceable>

3256

where

3257

<replaceable class="parameter">address</replaceable>

3258

is an IPv4 or IPv6 address and

3259

3260

is the number of bits in the netmask. If the

3261

3262

part is left off, then the

3263

netmask is 32 for IPv4 and 128 for IPv6, so the value represents

3264

just a single host. On display, the

3265

3266

portion is suppressed if the netmask specifies a single host.

3267

</para>

3268

</sect2>

3269

3270

3271

3272

3273

3274

3275

</indexterm>

3276

3277

<para>

3278

The <type>cidr</type> type holds an IPv4 or IPv6 network specification.

3279

Input and output formats follow Classless Internet Domain Routing

3280

conventions.

3281

The format for specifying networks is <replaceable

3282

class="parameter">address/y</> where <replaceable

3283

class="parameter">address</> is the network represented as an

3284

IPv4 or IPv6 address, and <replaceable

3285

class="parameter">y</> is the number of bits in the netmask. If

3286

<replaceable class="parameter">y</> is omitted, it is calculated

3287

using assumptions from the older classful network numbering system, except

3288

that it will be at least large enough to include all of the octets

3289

written in the input. It is an error to specify a network address

3290

that has bits set to the right of the specified netmask.

3291

</para>

3292

3293

<para>

3294

<xref linkend="datatype-net-cidr-table"> shows some examples.

3295

</para>

3296

3297

3298

<title><type>cidr</> Type Input Examples</title>

3299

3300

<thead>

3301

<row>

3302

<entry><type>cidr</type> Input</entry>

3303

<entry><type>cidr</type> Output</entry>

3304

<entry><literal><function>abbrev</function>(<type>cidr</type>)</literal></entry>

3305

</row>

3306

</thead>

3307

<tbody>

3308

<row>

3309

3310

3311

3312

</row>

3313

<row>

3314

3315

3316

3317

</row>

3318

<row>

3319

3320

3321

3322

</row>

3323

<row>

3324

3325

3326

3327

</row>

3328

<row>

3329

3330

3331

3332

</row>

3333

<row>

3334

3335

3336

3337

</row>

3338

<row>

3339

3340

3341

3342

</row>

3343

<row>

3344

3345

3346

3347

</row>

3348

<row>

3349

3350

3351

3352

</row>

3353

<row>

3354

3355

3356

3357

</row>

3358

<row>

3359

3360

3361

3362

</row>

3363

<row>

3364

3365

3366

3367

</row>

3368

<row>

3369

3370

3371

3372

</row>

3373

<row>

3374

3375

3376

3377

</row>

3378

<row>

3379

3380

3381

3382

</row>

3383

<row>

3384

3385

3386

3387

</row>

3388

</tbody>

3389

</tgroup>

3390

</table>

3391

</sect2>

3392

3393

3394

3395

3396

<para>

3397

The essential difference between <type>inet</type> and <type>cidr</type>

3398

data types is that <type>inet</type> accepts values with nonzero bits to

3399

the right of the netmask, whereas <type>cidr</type> does not.

3400

</para>

3401

3402

<tip>

3403

<para>

3404

If you do not like the output format for <type>inet</type> or

3405

<type>cidr</type> values, try the functions <function>host</>,

3406

<function>text</>, and <function>abbrev</>.

3407

</para>

3408

</tip>

3409

</sect2>

3410

3411

3412

<title><type>macaddr</></>

3413

3414

3415

<primary>macaddr (data type)</primary>

3416

</indexterm>

3417

3418

3419

<primary>MAC address</primary>

3420

<see>macaddr</see>

3421

</indexterm>

3422

3423

<para>

3424

The <type>macaddr</> type stores MAC addresses, known for example

3425

from Ethernet card hardware addresses (although MAC addresses are

3426

used for other purposes as well). Input is accepted in the

3427

following formats:

3428

3429

3430

3431

3432

3433

3434

3435

3436

</simplelist>

3437

3438

These examples would all specify the same address. Upper and

3439

lower case is accepted for the digits

3440

<literal>a</> through <literal>f</>. Output is always in the

3441

first of the forms shown.

3442

</para>

3443

3444

<para>

3445

IEEE Std 802-2001 specifies the second shown form (with hyphens)

3446

as the canonical form for MAC addresses, and specifies the first

3447

form (with colons) as the bit-reversed notation, so that

3448

08-00-2b-01-02-03 = 01:00:4D:08:04:0C. This convention is widely

3449

ignored nowadays, and it is only relevant for obsolete network

3450

protocols (such as Token Ring). PostgreSQL makes no provisions

3451

for bit reversal, and all accepted formats use the canonical LSB

3452

order.

3453

</para>

3454

3455

<para>

3456

The remaining four input formats are not part of any standard.

3457

</para>

3458

</sect2>

3459

3460

</sect1>

3461

3462

3463

<title>Bit String Types</title>

3464

3465

3466

<primary>bit string</primary>

3467

3468

</indexterm>

3469

3470

<para>

3471

Bit strings are strings of 1's and 0's. They can be used to store

3472

or visualize bit masks. There are two SQL bit types:

3473

<type>bit(<replaceable>n</replaceable>)</type> and <type>bit

3474

varying(<replaceable>n</replaceable>)</type>, where

3475

<replaceable>n</replaceable> is a positive integer.

3476

</para>

3477

3478

<para>

3479

<type>bit</type> type data must match the length

3480

<replaceable>n</replaceable> exactly; it is an error to attempt to

3481

store shorter or longer bit strings. <type>bit varying</type> data is

3482

of variable length up to the maximum length

3483

<replaceable>n</replaceable>; longer strings will be rejected.

3484

Writing <type>bit</type> without a length is equivalent to

3485

<literal>bit(1)</literal>, while <type>bit varying</type> without a length

3486

specification means unlimited length.

3487

</para>

3488

3489

<note>

3490

<para>

3491

If one explicitly casts a bit-string value to

3492

<type>bit(<replaceable>n</>)</type>, it will be truncated or

3493

zero-padded on the right to be exactly <replaceable>n</> bits,

3494

without raising an error. Similarly,

3495

if one explicitly casts a bit-string value to

3496

<type>bit varying(<replaceable>n</>)</type>, it will be truncated

3497

on the right if it is more than <replaceable>n</> bits.

3498

</para>

3499

</note>

3500

3501

<para>

3502

Refer to <xref

3503

linkend="sql-syntax-bit-strings"> for information about the syntax

3504

of bit string constants. Bit-logical operators and string

3505

manipulation functions are available; see <xref

3506

linkend="functions-bitstring">.

3507

</para>

3508

3509

3510

<title>Using the bit string types</title>

3511

3512

3513

CREATE TABLE test (a BIT(3), b BIT VARYING(5));

3514

INSERT INTO test VALUES (B'101', B'00');

3515

INSERT INTO test VALUES (B'10', B'101');

3516

3517

ERROR: bit string length 2 does not match type bit(3)

3518

</computeroutput>

3519

INSERT INTO test VALUES (B'10'::bit(3), B'101');

3520

SELECT * FROM test;

3521

3522

a | b

3523

-----+-----

3524

101 | 00

3525

100 | 101

3526

</computeroutput>

3527

</programlisting>

3528

</example>

3529

3530

<para>

3531

A bit string value requires 1 byte for each group of 8 bits, plus

3532

5 or 8 bytes overhead depending on the length of the string

3533

(but long values may be compressed or moved out-of-line, as explained

3534

in <xref linkend="datatype-character"> for character strings).

3535

</para>

3536

</sect1>

3537

3538

3539

<title>Text Search Types</title>

3540

3541

3542

<primary>full text search</primary>

3543

<secondary>data types</secondary>

3544

</indexterm>

3545

3546

3547

<primary>text search</primary>

3548

<secondary>data types</secondary>

3549

</indexterm>

3550

3551

<para>

3552

<productname>PostgreSQL</productname> provides two data types that

3553

are designed to support full text search, which is the activity of

3554

searching through a collection of natural-language <firstterm>documents</>

3555

to locate those that best match a <firstterm>query</>.

3556

The <type>tsvector</type> type represents a document in a form suited

3557

for text search, while the <type>tsquery</type> type similarly represents

3558

a query.

3559

<xref linkend="textsearch"> provides a detailed explanation of this

3560

facility, and <xref linkend="functions-textsearch"> summarizes the

3561

related functions and operators.

3562

</para>

3563

3564

3565

<title><type>tsvector</type></title>

3566

3567

3568

<primary>tsvector (data type)</primary>

3569

</indexterm>

3570

3571

<para>

3572

A <type>tsvector</type> value is a sorted list of distinct

3573

<firstterm>lexemes</>, which are words that have been

3574

<firstterm>normalized</> to make different variants of the same word look

3575

alike (see <xref linkend="textsearch"> for details). Sorting and

3576

duplicate-elimination are done automatically during input, as shown in

3577

this example:

3578

3579

3580

SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;

3581

tsvector

3582

----------------------------------------------------

3583

'a' 'and' 'ate' 'cat' 'fat' 'mat' 'on' 'rat' 'sat'

3584

</programlisting>

3585

3586

To represent

3587

lexemes containing whitespace or punctuation, surround them with quotes:

3588

3589

3590

SELECT $$the lexeme ' ' contains spaces$$::tsvector;

3591

tsvector

3592

-------------------------------------------

3593

' ' 'contains' 'lexeme' 'spaces' 'the'

3594

</programlisting>

3595

3596

(We use dollar-quoted string literals in this example and the next one,

3597

to avoid confusing matters by having to double quote marks within the

3598

literals.) Embedded quotes and backslashes must be doubled:

3599

3600

3601

SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;

3602

tsvector

3603

------------------------------------------------

3604

'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'

3605

</programlisting>

3606

3607

Optionally, integer <firstterm>position(s)</>

3608

can be attached to any or all of the lexemes:

3609

3610

3611

SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;

3612

tsvector

3613

-------------------------------------------------------------------------------

3614

'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4

3615

</programlisting>

3616

3617

A position normally indicates the source word's location in the

3618

document. Positional information can be used for

3619

<firstterm>proximity ranking</firstterm>. Position values can

3620

range from 1 to 16383; larger numbers are silently clamped to 16383.

3621

Duplicate positions for the same lexeme are discarded.

3622

</para>

3623

3624

<para>

3625

Lexemes that have positions can further be labeled with a

3626

<firstterm>weight</>, which can be <literal>A</literal>,

3627

<literal>B</literal>, <literal>C</literal>, or <literal>D</literal>.

3628

<literal>D</literal> is the default and hence is not shown on output:

3629

3630

3631

SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;

3632

tsvector

3633

----------------------------

3634

'a':1A 'cat':5 'fat':2B,4C

3635

</programlisting>

3636

3637

Weights are typically used to reflect document structure, for example

3638

by marking title words differently from body words. Text search

3639

ranking functions can assign different priorities to the different

3640

weight markers.

3641

</para>

3642

3643

<para>

3644

It is important to understand that the

3645

<type>tsvector</type> type itself does not perform any normalization;

3646

it assumes that the words it is given are normalized appropriately

3647

for the application. For example,

3648

3649

3650

select 'The Fat Rats'::tsvector;

3651

tsvector

3652

--------------------

3653

'Fat' 'Rats' 'The'

3654

</programlisting>

3655

3656

For most English-text-searching applications the above words would

3657

be considered non-normalized, but <type>tsvector</type> doesn't care.

3658

Raw document text should usually be passed through

3659

<function>to_tsvector</> to normalize the words appropriately

3660

for searching:

3661

3662

3663

SELECT to_tsvector('english', 'The Fat Rats');

3664

to_tsvector

3665

-----------------

3666

'fat':2 'rat':3

3667

</programlisting>

3668

3669

Again, see <xref linkend="textsearch"> for more detail.

3670

</para>

3671

3672

</sect2>

3673

3674

3675

<title><type>tsquery</type></title>

3676

3677

3678

<primary>tsquery (data type)</primary>

3679

</indexterm>

3680

3681

<para>

3682

A <type>tsquery</type> value stores lexemes that are to be

3683

searched for, and combines them using the boolean operators

3684

<literal>&</literal> (AND), <literal>|</literal> (OR), and

3685

<literal>!</> (NOT). Parentheses can be used to enforce grouping

3686

of the operators:

3687

3688

3689

SELECT 'fat & rat'::tsquery;

3690

tsquery

3691

---------------

3692

'fat' & 'rat'

3693

3694

SELECT 'fat & (rat | cat)'::tsquery;

3695

tsquery

3696

---------------------------

3697

'fat' & ( 'rat' | 'cat' )

3698

3699

SELECT 'fat & rat & ! cat'::tsquery;

3700

tsquery

3701

------------------------

3702

'fat' & 'rat' & !'cat'

3703

</programlisting>

3704

3705

In the absence of parentheses, <literal>!</> (NOT) binds most tightly,

3706

and <literal>&</literal> (AND) binds more tightly than

3707

<literal>|</literal> (OR).

3708

</para>

3709

3710

<para>

3711

Optionally, lexemes in a <type>tsquery</type> can be labeled with

3712

one or more weight letters, which restricts them to match only

3713

<type>tsvector</> lexemes with one of those weights:

3714

3715

3716

SELECT 'fat:ab & cat'::tsquery;

3717

tsquery

3718

------------------

3719

'fat':AB & 'cat'

3720

</programlisting>

3721

</para>

3722

3723

<para>

3724

Also, lexemes in a <type>tsquery</type> can be labeled with <literal>*</>

3725

to specify prefix matching:

3726

3727

SELECT 'super:*'::tsquery;

3728

tsquery

3729

-----------

3730

'super':*

3731

</programlisting>

3732

This query will match any word in a <type>tsvector</> that begins

3733

with <quote>super</>.

3734

</para>

3735

3736

<para>

3737

Quoting rules for lexemes are the same as described above for

3738

lexemes in <type>tsvector</>; and, as with <type>tsvector</>,

3739

any required normalization of words must be done before putting

3740

them into the <type>tsquery</> type. The <function>to_tsquery</>

3741

function is convenient for performing such normalization:

3742

3743

3744

SELECT to_tsquery('Fat:ab & Cats');

3745

to_tsquery

3746

------------------

3747

'fat':AB & 'cat'

3748

</programlisting>

3749

</para>

3750

3751

</sect2>

3752

3753

</sect1>

3754

3755

3756

3757

3758

3759

3760

</indexterm>

3761

3762

<para>

3763

The data type <type>uuid</type> stores Universally Unique Identifiers

3764

(UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.

3765

(Some systems refer to this data type as globally unique identifier, or

3766

GUID,<indexterm><primary>GUID</primary></indexterm> instead.) Such an

3767

identifier is a 128-bit quantity that is generated by an algorithm chosen

3768

to make it very unlikely that the same identifier will be generated by

3769

anyone else in the known universe using the same algorithm. Therefore,

3770

for distributed systems, these identifiers provide a better uniqueness

3771

guarantee than that which can be achieved using sequence generators, which

3772

are only unique within a single database.

3773

</para>

3774

3775

<para>

3776

A UUID is written as a sequence of lower-case hexadecimal digits,

3777

in several groups separated by hyphens, specifically a group of 8

3778

digits followed by three groups of 4 digits followed by a group of

3779

12 digits, for a total of 32 digits representing the 128 bits. An

3780

example of a UUID in this standard form is:

3781

3782

a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11

3783

</programlisting>

3784

<productname>PostgreSQL</productname> also accepts the following

3785

alternative forms for input:

3786

use of upper-case digits, the standard format surrounded by

3787

braces, omitting some or all hyphens, adding a hyphen after any

3788

group of four digits. Examples are:

3789

3790

A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11

3791

{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}

3792

a0eebc999c0b4ef8bb6d6bb9bd380a11

3793

a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11

3794

{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}

3795

</programlisting>

3796

Output is always in the standard form.

3797

</para>

3798

3799

<para>

3800

<productname>PostgreSQL</productname> provides storage and comparison

3801

functions for UUIDs, but the core database does not include any

3802

function for generating UUIDs, because no single algorithm is well

3803

suited for every application. The contrib module

3804

<filename>contrib/uuid-ossp</filename> provides functions that implement

3805

several standard algorithms.

3806

Alternatively, UUIDs could be generated by client applications or

3807

other libraries invoked through a server-side function.

3808

</para>

3809

</sect1>

3810

3811

3812

3813

3814

3815

3816

</indexterm>

3817

3818

<para>

3819

The data type <type>xml</type> can be used to store XML data. Its

3820

advantage over storing XML data in a <type>text</type> field is that it

3821

checks the input values for well-formedness, and there are support

3822

functions to perform type-safe operations on it; see <xref

3823

linkend="functions-xml">. Use of this data type requires the

3824

installation to have been built with <command>configure

3825

--with-libxml</>.

3826

</para>

3827

3828

<para>

3829

The <type>xml</type> type can store well-formed

3830

<quote>documents</quote>, as defined by the XML standard, as well

3831

as <quote>content</quote> fragments, which are defined by the

3832

production <literal>XMLDecl? content</literal> in the XML

3833

standard. Roughly, this means that content fragments can have

3834

more than one top-level element or character node. The expression

3835

<literal><replaceable>xmlvalue</replaceable> IS DOCUMENT</literal>

3836

can be used to evaluate whether a particular <type>xml</type>

3837

value is a full document or only a content fragment.

3838

</para>

3839

3840

<sect2>

3841

<title>Creating XML Values</title>

3842

<para>

3843

To produce a value of type <type>xml</type> from character data,

3844

use the function

3845

<function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>

3846

3847

XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)

3848

</synopsis>

3849

Examples:

3850

<programlisting><![CDATA[

3851

XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')

3852

XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')

3853

]]></programlisting>

3854

While this is the only way to convert character strings into XML

3855

values according to the SQL standard, the PostgreSQL-specific

3856

syntaxes:

3857

<programlisting><![CDATA[

3858

xml '<foo>bar</foo>'

3859

'<foo>bar</foo>'::xml

3860

]]></programlisting>

3861

can also be used.

3862

</para>

3863

3864

<para>

3865

The <type>xml</type> type does not validate its input values

3866

against a possibly included document type declaration

3867

(DTD).<indexterm><primary>DTD</primary></indexterm>

3868

</para>

3869

3870

<para>

3871

The inverse operation, producing character string type values from

3872

<type>xml</type>, uses the function

3873

<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>

3874

3875

XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )

3876

</synopsis>

3877

<replaceable>type</replaceable> can be one of

3878

<type>character</type>, <type>character varying</type>, or

3879

<type>text</type> (or an alias name for those). Again, according

3880

to the SQL standard, this is the only way to convert between type

3881

<type>xml</type> and character types, but PostgreSQL also allows

3882

you to simply cast the value.

3883

</para>

3884

3885

<para>

3886

When character string values are cast to or from type

3887

<type>xml</type> without going through <type>XMLPARSE</type> or

3888

<type>XMLSERIALIZE</type>, respectively, the choice of

3889

<literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is

3890

determined by the <quote>XML option</quote>

3891

<indexterm><primary>XML option</primary></indexterm>

3892

session configuration parameter, which can be set using the

3893

standard command

3894

3895

SET XML OPTION { DOCUMENT | CONTENT };

3896

</synopsis>

3897

or the more PostgreSQL-like syntax

3898

3899

SET xmloption TO { DOCUMENT | CONTENT };

3900

</synopsis>

3901

The default is <literal>CONTENT</literal>, so all forms of XML

3902

data are allowed.

3903

</para>

3904

</sect2>

3905

3906

<sect2>

3907

<title>Encoding Handling</title>

3908

<para>

3909

Care must be taken when dealing with multiple character encodings

3910

on the client, server, and in the XML data passed through them.

3911

When using the text mode to pass queries to the server and query

3912

results to the client (which is the normal mode), PostgreSQL

3913

converts all character data passed between the client and the

3914

server and vice versa to the character encoding of the respective

3915

end; see <xref linkend="multibyte">. This includes string

3916

representations of XML values, such as in the above examples.

3917

This would ordinarily mean that encoding declarations contained in

3918

XML data might become invalid as the character data is converted

3919

to other encodings while travelling between client and server,

3920

while the embedded encoding declaration is not changed. To cope

3921

with this behavior, an encoding declaration contained in a

3922

character string presented for input to the <type>xml</type> type

3923

is <emphasis>ignored</emphasis>, and the content is always assumed

3924

to be in the current server encoding. Consequently, for correct

3925

processing, such character strings of XML data must be sent off

3926

from the client in the current client encoding. It is the

3927

responsibility of the client to either convert the document to the

3928

current client encoding before sending it off to the server or to

3929

adjust the client encoding appropriately. On output, values of

3930

type <type>xml</type> will not have an encoding declaration, and

3931

clients must assume that the data is in the current client

3932

encoding.

3933

</para>

3934

3935

<para>

3936

When using the binary mode to pass query parameters to the server

3937

and query results back to the client, no character set conversion

3938

is performed, so the situation is different. In this case, an

3939

encoding declaration in the XML data will be observed, and if it

3940

is absent, the data will be assumed to be in UTF-8 (as required by

3941

the XML standard; note that PostgreSQL does not support UTF-16 at

3942

all). On output, data will have an encoding declaration

3943

specifying the client encoding, unless the client encoding is

3944

UTF-8, in which case it will be omitted.

3945

</para>

3946

3947

<para>

3948

Needless to say, processing XML data with PostgreSQL will be less

3949

error-prone and more efficient if data encoding, client encoding,

3950

and server encoding are the same. Since XML data is internally

3951

processed in UTF-8, computations will be most efficient if the

3952

server encoding is also UTF-8.

3953

</para>

3954

</sect2>

3955

3956

<sect2>

3957

<title>Accessing XML Values</title>

3958

3959

<para>

3960

The <type>xml</type> data type is unusual in that it does not

3961

provide any comparison operators. This is because there is no

3962

well-defined and universally useful comparison algorithm for XML

3963

data. One consequence of this is that you cannot retrieve rows by

3964

comparing an <type>xml</type> column against a search value. XML

3965

values should therefore typically be accompanied by a separate key

3966

field such as an ID. An alternative solution for comparing XML

3967

values is to convert them to character strings first, but note

3968

that character string comparison has little to do with a useful

3969

XML comparison method.

3970

</para>

3971

3972

<para>

3973

Since there are no comparison operators for the <type>xml</type>

3974

data type, it is not possible to create an index directly on a

3975

column of this type. If speedy searches in XML data are desired,

3976

possible workarounds would be casting the expression to a

3977

character string type and indexing that, or indexing an XPath

3978

expression. The actual query would of course have to be adjusted

3979

to search by the indexed expression.

3980

</para>

3981

3982

<para>

3983

The text-search functionality in PostgreSQL could also be used to speed

3984

up full-document searches in XML data. The necessary

3985

preprocessing support is, however, not available in the PostgreSQL

3986

distribution in this release.

3987

</para>

3988

</sect2>

3989

</sect1>

3990

3991

&array;

3992

3993

&rowtypes;

3994

3995

3996

<title>Object Identifier Types</title>

3997

3998

3999

<primary>object identifier</primary>

4000

4001

</indexterm>

4002

4003

4004

4005

</indexterm>

4006

4007

4008

<primary>regproc</primary>

4009

</indexterm>

4010

4011

4012

<primary>regprocedure</primary>

4013

</indexterm>

4014

4015

4016

<primary>regoper</primary>

4017

</indexterm>

4018

4019

4020

<primary>regoperator</primary>

4021

</indexterm>

4022

4023

4024

<primary>regclass</primary>

4025

</indexterm>

4026

4027

4028

<primary>regtype</primary>

4029

</indexterm>

4030

4031

4032

<primary>regconfig</primary>

4033

</indexterm>

4034

4035

4036

<primary>regdictionary</primary>

4037

</indexterm>

4038

4039

4040

4041

</indexterm>

4042

4043

4044

4045

</indexterm>

4046

4047

4048

4049

</indexterm>

4050

4051

<para>

4052

Object identifiers (OIDs) are used internally by

4053

<productname>PostgreSQL</productname> as primary keys for various

4054

system tables. OIDs are not added to user-created tables, unless

4055

<literal>WITH OIDS</literal> is specified when the table is

4056

created, or the <xref linkend="guc-default-with-oids">

4057

configuration variable is enabled. Type <type>oid</> represents

4058

an object identifier. There are also several alias types for

4059

<type>oid</>: <type>regproc</>, <type>regprocedure</>,

4060

<type>regoper</>, <type>regoperator</>, <type>regclass</>,

4061

<type>regtype</>, <type>regconfig</>, and <type>regdictionary</>.

4062

<xref linkend="datatype-oid-table"> shows an overview.

4063

</para>

4064

4065

<para>

4066

The <type>oid</> type is currently implemented as an unsigned

4067

four-byte integer. Therefore, it is not large enough to provide

4068

database-wide uniqueness in large databases, or even in large

4069

individual tables. So, using a user-created table's OID column as

4070

a primary key is discouraged. OIDs are best used only for

4071

references to system tables.

4072

</para>

4073

4074

<para>

4075

The <type>oid</> type itself has few operations beyond comparison.

4076

It can be cast to integer, however, and then manipulated using the

4077

standard integer operators. (Beware of possible

4078

signed-versus-unsigned confusion if you do this.)

4079

</para>

4080

4081

<para>

4082

The OID alias types have no operations of their own except

4083

for specialized input and output routines. These routines are able

4084

to accept and display symbolic names for system objects, rather than

4085

the raw numeric value that type <type>oid</> would use. The alias

4086

types allow simplified lookup of OID values for objects. For example,

4087

to examine the <structname>pg_attribute</> rows related to a table

4088

<literal>mytable</>, one could write:

4089

4090

SELECT * FROM pg_attribute WHERE attrelid = 'mytable'::regclass;

4091

</programlisting>

4092

rather than:

4093

4094

SELECT * FROM pg_attribute

4095

WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = 'mytable');

4096

</programlisting>

4097

While that doesn't look all that bad by itself, it's still oversimplified.

4098

A far more complicated sub-select would be needed to

4099

select the right OID if there are multiple tables named

4100

<literal>mytable</> in different schemas.

4101

The <type>regclass</> input converter handles the table lookup according

4102

to the schema path setting, and so it does the <quote>right thing</>

4103

automatically. Similarly, casting a table's OID to

4104

<type>regclass</> is handy for symbolic display of a numeric OID.

4105

</para>

4106

4107

4108

<title>Object Identifier Types</title>

4109

4110

<thead>

4111

<row>

4112

4113

<entry>References</entry>

4114

<entry>Description</entry>

4115

<entry>Value Example</entry>

4116

</row>

4117

</thead>

4118

4119

<tbody>

4120

4121

<row>

4122

4123

4124

<entry>numeric object identifier</entry>

4125

4126

</row>

4127

4128

<row>

4129

<entry><type>regproc</></entry>

4130

4131

<entry>function name</entry>

4132

4133

</row>

4134

4135

<row>

4136

<entry><type>regprocedure</></entry>

4137

4138

<entry>function with argument types</entry>

4139

4140

</row>

4141

4142

<row>

4143

<entry><type>regoper</></entry>

4144

<entry><structname>pg_operator</></entry>

4145

<entry>operator name</entry>

4146

4147

</row>

4148

4149

<row>

4150

<entry><type>regoperator</></entry>

4151

<entry><structname>pg_operator</></entry>

4152

<entry>operator with argument types</entry>

4153

<entry><literal>*(integer,integer)</> or <literal>-(NONE,integer)</></entry>

4154

</row>

4155

4156

<row>

4157

<entry><type>regclass</></entry>

4158

<entry><structname>pg_class</></entry>

4159

<entry>relation name</entry>

4160

4161

</row>

4162

4163

<row>

4164

<entry><type>regtype</></entry>

4165

4166

4167

<entry><literal>integer</></entry>

4168

</row>

4169

4170

<row>

4171

<entry><type>regconfig</></entry>

4172

<entry><structname>pg_ts_config</></entry>

4173

<entry>text search configuration</entry>

4174

<entry><literal>english</></entry>

4175

</row>

4176

4177

<row>

4178

<entry><type>regdictionary</></entry>

4179

4180

<entry>text search dictionary</entry>

4181

<entry><literal>simple</></entry>

4182

</row>

4183

</tbody>

4184

</tgroup>

4185

</table>

4186

4187

<para>

4188

All of the OID alias types accept schema-qualified names, and will

4189

display schema-qualified names on output if the object would not

4190

be found in the current search path without being qualified.

4191

The <type>regproc</> and <type>regoper</> alias types will only

4192

accept input names that are unique (not overloaded), so they are

4193

of limited use; for most uses <type>regprocedure</> or

4194

<type>regoperator</> is more appropriate. For <type>regoperator</>,

4195

unary operators are identified by writing <literal>NONE</> for the unused

4196

operand.

4197

</para>

4198

4199

<para>

4200

An additional property of the OID alias types is that if a

4201

constant of one of these types appears in a stored expression

4202

(such as a column default expression or view), it creates a dependency

4203

on the referenced object. For example, if a column has a default

4204

expression <literal>nextval('my_seq'::regclass)</>,

4205

<productname>PostgreSQL</productname>

4206

understands that the default expression depends on the sequence

4207

<literal>my_seq</>; the system will not let the sequence be dropped

4208

without first removing the default expression.

4209

</para>

4210

4211

<para>

4212

Another identifier type used by the system is <type>xid</>, or transaction

4213

(abbreviated <abbrev>xact</>) identifier. This is the data type of the system columns

4214

<structfield>xmin</> and <structfield>xmax</>. Transaction identifiers are 32-bit quantities.

4215

</para>

4216

4217

<para>

4218

A third identifier type used by the system is <type>cid</>, or

4219

command identifier. This is the data type of the system columns

4220

<structfield>cmin</> and <structfield>cmax</>. Command identifiers are also 32-bit quantities.

4221

</para>

4222

4223

<para>

4224

A final identifier type used by the system is <type>tid</>, or tuple

4225

identifier (row identifier). This is the data type of the system column

4226

<structfield>ctid</>. A tuple ID is a pair

4227

(block number, tuple index within block) that identifies the

4228

physical location of the row within its table.

4229

</para>

4230

4231

<para>

4232

(The system columns are further explained in <xref

4233

linkend="ddl-system-columns">.)

4234

</para>

4235

</sect1>

4236

4237

4238

<title>Pseudo-Types</title>

4239

4240

4241

<primary>record</primary>

4242

</indexterm>

4243

4244

4245

4246

</indexterm>

4247

4248

4249

<primary>anyelement</primary>

4250

</indexterm>

4251

4252

4253

<primary>anyarray</primary>

4254

</indexterm>

4255

4256

4257

<primary>anynonarray</primary>

4258

</indexterm>

4259

4260

4261

<primary>anyenum</primary>

4262

</indexterm>

4263

4264

4265

4266

</indexterm>

4267

4268

4269

<primary>trigger</primary>

4270

</indexterm>

4271

4272

4273

<primary>language_handler</primary>

4274

</indexterm>

4275

4276

4277

<primary>cstring</primary>

4278

</indexterm>

4279

4280

4281

<primary>internal</primary>

4282

</indexterm>

4283

4284

4285

<primary>opaque</primary>

4286

</indexterm>

4287

4288

<para>

4289

The <productname>PostgreSQL</productname> type system contains a

4290

number of special-purpose entries that are collectively called

4291

<firstterm>pseudo-types</>. A pseudo-type cannot be used as a

4292

column data type, but it can be used to declare a function's

4293

argument or result type. Each of the available pseudo-types is

4294

useful in situations where a function's behavior does not

4295

correspond to simply taking or returning a value of a specific

4296

<acronym>SQL</acronym> data type. <xref

4297

linkend="datatype-pseudotypes-table"> lists the existing

4298

pseudo-types.

4299

</para>

4300

4301

4302

<title>Pseudo-Types</title>

4303

4304

<thead>

4305

<row>

4306

4307

<entry>Description</entry>

4308

</row>

4309

</thead>

4310

4311

<tbody>

4312

<row>

4313

4314

<entry>Indicates that a function accepts any input data type whatever.</entry>

4315

</row>

4316

4317

<row>

4318

<entry><type>anyarray</></entry>

4319

<entry>Indicates that a function accepts any array data type

4320

(see <xref linkend="extend-types-polymorphic">).</entry>

4321

</row>

4322

4323

<row>

4324

<entry><type>anyelement</></entry>

4325

<entry>Indicates that a function accepts any data type

4326

(see <xref linkend="extend-types-polymorphic">).</entry>

4327

</row>

4328

4329

<row>

4330

<entry><type>anyenum</></entry>

4331

<entry>Indicates that a function accepts any enum data type

4332

(see <xref linkend="extend-types-polymorphic"> and

4333

4334

</row>

4335

4336

<row>

4337

<entry><type>anynonarray</></entry>

4338

<entry>Indicates that a function accepts any non-array data type

4339

(see <xref linkend="extend-types-polymorphic">).</entry>

4340

</row>

4341

4342

<row>

4343

<entry><type>cstring</></entry>

4344

<entry>Indicates that a function accepts or returns a null-terminated C string.</entry>

4345

</row>

4346

4347

<row>

4348

<entry><type>internal</></entry>

4349

<entry>Indicates that a function accepts or returns a server-internal

4350

data type.</entry>

4351

</row>

4352

4353

<row>

4354

<entry><type>language_handler</></entry>

4355

<entry>A procedural language call handler is declared to return <type>language_handler</>.</entry>

4356

</row>

4357

4358

<row>

4359

<entry><type>record</></entry>

4360

<entry>Identifies a function returning an unspecified row type.</entry>

4361

</row>

4362

4363

<row>

4364

<entry><type>trigger</></entry>

4365

<entry>A trigger function is declared to return <type>trigger.</></entry>

4366

</row>

4367

4368

<row>

4369

4370

<entry>Indicates that a function returns no value.</entry>

4371

</row>

4372

4373

<row>

4374

<entry><type>opaque</></entry>

4375

<entry>An obsolete type name that formerly served all the above purposes.</entry>

4376

</row>

4377

</tbody>

4378

</tgroup>

4379

</table>

4380

4381

<para>

4382

Functions coded in C (whether built-in or dynamically loaded) can be

4383

declared to accept or return any of these pseudo data types. It is up to

4384

the function author to ensure that the function will behave safely

4385

when a pseudo-type is used as an argument type.

4386

</para>

4387

4388

<para>

4389

Functions coded in procedural languages can use pseudo-types only as

4390

allowed by their implementation languages. At present the procedural

4391

languages all forbid use of a pseudo-type as argument type, and allow

4392

only <type>void</> and <type>record</> as a result type (plus

4393

<type>trigger</> when the function is used as a trigger). Some also

4394

support polymorphic functions using the types <type>anyarray</>,

4395

<type>anyelement</>, <type>anyenum</>, and <type>anynonarray</>.

4396

</para>

4397

4398

<para>

4399

The <type>internal</> pseudo-type is used to declare functions

4400

that are meant only to be called internally by the database

4401

system, and not by direct invocation in a <acronym>SQL</acronym>

4402

query. If a function has at least one <type>internal</>-type

4403

argument then it cannot be called from <acronym>SQL</acronym>. To

4404

preserve the type safety of this restriction it is important to

4405

follow this coding rule: do not create any function that is

4406

declared to return <type>internal</> unless it has at least one

4407

<type>internal</> argument.

4408

</para>

4409

4410

</sect1>

4411

4412

</chapter>

Older »