2
$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.96.4.3 2005-02-21 06:12:41 neilc Exp $
6
<title>User-Defined Functions</title>
8
<indexterm zone="xfunc">
9
<primary>function</primary>
10
<secondary>user-defined</secondary>
14
<productname>PostgreSQL</productname> provides four kinds of
20
query language functions (functions written in
21
<acronym>SQL</acronym>) (<xref linkend="xfunc-sql">)
26
procedural language functions (functions written in, for
27
example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
28
(<xref linkend="xfunc-pl">)
33
internal functions (<xref linkend="xfunc-internal">)
38
C-language functions (<xref linkend="xfunc-c">)
46
of function can take base types, composite types, or
47
combinations of these as arguments (parameters). In addition,
48
every kind of function can return a base type or
49
a composite type. Functions may also be defined to return
50
sets of base or composite values.
54
Many kinds of functions can take or return certain pseudo-types
55
(such as polymorphic types), but the available facilities vary.
56
Consult the description of each kind of function for more details.
60
It's easiest to define <acronym>SQL</acronym>
61
functions, so we'll start by discussing those.
62
Most of the concepts presented for <acronym>SQL</acronym> functions
63
will carry over to the other types of functions.
67
Throughout this chapter, it can be useful to look at the reference
68
page of the <xref linkend="sql-createfunction"
69
endterm="sql-createfunction-title"> command to
70
understand the examples better. Some examples from this chapter
71
can be found in <filename>funcs.sql</filename> and
72
<filename>funcs.c</filename> in the <filename>src/tutorial</>
73
directory in the <productname>PostgreSQL</productname> source
78
<sect1 id="xfunc-sql">
79
<title>Query Language (<acronym>SQL</acronym>) Functions</title>
81
<indexterm zone="xfunc-sql">
82
<primary>function</primary>
83
<secondary>user-defined</secondary>
84
<tertiary>in SQL</tertiary>
88
SQL functions execute an arbitrary list of SQL statements, returning
89
the result of the last query in the list.
90
In the simple (non-set)
91
case, the first row of the last query's result will be returned.
92
(Bear in mind that <quote>the first row</quote> of a multirow
93
result is not well-defined unless you use <literal>ORDER BY</>.)
94
If the last query happens
95
to return no rows at all, the null value will be returned.
99
<indexterm><primary>SETOF</><seealso>function</></> Alternatively,
100
an SQL function may be declared to return a set, by specifying the
101
function's return type as <literal>SETOF
102
<replaceable>sometype</></literal>.<indexterm><primary>SETOF</></>
103
In this case all rows of the last query's result are returned.
104
Further details appear below.
108
The body of an SQL function must be a list of SQL
109
statements separated by semicolons. A semicolon after the last
110
statement is optional. Unless the function is declared to return
111
<type>void</>, the last statement must be a <command>SELECT</>.
115
Any collection of commands in the <acronym>SQL</acronym>
116
language can be packaged together and defined as a function.
117
Besides <command>SELECT</command> queries, the commands can include data
118
modification queries (<command>INSERT</command>,
119
<command>UPDATE</command>, and <command>DELETE</command>), as well as
120
other SQL commands. (The only exception is that you can't put
121
<command>BEGIN</>, <command>COMMIT</>, <command>ROLLBACK</>, or
122
<command>SAVEPOINT</> commands into a <acronym>SQL</acronym> function.)
123
However, the final command
124
must be a <command>SELECT</command> that returns whatever is
125
specified as the function's return type. Alternatively, if you
126
want to define a SQL function that performs actions but has no
127
useful value to return, you can define it as returning <type>void</>.
128
In that case, the function body must not end with a <command>SELECT</command>.
129
For example, this function removes rows with negative salaries from
130
the <literal>emp</> table:
133
CREATE FUNCTION clean_emp() RETURNS void AS '
148
The syntax of the <command>CREATE FUNCTION</command> command requires
149
the function body to be written as a string constant. It is usually
150
most convenient to use dollar quoting (see <xref
151
linkend="sql-syntax-dollar-quoting">) for the string constant.
152
If you choose to use regular single-quoted string constant syntax,
153
you must escape single quote marks (<literal>'</>) and backslashes
154
(<literal>\</>) used in the body of the function, typically by
155
doubling them (see <xref linkend="sql-syntax-strings">).
159
Arguments to the SQL function are referenced in the function
160
body using the syntax <literal>$<replaceable>n</></>: <literal>$1</>
161
refers to the first argument, <literal>$2</> to the second, and so on.
162
If an argument is of a composite type, then the dot notation,
163
e.g., <literal>$1.name</literal>, may be used to access attributes
164
of the argument. The arguments can only be used as data values,
165
not as identifiers. Thus for example this is reasonable:
167
INSERT INTO mytable VALUES ($1);
169
but this will not work:
171
INSERT INTO $1 VALUES (42);
176
<title><acronym>SQL</acronym> Functions on Base Types</title>
179
The simplest possible <acronym>SQL</acronym> function has no arguments and
180
simply returns a base type, such as <type>integer</type>:
183
CREATE FUNCTION one() RETURNS integer AS $$
187
-- Alternative syntax for string literal:
188
CREATE FUNCTION one() RETURNS integer AS '
201
Notice that we defined a column alias within the function body for the result of the function
202
(with the name <literal>result</>), but this column alias is not visible
203
outside the function. Hence, the result is labeled <literal>one</>
204
instead of <literal>result</>.
208
It is almost as easy to define <acronym>SQL</acronym> functions
209
that take base types as arguments. In the example below, notice
210
how we refer to the arguments within the function as <literal>$1</>
214
CREATE FUNCTION add_em(integer, integer) RETURNS integer AS $$
218
SELECT add_em(1, 2) AS answer;
227
Here is a more useful function, which might be used to debit a
231
CREATE FUNCTION tf1 (integer, numeric) RETURNS integer AS $$
233
SET balance = balance - $2
234
WHERE accountno = $1;
239
A user could execute this function to debit account 17 by $100.00 as
243
SELECT tf1(17, 100.0);
248
In practice one would probably like a more useful result from the
249
function than a constant 1, so a more likely definition
253
CREATE FUNCTION tf1 (integer, numeric) RETURNS numeric AS $$
255
SET balance = balance - $2
256
WHERE accountno = $1;
257
SELECT balance FROM bank WHERE accountno = $1;
261
which adjusts the balance and returns the new balance.
266
<title><acronym>SQL</acronym> Functions on Composite Types</title>
269
When writing functions with arguments of composite
270
types, we must not only specify which
271
argument we want (as we did above with <literal>$1</> and <literal>$2</literal>) but
272
also the desired attribute (field) of that argument. For example,
274
<type>emp</type> is a table containing employee data, and therefore
275
also the name of the composite type of each row of the table. Here
276
is a function <function>double_salary</function> that computes what someone's
277
salary would be if it were doubled:
287
CREATE FUNCTION double_salary(emp) RETURNS numeric AS $$
288
SELECT $1.salary * 2 AS salary;
291
SELECT name, double_salary(emp.*) AS dream
293
WHERE emp.cubicle ~= point '(2,1)';
302
Notice the use of the syntax <literal>$1.salary</literal>
303
to select one field of the argument row value. Also notice
304
how the calling <command>SELECT</> command uses <literal>*</>
306
the entire current row of a table as a composite value. The table
307
row can alternatively be referenced using just the table name,
310
SELECT name, double_salary(emp) AS dream
312
WHERE emp.cubicle ~= point '(2,1)';
314
but this usage is deprecated since it's easy to get confused.
318
Sometimes it is handy to construct a composite argument value
319
on-the-fly. This can be done with the <literal>ROW</> construct.
320
For example, we could adjust the data being passed to the function:
322
SELECT name, double_salary(ROW(name, salary*1.1, age, cubicle)) AS dream
328
It is also possible to build a function that returns a composite type.
329
This is an example of a function
330
that returns a single <type>emp</type> row:
333
CREATE FUNCTION new_emp() RETURNS emp AS $$
334
SELECT text 'None' AS name,
337
point '(2,2)' AS cubicle;
341
In this example we have specified each of the attributes
342
with a constant value, but any computation
343
could have been substituted for these constants.
347
Note two important things about defining the function:
352
The select list order in the query must be exactly the same as
353
that in which the columns appear in the table associated
354
with the composite type. (Naming the columns, as we did above,
355
is irrelevant to the system.)
360
You must typecast the expressions to match the
361
definition of the composite type, or you will get errors like this:
364
ERROR: function declared to return emp returns varchar instead of text at column 1
373
A different way to define the same function is:
376
CREATE FUNCTION new_emp() RETURNS emp AS $$
377
SELECT ROW('None', 1000.0, 25, '(2,2)')::emp;
381
Here we wrote a <command>SELECT</> that returns just a single
382
column of the correct composite type. This isn't really better
383
in this situation, but it is a handy alternative in some cases
384
— for example, if we need to compute the result by calling
385
another function that returns the desired composite value.
389
We could call this function directly in either of two ways:
395
--------------------------
396
(None,1000.0,25,"(2,2)")
398
SELECT * FROM new_emp();
400
name | salary | age | cubicle
401
------+--------+-----+---------
402
None | 1000.0 | 25 | (2,2)
405
The second way is described more fully in <xref
406
linkend="xfunc-sql-table-functions">.
410
When you use a function that returns a composite type,
411
you might want only one field (attribute) from its result.
412
You can do that with syntax like this:
415
SELECT (new_emp()).name;
422
The extra parentheses are needed to keep the parser from getting
423
confused. If you try to do it without them, you get something like this:
426
SELECT new_emp().name;
427
ERROR: syntax error at or near "." at character 17
428
LINE 1: SELECT new_emp().name;
434
Another option is to use
435
functional notation for extracting an attribute. The simple way
436
to explain this is that we can use the
437
notations <literal>attribute(table)</> and <literal>table.attribute</>
441
SELECT name(new_emp());
449
-- This is the same as:
450
-- SELECT emp.name AS youngster FROM emp WHERE emp.age < 30;
452
SELECT name(emp) AS youngster FROM emp WHERE age(emp) < 30;
463
The equivalence between functional notation and attribute notation
464
makes it possible to use functions on composite types to emulate
465
<quote>computed fields</>.
467
<primary>computed field</primary>
470
<primary>field</primary>
471
<secondary>computed</secondary>
473
For example, using the previous definition
474
for <literal>double_salary(emp)</>, we can write
477
SELECT emp.name, emp.double_salary FROM emp;
480
An application using this wouldn't need to be directly aware that
481
<literal>double_salary</> isn't a real column of the table.
482
(You can also emulate computed fields with views.)
487
Another way to use a function returning a row result is to pass the
488
result to another function that accepts the correct row type as input:
491
CREATE FUNCTION getname(emp) RETURNS text AS $$
495
SELECT getname(new_emp());
504
Another way to use a function that returns a composite type is to
505
call it as a table function, as described below.
509
<sect2 id="xfunc-sql-table-functions">
510
<title><acronym>SQL</acronym> Functions as Table Sources</title>
513
All SQL functions may be used in the <literal>FROM</> clause of a query,
514
but it is particularly useful for functions returning composite types.
515
If the function is defined to return a base type, the table function
516
produces a one-column table. If the function is defined to return
517
a composite type, the table function produces a column for each attribute
518
of the composite type.
525
CREATE TABLE foo (fooid int, foosubid int, fooname text);
526
INSERT INTO foo VALUES (1, 1, 'Joe');
527
INSERT INTO foo VALUES (1, 2, 'Ed');
528
INSERT INTO foo VALUES (2, 1, 'Mary');
530
CREATE FUNCTION getfoo(int) RETURNS foo AS $$
531
SELECT * FROM foo WHERE fooid = $1;
534
SELECT *, upper(fooname) FROM getfoo(1) AS t1;
536
fooid | foosubid | fooname | upper
537
-------+----------+---------+-------
542
As the example shows, we can work with the columns of the function's
543
result just the same as if they were columns of a regular table.
547
Note that we only got one row out of the function. This is because
548
we did not use <literal>SETOF</>. That is described in the next section.
553
<title><acronym>SQL</acronym> Functions Returning Sets</title>
556
When an SQL function is declared as returning <literal>SETOF
557
<replaceable>sometype</></literal>, the function's final
558
<command>SELECT</> query is executed to completion, and each row it
559
outputs is returned as an element of the result set.
563
This feature is normally used when calling the function in the <literal>FROM</>
564
clause. In this case each row returned by the function becomes
565
a row of the table seen by the query. For example, assume that
566
table <literal>foo</> has the same contents as above, and we say:
569
CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
570
SELECT * FROM foo WHERE fooid = $1;
573
SELECT * FROM getfoo(1) AS t1;
578
fooid | foosubid | fooname
579
-------+----------+---------
587
Currently, functions returning sets may also be called in the select list
588
of a query. For each row that the query
589
generates by itself, the function returning set is invoked, and an output
590
row is generated for each element of the function's result set. Note,
591
however, that this capability is deprecated and may be removed in future
592
releases. The following is an example function returning a set from the
596
CREATE FUNCTION listchildren(text) RETURNS SETOF text AS $$
597
SELECT name FROM nodes WHERE parent = $1
611
SELECT listchildren('Top');
619
SELECT name, listchildren(name) FROM nodes;
621
--------+--------------
630
In the last <command>SELECT</command>,
631
notice that no output row appears for <literal>Child2</>, <literal>Child3</>, etc.
632
This happens because <function>listchildren</function> returns an empty set
633
for those arguments, so no result rows are generated.
638
<title>Polymorphic <acronym>SQL</acronym> Functions</title>
641
<acronym>SQL</acronym> functions may be declared to accept and
642
return the polymorphic types <type>anyelement</type> and
643
<type>anyarray</type>. See <xref
644
linkend="extend-types-polymorphic"> for a more detailed
645
explanation of polymorphic functions. Here is a polymorphic
646
function <function>make_array</function> that builds up an array
647
from two arbitrary data type elements:
649
CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS $$
650
SELECT ARRAY[$1, $2];
653
SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
655
----------+-----------
662
Notice the use of the typecast <literal>'a'::text</literal>
663
to specify that the argument is of type <type>text</type>. This is
664
required if the argument is just a string literal, since otherwise
665
it would be treated as type
666
<type>unknown</type>, and array of <type>unknown</type> is not a valid
668
Without the typecast, you will get errors like this:
671
ERROR: could not determine "anyarray"/"anyelement" type because input has type "unknown"
677
It is permitted to have polymorphic arguments with a fixed
678
return type, but the converse is not. For example:
680
CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
684
SELECT is_greater(1, 2);
690
CREATE FUNCTION invalid_func() RETURNS anyelement AS $$
693
ERROR: cannot determine result data type
694
DETAIL: A function returning "anyarray" or "anyelement" must have at least one argument of either type.
700
<sect1 id="xfunc-overload">
701
<title>Function Overloading</title>
703
<indexterm zone="xfunc-overload">
704
<primary>overloading</primary>
705
<secondary>functions</secondary>
709
More than one function may be defined with the same SQL name, so long
710
as the arguments they take are different. In other words,
711
function names can be <firstterm>overloaded</firstterm>. When a
712
query is executed, the server will determine which function to
713
call from the data types and the number of the provided arguments.
714
Overloading can also be used to simulate functions with a variable
715
number of arguments, up to a finite maximum number.
719
When creating a family of overloaded functions, one should be
720
careful not to create ambiguities. For instance, given the
723
CREATE FUNCTION test(int, real) RETURNS ...
724
CREATE FUNCTION test(smallint, double precision) RETURNS ...
726
it is not immediately clear which function would be called with
727
some trivial input like <literal>test(1, 1.5)</literal>. The
728
currently implemented resolution rules are described in
729
<xref linkend="typeconv">, but it is unwise to design a system that subtly
730
relies on this behavior.
734
A function that takes a single argument of a composite type should
735
generally not have the same name as any attribute (field) of that type.
736
Recall that <literal>attribute(table)</literal> is considered equivalent
737
to <literal>table.attribute</literal>. In the case that there is an
738
ambiguity between a function on a composite type and an attribute of
739
the composite type, the attribute will always be used. It is possible
740
to override that choice by schema-qualifying the function name
741
(that is, <literal>schema.func(table)</literal>) but it's better to
742
avoid the problem by not choosing conflicting names.
746
When overloading C-language functions, there is an additional
747
constraint: The C name of each function in the family of
748
overloaded functions must be different from the C names of all
749
other functions, either internal or dynamically loaded. If this
750
rule is violated, the behavior is not portable. You might get a
751
run-time linker error, or one of the functions will get called
752
(usually the internal one). The alternative form of the
753
<literal>AS</> clause for the SQL <command>CREATE
754
FUNCTION</command> command decouples the SQL function name from
755
the function name in the C source code. For instance,
757
CREATE FUNCTION test(int) RETURNS int
758
AS '<replaceable>filename</>', 'test_1arg'
760
CREATE FUNCTION test(int, int) RETURNS int
761
AS '<replaceable>filename</>', 'test_2arg'
764
The names of the C functions here reflect one of many possible conventions.
768
<sect1 id="xfunc-volatility">
769
<title>Function Volatility Categories</title>
771
<indexterm zone="xfunc-volatility">
772
<primary>volatility</primary>
773
<secondary>functions</secondary>
775
<indexterm zone="xfunc-volatility">
776
<primary>VOLATILE</primary>
778
<indexterm zone="xfunc-volatility">
779
<primary>STABLE</primary>
781
<indexterm zone="xfunc-volatility">
782
<primary>IMMUTABLE</primary>
786
Every function has a <firstterm>volatility</> classification, with
787
the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
788
<literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
789
<command>CREATE FUNCTION</command> command does not specify a category.
790
The volatility category is a promise to the optimizer about the behavior
796
A <literal>VOLATILE</> function can do anything, including modifying
797
the database. It can return different results on successive calls with
798
the same arguments. The optimizer makes no assumptions about the
799
behavior of such functions. A query using a volatile function will
800
re-evaluate the function at every row where its value is needed.
805
A <literal>STABLE</> function cannot modify the database and is
806
guaranteed to return the same results given the same arguments
807
for all calls within a single surrounding query. This category
808
allows the optimizer to optimize away multiple calls of the function
809
within a single query. In particular, it is safe to use an expression
810
containing such a function in an index scan condition. (Since an
811
index scan will evaluate the comparison value only once, not once at
812
each row, it is not valid to use a <literal>VOLATILE</> function in
813
an index scan condition.)
818
An <literal>IMMUTABLE</> function cannot modify the database and is
819
guaranteed to return the same results given the same arguments forever.
820
This category allows the optimizer to pre-evaluate the function when
821
a query calls it with constant arguments. For example, a query like
822
<literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
823
<literal>SELECT ... WHERE x = 4</>, because the function underlying
824
the integer addition operator is marked <literal>IMMUTABLE</>.
831
For best optimization results, you should label your functions with the
832
strictest volatility category that is valid for them.
836
Any function with side-effects <emphasis>must</> be labeled
837
<literal>VOLATILE</>, so that calls to it cannot be optimized away.
838
Even a function with no side-effects needs to be labeled
839
<literal>VOLATILE</> if its value can change within a single query;
840
some examples are <literal>random()</>, <literal>currval()</>,
841
<literal>timeofday()</>.
845
There is relatively little difference between <literal>STABLE</> and
846
<literal>IMMUTABLE</> categories when considering simple interactive
847
queries that are planned and immediately executed: it doesn't matter
848
a lot whether a function is executed once during planning or once during
849
query execution startup. But there is a big difference if the plan is
850
saved and reused later. Labeling a function <literal>IMMUTABLE</> when
851
it really isn't may allow it to be prematurely folded to a constant during
852
planning, resulting in a stale value being re-used during subsequent uses
853
of the plan. This is a hazard when using prepared statements or when
854
using function languages that cache plans (such as
855
<application>PL/pgSQL</>).
859
Because of the snapshotting behavior of MVCC (see <xref linkend="mvcc">)
860
a function containing only <command>SELECT</> commands can safely be
861
marked <literal>STABLE</>, even if it selects from tables that might be
862
undergoing modifications by concurrent queries.
863
<productname>PostgreSQL</productname> will execute a <literal>STABLE</>
864
function using the snapshot established for the calling query, and so it
865
will see a fixed view of the database throughout that query.
867
that the <function>current_timestamp</> family of functions qualify
868
as stable, since their values do not change within a transaction.
872
The same snapshotting behavior is used for <command>SELECT</> commands
873
within <literal>IMMUTABLE</> functions. It is generally unwise to select
874
from database tables within an <literal>IMMUTABLE</> function at all,
875
since the immutability will be broken if the table contents ever change.
876
However, <productname>PostgreSQL</productname> does not enforce that you
881
A common error is to label a function <literal>IMMUTABLE</> when its
882
results depend on a configuration parameter. For example, a function
883
that manipulates timestamps might well have results that depend on the
884
<xref linkend="guc-timezone"> setting. For safety, such functions should
885
be labeled <literal>STABLE</> instead.
890
Before <productname>PostgreSQL</productname> release 8.0, the requirement
891
that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
892
the database was not enforced by the system. Release 8.0 enforces it
893
by requiring SQL functions and procedural language functions of these
894
categories to contain no SQL commands other than <command>SELECT</>.
895
(This is not a completely bulletproof test, since such functions could
896
still call <literal>VOLATILE</> functions that modify the database.
897
If you do that, you will find that the <literal>STABLE</> or
898
<literal>IMMUTABLE</> function does not notice the database changes
899
applied by the called function.)
904
<sect1 id="xfunc-pl">
905
<title>Procedural Language Functions</title>
908
<productname>PostgreSQL</productname> allows user-defined functions
909
to be written in other languages besides SQL and C. These other
910
languages are generically called <firstterm>procedural
911
languages</firstterm> (<acronym>PL</>s).
912
Procedural languages aren't built into the
913
<productname>PostgreSQL</productname> server; they are offered
915
See <xref linkend="xplang"> and following chapters for more
920
<sect1 id="xfunc-internal">
921
<title>Internal Functions</title>
923
<indexterm zone="xfunc-internal"><primary>function</><secondary>internal</></>
926
Internal functions are functions written in C that have been statically
927
linked into the <productname>PostgreSQL</productname> server.
928
The <quote>body</quote> of the function definition
929
specifies the C-language name of the function, which need not be the
930
same as the name being declared for SQL use.
931
(For reasons of backwards compatibility, an empty body
932
is accepted as meaning that the C-language function name is the
933
same as the SQL name.)
937
Normally, all internal functions present in the
938
server are declared during the initialization of the database cluster (<command>initdb</command>),
939
but a user could use <command>CREATE FUNCTION</command>
940
to create additional alias names for an internal function.
941
Internal functions are declared in <command>CREATE FUNCTION</command>
942
with language name <literal>internal</literal>. For instance, to
943
create an alias for the <function>sqrt</function> function:
945
CREATE FUNCTION square_root(double precision) RETURNS double precision
950
(Most internal functions expect to be declared <quote>strict</quote>.)
955
Not all <quote>predefined</quote> functions are
956
<quote>internal</quote> in the above sense. Some predefined
957
functions are written in SQL.
963
<title>C-Language Functions</title>
965
<indexterm zone="xfunc-sql">
966
<primary>function</primary>
967
<secondary>user-defined</secondary>
968
<tertiary>in C</tertiary>
972
User-defined functions can be written in C (or a language that can
973
be made compatible with C, such as C++). Such functions are
974
compiled into dynamically loadable objects (also called shared
975
libraries) and are loaded by the server on demand. The dynamic
976
loading feature is what distinguishes <quote>C language</> functions
977
from <quote>internal</> functions — the actual coding conventions
978
are essentially the same for both. (Hence, the standard internal
979
function library is a rich source of coding examples for user-defined
984
Two different calling conventions are currently used for C functions.
985
The newer <quote>version 1</quote> calling convention is indicated by writing
986
a <literal>PG_FUNCTION_INFO_V1()</literal> macro call for the function,
987
as illustrated below. Lack of such a macro indicates an old-style
988
(<quote>version 0</quote>) function. The language name specified in <command>CREATE FUNCTION</command>
989
is <literal>C</literal> in either case. Old-style functions are now deprecated
990
because of portability problems and lack of functionality, but they
991
are still supported for compatibility reasons.
994
<sect2 id="xfunc-c-dynload">
995
<title>Dynamic Loading</title>
997
<indexterm zone="xfunc-c-dynload">
998
<primary>dynamic loading</primary>
1002
The first time a user-defined function in a particular
1003
loadable object file is called in a session,
1004
the dynamic loader loads that object file into memory so that the
1005
function can be called. The <command>CREATE FUNCTION</command>
1006
for a user-defined C function must therefore specify two pieces of
1007
information for the function: the name of the loadable
1008
object file, and the C name (link symbol) of the specific function to call
1009
within that object file. If the C name is not explicitly specified then
1010
it is assumed to be the same as the SQL function name.
1014
The following algorithm is used to locate the shared object file
1015
based on the name given in the <command>CREATE FUNCTION</command>
1021
If the name is an absolute path, the given file is loaded.
1027
If the name starts with the string <literal>$libdir</literal>,
1028
that part is replaced by the <productname>PostgreSQL</> package
1030
name, which is determined at build time.<indexterm><primary>$libdir</></>
1036
If the name does not contain a directory part, the file is
1037
searched for in the path specified by the configuration variable
1038
<xref linkend="guc-dynamic-library-path">.<indexterm><primary>dynamic_library_path</></>
1044
Otherwise (the file was not found in the path, or it contains a
1045
non-absolute directory part), the dynamic loader will try to
1046
take the name as given, which will most likely fail. (It is
1047
unreliable to depend on the current working directory.)
1052
If this sequence does not work, the platform-specific shared
1053
library file name extension (often <filename>.so</filename>) is
1054
appended to the given name and this sequence is tried again. If
1055
that fails as well, the load will fail.
1059
The user ID the <productname>PostgreSQL</productname> server runs
1060
as must be able to traverse the path to the file you intend to
1061
load. Making the file or a higher-level directory not readable
1062
and/or not executable by the <systemitem>postgres</systemitem>
1063
user is a common mistake.
1067
In any case, the file name that is given in the
1068
<command>CREATE FUNCTION</command> command is recorded literally
1069
in the system catalogs, so if the file needs to be loaded again
1070
the same procedure is applied.
1075
<productname>PostgreSQL</productname> will not compile a C function
1076
automatically. The object file must be compiled before it is referenced
1077
in a <command>CREATE
1078
FUNCTION</> command. See <xref linkend="dfunc"> for additional
1084
After it is used for the first time, a dynamically loaded object
1085
file is retained in memory. Future calls in the same session to
1086
the function(s) in that file will only incur the small overhead of
1087
a symbol table lookup. If you need to force a reload of an object
1088
file, for example after recompiling it, use the <command>LOAD</>
1089
command or begin a fresh session.
1093
It is recommended to locate shared libraries either relative to
1094
<literal>$libdir</literal> or through the dynamic library path.
1095
This simplifies version upgrades if the new installation is at a
1096
different location. The actual directory that
1097
<literal>$libdir</literal> stands for can be found out with the
1098
command <literal>pg_config --pkglibdir</literal>.
1102
Before <productname>PostgreSQL</productname> release 7.2, only
1103
exact absolute paths to object files could be specified in
1104
<command>CREATE FUNCTION</>. This approach is now deprecated
1105
since it makes the function definition unnecessarily unportable.
1106
It's best to specify just the shared library name with no path nor
1107
extension, and let the search mechanism provide that information
1112
<sect2 id="xfunc-c-basetype">
1113
<title>Base Types in C-Language Functions</title>
1115
<indexterm zone="xfunc-c-basetype">
1116
<primary>data type</primary>
1117
<secondary>internal organisation</secondary>
1121
To know how to write C-language functions, you need to know how
1122
<productname>PostgreSQL</productname> internally represents base
1123
data types and how they can be passed to and from functions.
1124
Internally, <productname>PostgreSQL</productname> regards a base
1125
type as a <quote>blob of memory</quote>. The user-defined
1126
functions that you define over a type in turn define the way that
1127
<productname>PostgreSQL</productname> can operate on it. That
1128
is, <productname>PostgreSQL</productname> will only store and
1129
retrieve the data from disk and use your user-defined functions
1130
to input, process, and output the data.
1134
Base types can have one of three internal formats:
1139
pass by value, fixed-length
1144
pass by reference, fixed-length
1149
pass by reference, variable-length
1156
By-value types can only be 1, 2, or 4 bytes in length
1157
(also 8 bytes, if <literal>sizeof(Datum)</literal> is 8 on your machine).
1158
You should be careful
1159
to define your types such that they will be the same
1160
size (in bytes) on all architectures. For example, the
1161
<literal>long</literal> type is dangerous because it
1162
is 4 bytes on some machines and 8 bytes on others, whereas
1163
<type>int</type> type is 4 bytes on most
1164
Unix machines. A reasonable implementation of
1165
the <type>int4</type> type on Unix
1169
/* 4-byte integer, passed by value */
1175
On the other hand, fixed-length types of any size may
1176
be passed by-reference. For example, here is a sample
1177
implementation of a <productname>PostgreSQL</productname> type:
1180
/* 16-byte structure, passed by reference */
1187
Only pointers to such types can be used when passing
1188
them in and out of <productname>PostgreSQL</productname> functions.
1189
To return a value of such a type, allocate the right amount of
1190
memory with <literal>palloc</literal>, fill in the allocated memory,
1191
and return a pointer to it. (You can also return an input value
1192
that has the same type as the return value directly by returning
1193
the pointer to the input value. <emphasis>Never</> modify the
1194
contents of a pass-by-reference input value, however.)
1198
Finally, all variable-length types must also be passed
1199
by reference. All variable-length types must begin
1200
with a length field of exactly 4 bytes, and all data to
1201
be stored within that type must be located in the memory
1202
immediately following that length field. The
1203
length field contains the total length of the structure,
1204
that is, it includes the size of the length field
1209
As an example, we can define the type <type>text</type> as
1219
Obviously, the data field declared here is not long enough to hold
1220
all possible strings. Since it's impossible to declare a variable-size
1221
structure in <acronym>C</acronym>, we rely on the knowledge that the
1222
<acronym>C</acronym> compiler won't range-check array subscripts. We
1223
just allocate the necessary amount of space and then access the array as
1224
if it were declared the right length. (This is a common trick, which
1225
you can read about in many textbooks about C.)
1230
variable-length types, we must be careful to allocate
1231
the correct amount of memory and set the length field correctly.
1232
For example, if we wanted to store 40 bytes in a <structname>text</>
1233
structure, we might use a code fragment like this:
1236
#include "postgres.h"
1238
char buffer[40]; /* our source data */
1240
text *destination = (text *) palloc(VARHDRSZ + 40);
1241
destination->length = VARHDRSZ + 40;
1242
memcpy(destination->data, buffer, 40);
1246
<literal>VARHDRSZ</> is the same as <literal>sizeof(int4)</>, but
1247
it's considered good style to use the macro <literal>VARHDRSZ</>
1248
to refer to the size of the overhead for a variable-length type.
1252
<xref linkend="xfunc-c-type-table"> specifies which C type
1253
corresponds to which SQL type when writing a C-language function
1254
that uses a built-in type of <productname>PostgreSQL</>.
1255
The <quote>Defined In</quote> column gives the header file that
1256
needs to be included to get the type definition. (The actual
1257
definition may be in a different file that is included by the
1258
listed file. It is recommended that users stick to the defined
1259
interface.) Note that you should always include
1260
<filename>postgres.h</filename> first in any source file, because
1261
it declares a number of things that you will need anyway.
1264
<table tocentry="1" id="xfunc-c-type-table">
1265
<title>Equivalent C Types for Built-In SQL Types</title>
1282
<entry><type>abstime</type></entry>
1283
<entry><type>AbsoluteTime</type></entry>
1284
<entry><filename>utils/nabstime.h</filename></entry>
1287
<entry><type>boolean</type></entry>
1288
<entry><type>bool</type></entry>
1289
<entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
1292
<entry><type>box</type></entry>
1293
<entry><type>BOX*</type></entry>
1294
<entry><filename>utils/geo_decls.h</filename></entry>
1297
<entry><type>bytea</type></entry>
1298
<entry><type>bytea*</type></entry>
1299
<entry><filename>postgres.h</filename></entry>
1302
<entry><type>"char"</type></entry>
1303
<entry><type>char</type></entry>
1304
<entry>(compiler built-in)</entry>
1307
<entry><type>character</type></entry>
1308
<entry><type>BpChar*</type></entry>
1309
<entry><filename>postgres.h</filename></entry>
1312
<entry><type>cid</type></entry>
1313
<entry><type>CommandId</type></entry>
1314
<entry><filename>postgres.h</filename></entry>
1317
<entry><type>date</type></entry>
1318
<entry><type>DateADT</type></entry>
1319
<entry><filename>utils/date.h</filename></entry>
1322
<entry><type>smallint</type> (<type>int2</type>)</entry>
1323
<entry><type>int2</type> or <type>int16</type></entry>
1324
<entry><filename>postgres.h</filename></entry>
1327
<entry><type>int2vector</type></entry>
1328
<entry><type>int2vector*</type></entry>
1329
<entry><filename>postgres.h</filename></entry>
1332
<entry><type>integer</type> (<type>int4</type>)</entry>
1333
<entry><type>int4</type> or <type>int32</type></entry>
1334
<entry><filename>postgres.h</filename></entry>
1337
<entry><type>real</type> (<type>float4</type>)</entry>
1338
<entry><type>float4*</type></entry>
1339
<entry><filename>postgres.h</filename></entry>
1342
<entry><type>double precision</type> (<type>float8</type>)</entry>
1343
<entry><type>float8*</type></entry>
1344
<entry><filename>postgres.h</filename></entry>
1347
<entry><type>interval</type></entry>
1348
<entry><type>Interval*</type></entry>
1349
<entry><filename>utils/timestamp.h</filename></entry>
1352
<entry><type>lseg</type></entry>
1353
<entry><type>LSEG*</type></entry>
1354
<entry><filename>utils/geo_decls.h</filename></entry>
1357
<entry><type>name</type></entry>
1358
<entry><type>Name</type></entry>
1359
<entry><filename>postgres.h</filename></entry>
1362
<entry><type>oid</type></entry>
1363
<entry><type>Oid</type></entry>
1364
<entry><filename>postgres.h</filename></entry>
1367
<entry><type>oidvector</type></entry>
1368
<entry><type>oidvector*</type></entry>
1369
<entry><filename>postgres.h</filename></entry>
1372
<entry><type>path</type></entry>
1373
<entry><type>PATH*</type></entry>
1374
<entry><filename>utils/geo_decls.h</filename></entry>
1377
<entry><type>point</type></entry>
1378
<entry><type>POINT*</type></entry>
1379
<entry><filename>utils/geo_decls.h</filename></entry>
1382
<entry><type>regproc</type></entry>
1383
<entry><type>regproc</type></entry>
1384
<entry><filename>postgres.h</filename></entry>
1387
<entry><type>reltime</type></entry>
1388
<entry><type>RelativeTime</type></entry>
1389
<entry><filename>utils/nabstime.h</filename></entry>
1392
<entry><type>text</type></entry>
1393
<entry><type>text*</type></entry>
1394
<entry><filename>postgres.h</filename></entry>
1397
<entry><type>tid</type></entry>
1398
<entry><type>ItemPointer</type></entry>
1399
<entry><filename>storage/itemptr.h</filename></entry>
1402
<entry><type>time</type></entry>
1403
<entry><type>TimeADT</type></entry>
1404
<entry><filename>utils/date.h</filename></entry>
1407
<entry><type>time with time zone</type></entry>
1408
<entry><type>TimeTzADT</type></entry>
1409
<entry><filename>utils/date.h</filename></entry>
1412
<entry><type>timestamp</type></entry>
1413
<entry><type>Timestamp*</type></entry>
1414
<entry><filename>utils/timestamp.h</filename></entry>
1417
<entry><type>tinterval</type></entry>
1418
<entry><type>TimeInterval</type></entry>
1419
<entry><filename>utils/nabstime.h</filename></entry>
1422
<entry><type>varchar</type></entry>
1423
<entry><type>VarChar*</type></entry>
1424
<entry><filename>postgres.h</filename></entry>
1427
<entry><type>xid</type></entry>
1428
<entry><type>TransactionId</type></entry>
1429
<entry><filename>postgres.h</filename></entry>
1436
Now that we've gone over all of the possible structures
1437
for base types, we can show some examples of real functions.
1442
<title>Calling Conventions Version 0 for C-Language Functions</title>
1445
We present the <quote>old style</quote> calling convention first — although
1446
this approach is now deprecated, it's easier to get a handle on
1447
initially. In the version-0 method, the arguments and result
1448
of the C function are just declared in normal C style, but being
1449
careful to use the C representation of each SQL data type as shown
1454
Here are some examples:
1457
#include "postgres.h"
1458
#include <string.h>
1468
/* by reference, fixed length */
1471
add_one_float8(float8 *arg)
1473
float8 *result = (float8 *) palloc(sizeof(float8));
1475
*result = *arg + 1.0;
1481
makepoint(Point *pointx, Point *pointy)
1483
Point *new_point = (Point *) palloc(sizeof(Point));
1485
new_point->x = pointx->x;
1486
new_point->y = pointy->y;
1491
/* by reference, variable length */
1497
* VARSIZE is the total size of the struct in bytes.
1499
text *new_t = (text *) palloc(VARSIZE(t));
1500
VARATT_SIZEP(new_t) = VARSIZE(t);
1502
* VARDATA is a pointer to the data region of the struct.
1504
memcpy((void *) VARDATA(new_t), /* destination */
1505
(void *) VARDATA(t), /* source */
1506
VARSIZE(t)-VARHDRSZ); /* how many bytes */
1511
concat_text(text *arg1, text *arg2)
1513
int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1514
text *new_text = (text *) palloc(new_text_size);
1516
VARATT_SIZEP(new_text) = new_text_size;
1517
memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1518
memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1519
VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1526
Supposing that the above code has been prepared in file
1527
<filename>funcs.c</filename> and compiled into a shared object,
1528
we could define the functions to <productname>PostgreSQL</productname>
1529
with commands like this:
1532
CREATE FUNCTION add_one(integer) RETURNS integer
1533
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one'
1536
-- note overloading of SQL function name "add_one"
1537
CREATE FUNCTION add_one(double precision) RETURNS double precision
1538
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8'
1541
CREATE FUNCTION makepoint(point, point) RETURNS point
1542
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint'
1545
CREATE FUNCTION copytext(text) RETURNS text
1546
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext'
1549
CREATE FUNCTION concat_text(text, text) RETURNS text
1550
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text',
1556
Here, <replaceable>DIRECTORY</replaceable> stands for the
1557
directory of the shared library file (for instance the
1558
<productname>PostgreSQL</productname> tutorial directory, which
1559
contains the code for the examples used in this section).
1560
(Better style would be to use just <literal>'funcs'</> in the
1561
<literal>AS</> clause, after having added
1562
<replaceable>DIRECTORY</replaceable> to the search path. In any
1563
case, we may omit the system-specific extension for a shared
1564
library, commonly <literal>.so</literal> or
1565
<literal>.sl</literal>.)
1569
Notice that we have specified the functions as <quote>strict</quote>,
1571
the system should automatically assume a null result if any input
1572
value is null. By doing this, we avoid having to check for null inputs
1573
in the function code. Without this, we'd have to check for null values
1574
explicitly, by checking for a null pointer for each
1575
pass-by-reference argument. (For pass-by-value arguments, we don't
1576
even have a way to check!)
1580
Although this calling convention is simple to use,
1581
it is not very portable; on some architectures there are problems
1582
with passing data types that are smaller than <type>int</type> this way. Also, there is
1583
no simple way to return a null result, nor to cope with null arguments
1584
in any way other than making the function strict. The version-1
1585
convention, presented next, overcomes these objections.
1590
<title>Calling Conventions Version 1 for C-Language Functions</title>
1593
The version-1 calling convention relies on macros to suppress most
1594
of the complexity of passing arguments and results. The C declaration
1595
of a version-1 function is always
1597
Datum funcname(PG_FUNCTION_ARGS)
1599
In addition, the macro call
1601
PG_FUNCTION_INFO_V1(funcname);
1603
must appear in the same source file. (Conventionally. it's
1604
written just before the function itself.) This macro call is not
1605
needed for <literal>internal</>-language functions, since
1606
<productname>PostgreSQL</> assumes that all internal functions
1607
use the version-1 convention. It is, however, required for
1608
dynamically-loaded functions.
1612
In a version-1 function, each actual argument is fetched using a
1613
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1614
macro that corresponds to the argument's data type, and the
1615
result is returned using a
1616
<function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
1617
macro for the return type.
1618
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1619
takes as its argument the number of the function argument to
1620
fetch, where the count starts at 0.
1621
<function>PG_RETURN_<replaceable>xxx</replaceable>()</function>
1622
takes as its argument the actual value to return.
1626
Here we show the same functions as above, coded in version-1 style:
1629
#include "postgres.h"
1630
#include <string.h>
1635
PG_FUNCTION_INFO_V1(add_one);
1638
add_one(PG_FUNCTION_ARGS)
1640
int32 arg = PG_GETARG_INT32(0);
1642
PG_RETURN_INT32(arg + 1);
1645
/* by reference, fixed length */
1647
PG_FUNCTION_INFO_V1(add_one_float8);
1650
add_one_float8(PG_FUNCTION_ARGS)
1652
/* The macros for FLOAT8 hide its pass-by-reference nature. */
1653
float8 arg = PG_GETARG_FLOAT8(0);
1655
PG_RETURN_FLOAT8(arg + 1.0);
1658
PG_FUNCTION_INFO_V1(makepoint);
1661
makepoint(PG_FUNCTION_ARGS)
1663
/* Here, the pass-by-reference nature of Point is not hidden. */
1664
Point *pointx = PG_GETARG_POINT_P(0);
1665
Point *pointy = PG_GETARG_POINT_P(1);
1666
Point *new_point = (Point *) palloc(sizeof(Point));
1668
new_point->x = pointx->x;
1669
new_point->y = pointy->y;
1671
PG_RETURN_POINT_P(new_point);
1674
/* by reference, variable length */
1676
PG_FUNCTION_INFO_V1(copytext);
1679
copytext(PG_FUNCTION_ARGS)
1681
text *t = PG_GETARG_TEXT_P(0);
1683
* VARSIZE is the total size of the struct in bytes.
1685
text *new_t = (text *) palloc(VARSIZE(t));
1686
VARATT_SIZEP(new_t) = VARSIZE(t);
1688
* VARDATA is a pointer to the data region of the struct.
1690
memcpy((void *) VARDATA(new_t), /* destination */
1691
(void *) VARDATA(t), /* source */
1692
VARSIZE(t)-VARHDRSZ); /* how many bytes */
1693
PG_RETURN_TEXT_P(new_t);
1696
PG_FUNCTION_INFO_V1(concat_text);
1699
concat_text(PG_FUNCTION_ARGS)
1701
text *arg1 = PG_GETARG_TEXT_P(0);
1702
text *arg2 = PG_GETARG_TEXT_P(1);
1703
int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
1704
text *new_text = (text *) palloc(new_text_size);
1706
VARATT_SIZEP(new_text) = new_text_size;
1707
memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
1708
memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ),
1709
VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
1710
PG_RETURN_TEXT_P(new_text);
1716
The <command>CREATE FUNCTION</command> commands are the same as
1717
for the version-0 equivalents.
1721
At first glance, the version-1 coding conventions may appear to
1722
be just pointless obscurantism. They do, however, offer a number
1723
of improvements, because the macros can hide unnecessary detail.
1724
An example is that in coding <function>add_one_float8</>, we no longer need to
1725
be aware that <type>float8</type> is a pass-by-reference type. Another
1726
example is that the <literal>GETARG</> macros for variable-length types allow
1727
for more efficient fetching of <quote>toasted</quote> (compressed or
1728
out-of-line) values.
1732
One big improvement in version-1 functions is better handling of null
1733
inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function>
1734
allows a function to test whether each input is null. (Of course, doing
1735
this is only necessary in functions not declared <quote>strict</>.)
1737
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros,
1738
the input arguments are counted beginning at zero. Note that one
1739
should refrain from executing
1740
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until
1741
one has verified that the argument isn't null.
1742
To return a null result, execute <function>PG_RETURN_NULL()</function>;
1743
this works in both strict and nonstrict functions.
1747
Other options provided in the new-style interface are two
1749
<function>PG_GETARG_<replaceable>xxx</replaceable>()</function>
1750
macros. The first of these,
1751
<function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>,
1752
guarantees to return a copy of the specified argument that is
1753
safe for writing into. (The normal macros will sometimes return a
1754
pointer to a value that is physically stored in a table, which
1755
must not be written to. Using the
1756
<function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>
1757
macros guarantees a writable result.)
1758
The second variant consists of the
1759
<function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function>
1760
macros which take three arguments. The first is the number of the
1761
function argument (as above). The second and third are the offset and
1762
length of the segment to be returned. Offsets are counted from
1763
zero, and a negative length requests that the remainder of the
1764
value be returned. These macros provide more efficient access to
1765
parts of large values in the case where they have storage type
1766
<quote>external</quote>. (The storage type of a column can be specified using
1767
<literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER
1768
COLUMN <replaceable>colname</replaceable> SET STORAGE
1769
<replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of
1770
<literal>plain</>, <literal>external</>, <literal>extended</literal>,
1771
or <literal>main</>.)
1775
Finally, the version-1 function call conventions make it possible
1776
to return set results (<xref linkend="xfunc-c-return-set">) and
1777
implement trigger functions (<xref linkend="triggers">) and
1778
procedural-language call handlers (<xref
1779
linkend="plhandler">). Version-1 code is also more
1780
portable than version-0, because it does not break restrictions
1781
on function call protocol in the C standard. For more details
1782
see <filename>src/backend/utils/fmgr/README</filename> in the
1783
source distribution.
1788
<title>Writing Code</title>
1791
Before we turn to the more advanced topics, we should discuss
1792
some coding rules for <productname>PostgreSQL</productname>
1793
C-language functions. While it may be possible to load functions
1794
written in languages other than C into
1795
<productname>PostgreSQL</productname>, this is usually difficult
1796
(when it is possible at all) because other languages, such as
1797
C++, FORTRAN, or Pascal often do not follow the same calling
1798
convention as C. That is, other languages do not pass argument
1799
and return values between functions in the same way. For this
1800
reason, we will assume that your C-language functions are
1801
actually written in C.
1805
The basic rules for writing and building C functions are as follows:
1810
Use <literal>pg_config
1811
--includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></>
1812
to find out where the <productname>PostgreSQL</> server header
1813
files are installed on your system (or the system that your
1814
users will be running on). This option is new with
1815
<productname>PostgreSQL</> 7.2. For
1816
<productname>PostgreSQL</> 7.1 you should use the option
1817
<option>--includedir</option>. (<command>pg_config</command>
1818
will exit with a non-zero status if it encounters an unknown
1819
option.) For releases prior to 7.1 you will have to guess,
1820
but since that was before the current calling conventions were
1821
introduced, it is unlikely that you want to support those
1828
When allocating memory, use the
1829
<productname>PostgreSQL</productname> functions
1830
<function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></>
1831
instead of the corresponding C library functions
1832
<function>malloc</function> and <function>free</function>.
1833
The memory allocated by <function>palloc</function> will be
1834
freed automatically at the end of each transaction, preventing
1841
Always zero the bytes of your structures using
1842
<function>memset</function>. Without this, it's difficult to
1843
support hash indexes or hash joins, as you must pick out only
1844
the significant bits of your data structure to compute a hash.
1845
Even if you initialize all fields of your structure, there may be
1846
alignment padding (holes in the structure) that may contain
1853
Most of the internal <productname>PostgreSQL</productname>
1854
types are declared in <filename>postgres.h</filename>, while
1855
the function manager interfaces
1856
(<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in
1857
<filename>fmgr.h</filename>, so you will need to include at
1858
least these two files. For portability reasons it's best to
1859
include <filename>postgres.h</filename> <emphasis>first</>,
1860
before any other system or user header files. Including
1861
<filename>postgres.h</filename> will also include
1862
<filename>elog.h</filename> and <filename>palloc.h</filename>
1869
Symbol names defined within object files must not conflict
1870
with each other or with symbols defined in the
1871
<productname>PostgreSQL</productname> server executable. You
1872
will have to rename your functions or variables if you get
1873
error messages to this effect.
1879
Compiling and linking your code so that it can be dynamically
1880
loaded into <productname>PostgreSQL</productname> always
1881
requires special flags. See <xref linkend="dfunc"> for a
1882
detailed explanation of how to do it for your particular
1892
<sect2 id="xfunc-c-pgxs">
1893
<title>Extension Building Infrastructure</title>
1895
<indexterm zone="xfunc-c-pgxs">
1896
<primary>pgxs</primary>
1900
If you are thinking about distributing your
1901
<productname>PostgreSQL</> extension modules, setting up a
1902
portable build system for them can be fairly difficult. Therefore
1903
the <productname>PostgreSQL</> installation provides a build
1904
infrastructure for extensions, called <acronym>PGXS</acronym>, so
1905
that simple extension modules can be built simply against an
1906
already installed server. Note that this infrastructure is not
1907
intended to be a universal build system framework that can be used
1908
to build all software interfacing to <productname>PostgreSQL</>;
1909
it simply automates common build rules for simple server extension
1910
modules. For more complicated packages, you need to write your
1915
To use the infrastructure for your extension, you must write a
1916
simple makefile. In that makefile, you need to set some variables
1917
and finally include the global <acronym>PGXS</acronym> makefile.
1918
Here is an example that builds an extension module named
1919
<literal>isbn_issn</literal> consisting of a shared library, an
1920
SQL script, and a documentation text file:
1923
DATA_built = isbn_issn.sql
1924
DOCS = README.isbn_issn
1926
PGXS := $(shell pg_config --pgxs)
1929
The last two lines should always be the same. Earlier in the
1930
file, you assign variables or add custom
1931
<application>make</application> rules.
1935
The following variables can be set:
1939
<term><varname>MODULES</varname></term>
1942
list of shared objects to be built from source file with same
1943
stem (do not include suffix in this list)
1949
<term><varname>DATA</varname></term>
1952
random files to install into <literal><replaceable>prefix</replaceable>/share/contrib</literal>
1958
<term><varname>DATA_built</varname></term>
1961
random files to install into
1962
<literal><replaceable>prefix</replaceable>/share/contrib</literal>,
1963
which need to be built first
1969
<term><varname>DOCS</varname></term>
1972
random files to install under
1973
<literal><replaceable>prefix</replaceable>/doc/contrib</literal>
1979
<term><varname>SCRIPTS</varname></term>
1982
script files (not binaries) to install into
1983
<literal><replaceable>prefix</replaceable>/bin</literal>
1989
<term><varname>SCRIPTS_built</varname></term>
1992
script files (not binaries) to install into
1993
<literal><replaceable>prefix</replaceable>/bin</literal>,
1994
which need to be built first
2000
<term><varname>REGRESS</varname></term>
2003
list of regression test cases (without suffix)
2009
or at most one of these two:
2013
<term><varname>PROGRAM</varname></term>
2016
a binary program to build (list objects files in <varname>OBJS</varname>)
2022
<term><varname>MODULE_big</varname></term>
2025
a shared object to build (list object files in <varname>OBJS</varname>)
2031
The following can also be set:
2036
<term><varname>EXTRA_CLEAN</varname></term>
2039
extra files to remove in <literal>make clean</literal>
2045
<term><varname>PG_CPPFLAGS</varname></term>
2048
will be added to <varname>CPPFLAGS</varname>
2054
<term><varname>PG_LIBS</varname></term>
2057
will be added to <varname>PROGRAM</varname> link line
2063
<term><varname>SHLIB_LINK</varname></term>
2066
will be added to <varname>MODULE_big</varname> link line
2074
Put this makefile as <literal>Makefile</literal> in the directory
2075
which holds your extension. Then you can do
2076
<literal>make</literal> to compile, and later <literal>make
2077
install</literal> to install your module. The extension is
2078
compiled and installed for the
2079
<productname>PostgreSQL</productname> installation that
2080
corresponds to the first <command>pg_config</command> command
2087
<title>Composite-Type Arguments in C-Language Functions</title>
2090
Composite types do not have a fixed layout like C structures.
2091
Instances of a composite type may contain null fields. In
2092
addition, composite types that are part of an inheritance
2093
hierarchy may have different fields than other members of the
2094
same inheritance hierarchy. Therefore,
2095
<productname>PostgreSQL</productname> provides a function
2096
interface for accessing fields of composite types from C.
2100
Suppose we want to write a function to answer the query
2103
SELECT name, c_overpaid(emp, 1500) AS overpaid
2105
WHERE name = 'Bill' OR name = 'Sam';
2108
Using call conventions version 0, we can define
2109
<function>c_overpaid</> as:
2112
#include "postgres.h"
2113
#include "executor/executor.h" /* for GetAttributeByName() */
2116
c_overpaid(HeapTupleHeader t, /* the current row of emp */
2122
salary = DatumGetInt32(GetAttributeByName(t, "salary", &isnull));
2125
return salary > limit;
2129
In version-1 coding, the above would look like this:
2132
#include "postgres.h"
2133
#include "executor/executor.h" /* for GetAttributeByName() */
2135
PG_FUNCTION_INFO_V1(c_overpaid);
2138
c_overpaid(PG_FUNCTION_ARGS)
2140
HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
2141
int32 limit = PG_GETARG_INT32(1);
2145
salary = GetAttributeByName(t, "salary", &isnull);
2147
PG_RETURN_BOOL(false);
2148
/* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
2150
PG_RETURN_BOOL(DatumGetInt32(salary) > limit);
2156
<function>GetAttributeByName</function> is the
2157
<productname>PostgreSQL</productname> system function that
2158
returns attributes out of the specified row. It has
2159
three arguments: the argument of type <type>HeapTupleHeader</type> passed
2161
the function, the name of the desired attribute, and a
2162
return parameter that tells whether the attribute
2163
is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
2164
value that you can convert to the proper data type by using the
2165
appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
2166
macro. Note that the return value is meaningless if the null flag is
2167
set; always check the null flag before trying to do anything with the
2172
There is also <function>GetAttributeByNum</function>, which selects
2173
the target attribute by column number instead of name.
2177
The following command declares the function
2178
<function>c_overpaid</function> in SQL:
2181
CREATE FUNCTION c_overpaid(emp, integer) RETURNS boolean
2182
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'c_overpaid'
2186
Notice we have used <literal>STRICT</> so that we did not have to
2187
check whether the input arguments were NULL.
2192
<title>Returning Rows (Composite Types) from C-Language Functions</title>
2195
To return a row or composite-type value from a C-language
2196
function, you can use a special API that provides macros and
2197
functions to hide most of the complexity of building composite
2198
data types. To use this API, the source file must include:
2200
#include "funcapi.h"
2205
There are two ways you can build a composite data value (henceforth
2206
a <quote>tuple</>): you can build it from an array of Datum values,
2207
or from an array of C strings that can be passed to the input
2208
conversion functions of the tuple's column data types. In either
2209
case, you first need to obtain or construct a <structname>TupleDesc</>
2210
descriptor for the tuple structure. When working with Datums, you
2211
pass the <structname>TupleDesc</> to <function>BlessTupleDesc</>,
2212
and then call <function>heap_formtuple</> for each row. When working
2213
with C strings, you pass the <structname>TupleDesc</> to
2214
<function>TupleDescGetAttInMetadata</>, and then call
2215
<function>BuildTupleFromCStrings</> for each row. In the case of a
2216
function returning a set of tuples, the setup steps can all be done
2217
once during the first call of the function.
2221
Several helper functions are available for setting up the initial
2222
<structname>TupleDesc</>. If you want to use a named composite type,
2223
you can fetch the information from the system catalogs. Use
2225
TupleDesc RelationNameGetTupleDesc(const char *relname)
2227
to get a <structname>TupleDesc</> for a named relation, or
2229
TupleDesc TypeGetTupleDesc(Oid typeoid, List *colaliases)
2231
to get a <structname>TupleDesc</> based on a type OID. This can
2232
be used to get a <structname>TupleDesc</> for a base or
2233
composite type. When writing a function that returns
2234
<structname>record</>, the expected <structname>TupleDesc</>
2235
must be passed in by the caller.
2239
Once you have a <structname>TupleDesc</>, call
2241
TupleDesc BlessTupleDesc(TupleDesc tupdesc)
2243
if you plan to work with Datums, or
2245
AttInMetadata *TupleDescGetAttInMetadata(TupleDesc tupdesc)
2247
if you plan to work with C strings. If you are writing a function
2248
returning set, you can save the results of these functions in the
2249
<structname>FuncCallContext</> structure — use the
2250
<structfield>tuple_desc</> or <structfield>attinmeta</> field
2255
When working with Datums, use
2257
HeapTuple heap_formtuple(TupleDesc tupdesc, Datum *values, char *nulls)
2259
to build a <structname>HeapTuple</> given user data in Datum form.
2263
When working with C strings, use
2265
HeapTuple BuildTupleFromCStrings(AttInMetadata *attinmeta, char **values)
2267
to build a <structname>HeapTuple</> given user data
2268
in C string form. <literal>values</literal> is an array of C strings,
2269
one for each attribute of the return row. Each C string should be in
2270
the form expected by the input function of the attribute data
2271
type. In order to return a null value for one of the attributes,
2272
the corresponding pointer in the <parameter>values</> array
2273
should be set to <symbol>NULL</>. This function will need to
2274
be called again for each row you return.
2278
Once you have built a tuple to return from your function, it
2279
must be converted into a <type>Datum</>. Use
2281
HeapTupleGetDatum(HeapTuple tuple)
2283
to convert a <structname>HeapTuple</> into a valid Datum. This
2284
<type>Datum</> can be returned directly if you intend to return
2285
just a single row, or it can be used as the current return value
2286
in a set-returning function.
2290
An example appears in the next section.
2295
<sect2 id="xfunc-c-return-set">
2296
<title>Returning Sets from C-Language Functions</title>
2299
There is also a special API that provides support for returning
2300
sets (multiple rows) from a C-language function. A set-returning
2301
function must follow the version-1 calling conventions. Also,
2302
source files must include <filename>funcapi.h</filename>, as
2307
A set-returning function (<acronym>SRF</>) is called
2308
once for each item it returns. The <acronym>SRF</> must
2309
therefore save enough state to remember what it was doing and
2310
return the next item on each call.
2311
The structure <structname>FuncCallContext</> is provided to help
2312
control this process. Within a function, <literal>fcinfo->flinfo->fn_extra</>
2313
is used to hold a pointer to <structname>FuncCallContext</>
2319
* Number of times we've been called before
2321
* call_cntr is initialized to 0 for you by SRF_FIRSTCALL_INIT(), and
2322
* incremented for you every time SRF_RETURN_NEXT() is called.
2327
* OPTIONAL maximum number of calls
2329
* max_calls is here for convenience only and setting it is optional.
2330
* If not set, you must provide alternative means to know when the
2336
* OPTIONAL pointer to result slot
2338
* This is obsolete and only present for backwards compatibility, viz,
2339
* user-defined SRFs that use the deprecated TupleDescGetSlot().
2341
TupleTableSlot *slot;
2344
* OPTIONAL pointer to miscellaneous user-provided context information
2346
* user_fctx is for use as a pointer to your own data to retain
2347
* arbitrary context information between calls of your function.
2352
* OPTIONAL pointer to struct containing attribute type input metadata
2354
* attinmeta is for use when returning tuples (i.e., composite data types)
2355
* and is not used when returning base data types. It is only needed
2356
* if you intend to use BuildTupleFromCStrings() to create the return
2359
AttInMetadata *attinmeta;
2362
* memory context used for structures that must live for multiple calls
2364
* multi_call_memory_ctx is set by SRF_FIRSTCALL_INIT() for you, and used
2365
* by SRF_RETURN_DONE() for cleanup. It is the most appropriate memory
2366
* context for any memory that is to be reused across multiple calls
2369
MemoryContext multi_call_memory_ctx;
2372
* OPTIONAL pointer to struct containing tuple description
2374
* tuple_desc is for use when returning tuples (i.e. composite data types)
2375
* and is only needed if you are going to build the tuples with
2376
* heap_formtuple() rather than with BuildTupleFromCStrings(). Note that
2377
* the TupleDesc pointer stored here should usually have been run through
2378
* BlessTupleDesc() first.
2380
TupleDesc tuple_desc;
2387
An <acronym>SRF</> uses several functions and macros that
2388
automatically manipulate the <structname>FuncCallContext</>
2389
structure (and expect to find it via <literal>fn_extra</>). Use
2393
to determine if your function is being called for the first or a
2394
subsequent time. On the first call (only) use
2396
SRF_FIRSTCALL_INIT()
2398
to initialize the <structname>FuncCallContext</>. On every function call,
2399
including the first, use
2403
to properly set up for using the <structname>FuncCallContext</>
2404
and clearing any previously returned data left over from the
2409
If your function has data to return, use
2411
SRF_RETURN_NEXT(funcctx, result)
2413
to return it to the caller. (<literal>result</> must be of type
2414
<type>Datum</>, either a single value or a tuple prepared as
2415
described above.) Finally, when your function is finished
2418
SRF_RETURN_DONE(funcctx)
2420
to clean up and end the <acronym>SRF</>.
2424
The memory context that is current when the <acronym>SRF</> is called is
2425
a transient context that will be cleared between calls. This means
2426
that you do not need to call <function>pfree</> on everything
2427
you allocated using <function>palloc</>; it will go away anyway. However, if you want to allocate
2428
any data structures to live across calls, you need to put them somewhere
2429
else. The memory context referenced by
2430
<structfield>multi_call_memory_ctx</> is a suitable location for any
2431
data that needs to survive until the <acronym>SRF</> is finished running. In most
2432
cases, this means that you should switch into
2433
<structfield>multi_call_memory_ctx</> while doing the first-call setup.
2437
A complete pseudo-code example looks like the following:
2440
my_set_returning_function(PG_FUNCTION_ARGS)
2442
FuncCallContext *funcctx;
2444
MemoryContext oldcontext;
2445
<replaceable>further declarations as needed</replaceable>
2447
if (SRF_IS_FIRSTCALL())
2449
funcctx = SRF_FIRSTCALL_INIT();
2450
oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2451
/* One-time setup code appears here: */
2452
<replaceable>user code</replaceable>
2453
<replaceable>if returning composite</replaceable>
2454
<replaceable>build TupleDesc, and perhaps AttInMetadata</replaceable>
2455
<replaceable>endif returning composite</replaceable>
2456
<replaceable>user code</replaceable>
2457
MemoryContextSwitchTo(oldcontext);
2460
/* Each-time setup code appears here: */
2461
<replaceable>user code</replaceable>
2462
funcctx = SRF_PERCALL_SETUP();
2463
<replaceable>user code</replaceable>
2465
/* this is just one way we might test whether we are done: */
2466
if (funcctx->call_cntr < funcctx->max_calls)
2468
/* Here we want to return another item: */
2469
<replaceable>user code</replaceable>
2470
<replaceable>obtain result Datum</replaceable>
2471
SRF_RETURN_NEXT(funcctx, result);
2475
/* Here we are done returning items and just need to clean up: */
2476
<replaceable>user code</replaceable>
2477
SRF_RETURN_DONE(funcctx);
2484
A complete example of a simple <acronym>SRF</> returning a composite type looks like:
2486
PG_FUNCTION_INFO_V1(testpassbyval);
2489
testpassbyval(PG_FUNCTION_ARGS)
2491
FuncCallContext *funcctx;
2495
AttInMetadata *attinmeta;
2497
/* stuff done only on the first call of the function */
2498
if (SRF_IS_FIRSTCALL())
2500
MemoryContext oldcontext;
2502
/* create a function context for cross-call persistence */
2503
funcctx = SRF_FIRSTCALL_INIT();
2505
/* switch to memory context appropriate for multiple function calls */
2506
oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
2508
/* total number of tuples to be returned */
2509
funcctx->max_calls = PG_GETARG_UINT32(0);
2511
/* Build a tuple description for a __testpassbyval tuple */
2512
tupdesc = RelationNameGetTupleDesc("__testpassbyval");
2515
* generate attribute metadata needed later to produce tuples from raw
2518
attinmeta = TupleDescGetAttInMetadata(tupdesc);
2519
funcctx->attinmeta = attinmeta;
2521
MemoryContextSwitchTo(oldcontext);
2524
/* stuff done on every call of the function */
2525
funcctx = SRF_PERCALL_SETUP();
2527
call_cntr = funcctx->call_cntr;
2528
max_calls = funcctx->max_calls;
2529
attinmeta = funcctx->attinmeta;
2531
if (call_cntr < max_calls) /* do when there is more left to send */
2538
* Prepare a values array for building the returned tuple.
2539
* This should be an array of C strings which will
2540
* be processed later by the type input functions.
2542
values = (char **) palloc(3 * sizeof(char *));
2543
values[0] = (char *) palloc(16 * sizeof(char));
2544
values[1] = (char *) palloc(16 * sizeof(char));
2545
values[2] = (char *) palloc(16 * sizeof(char));
2547
snprintf(values[0], 16, "%d", 1 * PG_GETARG_INT32(1));
2548
snprintf(values[1], 16, "%d", 2 * PG_GETARG_INT32(1));
2549
snprintf(values[2], 16, "%d", 3 * PG_GETARG_INT32(1));
2552
tuple = BuildTupleFromCStrings(attinmeta, values);
2554
/* make the tuple into a datum */
2555
result = HeapTupleGetDatum(tuple);
2557
/* clean up (this is not really necessary) */
2563
SRF_RETURN_NEXT(funcctx, result);
2565
else /* do when there is no more left */
2567
SRF_RETURN_DONE(funcctx);
2572
The SQL code to declare this function is:
2574
CREATE TYPE __testpassbyval AS (f1 integer, f2 integer, f3 integer);
2576
CREATE OR REPLACE FUNCTION testpassbyval(integer, integer) RETURNS SETOF __testpassbyval
2577
AS '<replaceable>filename</>', 'testpassbyval'
2578
LANGUAGE C IMMUTABLE STRICT;
2583
The directory <filename>contrib/tablefunc</> in the source
2584
distribution contains more examples of set-returning functions.
2589
<title>Polymorphic Arguments and Return Types</title>
2592
C-language functions may be declared to accept and
2593
return the polymorphic types
2594
<type>anyelement</type> and <type>anyarray</type>.
2595
See <xref linkend="extend-types-polymorphic"> for a more detailed explanation
2596
of polymorphic functions. When function arguments or return types
2597
are defined as polymorphic types, the function author cannot know
2598
in advance what data type it will be called with, or
2599
need to return. There are two routines provided in <filename>fmgr.h</>
2600
to allow a version-1 C function to discover the actual data types
2601
of its arguments and the type it is expected to return. The routines are
2602
called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
2603
<literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
2604
They return the result or argument type OID, or <symbol>InvalidOid</symbol> if the
2605
information is not available.
2606
The structure <literal>flinfo</> is normally accessed as
2607
<literal>fcinfo->flinfo</>. The parameter <literal>argnum</>
2612
For example, suppose we want to write a function to accept a single
2613
element of any type, and return a one-dimensional array of that type:
2616
PG_FUNCTION_INFO_V1(make_array);
2618
make_array(PG_FUNCTION_ARGS)
2621
Oid element_type = get_fn_expr_argtype(fcinfo->flinfo, 0);
2630
if (!OidIsValid(element_type))
2631
elog(ERROR, "could not determine data type of input");
2633
/* get the provided element */
2634
element = PG_GETARG_DATUM(0);
2636
/* we have one dimension */
2638
/* and one element */
2640
/* and lower bound is 1 */
2643
/* get required info about the element type */
2644
get_typlenbyvalalign(element_type, &typlen, &typbyval, &typalign);
2646
/* now build the array */
2647
result = construct_md_array(&element, ndims, dims, lbs,
2648
element_type, typlen, typbyval, typalign);
2650
PG_RETURN_ARRAYTYPE_P(result);
2656
The following command declares the function
2657
<function>make_array</function> in SQL:
2660
CREATE FUNCTION make_array(anyelement) RETURNS anyarray
2661
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
2665
Note the use of <literal>STRICT</literal>; this is essential
2666
since the code is not bothering to test for a null input.
2671
<!-- Keep this comment at the end of the file
2676
sgml-minimize-attributes:nil
2677
sgml-always-quote-attributes:t
2680
sgml-parent-document:nil
2681
sgml-default-dtd-file:"./reference.ced"
2682
sgml-exposed-tags:nil
2683
sgml-local-catalogs:("/usr/lib/sgml/catalog")
2684
sgml-local-ecat-files:nil